from Hacker News

The matrix calculus you need for deep learning (2018)

by cpp_frog on 7/30/23, 5:18 PM with 40 comments

  • by dang on 7/31/23, 7:56 PM

    Related:

    The matrix calculus you need for deep learning (2018) - https://news.ycombinator.com/item?id=26676729 - April 2021 (40 comments)

    Matrix calculus for deep learning part 2 - https://news.ycombinator.com/item?id=23358761 - May 2020 (6 comments)

    Matrix Calculus for Deep Learning - https://news.ycombinator.com/item?id=21661545 - Nov 2019 (47 comments)

    The Matrix Calculus You Need for Deep Learning - https://news.ycombinator.com/item?id=17422770 - June 2018 (77 comments)

    Matrix Calculus for Deep Learning - https://news.ycombinator.com/item?id=16267178 - Jan 2018 (81 comments)

  • by quanto on 7/31/23, 4:55 PM

    The article/webpage is a nice walk-through for the uninitiated. Half the challenge of doing matrix calculus is remembering the dimension of the object you are dealing with (scalar, vector, matrix, higher-dim tensor).

    Ultimately, the point of using matrix calculus (or matrices in general) is not just concision of notation but also understanding that matrices are operators acting on members of some spaces, i.e. vectors. It is this higher level abstraction that makes matrices powerful.

    For people who are familiar with the concepts but need a concise refresher, the Wikipedia page serves well:

    https://en.wikipedia.org/wiki/Matrix_calculus

  • by SnooSux on 7/31/23, 3:15 PM

    This is the resource I wish I had in 2018. Every grad school course had a Linear Algebra review lecture but never got into the Matrix Calculus I actually needed.
  • by cs702 on 7/31/23, 3:16 PM

    Please change the link to the original source:

    https://arxiv.org/abs/1802.01528

    ---

    EDIT: It turns out explained.ai is the personal website of one of the authors, so there's no need to change the link. See comment below.

  • by trolan on 7/31/23, 4:01 PM

    I finished Vector Calculus last year and have no experience in machine learning but this seems exceptionally thorough and would have made my life easier having a practical explanation over a mathematical one, but woe is the life of the engineering student I guess.
  • by rdedev on 7/31/23, 6:58 PM

    I had followed this when I was learning DL through Andrew NG's course. In one of the lessons, he had the formula for calculating the loss as well as it's derivatives.

    I tried driving these formulas from scratch using what I learned from OP's post but it felt like there was something missing. I think it boils down to me not knowing how to aggregate those element wise derivatives into a matrix form. Afaik the Matrix cookbook and certain notes from Stanford cs231n that helped me grok it fully

  • by bluerooibos on 7/31/23, 7:07 PM

    Oh nice, I did most of this in school, and during my non-CS engineering degree. Thanks for sharing!

    Always wanted to dip my toes into ML, but I've never been convinced of it's usefulness to the average solo developer, in terms of things you can build with this new knowledge. Likely I don't know enough about it to make that call though.

  • by godelski on 7/31/23, 6:38 PM

    There's a common belief that you don't need math for ML or that you need a lot of math for ML. So let me clarify:

    You don't need math to make a model perform well, but you do need math to know why your model is wrong.

  • by nsajko on 7/31/23, 7:12 PM

    Another matrix math reference: https://github.com/r-barnes/MatrixForensics
  • by _the_inflator on 7/31/23, 4:56 PM

    I just had a glimpse look at it. A good sum-up.

    It seems that these topics are covered by the first one or two semesters of a Math degree. Of course university is a bit more advanced.

  • by jayro on 7/31/23, 7:20 PM

    We just released a comprehensive online course on Multivariable Calculus (https://mathacademy.com/courses/multivariable-calculus), and we also have a course on Mathematics for Machine Learning (https://mathacademy.com/courses/mathematics-for-machine-lear...) that covers just the matrix calculus you need in addition to just the linear algebra and statistics you need, etc. I'm a founder and would be happy to answer any questions you might have.
  • by thatsadude on 7/31/23, 5:47 PM

    vec(ABC)=kron(C.T,A)vec(C) is all your need for matrix calculus!
  • by scrubs on 7/31/23, 1:20 AM

    Darn good post!