Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any good Textbooks for this topic?


If you google "matrix derivatives" there are a ton of cheat sheets, like this one:

http://www.gatsby.ucl.ac.uk/teaching/courses/sntn/sntn-2017/...

I used matrix derivatives in grad school a lot (my dissertation was on nonconvex optimization) and I'm not sure I've ever needed an entire textbook. Matrix calculus is just plain calculus with a few extra rules to generalize it to matrices/vectors.

(to be effective, you do need to know the rules of matrix algebra however)


I am a big fan of "Matrix Differential Calculus with Applications in Statistics and Econometrics" by Jan R. Magnus, Heinz Neudecker. They are very thorough and rigorous with their exposition and proofs, which demands a degree of mathematical maturity on their readers. If you are comfortable with multivariate calculus (at the level of mathematical analysis for some materials) and linear algebra, then this book will teach you matrix calculus well.

On the other hand, if you are already familiar with calculus and linear algebra, then most of materials are just straightforward derivations using theorems you already know, so you might not need a textbook. I liked this textbook because it took mysteries out of formulas I was taught but never told about derivations, and because it was actually a quick read.

There are some limitations though. The book stops at Hessian matrices and matrix derivatives of vectors or something like that, because chain rules start breaking down and one needs multilinear algebra for higher order derivatives. However, for deep learning you don't need to worry about it. The book will teach you enough to derive the gradient of convolution as realized in matrix products.


Mentioned in another comment also: "tensor calculus" is the right term to search for if you want a full text book.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: