by mariuz on 4/29/25, 3:18 AM with 24 comments
by gwf on 4/30/25, 4:42 PM
[*] G.W. Flake & B.A. Pearlmutter, "Differentiating Functions of the Jacobian with Respect to the Weights," https://proceedings.neurips.cc/paper_files/paper/1999/file/b...
by rdyro on 4/30/25, 6:02 AM
Computing sparse Jacobians can save a lot of compute if there's a real lack of dependency between part of the input and the output. Discovering this automatically through coloring is very appealing.
Another alternative is to implement sparse rules for each operation yourself, but that often requires custom autodiff implementations which aren't easy to get right, I wrote a small toy version of a sparse rules-based autodiff here: https://github.com/rdyro/SpAutoDiff.jl
Another example (a much more serious one) is https://github.com/microsoft/folx
by whitten on 4/30/25, 4:09 AM
Is this type of analysis a part of a particular mathematical heritage ?
What would it be called ?
Is this article relevant ? https://medium.com/@lobosi/calculus-for-machine-learning-jac...
by FilosofumRex on 4/30/25, 2:35 PM
https://davidtabora.wordpress.com/wp-content/uploads/2015/01...
A short overview is chapter 11 in Gilbert Strangs's Intro to linear Algebra https://math.mit.edu/~gs/linearalgebra/ila5/linearalgebra5_1...
AD comes from a different tradition - dating back to FORTRAN 77 programers attempt to differentiate non-elementary functions (For Loops, procedural functions, Subroutines, etc). Note the hardware specs for some nostalgia https://www.mcs.anl.gov/research/projects/adifor/
by nathan_douglas on 4/29/25, 1:38 PM
by patrick451 on 5/2/25, 11:48 PM
by goosedragons on 4/30/25, 5:22 PM
by oulipo on 4/30/25, 6:11 AM