Here’s my preferred formulation of the general derivative problem (skip to the last paragraph if you just want the summary): you have some function f(x). We’ll assume that it’s been “flattened out”, i.e. all the loops and recursive calls have been expanded, it’s just a straight-line numerical function. Adopting hilariously bad variable names, suppose the i-th line of f computes yi. We’ll also assume that the first lines of f just load in x, so e.g. y0=x0. If f has n lines, then the output of f is yn.
Now, we create a vector-valued function F(y), which runs each line of f in parallel: Fi(y)=(line i of f evaluated at y). f(x) computes a fixed point y=F(y) (it may take a moment of thought or an example for that part to make sense). It’s that fixed point formula which we differentiate. The result: we get x=Adydx, where A is a very sparse triangular matrix. In fact, we don’t even need to solve the whole thing—we only need dyndx. Backprop just uses the usual method for solving triangular matrices: start at the end and work back.
Main point: derivative calculation, in general, can be done by solving a (sparse, triangular) system of linear equations. There’s a whole field devoted to solving sparse matrices, especially in parallel. Different methods work better depending on the matrix structure (which will follow the structure of the computation DAG of f), so different methods will work better for different functions. Pick your favorite sparse matrix solver, ideally one which will leverage triangularity, and boom, you have a derivative calculator.
Side note: do these comments support LaTeX? Is there a page explaining what comments do support? It doesn’t seem to be markdown, no idea what we’re using here.
Side note: do these comments support LaTeX? Is there a page explaining what comments do support? It doesn’t seem to be markdown, no idea what we’re using here.
It is a WYSIWYG markdown editor and dollar-sign is the symbol that opens the LaTex editor (I’ve LaTexed your comment for you, hope that’s okay).
Ooooh, that makes much more sense now, I was confused by the auto-formatting as I typed. Thank you for taking the time to clean up my comment. Also thankyou @habryka.
Also, how do images work in posts? I was writing up a post the other day, but when I tried to paste in an image it just created a camera symbol. Alternatively, is this stuff documented somewhere?
Yep, we support LaTeX and do a WYSIWYG translation of markdown as soon as you type it (I.e. words between asterisks get bolded, etc.). You can start typing LaTeX by typing $ and then a small equation editor shows up. You can also insert block-level equations by pressing CTRL+M.
Because the mobile editing experience was pretty buggy, we replaced the mobile editor with a markdown-only editor two days ago. We will activate LaTeX for that editor pretty soon (which will probably mean replacing equations between “$$” with the LaTeX rendered version), but that means LaTeX is temporarily unavailable on phones (though the previous LaTeX editor didn’t really work with phones anyways, so it’s mostly just a strict improvement on what we have).
Here’s my preferred formulation of the general derivative problem (skip to the last paragraph if you just want the summary): you have some function f(x). We’ll assume that it’s been “flattened out”, i.e. all the loops and recursive calls have been expanded, it’s just a straight-line numerical function. Adopting hilariously bad variable names, suppose the i-th line of f computes yi. We’ll also assume that the first lines of f just load in x, so e.g. y0=x0. If f has n lines, then the output of f is yn.
Now, we create a vector-valued function F(y), which runs each line of f in parallel: Fi(y)=(line i of f evaluated at y). f(x) computes a fixed point y=F(y) (it may take a moment of thought or an example for that part to make sense). It’s that fixed point formula which we differentiate. The result: we get x=Adydx, where A is a very sparse triangular matrix. In fact, we don’t even need to solve the whole thing—we only need dyndx. Backprop just uses the usual method for solving triangular matrices: start at the end and work back.
Main point: derivative calculation, in general, can be done by solving a (sparse, triangular) system of linear equations. There’s a whole field devoted to solving sparse matrices, especially in parallel. Different methods work better depending on the matrix structure (which will follow the structure of the computation DAG of f), so different methods will work better for different functions. Pick your favorite sparse matrix solver, ideally one which will leverage triangularity, and boom, you have a derivative calculator.
Side note: do these comments support LaTeX? Is there a page explaining what comments do support? It doesn’t seem to be markdown, no idea what we’re using here.
It is a WYSIWYG markdown editor and dollar-sign is the symbol that opens the LaTex editor (I’ve LaTexed your comment for you, hope that’s okay).
Added: @habryka oops, double-comment!
Ooooh, that makes much more sense now, I was confused by the auto-formatting as I typed. Thank you for taking the time to clean up my comment. Also thankyou @habryka.
Also, how do images work in posts? I was writing up a post the other day, but when I tried to paste in an image it just created a camera symbol. Alternatively, is this stuff documented somewhere?
My transatlantic flight permitting, I’ll reply with a post tomorrow with full descriptions of how to use the editor.
Thank you very much! I really appreciate the time you guys are putting in to this.
You’re welcome :-) Here’s a mini-guide to the editor.
The thing is now in LaTeX! Beautiful!
Yep, we support LaTeX and do a WYSIWYG translation of markdown as soon as you type it (I.e. words between asterisks get bolded, etc.). You can start typing LaTeX by typing $ and then a small equation editor shows up. You can also insert block-level equations by pressing CTRL+M.
Typing $ does nothing on my iPhone.
Because the mobile editing experience was pretty buggy, we replaced the mobile editor with a markdown-only editor two days ago. We will activate LaTeX for that editor pretty soon (which will probably mean replacing equations between “$$” with the LaTeX rendered version), but that means LaTeX is temporarily unavailable on phones (though the previous LaTeX editor didn’t really work with phones anyways, so it’s mostly just a strict improvement on what we have).
Ok, no problem; I don’t really know LaTeX anyway.