BTW: the way I found that first link was by searching the title on google scholar, finding the paper, and clicking “All 5 versions” below (it’s right next to “Cited by 7″ and “Related articles”). That brought me to a bunch of versions, one of which was a seemingly-ungated PDF. This will probably frequently work, because AI researchers usually make their papers publicly available (at least in pre-print form).
Thanks for the link! Looks like they do put optimization effort into choosing the subspace, but it’s still interesting that the training process can be factored into 2 pieces like that.
For the 40 parameters thing, this link should work. See also this earlier paper.
BTW: the way I found that first link was by searching the title on google scholar, finding the paper, and clicking “All 5 versions” below (it’s right next to “Cited by 7″ and “Related articles”). That brought me to a bunch of versions, one of which was a seemingly-ungated PDF. This will probably frequently work, because AI researchers usually make their papers publicly available (at least in pre-print form).
Thanks for the link! Looks like they do put optimization effort into choosing the subspace, but it’s still interesting that the training process can be factored into 2 pieces like that.