Thanks for the link! Looks like they do put optimization effort into choosing the subspace, but it’s still interesting that the training process can be factored into 2 pieces like that.
Thanks for the link! Looks like they do put optimization effort into choosing the subspace, but it’s still interesting that the training process can be factored into 2 pieces like that.