NTK training requires training time that scales quadratically with the number of training examples, so it’s not usable for large training datasets (nor with data augmentation, since that simulates a larger dataset). (I’m not an NTK expert, but, from what I understand, this quadratic growth is not easy to get rid of.)
NTK training requires training time that scales quadratically with the number of training examples, so it’s not usable for large training datasets (nor with data augmentation, since that simulates a larger dataset). (I’m not an NTK expert, but, from what I understand, this quadratic growth is not easy to get rid of.)