I think you could, but you’d be missing out on the 9% (for gpt2-small) of the variance that isn’t in one of those three dimensions, so you might degrade your performance.
I think you could, but you’d be missing out on the 9% (for gpt2-small) of the variance that isn’t in one of those three dimensions, so you might degrade your performance.