Sounds plausible, but why does this differentially impact the generalizing algorithm over the memorizing algorithm?
Perhaps under normal circumstances both are learned so fast that you just don’t notice that one is slower than the other, and this slows both of them down enough that you can see the difference?
Sounds plausible, but why does this differentially impact the generalizing algorithm over the memorizing algorithm?
Perhaps under normal circumstances both are learned so fast that you just don’t notice that one is slower than the other, and this slows both of them down enough that you can see the difference?