I have a distinct strong memory of grokking at a very young age (naturally well before I heard the term), learning the basic idea of base-10 digit increments from a long list of the integer sequence examples. What I find most interesting is how rewarding grokking is. There’s this clear strong aha! reward feeling in the brain when one learns a low complexity rule that compresses some (presumably important) high level sensory token stream.
There’s a pretty strong bayesian argument that learning is sensory compression and so this is just the general low level mechanism underlying much of the brain, but at the same time it seems to have a high level system 2 counterpart.
I have a distinct strong memory of grokking at a very young age (naturally well before I heard the term), learning the basic idea of base-10 digit increments from a long list of the integer sequence examples. What I find most interesting is how rewarding grokking is. There’s this clear strong aha! reward feeling in the brain when one learns a low complexity rule that compresses some (presumably important) high level sensory token stream.
There’s a pretty strong bayesian argument that learning is sensory compression and so this is just the general low level mechanism underlying much of the brain, but at the same time it seems to have a high level system 2 counterpart.