How much data does it take to pretrain a (human) brain? I conducted a (fairer) Fermi estimate.
The post goes through the following questions:
How long does it take to grow a human brain?
How many waking seconds do we have in our life?
How many “tokens” or “data points” does a human brain process in a second?
Can we simply count the spikes?
How many bits (spikes and non-spikes) does it take for the brain to process 1 sensory “piece of information”?
How do those numbers stack up against LLMs?
To get to this conclusion table:
I think you’re missing a trick here purely focusing on waking seconds.
Demis explains this beautifully in this talk here:
TLDW:
The brain replays memories stochastically during sleep, and replays them OOMs faster than they were experienced. This “multi epoch training” allows the brain to learn much more from the environment, and it can prioritise salient experiences.
Perhaps? I’m not fully understanding your point, could you explain a bit more what I’m missing—how does accounting for sleep and memory replay add to the point of comparing the pretraining dataset sizes between human brains and LLMs? At first glance, my understanding of your point is that adding in sleep seconds would increase the training set size for humans by a third or more. I wanted to make my estimate conservative so I didn’t add in sleep seconds, but I’m sure there’s a case for an adjustment adding it in.
Yeah, I don’t think it makes sense to add sleep if you are estimating “data points”, since it’s rehearsing remixes of the data from awake times.
On the other hand, if you are estimating “training steps”, then it does make sense to count sleep. Just as you’d count additional passes over the same data.
“OOMs faster ”? Where do you get that idea?
Dreams indicate a need for more processing than what happens when we’re awake, but likely less than 2x waking time.
The linked video says so at 30:45