Talking about modern ML models inevitably leads to a bunch of hard-to-intuit large numbers, especially when it comes to parameter count.
To address this, we propose that we adopt a new, human-friendly unit to measure the number of learnable parameters in an architecture:
1 beepower = 1 BP = 1 billion parameters
Bees have about one billion[1] synapses[2] in their forebrain[3], so this gives a nice basis for comparisons[4] between animal brains and artificial neural nets.
Like horsepower and candlepower,[5] the unit of beepower expresses the scale of a new and unfamiliar technology in terms that we are already familiar with. And it makes discussion of model specs flow better.
“This model has twenty bees”, you might say. Or “wow, look at all that beepower; did you train long enough to make good use of it?”
Here’s a helpful infographic to calibrate you on this new unit of measurement:
Other animals
We can even benchmark[6] against more or less brainy animals.
The smallest OpenAI API model, Ada, is probably[7] 350 million parameters, or about a third of a bee, which is comparable to a cricket:
The next size up, Babbage, is around 1.3 BP, or cockroach-sized.
Curie has almost seven bees, which is… sort of in an awkward gap between insects and mammals.
Davinci is a 175-bee model, which gets us up to hedgehog (or quail) scale:
Gopher (280 BP) is partridge (or ferret) sized. More research into actual gophers is needed to know how many gophers worth of parameters Gopher has.
Amusingly, PaLM, at 540 bees, has about as many parameters as a chinchilla has synapses:[8]
Tragically, we could not figure out how many palms worth of parameters Chinchilla (70 bees) has. We leave this as an exercise for the reader.
There are about 170,000 neurons in the corpora pendiculata of a honeybee, or roughly 140,000 after adjusting for the tendency of optical fractionators to overcount, and some sources give about 7,000 synapses per neuron for the human brain, and it turns out humans and mice have comparable synapse-per-neuron counts so it doesn’t scale that badly with brain size; skeptical readers are encouraged to shut up and multiply.
Is one synapse equivalent to one parameter? Well, there are about five bits of recoverable information encoded in the strength of a synaptic connection, and neural-net parameters can be compressed to eight bits (or even 4 bits!) without too much loss of performance, so kinda-ish yeah.
Wikipedia claims the corpora pendiculata (“mushroom bodies”) of insects, which “are known to play a role in olfactory learning and memory”, are analogous to the mammalian cerebral cortex or the avian hypopallium. Sure, why not.
Based on “forebrain” neuron numbers from this Wikipedia article, assuming 7,000 synapses per neuron, with optical fractionator counts discounted by a factor of about .8 based on the pairwise comparisons available there.
Due to a lack of interest in studying chinchillas, there doesn’t seem to have been a direct measurement of the synapse or neuron count for chinchillas. That being said, rabbits have around 500 billion synapses, and (domesticated) Chinchillas are around the same size and body weight as (smaller) rabbits and have the same cerebellum weight, so we feel justified in making this claim anyways. :)
GPT-175bee
Epistemic status: whimsical
Bees: a new unit of measurement for ML model size
Talking about modern ML models inevitably leads to a bunch of hard-to-intuit large numbers, especially when it comes to parameter count.
To address this, we propose that we adopt a new, human-friendly unit to measure the number of learnable parameters in an architecture:
1 beepower = 1 BP = 1 billion parameters
Bees have about one billion[1] synapses[2] in their forebrain[3], so this gives a nice basis for comparisons[4] between animal brains and artificial neural nets.
Like horsepower and candlepower,[5] the unit of beepower expresses the scale of a new and unfamiliar technology in terms that we are already familiar with. And it makes discussion of model specs flow better.
“This model has twenty bees”, you might say. Or “wow, look at all that beepower; did you train long enough to make good use of it?”
Here’s a helpful infographic to calibrate you on this new unit of measurement:
Other animals
We can even benchmark[6] against more or less brainy animals.
The smallest OpenAI API model, Ada, is probably[7] 350 million parameters, or about a third of a bee, which is comparable to a cricket:
The next size up, Babbage, is around 1.3 BP, or cockroach-sized.
Curie has almost seven bees, which is… sort of in an awkward gap between insects and mammals.
Davinci is a 175-bee model, which gets us up to hedgehog (or quail) scale:
Gopher (280 BP) is partridge (or ferret) sized. More research into actual gophers is needed to know how many gophers worth of parameters Gopher has.
Amusingly, PaLM, at 540 bees, has about as many parameters as a chinchilla has synapses:[8]
Tragically, we could not figure out how many palms worth of parameters Chinchilla (70 bees) has. We leave this as an exercise for the reader.
There are about 170,000 neurons in the corpora pendiculata of a honeybee, or roughly 140,000 after adjusting for the tendency of optical fractionators to overcount, and some sources give about 7,000 synapses per neuron for the human brain, and it turns out humans and mice have comparable synapse-per-neuron counts so it doesn’t scale that badly with brain size; skeptical readers are encouraged to shut up and multiply.
Is one synapse equivalent to one parameter? Well, there are about five bits of recoverable information encoded in the strength of a synaptic connection, and neural-net parameters can be compressed to eight bits (or even 4 bits!) without too much loss of performance, so kinda-ish yeah.
Wikipedia claims the corpora pendiculata (“mushroom bodies”) of insects, which “are known to play a role in olfactory learning and memory”, are analogous to the mammalian cerebral cortex or the avian hypopallium. Sure, why not.
The thing about apples and oranges is that nobody can actually stop you from comparing them.
A hundred-watt incandescent gets about a thousand candlepower per horsepower; LEDs can get 6,000 candles per horse or even more.
Based on “forebrain” neuron numbers from this Wikipedia article, assuming 7,000 synapses per neuron, with optical fractionator counts discounted by a factor of about .8 based on the pairwise comparisons available there.
https://blog.eleuther.ai/gpt3-model-sizes/
Due to a lack of interest in studying chinchillas, there doesn’t seem to have been a direct measurement of the synapse or neuron count for chinchillas. That being said, rabbits have around 500 billion synapses, and (domesticated) Chinchillas are around the same size and body weight as (smaller) rabbits and have the same cerebellum weight, so we feel justified in making this claim anyways. :)