[Question] Why is neuron count of human brain relevant to AI timelines?

xpostah24 Dec 2024 5:15 UTC

4 points

My take is that it is irrelevant so I want to hear opposing viewpoints.

The really simple argument for its irrelevance is that evolution used a lot more compute to produce human brains than the compute inside a single human brain. If you are making an argument on how much compute can find an intelligent mind, you have to look at how much compute used by all of evolution. (This includes compute to simulate environment, which Ajeya Cotra’s bioanchors wrongly ignores.)

What am I missing?

xpostah24 Dec 2024 5:15 UTC

4 points

6 comments1 min readLW link

AI Timelines World Modeling AI

Charlie Steiner 24 Dec 2024 16:55 UTC
4 points
2
In the strongest sense, neither the human brain analogy nor the evolution analogy really apply to AI. They only apply in a weaker sense where you are aware you’re working with analogy, and should hopefully be tracking some more detailed model behind the scenes.
The best argument to consider human development a stronger analogy than evolutionary history is that present-day AIs work more like human brains than they do like evolution. See e.g. papers finding that you can use a linear function to translate some concepts between brain scans and internal layers in a LLM, or the extremely close correspondence between ConvNet feature and neurons in the visual cortex. In contrast, I predict it’s extremely unlikely that you’ll be able to find a nontrivial correspondence between the internals of AI and evolutionary history or the trajectory of ecosystems or similar.
Of course, just because they work more like human brains after training doesn’t necessarily mean they learn similarly—and they don’t learn similarly! In some ways AI’s better (backpropagation is great, but it’s basically impossible to implement in a brain), in other ways AI’s worse (biological neurons are way smarter than artificial ‘neurons’). Don’t take the analogy too literally. But most of the human brain (the neocortex) already learns its ‘weights’ from experience over a human lifetime, in a way that’s not all that different from self-supervised learning if you squint.
- xpostah 25 Dec 2024 10:56 UTC
  1 point
  0
  Parent
  See e.g. papers finding that you can use a linear function to translate some concepts between brain scans and internal layers in a LLM, or the extremely close correspondence between ConvNet feature and neurons in the visual cortex.
  I would love links to these if you have time.
  
  But also, let’s says it’s true that there is similarity in internal structure of the end results—adult human brain and trained LLM. Adult human brain was produced by evolution + learning after birth. Trained LLM was produced by gradient descent. This does not tell me evolution doesn’t matter and learning after birth matters.
  
  > But most of the human brain (the neocortex) already learns its ‘weights’ from experience over a human lifetime, in a way that’s not all that different from self-supervised learning if you squint.
  The difference is that the weights are not initialised with random values at birth (or at the embryo stage, to be more precise).
  
  They only apply in a weaker sense where you are aware you’re working with analogy, and should hopefully be tracking some more detailed model behind the scenes.
  What do you mean by weaker sense? I say irrelevant and you say weaker sense, so we’re not yet in agreement then. How much predictive power does this analogy have as per you personally?
  - Charlie Steiner 25 Dec 2024 15:31 UTC
    6 points
    0
    Parent
    Some survey articles:
    https://arxiv.org/abs/2306.05126
    https://arxiv.org/pdf/2001.07092
    The difference is that the weights are not initialised with random values at birth (or at the embryo stage, to be more precise).
    The human cortex (the part we have way more of than chimps) is initialized to be made of a bunch of cortical column units, with slowly varying properties over the surface of the brain. But there’s decent evidence that there’s not much more initialization than that, and that that huge fraction of the brain has to slowly pick up knowledge within the human lifetime before it starts being useful, e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC9957955/
    Or you could think about it like our DNA has on the order of a megabyte to spend on the brain, and the adult brain has on the order of a terabyte of information. So 99.99[..]% of the information in the adult brain comes from the learning algorithm, not the initialization.
    How much predictive power does this analogy have as per you personally?
    Yeah, it’s way more informative than the evolution analogy to me, because I expect human researchers + computers spending resources designing AI to be pretty hard to analogize to evolution, but learning within AI to be within a few orders of magnitude on various resources to learning within a brain’s lifetime.
    - xpostah 25 Dec 2024 16:39 UTC
      1 point
      0
      Parent
      Thanks for the links. Might go through when I find time.
      Even if the papers prove that there’s similiarities, I don’t see how this proves anything about evolution versus within-lifetime learning.
      But there’s decent evidence that there’s not much more initialization than that, and that that huge fraction of the brain has to slowly pick up knowledge within the human lifetime before it starts being useful, e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC9957955/
      This seems like your strongest argument. I will have to study more to understand this.
      our DNA has on the order of a megabyte to spend on the brain
      That’s it? Really? That is new information for me.
      Tbh your argument might end up being persuasive to me. So thank you for writing them.
      The problem is that me building a background in neuroscience to the point I’m confident I’m not being fooled, will take time. And I’m interested in neuroscience but not that interested in studying it just for AI safety reasons. If you have like a post that covers this argument well (around initialisation not storing a lot of information) it’ll be nice. (But not necessary ofcourse, that’s upto you)

notfnofn 24 Dec 2024 12:04 UTC
1 point
0
If you are making an argument on how much compute can find an intelligent mind, you have to look at how much compute used by all of evolution.
Just to make sure I fully understand your argument, is this paraphrase correct?
“Suppose we have the compute theoretically required to simulate the human brain down to an adequate granularity for obtaining its intelligence (which might be at the level of cells instead of, say, the atomic level). Even so, one has to consider the compute required to actually build such a simulation, which could be much larger as the human brain was built by the full universe.”
(My personal view is that the opposite direction is true: it seems with recent evidence that we can pareto-exceed human intelligence while being very far from the compute required to simulate a brain. An idea I’ve seen floating around here is that natural selection built our brain randomly with a reward function that valued producing offspring so there is a lot of architecture that is irrelevant to intelligence)
- xpostah 25 Dec 2024 11:02 UTC
  1 point
  0
  Parent
  Yes your paraphrase is not bad. I think we can assume things outside of Earth don’t need to be simulated, it would be surprising to me if events outside of Earth made the difference between evolution producing Homo sapiens versus some other less intelligent species. (Maybe a few basic things like temperature of the Earth being shifted slowly) For the most part the Earth is causally isolated from the rest of the universe.
  
  Now which parts of the Earth can we safely omit simulating is a harder question as there’s more causal interactions going on. I can make some guesses around parts of the earths environment that can be ignored by the simulation, but they’ll be guesses only.
  An idea I’ve seen floating around here is that natural selection built our brain randomly with a reward function that valued producing offspring so there is a lot of architecture that is irrelevant to intelligence
  Yes gradient descent is likely a faster search algorithm, but IMO you’re still using it to search the big search space that evolution searched through, not the smaller one a human brain searches through after being born.