The biggest stretch here seems to me to be evaluating the brain on the basis of how much compute existing hardware requires to emulate the brain.
Where did I do that? I never used emulation in that context. Closely emulating a brain—depending on what you mean—could require arbitrarily more compute then the brain itself.
This article is about analyzing how close the brain is to known physical computational limits.
You may be confused by my comparisons to GPUs? That is to establish points of comparison. Naturally it also relates to the compute/energy cost of simulating brain sized circuits, but that’s only because GPUs are optimized for efficiently converting raw transistor ops into matrix-mult ops of the form needed to simulate neural nets.
However, the question of how much of the compute that is being attributed to the brain here is actually necessary for cognition remains open.
I address that partly in the circuits and data section, but much depends on what you mean by ‘cognition’. If you want a system that does all the things the brain does, then there isn’t much hope for doing that using multiple OOM less energy than the brain, at least on conventional computers.
The part of the article where you compare contemporary approaches particularly seems like it leaves the possibility that both brains and deep learning systems are orders of magnitude inefficient, still on the table.
I’m not entirely clear on which efficiency metric you are considering.
For DL energy of course—at the hardware level GPUs are a few OOM less energy efficient than the brain, probably then also lose a few OOMs in DL software not fully optimized yet (for sparsity, etc). And the conclusion in the section on energy/thermodynamics was the brain is up to an OOM or so from physical limits, due to size/cooling constraints. I didn’t use direct comparisons of DL compute required for tasks partly for DL’s known compute inefficiency, and partly because the article was already pretty long.
For data efficiency it should be obvious that much larger networks/models could potentially learn almost proportionally faster—as implied by the scaling laws, but at the expense of more compute.
Just pointing out that humans doing arithmetic and GPT3 doing arithmetic are both awful in efficiency compared to raw processor instructions. I think what FeepingCreature is considering is how many other tasks are like that?
The set of tasks like that is simply traditional computer science. AGI is defined as doing what the brain does very efficiently, not doing what computers are already good at.
Don’t dismiss these tasks just by saying they aren’t part of AGI by definition.
The human brain is reasonably good at some tasks and utterly hopeless at others. The tasks early crude computers got turned to were mostly the places where the early crude computers could compete with brains, ie the tasks brains were hopeless at. So the first computers did arithmetic because brains are really really bad at arithmetic so even vacuum tubes were an improvement.
The modern field of AI is what is left when all the tasks that it is easy to do perfectly are removed.
Suppose someone finds a really good algorithm for quickly finding physics equations from experimental data tomorrow. No the algorithm doesn’t contain anything resembling a neural network. Would you dismiss that as “Just traditional computer science”? Do you think this can’t happen?
Imagine a hypothetical world in which there was an algorithm that could do everything that the human brain does better, and with a millionth of the compute. If someone invented this algorithm last week and
AGI is defined as doing what the brain does very efficiently, not doing what computers are already good at.
Wouldn’t that mean no such thing as AGI was possible. There was literally nothing the brain did efficiently, it was all stuff computers were already good at. You just didn’t know the right algorithm to do it.
Imagine a hypothetical world in which there was an algorithm that could do everything that the human brain does better, and with a millionth of the compute.
Based on the evidence at hand (as summarized in this article) - we probably don’t live in that world. The burden of proof is on you to show otherwise.
But in those hypothetical worlds, AGI would come earlier, probably well before the end phase of Moore’s Law.
I was using that as a hypothetical example to show that your definitions were bad. (In particular, the attempt to define arithmetic as not AI because computers were so much better at it.)
I also don’t think that you have significant evidence that we don’t live in this world, beyond the observation that if such an algorithm exists, it is sufficiently non-obvious that neither evolution or humans have found it so far.
A lot of the article is claiming the brain is thermodynamically efficient at turning energy into compute. The rest is comparing the brain to existing deep learning techniques.
I admit that I have little evidence that such an algorithm does exist, so its largely down to priors.
also don’t think that you have significant evidence that we don’t live in this world, beyond the observation that if such an algorithm exists, it is sufficiently non-obvious that neither evolution or humans have found it so far.
FWIW, I totally think that mental savants like Ramanujan (or “ordinary” geniuses like von Neumann) make a super-strong case for the existence of “algorithms evolution knows not”.
(Yes, they were humans, and were therefore running on the same evolutionary hardware as everybody else. But I don’t think it makes sense to credit their remarkable achievements to the hardware evolution produced; indeed, it seems almost certain that they were using that same hardware to run a better algorithm, producing much better results with the same amount of compute—or possibly less, in Ramanujan’s case!)
Where did I do that? I never used emulation in that context. Closely emulating a brain—depending on what you mean—could require arbitrarily more compute then the brain itself.
This article is about analyzing how close the brain is to known physical computational limits.
You may be confused by my comparisons to GPUs? That is to establish points of comparison. Naturally it also relates to the compute/energy cost of simulating brain sized circuits, but that’s only because GPUs are optimized for efficiently converting raw transistor ops into matrix-mult ops of the form needed to simulate neural nets.
I address that partly in the circuits and data section, but much depends on what you mean by ‘cognition’. If you want a system that does all the things the brain does, then there isn’t much hope for doing that using multiple OOM less energy than the brain, at least on conventional computers.
I’m not entirely clear on which efficiency metric you are considering.
For DL energy of course—at the hardware level GPUs are a few OOM less energy efficient than the brain, probably then also lose a few OOMs in DL software not fully optimized yet (for sparsity, etc). And the conclusion in the section on energy/thermodynamics was the brain is up to an OOM or so from physical limits, due to size/cooling constraints. I didn’t use direct comparisons of DL compute required for tasks partly for DL’s known compute inefficiency, and partly because the article was already pretty long.
For data efficiency it should be obvious that much larger networks/models could potentially learn almost proportionally faster—as implied by the scaling laws, but at the expense of more compute.
Just pointing out that humans doing arithmetic and GPT3 doing arithmetic are both awful in efficiency compared to raw processor instructions. I think what FeepingCreature is considering is how many other tasks are like that?
The set of tasks like that is simply traditional computer science. AGI is defined as doing what the brain does very efficiently, not doing what computers are already good at.
Don’t dismiss these tasks just by saying they aren’t part of AGI by definition.
The human brain is reasonably good at some tasks and utterly hopeless at others. The tasks early crude computers got turned to were mostly the places where the early crude computers could compete with brains, ie the tasks brains were hopeless at. So the first computers did arithmetic because brains are really really bad at arithmetic so even vacuum tubes were an improvement.
The modern field of AI is what is left when all the tasks that it is easy to do perfectly are removed.
Suppose someone finds a really good algorithm for quickly finding physics equations from experimental data tomorrow. No the algorithm doesn’t contain anything resembling a neural network. Would you dismiss that as “Just traditional computer science”? Do you think this can’t happen?
Imagine a hypothetical world in which there was an algorithm that could do everything that the human brain does better, and with a millionth of the compute. If someone invented this algorithm last week and
Wouldn’t that mean no such thing as AGI was possible. There was literally nothing the brain did efficiently, it was all stuff computers were already good at. You just didn’t know the right algorithm to do it.
Based on the evidence at hand (as summarized in this article) - we probably don’t live in that world. The burden of proof is on you to show otherwise.
But in those hypothetical worlds, AGI would come earlier, probably well before the end phase of Moore’s Law.
I was using that as a hypothetical example to show that your definitions were bad. (In particular, the attempt to define arithmetic as not AI because computers were so much better at it.)
I also don’t think that you have significant evidence that we don’t live in this world, beyond the observation that if such an algorithm exists, it is sufficiently non-obvious that neither evolution or humans have found it so far.
A lot of the article is claiming the brain is thermodynamically efficient at turning energy into compute. The rest is comparing the brain to existing deep learning techniques.
I admit that I have little evidence that such an algorithm does exist, so its largely down to priors.
FWIW, I totally think that mental savants like Ramanujan (or “ordinary” geniuses like von Neumann) make a super-strong case for the existence of “algorithms evolution knows not”.
(Yes, they were humans, and were therefore running on the same evolutionary hardware as everybody else. But I don’t think it makes sense to credit their remarkable achievements to the hardware evolution produced; indeed, it seems almost certain that they were using that same hardware to run a better algorithm, producing much better results with the same amount of compute—or possibly less, in Ramanujan’s case!)