Short summary: Biological anchors are a bad way to predict AGI. It’s a case of “argument from comparable resource consumption.” Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI! The 2020 OpenPhil estimate of 2050 is based on a biological anchor, so we should ignore it.
Longer summary:
Lots of folks made bad AGI predictions by asking:
How much compute is needed for AGI?
When that compute will be available?
To find (1), they use a “biological anchor,” like the computing power of the human brain, or the total compute used to evolve human brains.
Hans Moravec, 1988: the human brain uses 10^13 ops/s, and computers with this power will be available in 2010.
Eliezer objects that:
“We’ll have computers as fast as human brains in 2010” doesn’t imply “we’ll have strong AI in 2010.”
The compute needed depends on how well we understand cognition and computer science. It might be done with a hypercomputer but very little knowledge, or a modest computer but lots of knowledge.
An AGI wouldn’t actually need 10^13 ops/s, because human brains are inefficient. One example, they do lots of operations in parallel, which could be replaced with fewer operations in series.
Eliezer, 1999: Eliezer mentions that he too made bad AGI predictions as a teenager
Ray Kurzweil, 2001: Same idea as Moravec, but 10^16 ops/s. Not worth repeating
Someone, 2006: it took ~10^43 ops for evolution to create human brains. It’ll be a very long time before a computer can reach 10^43 ops, so AGI is very far away
Eliezer objects that the use of a biological anchor is sufficient to make this estimate useless. It’s a case of a more general “argument from comparable resource consumption.”
Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI!
OpenPhil, 2020: A much more sophisticated estimate, but still based on a biological anchor. They predict AGI in 2050.
How the new model works:
Demand side: Estimate how many neural-network parameters would emulate a brain. Use this to find the computational cost of training such a model. (I think this part mischaracterizes OpenPhil’s work, my comments at the bottom)
Supply side: Moore’s law, assuming
Willingness to spend on AGI training is a fixed percent of GDP
“Computation required to accomplish a fixed task decreases by half every 2-3 years due to better algorithms.”
Eliezer’s objections:
(Surprise!) It’s still founded on a biological anchor, which is sufficient to make it invalid
OpenPhil models theoretical AI progress as algorithms getting twice as efficient every 2-3 years. This is a bad model, because folks keep finding entirely new approaches. Specifically, it implies “we should be able to replicate any modern feat of deep learning performed in 2021, using techniques from before deep learning and around fifty times as much computing power.”
Some of OpenPhil’s parameters make it easy for the modelers to cheat, and make sure it comes up with an answer they like: “I was wondering what sort of tunable underdetermined parameters enabled your model to nail the psychologically overdetermined final figure of ’30 years’ so exactly.”
Can’t we use this as an upper bound? Maybe AGI will come sooner, but surely it won’t take longer than this estimate.
Eliezer thinks this is the same non-sequitur as Moravec’s. If you train a model big enough to emulate a brain, that doesn’t mean AGI will pop out at the end.
Other commentary: Eliezer mentions several times that he’s feeling old, tired, and unhealthy. He feels frustrated that researchers today repeat decades-old bad arguments. It takes him a lot of energy to rebut these claims
My thoughts:
I found this persuasive, but I also think it mischaracterized the OpenPhil model
My understanding is that OpenPhil didn’t just estimate the number of neural network parameters required to train a human brain. They used six different biological anchors, including the “evolution anchor’, which I find very useful for an upper bound.
Holden Karnofsky, who seems to put much more stock in the Bio Anchors model than Eliezer, explains the model really well here. But I was frustrated to see that the write-up on Holden’s blog gives 50% by 2090 (first graph) using the evolution anchor, while the same graph in the old calcs gives only 11%. Was this model tuned after seeing the results?
My conclusion: Bio Anchors is a terrible way to model when AGI will actually arrive. But I don’t agree with Eliezer’s dismissal of using Bio Anchors to get an upper bound, because I think the evolution anchor achieves this.
Holden also mentions something a bit like Eliezer’s criticism in his own write-up,
In particular, I think it’s hard to rule out the possibility of ingenuity leading to transformative AI in some far more efficient way than the “brute-force” method contemplated here.
When Holden talks about ‘ingenuity’ methods that seems consistent with Eliezer’s
They’re not going to be taking your default-imagined approach algorithmically faster, they’re going to be taking an algorithmically different approach that eats computing power in a different way than you imagine it being consumed.
I.e. if you wanted to fold this consideration into OpenAI’s estimate you’d have to do it by having a giant incredibly uncertain free-floating variable for ‘speedup factor’ because you’d be nonsensically trying to estimate the ‘speed-up’ to brain processing applied from using some completely non-Deep Learning or non-brainlike algorithm for intelligence. All your uncertainty just gets moved into that one factor, and you’re back where you started.
It’s possible that Eliezer is confident in this objection partly because of his ‘core of generality’ model of intelligence—i.e. he’s implicitly imagining enormous numbers of varied paths to improvement that end up practically in the same place, while ‘stack more layers in a brainlike DL model’ is just one of those paths (and one that probably won’t even work), so he naturally thinks estimating the difficulty of this one path we definitely won’t take (and which probably wouldn’t work even if we did try it) out of the huge numbers of varied paths to generality is useless.
However, if you don’t have this model, then perhaps you can be more confident that what we’re likely to build will look at least somewhat like a compute-limited DL system and that these other paths will have to share some properties of this path. Relatedly, it’s an implication of the model that there’s some imaginable (and not e.g. galaxy sized) model we could build right now that would be an AGI, which I think Eliezer disputes?
Short summary: Biological anchors are a bad way to predict AGI. It’s a case of “argument from comparable resource consumption.” Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI! The 2020 OpenPhil estimate of 2050 is based on a biological anchor, so we should ignore it.
Longer summary:
Lots of folks made bad AGI predictions by asking:
How much compute is needed for AGI?
When that compute will be available?
To find (1), they use a “biological anchor,” like the computing power of the human brain, or the total compute used to evolve human brains.
Hans Moravec, 1988: the human brain uses 10^13 ops/s, and computers with this power will be available in 2010.
Eliezer objects that:
“We’ll have computers as fast as human brains in 2010” doesn’t imply “we’ll have strong AI in 2010.”
The compute needed depends on how well we understand cognition and computer science. It might be done with a hypercomputer but very little knowledge, or a modest computer but lots of knowledge.
An AGI wouldn’t actually need 10^13 ops/s, because human brains are inefficient. One example, they do lots of operations in parallel, which could be replaced with fewer operations in series.
Eliezer, 1999: Eliezer mentions that he too made bad AGI predictions as a teenager
Ray Kurzweil, 2001: Same idea as Moravec, but 10^16 ops/s. Not worth repeating
Someone, 2006: it took ~10^43 ops for evolution to create human brains. It’ll be a very long time before a computer can reach 10^43 ops, so AGI is very far away
Eliezer objects that the use of a biological anchor is sufficient to make this estimate useless. It’s a case of a more general “argument from comparable resource consumption.”
Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI!
OpenPhil, 2020: A much more sophisticated estimate, but still based on a biological anchor. They predict AGI in 2050.
How the new model works:
Demand side: Estimate how many neural-network parameters would emulate a brain. Use this to find the computational cost of training such a model. (I think this part mischaracterizes OpenPhil’s work, my comments at the bottom)
Supply side: Moore’s law, assuming
Willingness to spend on AGI training is a fixed percent of GDP
“Computation required to accomplish a fixed task decreases by half every 2-3 years due to better algorithms.”
Eliezer’s objections:
(Surprise!) It’s still founded on a biological anchor, which is sufficient to make it invalid
OpenPhil models theoretical AI progress as algorithms getting twice as efficient every 2-3 years. This is a bad model, because folks keep finding entirely new approaches. Specifically, it implies “we should be able to replicate any modern feat of deep learning performed in 2021, using techniques from before deep learning and around fifty times as much computing power.”
Some of OpenPhil’s parameters make it easy for the modelers to cheat, and make sure it comes up with an answer they like:
“I was wondering what sort of tunable underdetermined parameters enabled your model to nail the psychologically overdetermined final figure of ’30 years’ so exactly.”
Can’t we use this as an upper bound? Maybe AGI will come sooner, but surely it won’t take longer than this estimate.
Eliezer thinks this is the same non-sequitur as Moravec’s. If you train a model big enough to emulate a brain, that doesn’t mean AGI will pop out at the end.
Other commentary: Eliezer mentions several times that he’s feeling old, tired, and unhealthy. He feels frustrated that researchers today repeat decades-old bad arguments. It takes him a lot of energy to rebut these claims
My thoughts:
I found this persuasive, but I also think it mischaracterized the OpenPhil model
My understanding is that OpenPhil didn’t just estimate the number of neural network parameters required to train a human brain. They used six different biological anchors, including the “evolution anchor’, which I find very useful for an upper bound.
Holden Karnofsky, who seems to put much more stock in the Bio Anchors model than Eliezer, explains the model really well here. But I was frustrated to see that the write-up on Holden’s blog gives 50% by 2090 (first graph) using the evolution anchor, while the same graph in the old calcs gives only 11%. Was this model tuned after seeing the results?
My conclusion: Bio Anchors is a terrible way to model when AGI will actually arrive. But I don’t agree with Eliezer’s dismissal of using Bio Anchors to get an upper bound, because I think the evolution anchor achieves this.
Holden also mentions something a bit like Eliezer’s criticism in his own write-up,
When Holden talks about ‘ingenuity’ methods that seems consistent with Eliezer’s
I.e. if you wanted to fold this consideration into OpenAI’s estimate you’d have to do it by having a giant incredibly uncertain free-floating variable for ‘speedup factor’ because you’d be nonsensically trying to estimate the ‘speed-up’ to brain processing applied from using some completely non-Deep Learning or non-brainlike algorithm for intelligence. All your uncertainty just gets moved into that one factor, and you’re back where you started.
It’s possible that Eliezer is confident in this objection partly because of his ‘core of generality’ model of intelligence—i.e. he’s implicitly imagining enormous numbers of varied paths to improvement that end up practically in the same place, while ‘stack more layers in a brainlike DL model’ is just one of those paths (and one that probably won’t even work), so he naturally thinks estimating the difficulty of this one path we definitely won’t take (and which probably wouldn’t work even if we did try it) out of the huge numbers of varied paths to generality is useless.
However, if you don’t have this model, then perhaps you can be more confident that what we’re likely to build will look at least somewhat like a compute-limited DL system and that these other paths will have to share some properties of this path. Relatedly, it’s an implication of the model that there’s some imaginable (and not e.g. galaxy sized) model we could build right now that would be an AGI, which I think Eliezer disputes?