To be blunt, it’s not just that Eliezer lacks a positive track record in predicting the nature of AI progress, which might be forgivable if we thought he had really good intuitions about this domain. Empiricism isn’t everything, theoretical arguments are important too and shouldn’t be dismissed. But-
Eliezer thought AGI would be developed from a recursively self-improving seed AI coded up by a small group, “brain in a box in a basement” style. He dismissed and mocked connectionist approaches to building AI. His writings repeatedly downplayed the importance of compute, and he has straw-manned writers like Moravec who did a better job at predicting when AGI would be developed than he did.
Old MIRI intuition pumps about why alignment should be difficult like the “Outcome Pump” and “Sorcerer’s apprentice” are now forgotten, it was a surprise that it would be easy to create helpful genies like LLMs who basically just do what we want. Remaining arguments for the difficulty of alignment are esoteric considerations about inductive biases, counting arguments, etc. So yes, let’s actually look at these arguments and not just dismiss them, but let’s not pretend that MIRI has a good track record.
I think the core concerns remain, and more importantly, there are other rather doom-y scenarios possible involving AI systems more similar to the ones we have that opened up and aren’t the straight up singleton ASI foom. The problem here is IMO not “this specific doom scenario will become a thing” but “we don’t have anything resembling a GOOD vision of the future with this tech that we are nevertheless developing at breakneck pace”. Yet the amount of dystopian or apocalyptic possible scenarios is enormous. Part of this is “what if we lose control of the AIs” (singleton or multipolar), part of it is “what if we fail to structure our society around having AIs” (loss of control, mass wireheading, and a lot of other scenarios I’m not sure how to name). The only positive vision the “optimists” on this have to offer is “don’t worry, it’ll be fine, this clearly revolutionary and never seen before technology that puts in question our very role in the world will play out the same way every invention ever did”. And that’s not terribly convincing.
I’m not saying anything on object-level about MIRI models, my point is that “outcomes are more predictable than trajectories” is pretty standard epistemically non-suspicious statement about wide range of phenomena. Moreover, in particular circumstances (and many others) you can reduce it to object-level claim, like “do observarions on current AIs generalize to future AI?”
How does the question of whether AI outcomes are more predictable than AI trajectories reduce to the (vague) question of whether observations on current AIs generalize to future AIs?
ChatGPT falsifies prediction about future superintelligent recursive self-improving AI only if ChatGPT is generalizable predictor of design of future superintelligent AIs.
There will be future superintelligent AIs that improve themselves. But they will be neural networks, they will at the very least start out as a compute-intensive project, in the infant stages of their self-improvement cycles they will understand and be motivated by human concepts rather than being dumb specialized systems that are only good for bootstrapping themselves to superintelligence.
To be blunt, it’s not just that Eliezer lacks a positive track record in predicting the nature of AI progress, which might be forgivable if we thought he had really good intuitions about this domain. Empiricism isn’t everything, theoretical arguments are important too and shouldn’t be dismissed. But-
Eliezer thought AGI would be developed from a recursively self-improving seed AI coded up by a small group, “brain in a box in a basement” style. He dismissed and mocked connectionist approaches to building AI. His writings repeatedly downplayed the importance of compute, and he has straw-manned writers like Moravec who did a better job at predicting when AGI would be developed than he did.
Old MIRI intuition pumps about why alignment should be difficult like the “Outcome Pump” and “Sorcerer’s apprentice” are now forgotten, it was a surprise that it would be easy to create helpful genies like LLMs who basically just do what we want. Remaining arguments for the difficulty of alignment are esoteric considerations about inductive biases, counting arguments, etc. So yes, let’s actually look at these arguments and not just dismiss them, but let’s not pretend that MIRI has a good track record.
I think the core concerns remain, and more importantly, there are other rather doom-y scenarios possible involving AI systems more similar to the ones we have that opened up and aren’t the straight up singleton ASI foom. The problem here is IMO not “this specific doom scenario will become a thing” but “we don’t have anything resembling a GOOD vision of the future with this tech that we are nevertheless developing at breakneck pace”. Yet the amount of dystopian or apocalyptic possible scenarios is enormous. Part of this is “what if we lose control of the AIs” (singleton or multipolar), part of it is “what if we fail to structure our society around having AIs” (loss of control, mass wireheading, and a lot of other scenarios I’m not sure how to name). The only positive vision the “optimists” on this have to offer is “don’t worry, it’ll be fine, this clearly revolutionary and never seen before technology that puts in question our very role in the world will play out the same way every invention ever did”. And that’s not terribly convincing.
I’m not saying anything on object-level about MIRI models, my point is that “outcomes are more predictable than trajectories” is pretty standard epistemically non-suspicious statement about wide range of phenomena. Moreover, in particular circumstances (and many others) you can reduce it to object-level claim, like “do observarions on current AIs generalize to future AI?”
How does the question of whether AI outcomes are more predictable than AI trajectories reduce to the (vague) question of whether observations on current AIs generalize to future AIs?
ChatGPT falsifies prediction about future superintelligent recursive self-improving AI only if ChatGPT is generalizable predictor of design of future superintelligent AIs.
There will be future superintelligent AIs that improve themselves. But they will be neural networks, they will at the very least start out as a compute-intensive project, in the infant stages of their self-improvement cycles they will understand and be motivated by human concepts rather than being dumb specialized systems that are only good for bootstrapping themselves to superintelligence.
Edit: Retracted because some of my exegesis of the historical seed AI concept may not be accurate