Rob Bensinger comments on AGI Ruin: A List of Lethalities

Rob Bensinger 6 Jun 2022 21:20 UTC
3 points
0
I don’t think he does this; that’d be ridiculous.
Doesn’t do what? I understand Eliezer to be saying that he figured out AI risk via thinking things through himself (e.g., writing a story that involved outcome pumps; reflecting on orthogonality and instrumental convergence; etc.), rather than being argued into it by someone else who was worried about AI risk. If Eliezer didn’t do that, there would still presumably be someone prior to him who did that, since conclusions and ideas have to enter the world somehow. So I’m not understanding what you’re modeling as ridiculous.
(I don’t know that foom falls into the same category; did Vinge or I.J. Good’s arguments help persuade EY here?)
“I can’t find any good alignment researchers. The only way I know how to find them is by explaining that the field is important, using arguments for AI risk and doomerism, which means they didn’t come up with those arguments on their own, and thus cannot be ‘worthy’.”
This is phrased in a way that’s meant to make the standard sound unfair or impossible. But it seems like a perfectly fine Bayesian update:
- There’s no logical necessity that we live in a world that lacks dozens of independent “Eliezers” who all come up with this stuff and write about it. I think Nick Bostrom had some AI risk worries independently of Eliezer, so gets at least partial credit on this dimension. Others who had thoughts along these lines independently include Norbert Wiener and I.J. Good (timeline with more examples).
  - You could imagine a world that has much more independent discovery on this topic, or one where all the basic concepts of AI risk were being widely discussed and analyzed back in the 1960s. It’s a fair Bayesian update to note that we don’t live in worlds that are anything like that, even if it’s not a fair test of individual ability for people who, say, encountered all of Eliezer’s writing as soon as they even learned about the concept of AI.
  - (I could also imagine a world where more of the independent discoveries result in serious research programs being launched, rather than just resulting in someone writing a science fiction story and then moving on with a shrug!)
- Your summary leaves out that “coming up with stuff without needing to be argued into it” is a matter of degree, and that there are many important claims here beyond just ‘AI risk is worth paying attention to at all’.
  - It’s logically possible to live in a world where people need to have AI risk brought to their attention, but then they immediately “get it” when they hear the two-sentence version, rather than needing an essay-length or seven-essay-length explanation. To the extent we live in a world where many key players need the full essay, and many other smart, important people don’t even “get it” after hours of conversation (e.g., LeCun), that’s a negative update about humanity’s odds of success.
  - Similarly, it’s logically possible to live in a world where people needed persuading to accept the core ‘AI risk’ thing, but then they have an easy time generating all the other important details and subclaims themselves. “Maximum doom” and “minimum doom” aren’t the only options; the exact level of doominess matters a lot.
    E.g., my Eliezer-model thinks that nearly all public discussion of ‘practical implications of logical decision theory’ outside of MIRI (e.g., discussion of humans trying to acausally trade with superintelligences) has been utterly awful. If instead this discourse had managed to get a ton of stuff right even though EY wasn’t releasing much of his own detailed thoughts about acausal trade, then that would have been an important positive update.
- Eliezer spent years alluding to his AI risk concerns on Overcoming Bias without writing them all up, and deliberately withheld many related arguments for years (including as recently as last year) in order to test whether anyone else would generate them independently. It isn’t the case that humanity had to passively wait to hear the full argument from Eliezer before it was permitted for them to start thinking and writing about this stuff.
- riceissa 6 Jun 2022 22:05 UTC
  15 points
  11
  Parent
  Doesn’t do what? I understand Eliezer to be saying that he figured out AI risk via thinking things through himself (e.g., writing a story that involved outcome pumps; reflecting on orthogonality and instrumental convergence; etc.), rather than being argued into it by someone else who was worried about AI risk. If Eliezer didn’t do that, there would still presumably be someone prior to him who did that, since conclusions and ideas have to enter the world somehow. So I’m not understanding what you’re modeling as ridiculous.
  My understanding of the history is that Eliezer did not realize the importance of alignment at first, and that he only did so later after arguing about it online with people like Nick Bostrom. See e.g. this thread. I don’t know enough of the history here, but it also seems logically possible that Bostrom could have, say, only realized the importance of alignment after conversing with other people who also didn’t realize the importance of alignment. In that case, there might be a “bubble” of humans who together satisfy the null string criterion, but no single human who does.
  The null string criterion does seem a bit silly nowadays since I think the people who would have satisfied it would have sooner read about AI risk on e.g. LessWrong. So they wouldn’t even have the chance to live to age ~21 to see if they spontaneously invent the ideas.
- lc 6 Jun 2022 22:01 UTC
  4 points
  2
  Parent
  Look, maybe you’re right. But I’m not good at complicated reasoning; I can’t confidently verify these results you’re giving me. My brain is using a much simpler heuristic that says: look at all of these other fields with core insights that could have been made way earlier than they did. Look at Newton! Look at Darwin! Certainly game theorists could have come along a lot sooner. But that doesn’t mean only the founder of these fields is the one Great enough to make progress, so, what are you saying, exactly?