Garrett Baker comments on AGI Ruin: A List of Lethalities

Garrett Baker 6 Jun 2022 1:31 UTC
9 points
6
[small nitpick]
I figured this stuff out using the null string as input, and frankly, I have a hard time myself feeling hopeful about getting real alignment work out of somebody who previously sat around waiting for somebody else to input a persuasive argument into them. This ability to “notice lethal difficulties without Eliezer Yudkowsky arguing you into noticing them” currently is an opaque piece of cognitive machinery to me, I do not know how to train it into others. It probably relates to ‘security mindset’, and a mental motion where you refuse to play out scripts, and being able to operate in a field that’s in a state of chaos.
I find this hard to believe. I’m sure you had some conversations with others which allowed you to arrive at these conclusions. In particular, your Intelligence Explosion Microeconomics paper uses the data from the evolution of humans to make the case that making intelligence higher was easy for evolution once the ball got rolling, which is not the null string.
- Eliezer Yudkowsky 6 Jun 2022 3:08 UTC
  12 points
  6
  Parent
  Null string socially. I obviously was allowed to look at the external world to form these conclusions, which is not the same as needing somebody to nag me into doing so.
  - Garrett Baker 6 Jun 2022 3:26 UTC
    9 points
    3
    Parent
    This makes more sense. I think you should clarify that this is what you mean when talking about the null string analogy in the future, especially when talking about what thinking about hard-to-think-about topics should look like. It seems fine, and probably useful, as long as you know it’s a vast overstatement, but because it’s a vast overstatement, it doesn’t actually provide that much actionable advice.
    Concretely, instead of talking about the null string, it would be more helpful if you talked about the amount of discussion it should take a prospective researcher to reach correct conclusions. From literal null-string for the optimal agent, to vague pointing in the correct direction for a pretty good researcher, to a fully formal and certain proof listing every claim and counter-claim imaginable for someone who probably shouldn’t go into alignment.
- Rob Bensinger 6 Jun 2022 3:15 UTC
  6 points
  1
  Parent
  If you read the linked tweet (https://twitter.com/ESYudkowsky/status/1500863629490544645), it’s talking about the persuasion/convincing/pushing you need in addition to whatever raw data makes it possible to reach the conclusion; it’s not saying that humans can get by without any Bayesian evidence about the external world.
  - Garrett Baker 6 Jun 2022 3:19 UTC
    4 points
    0
    Parent
    I did read the linked tweet, and now that you bring it up, my third sentence doesn’t apply. But I think my first & second sentences do still apply (ignoring Eliezer’s recent clarification).