Charlie Steiner comments on Deconstructing Bostrom’s Classic Argument for AI Doom

Charlie Steiner 11 Mar 2024 19:06 UTC
4 points
1
Thank you for posting this, and it was interesting. Also, I think the middle section is bad.
Basically starting from Lance taking a digression out of an anthropomorphic argument to castigate those who think AI might do bad things for anthropomorphising, and ending with the end of all discussion of Solomonoff induction, I think there was a lot of misconstruing ideas or arguing against nonexistent people.
Like, I personally don’t agree with people who expect optimization daemons to arise in gradient descent, but I don’t say they’re motivated by whether the Solomonoff prior is malign.
- Nora Belrose 12 Mar 2024 0:07 UTC
  1 point
  −7
  Parent
  I do think that Solomonoff-flavored intuitions motivate much of the credence people around here put on scheming. Apparently Evan Hubinger puts a decent amount of weight on it, because he kept bringing it up in our discussion in the comments to Counting arguments provide no evidence for AI doom.
  - Charlie Steiner 12 Mar 2024 8:02 UTC
    4 points
    0
    Parent
    I was curious about the context and so I went over and ctrl+F’ed Solomonoff and found Evan saying
    
    I think you’re misunderstanding the nature of my objection. It’s not that Solomonoff induction is my real reason for believing in deceptive alignment or something, it’s that the reasoning in this post is mathematically unsound, and I’m using the formalism to show why. If I weren’t responding to this post specifically, I probably wouldn’t have brought up Solomonoff induction at all.
    - Nora Belrose 14 Mar 2024 16:42 UTC
      14 points
      11
      Parent
      Yeah, I think Evan is basically opportunistically changing his position during that exchange, and has no real coherent argument.
  - ryan_greenblatt 12 Mar 2024 0:56 UTC
    4 points
    0
    Parent
    Intuitions about simplicity in regimes where speed is unimportant (e.g. turing machines with minimal speed bound) != intuitions from the solomonoff prior being malign due to the emergence of life within these turing machines.
    
    It seems important to not equivocate between these.
    
    (Sorry for the terse response, hopefully this makes sense.)