Thank you for posting this, and it was interesting. Also, I think the middle section is bad.
Basically starting from Lance taking a digression out of an anthropomorphic argument to castigate those who think AI might do bad things for anthropomorphising, and ending with the end of all discussion of Solomonoff induction, I think there was a lot of misconstruing ideas or arguing against nonexistent people.
Like, I personally don’t agree with people who expect optimization daemons to arise in gradient descent, but I don’t say they’re motivated by whether the Solomonoff prior is malign.
I do think that Solomonoff-flavored intuitions motivate much of the credence people around here put on scheming. Apparently Evan Hubinger puts a decent amount of weight on it, because he kept bringing it up in our discussion in the comments to Counting arguments provide no evidence for AI doom.
I was curious about the context and so I went over and ctrl+F’ed Solomonoff and found Evan saying
I think you’re misunderstanding the nature of my objection. It’s not that Solomonoff induction is my real reason for believing in deceptive alignment or something, it’s that the reasoning in this post is mathematically unsound, and I’m using the formalism to show why. If I weren’t responding to this post specifically, I probably wouldn’t have brought up Solomonoff induction at all.
Intuitions about simplicity in regimes where speed is unimportant (e.g. turing machines with minimal speed bound) != intuitions from the solomonoff prior being malign due to the emergence of life within these turing machines.
It seems important to not equivocate between these.
(Sorry for the terse response, hopefully this makes sense.)
Thank you for posting this, and it was interesting. Also, I think the middle section is bad.
Basically starting from Lance taking a digression out of an anthropomorphic argument to castigate those who think AI might do bad things for anthropomorphising, and ending with the end of all discussion of Solomonoff induction, I think there was a lot of misconstruing ideas or arguing against nonexistent people.
Like, I personally don’t agree with people who expect optimization daemons to arise in gradient descent, but I don’t say they’re motivated by whether the Solomonoff prior is malign.
I do think that Solomonoff-flavored intuitions motivate much of the credence people around here put on scheming. Apparently Evan Hubinger puts a decent amount of weight on it, because he kept bringing it up in our discussion in the comments to Counting arguments provide no evidence for AI doom.
I was curious about the context and so I went over and ctrl+F’ed Solomonoff and found Evan saying
Yeah, I think Evan is basically opportunistically changing his position during that exchange, and has no real coherent argument.
Intuitions about simplicity in regimes where speed is unimportant (e.g. turing machines with minimal speed bound) != intuitions from the solomonoff prior being malign due to the emergence of life within these turing machines.
It seems important to not equivocate between these.
(Sorry for the terse response, hopefully this makes sense.)