I think it’s totally doable (and I have done it myself) to convince people who haven’t yet staked a claim as an Alignment Skeptic. There are specific people such as Yann Lecun who are publicly skeptical of alignment research; it is them who I imagine we could not convince.
OK, I myself am sceptical of alignment research, but not at all sceptical of the necessity for it.
Do you think someone like Eliezer has had a proper go at convincing him that there’s a problem? Or will he just not give us the time of day? Has he written anything coherent on the internet that I could read in order to see what his objections are?
Personally I would love to lose my doom-related beliefs, so I’d like to try to understand his position as well as I can for two reasons.
Great, thanks, so I’m going to write down my response to his thoughts as I hear them:
Before reading the debate I read the Scientific American article it’s about. On first read, that seems convincing, ok, relax! And then take a closer look.
What’s he saying (paraphrasing stuff from Scientific American):
Superintelligence is possible, can be made to act in the world, and is likely coming soon.
Intelligence and goals are decoupled
Why would a sentient AI want to take over the world? It wouldn’t.
Intelligence per se does not generate the drive for domination
Not all animals care about dominance.
I’m a bit worried by mention of the first law of robotics. I thought the point of all those stories was all the ways such laws might lead to weird outcomes.
Blah blah joblessness, military robots, blah inequality, all true, I might even care if I thought there was going to be anyone around to worry about it. But it does mean that he’s not a starry-eyed optimist who thinks nothing can go wrong.
That’s great! I agree with all of that, it’s often very hard to get people that far. I think he’s on board with most of our argument.
And then right at the end (direct quote):
Even in the worst case, the robots will remain under our command, and we will have only ourselves to blame.
OK, so he thinks that because you made a robot, it will stay loyal to you and follow your commands. No justification given.
It’s not really fair to close read an article in a popular magazine, but at this point I think he maybe he realises that you can make a superintelligent wish-granting machine, but hasn’t thought about what happens if you make the wrong wish and want to change it later.
(I’m supposed to not be throwing mythological and literary references into things any more, but I can’t help but think about the Sybil, rotting in her bag because Apollo had granted her eternal life but not eternal youth, or TS Eliot’s: “That is not it at all, That is not what I meant, at all.” )
So let’s go and look at the debate itself, rather than the article.
I was talking to someone recently who talked to Yann and got him to agree with very alignment-y things, but then a couple days later, Yann was saying very capabilities things instead.
The “someone”’s theory was that Yann’s incentives and environment is all towards capabilities research.
I think it’s totally doable (and I have done it myself) to convince people who haven’t yet staked a claim as an Alignment Skeptic. There are specific people such as Yann Lecun who are publicly skeptical of alignment research; it is them who I imagine we could not convince.
OK, I myself am sceptical of alignment research, but not at all sceptical of the necessity for it.
Do you think someone like Eliezer has had a proper go at convincing him that there’s a problem? Or will he just not give us the time of day? Has he written anything coherent on the internet that I could read in order to see what his objections are?
Personally I would love to lose my doom-related beliefs, so I’d like to try to understand his position as well as I can for two reasons.
Here’s an example
Great, thanks, so I’m going to write down my response to his thoughts as I hear them:
Before reading the debate I read the Scientific American article it’s about. On first read, that seems convincing, ok, relax! And then take a closer look.
What’s he saying (paraphrasing stuff from Scientific American):
Superintelligence is possible, can be made to act in the world, and is likely coming soon.
Intelligence and goals are decoupled
Why would a sentient AI want to take over the world? It wouldn’t.
Intelligence per se does not generate the drive for domination
Not all animals care about dominance.
I’m a bit worried by mention of the first law of robotics. I thought the point of all those stories was all the ways such laws might lead to weird outcomes.
Blah blah joblessness, military robots, blah inequality, all true, I might even care if I thought there was going to be anyone around to worry about it. But it does mean that he’s not a starry-eyed optimist who thinks nothing can go wrong.
That’s great! I agree with all of that, it’s often very hard to get people that far. I think he’s on board with most of our argument.
And then right at the end (direct quote):
OK, so he thinks that because you made a robot, it will stay loyal to you and follow your commands. No justification given.
It’s not really fair to close read an article in a popular magazine, but at this point I think he maybe he realises that you can make a superintelligent wish-granting machine, but hasn’t thought about what happens if you make the wrong wish and want to change it later.
(I’m supposed to not be throwing mythological and literary references into things any more, but I can’t help but think about the Sybil, rotting in her bag because Apollo had granted her eternal life but not eternal youth, or TS Eliot’s: “That is not it at all, That is not what I meant, at all.” )
So let’s go and look at the debate itself, rather than the article.
I was talking to someone recently who talked to Yann and got him to agree with very alignment-y things, but then a couple days later, Yann was saying very capabilities things instead.
The “someone”’s theory was that Yann’s incentives and environment is all towards capabilities research.