God, absolutely, yes, do I get to talk to the sceptic in question regularly over the three months?
Given three months of dialogue with someone who thinks like me about computers and maths, and where we both promise to take each other’s ideas seriously, if I haven’t changed his mind far enough to convince him that there are serious reasons to be scared, he will have changed mine.
I have actually managed this with a couple of sceptic friends, although the three months of dialogue has been spread out over the last decade!
And I don’t know what I’m talking about. Are you seriously saying that our best people can’t do this?! Eliezer used to make a sport of getting people to let him out of his box. And has always been really really good at explaining complicated thoughts persuasively.
Maybe our arguments aren’t worth listening to. Maybe we’re just wrong.
Give me this challenge!! Nobody needs to pay me, I will try to do this for fun and curiosity with anyone on the other side who is open-minded enough to commit to regular chats. An hour every evening?
In person would be better, so Cambridge or maybe London? I can face the afternoon train for this.
I think it’s totally doable (and I have done it myself) to convince people who haven’t yet staked a claim as an Alignment Skeptic. There are specific people such as Yann Lecun who are publicly skeptical of alignment research; it is them who I imagine we could not convince.
OK, I myself am sceptical of alignment research, but not at all sceptical of the necessity for it.
Do you think someone like Eliezer has had a proper go at convincing him that there’s a problem? Or will he just not give us the time of day? Has he written anything coherent on the internet that I could read in order to see what his objections are?
Personally I would love to lose my doom-related beliefs, so I’d like to try to understand his position as well as I can for two reasons.
Great, thanks, so I’m going to write down my response to his thoughts as I hear them:
Before reading the debate I read the Scientific American article it’s about. On first read, that seems convincing, ok, relax! And then take a closer look.
What’s he saying (paraphrasing stuff from Scientific American):
Superintelligence is possible, can be made to act in the world, and is likely coming soon.
Intelligence and goals are decoupled
Why would a sentient AI want to take over the world? It wouldn’t.
Intelligence per se does not generate the drive for domination
Not all animals care about dominance.
I’m a bit worried by mention of the first law of robotics. I thought the point of all those stories was all the ways such laws might lead to weird outcomes.
Blah blah joblessness, military robots, blah inequality, all true, I might even care if I thought there was going to be anyone around to worry about it. But it does mean that he’s not a starry-eyed optimist who thinks nothing can go wrong.
That’s great! I agree with all of that, it’s often very hard to get people that far. I think he’s on board with most of our argument.
And then right at the end (direct quote):
Even in the worst case, the robots will remain under our command, and we will have only ourselves to blame.
OK, so he thinks that because you made a robot, it will stay loyal to you and follow your commands. No justification given.
It’s not really fair to close read an article in a popular magazine, but at this point I think he maybe he realises that you can make a superintelligent wish-granting machine, but hasn’t thought about what happens if you make the wrong wish and want to change it later.
(I’m supposed to not be throwing mythological and literary references into things any more, but I can’t help but think about the Sybil, rotting in her bag because Apollo had granted her eternal life but not eternal youth, or TS Eliot’s: “That is not it at all, That is not what I meant, at all.” )
So let’s go and look at the debate itself, rather than the article.
I was talking to someone recently who talked to Yann and got him to agree with very alignment-y things, but then a couple days later, Yann was saying very capabilities things instead.
The “someone”’s theory was that Yann’s incentives and environment is all towards capabilities research.
God, absolutely, yes, do I get to talk to the sceptic in question regularly over the three months?
Given three months of dialogue with someone who thinks like me about computers and maths, and where we both promise to take each other’s ideas seriously, if I haven’t changed his mind far enough to convince him that there are serious reasons to be scared, he will have changed mine.
I have actually managed this with a couple of sceptic friends, although the three months of dialogue has been spread out over the last decade!
And I don’t know what I’m talking about. Are you seriously saying that our best people can’t do this?! Eliezer used to make a sport of getting people to let him out of his box. And has always been really really good at explaining complicated thoughts persuasively.
Maybe our arguments aren’t worth listening to. Maybe we’re just wrong.
Give me this challenge!! Nobody needs to pay me, I will try to do this for fun and curiosity with anyone on the other side who is open-minded enough to commit to regular chats. An hour every evening?
In person would be better, so Cambridge or maybe London? I can face the afternoon train for this.
I think it’s totally doable (and I have done it myself) to convince people who haven’t yet staked a claim as an Alignment Skeptic. There are specific people such as Yann Lecun who are publicly skeptical of alignment research; it is them who I imagine we could not convince.
OK, I myself am sceptical of alignment research, but not at all sceptical of the necessity for it.
Do you think someone like Eliezer has had a proper go at convincing him that there’s a problem? Or will he just not give us the time of day? Has he written anything coherent on the internet that I could read in order to see what his objections are?
Personally I would love to lose my doom-related beliefs, so I’d like to try to understand his position as well as I can for two reasons.
Here’s an example
Great, thanks, so I’m going to write down my response to his thoughts as I hear them:
Before reading the debate I read the Scientific American article it’s about. On first read, that seems convincing, ok, relax! And then take a closer look.
What’s he saying (paraphrasing stuff from Scientific American):
Superintelligence is possible, can be made to act in the world, and is likely coming soon.
Intelligence and goals are decoupled
Why would a sentient AI want to take over the world? It wouldn’t.
Intelligence per se does not generate the drive for domination
Not all animals care about dominance.
I’m a bit worried by mention of the first law of robotics. I thought the point of all those stories was all the ways such laws might lead to weird outcomes.
Blah blah joblessness, military robots, blah inequality, all true, I might even care if I thought there was going to be anyone around to worry about it. But it does mean that he’s not a starry-eyed optimist who thinks nothing can go wrong.
That’s great! I agree with all of that, it’s often very hard to get people that far. I think he’s on board with most of our argument.
And then right at the end (direct quote):
OK, so he thinks that because you made a robot, it will stay loyal to you and follow your commands. No justification given.
It’s not really fair to close read an article in a popular magazine, but at this point I think he maybe he realises that you can make a superintelligent wish-granting machine, but hasn’t thought about what happens if you make the wrong wish and want to change it later.
(I’m supposed to not be throwing mythological and literary references into things any more, but I can’t help but think about the Sybil, rotting in her bag because Apollo had granted her eternal life but not eternal youth, or TS Eliot’s: “That is not it at all, That is not what I meant, at all.” )
So let’s go and look at the debate itself, rather than the article.
I was talking to someone recently who talked to Yann and got him to agree with very alignment-y things, but then a couple days later, Yann was saying very capabilities things instead.
The “someone”’s theory was that Yann’s incentives and environment is all towards capabilities research.