My understanding is that the issue is with Timeless Decision Theory, and AIs that can do acausal trade.
Roko wasn’t arguing against TDT. Roko’s post was about acausal trade, but the conclusion he was trying to argue for was just ‘utilitarian AI is evil because it causes suffering for the sake of the greater good’. But if that’s your concern, you can just post about some variant on the trolley problem. If utilitarianism is risky because a utilitarian might employ blackmail and blackmail is evil, then there should be innumerable other evil things a utilitarian would also do that require less theoretical apparatus.
As I understand Roko’s motivation, it was to convince people that we should not build an AI that would do basilisks. Not to spread infohazards for no reason.
On Roko’s view, if no one finds out about basilisks, the basilisk can’t blackmail anyone. So publicizing the idea doesn’t make sense, unless Roko didn’t take his own argument all that seriously. (Maybe Roko was trying to protect himself from personal blackmail risk at others’ expense, but this seems odd if he also increased his own blackmail risk in the process.)
Possibly Roko was thinking: ‘If I don’t prevent utilitarian AI from being built, it will cause a bunch of atrocities in general. But LessWrong users are used to dismissing anti-utilitarian arguments, so I need to think of one with extra shock value to get them to do some original seeing. This blackmail argument should work—publishing it puts people at risk of blackmail, but it serves the greater good of protecting us from other evil utilitarian tradeoffs.’
(… Irony unintended.)
Still, if that’s right, I’m inclined to think Roko should have tried to post other arguments against utilitarianism that don’t (in his view) put anyone at risk of torture. I’m not aware of him having done that.
Roko wasn’t arguing against TDT. Roko’s post was about acausal trade, but the conclusion he was trying to argue for was just ‘utilitarian AI is evil because it causes suffering for the sake of the greater good’. But if that’s your concern, you can just post about some variant on the trolley problem. If utilitarianism is risky because a utilitarian might employ blackmail and blackmail is evil, then there should be innumerable other evil things a utilitarian would also do that require less theoretical apparatus.
Ok that makes a bit less sense to me. I didn’t think it was against utilitarianism in general, which is much less controversial than TDT. But I can definitely still see his argument.
When people talk about the trolley problem, they don’t usually imagine that they might be the ones tied to the second track. The deeply unsettling thing about the basilisk isn’t that the AI might torture people for the greater good. It’s that you are the one who is going to be tortured. That a pretty compelling case against utilitarianism.
On Roko’s view, if no one finds out about basilisks, the basilisk can’t blackmail anyone. So publicizing the idea doesn’t make sense, unless Roko didn’t take his own argument all that seriously.
Roko found out. It disturbed him greatly. So it absolutely made sense for him to try to stop the development of such an AI any way he could. By telling other people, he made it their problem too and converted them to his side.
It’s that you are the one who is going to be tortured. That’s a pretty compelling case against utilitarianism.
It doesn’t appear to me to be a case against utilitarianism at all. “Adopting utilitarianism might lead to me getting tortured, and that might actually be optimal in utilitarian terms, therefore utilitarianism is wrong” doesn’t even have the right shape to be a valid argument. It’s like “If there is no god then many bad people will prosper and not get punished, which would be awful, therefore there is a god.” (Or, from the other side, “If there is a god then he may choose to punish me, which would be awful, therefore there is no god”—which has a thing or two in common with the Roko basilisk, of course.)
he made it their problem too and converted them to his side.
Perhaps he hoped to. I don’t see any sign that he actually did.
“Adopting utilitarianism might lead to me getting tortured, and that might actually be optimal in utilitarian terms, therefore utilitarianism is wrong” doesn’t even have the right shape to be a valid argument.
You are strawmanning the argument significantly. I would word it more like this:
“Building an AI that follows utilitarianism will lead to me getting tortured. I don’t want to be tortured. Therefore I don’t want such an AI to be built.”
Perhaps he hoped to. I don’t see any sign that he actually did.
That’s partially because EY fought against it so hard and even silenced the discussion.
So there are two significant differences between your version and mine. The first is that mine says “might” and yours says “will”, but I’m pretty sure Roko wasn’t by any means certain that that would happen. The second is that yours ends “I don’t want such an AI to be built”, which doesn’t seem to me like the right ending for “a case against utilitarianism”.
(Unless you meant “a case against building a utilitarian AI” rather than “a case against utilitarianism as one’s actual moral theory”?)
The first is that mine says “might” and yours says “will”, but I’m pretty sure Roko wasn’t by any means certain that that would happen.
I should have mentioned that it’s conditional on the Basilisk being correct. If we build an AI that follows that line of reasoning, then it will torture. If the basilisk isn’t correct for unrelated reasons, then this whole line of reasoning is irrelevant.
Anyway, the exact certainty isn’t too important. You use the word “might”, as if the probability of you being tortured was really small. Like the AI would only do it in really obscure scenarios. And you are just as likely to be picked for torture as anyone else.
Roko believed that the probability was much higher, and therefore worth worrying about.
The second is that yours ends “I don’t want such an AI to be built”, which doesn’t seem to me like the right ending for “a case against utilitarianism”.
Unless you meant “a case against building a utilitarian AI” rather than “a case against utilitarianism as one’s actual moral theory”?
Well the AI is just implementing the conclusions of utilitarianism (again, conditional on the basilisk argument being correct.) If you don’t like those conclusions, and if you don’t want AIs to be utilitarian, then do you really support utilitarianism?
It’s a minor semantic point though. The important part is the practical consequences for how we should build AI. Whether or not utilitarianism is “right” is more subjective and mostly irrelevant.
Roko believed that the probability was much higher
All I know about what Roko believed about the probability is that (1) he used the word “might” just as I did and (2) he wrote “And even if you only think that the probability of this happening is 1%, …” suggesting that (a) he himself probably thought it was higher and (b) he thought it was somewhat reasonable to estimate it at 1%. So I’m standing by my “might” and robustly deny your claim that writing “might” was strawmanning.
if you don’t want AIs to be utilitarian
If you’re standing in front of me with a gun and telling me that you have done some calculations suggesting that on balance the world would be a happier place without me in it, then I would probably prefer you not to be utilitarian. This has essentially nothing to do with whether I think utilitarianism produces correct answers. (If I have a lot of faith in your reasoning and am sufficiently strong-minded then I might instead decide that you ought to shoot me. But my likely failure to do so merely indicates typical human self-interest.)
The important part is the practical consequences for how we should build AI.
Perhaps so, in which case calling the argument “a case against utilitarianism” is simply incorrect.
Roko’s argument implies the AI will torture. The probability you think his argument is correct is a different matter. Roko was just saying that “if you think there is a 1% chance that my argument is correct”, not “if my argument is correct, there is a 1% chance the AI will torture.”
This really isn’t important though. The point is, if an AI has some likelihood of torturing you, you shouldn’t want it to be built. You can call that self-interest, but that’s admitting you don’t really want utilitarianism to begin with. Which is the point.
Anyway this is just steel-manning Roko’s argument. I think the issue is with acausal trade, not utilitarianism. And that seems to be the issue most people have with it.
Roko wasn’t arguing against TDT. Roko’s post was about acausal trade, but the conclusion he was trying to argue for was just ‘utilitarian AI is evil because it causes suffering for the sake of the greater good’. But if that’s your concern, you can just post about some variant on the trolley problem. If utilitarianism is risky because a utilitarian might employ blackmail and blackmail is evil, then there should be innumerable other evil things a utilitarian would also do that require less theoretical apparatus.
On Roko’s view, if no one finds out about basilisks, the basilisk can’t blackmail anyone. So publicizing the idea doesn’t make sense, unless Roko didn’t take his own argument all that seriously. (Maybe Roko was trying to protect himself from personal blackmail risk at others’ expense, but this seems odd if he also increased his own blackmail risk in the process.)
Possibly Roko was thinking: ‘If I don’t prevent utilitarian AI from being built, it will cause a bunch of atrocities in general. But LessWrong users are used to dismissing anti-utilitarian arguments, so I need to think of one with extra shock value to get them to do some original seeing. This blackmail argument should work—publishing it puts people at risk of blackmail, but it serves the greater good of protecting us from other evil utilitarian tradeoffs.’
(… Irony unintended.)
Still, if that’s right, I’m inclined to think Roko should have tried to post other arguments against utilitarianism that don’t (in his view) put anyone at risk of torture. I’m not aware of him having done that.
Ok that makes a bit less sense to me. I didn’t think it was against utilitarianism in general, which is much less controversial than TDT. But I can definitely still see his argument.
When people talk about the trolley problem, they don’t usually imagine that they might be the ones tied to the second track. The deeply unsettling thing about the basilisk isn’t that the AI might torture people for the greater good. It’s that you are the one who is going to be tortured. That a pretty compelling case against utilitarianism.
Roko found out. It disturbed him greatly. So it absolutely made sense for him to try to stop the development of such an AI any way he could. By telling other people, he made it their problem too and converted them to his side.
It doesn’t appear to me to be a case against utilitarianism at all. “Adopting utilitarianism might lead to me getting tortured, and that might actually be optimal in utilitarian terms, therefore utilitarianism is wrong” doesn’t even have the right shape to be a valid argument. It’s like “If there is no god then many bad people will prosper and not get punished, which would be awful, therefore there is a god.” (Or, from the other side, “If there is a god then he may choose to punish me, which would be awful, therefore there is no god”—which has a thing or two in common with the Roko basilisk, of course.)
Perhaps he hoped to. I don’t see any sign that he actually did.
You are strawmanning the argument significantly. I would word it more like this:
“Building an AI that follows utilitarianism will lead to me getting tortured. I don’t want to be tortured. Therefore I don’t want such an AI to be built.”
That’s partially because EY fought against it so hard and even silenced the discussion.
So there are two significant differences between your version and mine. The first is that mine says “might” and yours says “will”, but I’m pretty sure Roko wasn’t by any means certain that that would happen. The second is that yours ends “I don’t want such an AI to be built”, which doesn’t seem to me like the right ending for “a case against utilitarianism”.
(Unless you meant “a case against building a utilitarian AI” rather than “a case against utilitarianism as one’s actual moral theory”?)
I should have mentioned that it’s conditional on the Basilisk being correct. If we build an AI that follows that line of reasoning, then it will torture. If the basilisk isn’t correct for unrelated reasons, then this whole line of reasoning is irrelevant.
Anyway, the exact certainty isn’t too important. You use the word “might”, as if the probability of you being tortured was really small. Like the AI would only do it in really obscure scenarios. And you are just as likely to be picked for torture as anyone else.
Roko believed that the probability was much higher, and therefore worth worrying about.
Well the AI is just implementing the conclusions of utilitarianism (again, conditional on the basilisk argument being correct.) If you don’t like those conclusions, and if you don’t want AIs to be utilitarian, then do you really support utilitarianism?
It’s a minor semantic point though. The important part is the practical consequences for how we should build AI. Whether or not utilitarianism is “right” is more subjective and mostly irrelevant.
All I know about what Roko believed about the probability is that (1) he used the word “might” just as I did and (2) he wrote “And even if you only think that the probability of this happening is 1%, …” suggesting that (a) he himself probably thought it was higher and (b) he thought it was somewhat reasonable to estimate it at 1%. So I’m standing by my “might” and robustly deny your claim that writing “might” was strawmanning.
If you’re standing in front of me with a gun and telling me that you have done some calculations suggesting that on balance the world would be a happier place without me in it, then I would probably prefer you not to be utilitarian. This has essentially nothing to do with whether I think utilitarianism produces correct answers. (If I have a lot of faith in your reasoning and am sufficiently strong-minded then I might instead decide that you ought to shoot me. But my likely failure to do so merely indicates typical human self-interest.)
Perhaps so, in which case calling the argument “a case against utilitarianism” is simply incorrect.
Roko’s argument implies the AI will torture. The probability you think his argument is correct is a different matter. Roko was just saying that “if you think there is a 1% chance that my argument is correct”, not “if my argument is correct, there is a 1% chance the AI will torture.”
This really isn’t important though. The point is, if an AI has some likelihood of torturing you, you shouldn’t want it to be built. You can call that self-interest, but that’s admitting you don’t really want utilitarianism to begin with. Which is the point.
Anyway this is just steel-manning Roko’s argument. I think the issue is with acausal trade, not utilitarianism. And that seems to be the issue most people have with it.