The vast majority of people who read about Pascal’s Mugging won’t actually be convinced to give money to someone promising them ludicrous fulfilment of their utility function. The vast majority of people who read about Roko’s Basilisk do not immediately go out and throw themselves into a research institute dedicated to building the basilisk. However, they also do not stop believing in the principles underpinning these “radical” scenarios/courses of action (the maximization of utility, for one). Many of them will go on to affirm the very same thought processes that would lead you to give all your money to a mugger or build an evil AI, for instance by donating money to charities they think will be most effective.
This suggests that most people have some innate way of distinguishing between “good” and “bad” implementations of certain ideas or principles that isn’t just “throw the idea away completely”. It might* be helpful if we could dig out this innate method and apply it more consciously.
*I say might because there’s a real chance that the method turns out to be just “accept implementations that are societally approved of, like giving money to charity, and dismiss implementations that are not societally approved of, like building rogue AIs”. If this is the case, then it’s not very useful. But it’s probably worth investigating some amount at least.
I don’t need any defense mechanisms against these ones, because I can just see the fallacy in the arguments.
In one description, Blaise Pascal is accosted by a mugger who has forgotten his weapon. However, the mugger proposes a deal: the philosopher gives him his wallet, and in exchange the mugger will return twice the amount of money tomorrow. Pascal declines, pointing out that it is unlikely the deal will be honoured. The mugger then continues naming higher rewards, pointing out that even if it is just one chance in 1000 that he will be honourable, it would make sense for Pascal to make a deal for a 2000 times return. Pascal responds that the probability for that high return is even lower than one in 1000. The mugger argues back that for any low probability of being able to pay back a large amount of money (or pure utility) there exists a finite amount that makes it rational to take the bet – and given human fallibility and philosophical scepticism a rational person must admit there is at least some non-zero chance that such a deal would be possible. In one example, the mugger succeeds by promising Pascal 1,000 quadrillion happy days of life. Convinced by the argument, Pascal gives the mugger the wallet.
When a mugger promises me to return twice my money tomorrow, I can see that it is almost certainly a hoax. There is maybe a one in a million chance he’s saying the truth. The expected value of the wager is negative. If he promises a 2000x return, that’s even less likely to be true. I estimate it as one in two billion. The expected value is still the same, and still negative. And so on, the more lavish reward the mugger promises, the less likely I am to trust him, so the expected value can always stay negative.
Roko’s basilisk
Why don’t I throw themselves into a research institute dedicated to building the basilisk? Because there is no such institute, and if someone seriously tried to build one, they’d just end up in prison or a mental asylum for extortion. Unless they are keeping their work secret, but then it’s just an unnecessarily convoluted way of building an AI that kills everyone. So there is no reason why I would want to do that.
I’m sorry, but I can’t take infohazard warnings seriously any longer. And yes, Roko’s basilisk is another example of a ridiculous infohazard, because almost all AGI designs are evil anyway. What if someone creates an AI that tortures everyone who doesn’t know about Roko’s basilisk? Then I’m doing a public service.
I think anyone seriously anxious about some potential future AGI torturing them is ridiculously emotionally fragile and should grow up. People get tortured all the time. If you weren’t born in a nice first world country, you’d live your whole life knowing you can get tortured by your government any moment. Two of my friends got tortured. Learning that my government tortures people makes one more likely to go protesting against it and end up tortured, too. Yet I don’t give people any infohazard warnings before talking about it, and I’m not going to. How are you even supposed to solve a problem if you aren’t allowed to discuss some of its aspects.
And if I’m mistaken somewhere, why don’t you explain why, instead of just downvoting me.
I don’t accept your authority on what “looks silly”. And I don’t optimize for how I look; so I’m unmoved by your social pressure. Most of your post is sum up by “Come on, be courageous”.
I strong downvoted because your post has patterns of social pressure, instead of just giving me arguments for why I’m wrong.
The only two argument I can retrieve are:
What if someone creates an AI that tortures everyone who doesn’t know about Roko’s basilisk?
[...]
How are you even supposed to solve a problem if you aren’t allowed to discuss some of its aspects.
I doubt good answers to those questions would change your mind on calling Roko’s basilisk an info-hasard. (would they?)
Well, looking bad leads to attracting less donor money, so it is somewhat important how you look. The argument about why Roko’s basilisk won’t actually be made on purpose is my central point, that’s what you’d have to refute to change my mind. (While I understand how it might get created by accident, spreading awareness to prevent such an accident is more helpful than covering it up—which is now impossible to do anyway, thanks to the Streisand effect the topic comes up all the time.)
The vast majority of people who read about Pascal’s Mugging won’t actually be convinced to give money to someone promising them ludicrous fulfilment of their utility function. The vast majority of people who read about Roko’s Basilisk do not immediately go out and throw themselves into a research institute dedicated to building the basilisk. However, they also do not stop believing in the principles underpinning these “radical” scenarios/courses of action (the maximization of utility, for one). Many of them will go on to affirm the very same thought processes that would lead you to give all your money to a mugger or build an evil AI, for instance by donating money to charities they think will be most effective.
This suggests that most people have some innate way of distinguishing between “good” and “bad” implementations of certain ideas or principles that isn’t just “throw the idea away completely”. It might* be helpful if we could dig out this innate method and apply it more consciously.
*I say might because there’s a real chance that the method turns out to be just “accept implementations that are societally approved of, like giving money to charity, and dismiss implementations that are not societally approved of, like building rogue AIs”. If this is the case, then it’s not very useful. But it’s probably worth investigating some amount at least.
I don’t need any defense mechanisms against these ones, because I can just see the fallacy in the arguments.
When a mugger promises me to return twice my money tomorrow, I can see that it is almost certainly a hoax. There is maybe a one in a million chance he’s saying the truth. The expected value of the wager is negative. If he promises a 2000x return, that’s even less likely to be true. I estimate it as one in two billion. The expected value is still the same, and still negative. And so on, the more lavish reward the mugger promises, the less likely I am to trust him, so the expected value can always stay negative.
Roko’s basilisk
Why don’t I throw themselves into a research institute dedicated to building the basilisk? Because there is no such institute, and if someone seriously tried to build one, they’d just end up in prison or a mental asylum for extortion. Unless they are keeping their work secret, but then it’s just an unnecessarily convoluted way of building an AI that kills everyone. So there is no reason why I would want to do that.
I stopped reading right after “Roko’s basilisk”
EtA: I suggest you label info-hazard
Roko’s basilisk was mentioned in the original comment, so I’m not doing any additional harm by mentioning it again in the same thread. I suggest you stop calling everything “infohazard”, because it devalues the term and makes you look silly. Some information is really dangerous, e.g. a bioweapon recipe. Wouldn’t it be good to have a term for dangerous information like this, and have people take it seriously. I think you’ve failed at the second part already. On this site, I’ve seen the term “infohazard” applied to such information as: “we are all going to die”, “there is a covid pandemic” https://www.lesswrong.com/posts/zTK8rRLr6RT5yWEmX/what-is-the-appropriate-way-to-communicate-that-we-are, “CDC made some mistakes” https://www.lesswrong.com/posts/nx94BD6vBY23rk6To/thoughts-on-the-scope-of-lesswrong-s-infohazard-policies.
I’m sorry, but I can’t take infohazard warnings seriously any longer. And yes, Roko’s basilisk is another example of a ridiculous infohazard, because almost all AGI designs are evil anyway. What if someone creates an AI that tortures everyone who doesn’t know about Roko’s basilisk? Then I’m doing a public service.
I think anyone seriously anxious about some potential future AGI torturing them is ridiculously emotionally fragile and should grow up. People get tortured all the time. If you weren’t born in a nice first world country, you’d live your whole life knowing you can get tortured by your government any moment. Two of my friends got tortured. Learning that my government tortures people makes one more likely to go protesting against it and end up tortured, too. Yet I don’t give people any infohazard warnings before talking about it, and I’m not going to. How are you even supposed to solve a problem if you aren’t allowed to discuss some of its aspects.
And if I’m mistaken somewhere, why don’t you explain why, instead of just downvoting me.
I don’t accept your authority on what “looks silly”. And I don’t optimize for how I look; so I’m unmoved by your social pressure. Most of your post is sum up by “Come on, be courageous”.
I strong downvoted because your post has patterns of social pressure, instead of just giving me arguments for why I’m wrong.
The only two argument I can retrieve are:
I doubt good answers to those questions would change your mind on calling Roko’s basilisk an info-hasard. (would they?)
Well, looking bad leads to attracting less donor money, so it is somewhat important how you look. The argument about why Roko’s basilisk won’t actually be made on purpose is my central point, that’s what you’d have to refute to change my mind. (While I understand how it might get created by accident, spreading awareness to prevent such an accident is more helpful than covering it up—which is now impossible to do anyway, thanks to the Streisand effect the topic comes up all the time.)