Suppose I were to threaten to increase existential risk by 0.0001% unless SIAI agrees to program its FAI to give me twice the post-Singuarity resource allocation (or whatever the unit of caring will be) that I would otherwise receive. Can see why it might have a policy against responding to threats? If Eliezer does not agree with you that censorship increases existential risk, he might censor some future post just to prove the credibility of his precommitment.
If you really think censorship is bad even by Eliezer’s values, I suggest withdrawing your threat and just try to convince him of that using rational arguments. I rather doubt that Eliezer has some sort of unfixable bug regarding censorship that has to be patched using such extreme measures. It’s probably just that he got used to exercising strong moderation powers on SL4 (which never blew up like this, at least to my knowledge), and I’d guess that he has already updated on the new evidence and will be much more careful next time.
If you really think censorship is bad even by Eliezer’s values, I suggest withdrawing your threat and just try to convince him of that using rational arguments.
I do not expect that (non-costly signalling by someone who does not have significant status) to work any more than threats would. A better suggestion would be to forget raw threats and consider what other alternatives wfg has available by which he could deploy an equivalent amount of power that would have the desired influence. Eliezer moved the game from one of persuasion (you should not talk about this) to one about power and enforcement (public humiliation, censorship and threats). You don’t take a pen to a gun fight.
I don’t understand why, just because Eliezer chose to move the game from one of persuasion to one about power and enforcement, you have to keep playing it that way.
If Eliezer is really so irrational that once he has exercised power on some issue, he is no longer open to any rational arguments on that topic, then what are we all doing here? Shouldn’t we be trying to hinder his efforts (to “not take over the world”) instead of (however indirectly) helping him?
Good questions, these were really fun to think about / write up :)
First off let’s kill a background assumption that’s been messing up this discussion: that EY/SIAI/anyone needs a known policy toward credible threats.
It seems to me that stated policies to credible threats are irrational unless a large number of the people you encounter will change their behavior based on those policies. To put it simply: policies are posturing.
If an AI credibly threatened to destroy the world unless EY became a vegetarian for the rest of the day, and he was already driving to a BBQ, is eating meat the only rational thing for him to do? (It sure would prevent future credible threats!)
If EY planned on parking in what looked like an empty space near the entrance to his local supermarket, only to discover that on closer inspection it was a handicapped-only parking space (with a tow truck only 20 feet away), is getting his car towed the only rational thing to do? (If he didn’t an AI might find out his policy isn’t iron clad!)
This is ridiculous. It’s posturing. It’s clearly not optimal.
In answer to your question: Do the thing that’s actually best. The answer might be to give you 2x the resources. It depends on the situation: what SIAI/EY knows about you, about the likely effect of cooperating with you or not, and about the cost vs benefits of cooperating with you.
Maybe there’s a good chance that knowing you’ll get more resources makes you impatient for SIAI to make a FAI, causing you to donate more. Who knows. Depends on the situation.
(If the above doesn’t work when an AI is involved, how about EY makes a policy that only applies to AIs?)
In answer to your second paragraph I could withdraw my threat, but that would lessen my posturing power for future credible threats.
(har har...)
The real reason is I’m worried about what happens while I’m trying to convince him.
I’d love to discuss what sort of moderation is correct for a community like less wrong—it sounds amazing. Let’s do it.
But no way I’m taking the risk of undoing my fix until I’m sure EY’s (and LW’s) bugs are gone.
Suppose I were to threaten to increase existential risk by 0.0001% unless SIAI agrees to program its FAI to give me twice the post-Singuarity resource allocation (or whatever the unit of caring will be) that I would otherwise receive. Can see why it might have a policy against responding to threats? If Eliezer does not agree with you that censorship increases existential risk, he might censor some future post just to prove the credibility of his precommitment.
If you really think censorship is bad even by Eliezer’s values, I suggest withdrawing your threat and just try to convince him of that using rational arguments. I rather doubt that Eliezer has some sort of unfixable bug regarding censorship that has to be patched using such extreme measures. It’s probably just that he got used to exercising strong moderation powers on SL4 (which never blew up like this, at least to my knowledge), and I’d guess that he has already updated on the new evidence and will be much more careful next time.
I do not expect that (non-costly signalling by someone who does not have significant status) to work any more than threats would. A better suggestion would be to forget raw threats and consider what other alternatives wfg has available by which he could deploy an equivalent amount of power that would have the desired influence. Eliezer moved the game from one of persuasion (you should not talk about this) to one about power and enforcement (public humiliation, censorship and threats). You don’t take a pen to a gun fight.
I don’t understand why, just because Eliezer chose to move the game from one of persuasion to one about power and enforcement, you have to keep playing it that way.
If Eliezer is really so irrational that once he has exercised power on some issue, he is no longer open to any rational arguments on that topic, then what are we all doing here? Shouldn’t we be trying to hinder his efforts (to “not take over the world”) instead of (however indirectly) helping him?
Good questions, these were really fun to think about / write up :)
First off let’s kill a background assumption that’s been messing up this discussion: that EY/SIAI/anyone needs a known policy toward credible threats.
It seems to me that stated policies to credible threats are irrational unless a large number of the people you encounter will change their behavior based on those policies. To put it simply: policies are posturing.
If an AI credibly threatened to destroy the world unless EY became a vegetarian for the rest of the day, and he was already driving to a BBQ, is eating meat the only rational thing for him to do? (It sure would prevent future credible threats!)
If EY planned on parking in what looked like an empty space near the entrance to his local supermarket, only to discover that on closer inspection it was a handicapped-only parking space (with a tow truck only 20 feet away), is getting his car towed the only rational thing to do? (If he didn’t an AI might find out his policy isn’t iron clad!)
This is ridiculous. It’s posturing. It’s clearly not optimal.
In answer to your question: Do the thing that’s actually best. The answer might be to give you 2x the resources. It depends on the situation: what SIAI/EY knows about you, about the likely effect of cooperating with you or not, and about the cost vs benefits of cooperating with you.
Maybe there’s a good chance that knowing you’ll get more resources makes you impatient for SIAI to make a FAI, causing you to donate more. Who knows. Depends on the situation.
(If the above doesn’t work when an AI is involved, how about EY makes a policy that only applies to AIs?)
In answer to your second paragraph I could withdraw my threat, but that would lessen my posturing power for future credible threats.
(har har...)
The real reason is I’m worried about what happens while I’m trying to convince him.
I’d love to discuss what sort of moderation is correct for a community like less wrong—it sounds amazing. Let’s do it.
But no way I’m taking the risk of undoing my fix until I’m sure EY’s (and LW’s) bugs are gone.