It doesn’t quite kill Pascal’s mugging—the threat does have to have some minimum level of credibility, but that minimum credibility can still be low enough that hand over the cash. Pascal’s mugging only is killed if the expected utility of handing over the cash is negative. To show this I think you really do need to evaluate the probability to the end.
Neither does it kill paperclip maximizers. A bunch of paperclips requires about log2(N) bits to describe, plus the description of the properties of a paperclip. So the paperclip maximizer can still have a constantly-increasing utility as they make more paperclips, your rule would just bound it to growing like log(N).
Good line of thought though: there may still be something in here.
To “kill Pascal’s mugging” one doesn’t have to give advice on how to deal with threats generally.
I think that N paperclips takes about complexity-of-N, plus complexity of a paperclip, bits to describe. “Complexity of N” can be much lower than log(N), e.g. complexity of 3^^^3 is smaller than the wikipedia article on Knuth’s notation. “3^^^3 paperclips” has very low complexity and very high utility.
But I think that a decision theory is better (better fulfills desiterata of universality, simplicity, etc. etc.) if it treats Pascal’s mugging with the same method it uses for other threats.
Why? Is “threat” a particularly “natural” category?
From my perspective, Pascal’s mugging is simply an argument showing that a human-friendly utility function should have a certain property, not a special class of problem to be solved.
Hah. Well, we can apply my exact same argument with different words to show why I agree with you:
But I think that a decision theory is better (better fulfills desiterata of universality, simplicity, etc. etc.) if it treats threats with the same method it uses for other decision problems.
Pascal’s mugging only is killed if the expected utility of handing over the cash is negative.
This will be the case in the scenario under discussion, due to the low probability of the mugger’s threat (in the “3^^^^3 disutilons” version), or the (relatively!) low disutility (in the “3^^^^3 persons” version, under Michael Vassar’s proposal).
So the paperclip maximizer can still have a constantly-increasing utility as they make more paperclips, your rule would just bound it to growing like log(N)
Yes; it would be a “less pure” paperclip maximizer, but still an unfriendly AI.
The rule is (proposed to be) necessary for friendliness, not sufficient by any means.
It doesn’t quite kill Pascal’s mugging—the threat does have to have some minimum level of credibility, but that minimum credibility can still be low enough that hand over the cash. Pascal’s mugging only is killed if the expected utility of handing over the cash is negative. To show this I think you really do need to evaluate the probability to the end.
Neither does it kill paperclip maximizers. A bunch of paperclips requires about log2(N) bits to describe, plus the description of the properties of a paperclip. So the paperclip maximizer can still have a constantly-increasing utility as they make more paperclips, your rule would just bound it to growing like log(N).
Good line of thought though: there may still be something in here.
To “kill Pascal’s mugging” one doesn’t have to give advice on how to deal with threats generally.
I think that N paperclips takes about complexity-of-N, plus complexity of a paperclip, bits to describe. “Complexity of N” can be much lower than log(N), e.g. complexity of 3^^^3 is smaller than the wikipedia article on Knuth’s notation. “3^^^3 paperclips” has very low complexity and very high utility.
Ah, you’re right.
But I think that a decision theory is better (better fulfills desiterata of universality, simplicity, etc. etc.) if it treats Pascal’s mugging with the same method it uses for other threats.
Why? Is “threat” a particularly “natural” category?
From my perspective, Pascal’s mugging is simply an argument showing that a human-friendly utility function should have a certain property, not a special class of problem to be solved.
Hah. Well, we can apply my exact same argument with different words to show why I agree with you:
This will be the case in the scenario under discussion, due to the low probability of the mugger’s threat (in the “3^^^^3 disutilons” version), or the (relatively!) low disutility (in the “3^^^^3 persons” version, under Michael Vassar’s proposal).
Yes; it would be a “less pure” paperclip maximizer, but still an unfriendly AI.
The rule is (proposed to be) necessary for friendliness, not sufficient by any means.