For the most part, when person P says, “I will do X,” that is evidence that P will do X, and the probability of P doing X increases. Instead, if P has a reputation for sarcasm, and if P says the same thing, then the probability that P will do X decreases. Clearly, then, our estimation of P’s position in mindspace determines weather we increase or decrease the likelihood of P’s claims. For the mugging situation, we might adopt a model where the mugger’s claims about very improbable actions in no way affect what we expect him to do since we do not have a useful estimate of the mugger’s position in mindspace—how could we? We cannot assume the mugger tends to be more honest than not, as we can with humans. Expected utilities balance and cancel, so I should keep my wallet.
You’re right about this. However the main problem we have here is this:
A compactly specified wager can grow in size much faster than it grows in complexity. The utility of a Turing machine can grow much faster than its prior probability shrinks.
If the expected utility grows proportional to −2^(2^n) but the prior probability decreases proportional to 2^(-n) in the complexity n (measured in Kolmogorov Complexity for all I care), then even if the information we get from their utterance does lead us to have a posterior different from the prior, the utility goes to -∞ for n going to ∞.
Fair enough, even though I wouldn’t call that my prior but rather my posterior after updating on my belief of what their expected utility might be.
So you propose that I update my probability to be proportional to the inverse of their expected utility?
How do I even begin to guess their utility function if this is a one-shot interaction?
How do I distinguish between honest and dishonest people?
Under ignorance of the mugger’s position in mindspace, we then should assign the same probability to the mugger’s claim and the claim’s opposite. Then for all n, (n utilons) Pr(mugger will cause n utilons) + (-n utilons) Pr(mugger will cause -n utilons) = 0. This response seems to manage the rate difference between the utility and the probability.
The question is not only about their position in mindspace. Surely there may be as many possible minds (not just humans) which believe they can simulate 3^^^3 people and torture them as there those that do not believe so. But this does not mean that there are as many possible minds which actually could really do it. So I shouldn’t use a maximum entropy prior for my belief in their ability to do it, but for my belief in their belief in their ability to do it!
This is one of those cases where it helps to be a human, because we’re dumb enough that we can’t possibly calculate the true probabilities involved, and so the expected utilities sum to zero in any reasonable approximation of the situation, by human standards.
Unfortunately, a superintelligent AI would be able to get a much better calculation out of something like this, and while a .0000000000000001 probability might round down to 0 for us lowly humans, an AI wouldn’t round that down. (After all, why should it? Unlike us, it has no reason to doubt its capabilities for calculation.) And with enormous utilities like 3^^^^3, even a .0000000000000001 difference in probability is too much. The problem isn’t with us, directly, but with the behavior a hypothetical AI agent might take. We certainly don’t want our newly-built FAI to suddenly decide to devote all of humanity’s resources to serving the first person who comes up with the bright idea of Pascal’s Mugging it.
For the most part, when person P says, “I will do X,” that is evidence that P will do X, and the probability of P doing X increases. Instead, if P has a reputation for sarcasm, and if P says the same thing, then the probability that P will do X decreases. Clearly, then, our estimation of P’s position in mindspace determines weather we increase or decrease the likelihood of P’s claims. For the mugging situation, we might adopt a model where the mugger’s claims about very improbable actions in no way affect what we expect him to do since we do not have a useful estimate of the mugger’s position in mindspace—how could we? We cannot assume the mugger tends to be more honest than not, as we can with humans. Expected utilities balance and cancel, so I should keep my wallet.
You’re right about this. However the main problem we have here is this:
If the expected utility grows proportional to −2^(2^n) but the prior probability decreases proportional to 2^(-n) in the complexity n (measured in Kolmogorov Complexity for all I care), then even if the information we get from their utterance does lead us to have a posterior different from the prior, the utility goes to -∞ for n going to ∞.
The fallacy here is that you’re assuming the prior probability shrinks only due to complexity.
For instance, the probability could also shrink due to the fact that higher utilities are more useful to dishonest muggers than lower utilities.
Fair enough, even though I wouldn’t call that my prior but rather my posterior after updating on my belief of what their expected utility might be.
So you propose that I update my probability to be proportional to the inverse of their expected utility? How do I even begin to guess their utility function if this is a one-shot interaction? How do I distinguish between honest and dishonest people?
Under ignorance of the mugger’s position in mindspace, we then should assign the same probability to the mugger’s claim and the claim’s opposite. Then for all n, (n utilons) Pr(mugger will cause n utilons) + (-n utilons) Pr(mugger will cause -n utilons) = 0. This response seems to manage the rate difference between the utility and the probability.
The question is not only about their position in mindspace. Surely there may be as many possible minds (not just humans) which believe they can simulate 3^^^3 people and torture them as there those that do not believe so. But this does not mean that there are as many possible minds which actually could really do it. So I shouldn’t use a maximum entropy prior for my belief in their ability to do it, but for my belief in their belief in their ability to do it!
This is one of those cases where it helps to be a human, because we’re dumb enough that we can’t possibly calculate the true probabilities involved, and so the expected utilities sum to zero in any reasonable approximation of the situation, by human standards.
Unfortunately, a superintelligent AI would be able to get a much better calculation out of something like this, and while a .0000000000000001 probability might round down to 0 for us lowly humans, an AI wouldn’t round that down. (After all, why should it? Unlike us, it has no reason to doubt its capabilities for calculation.) And with enormous utilities like 3^^^^3, even a .0000000000000001 difference in probability is too much. The problem isn’t with us, directly, but with the behavior a hypothetical AI agent might take. We certainly don’t want our newly-built FAI to suddenly decide to devote all of humanity’s resources to serving the first person who comes up with the bright idea of Pascal’s Mugging it.