Dufaer comments on Resolving the unexpected hanging paradox

Dufaer 26 Jan 2011 1:48 UTC
2 points
OK, let’s look at this: The prisoner receives 2 pieces of information from the warden at the beginning:
- The first piece of information is: He will be killed at noon of one day of the next five days.
Assuming that the warden’s claim is true, there are 5 possible outcomes:

Death at noon of Mon, Tue, Wed, Thu or Fri.

Assuming furthermore that the prisoner has no other information that and that he uses probability theory, he will construct the following uniform probability distribution:

P(Death at noon of X.)=1/5 where X can be Mon, Tue, Wed, Thu or Fri.

Furthermore he can now also infer the conditional probabilities

P(Death at noon of X.|Not dead after noon of Mon.)=1/4 for X = Tue, Wed, Thu or Fri.

P(Death at noon of X.|Not dead after noon of Tue.)=1/3 for X = Wed, Thu or Fri.

P(Death at noon of X.|Not dead after noon of Wed.)=1/2 for X = Thu or Fri.

And finally:

P(Death at noon of Fri.|Not dead after noon of Thu.)=1

Thus the prisoner will now be not ‘surprised’ only by a death at noon of Friday. As in: The occurring event had P>1/2.

Is this the proper notion of ‘surprise’?

I don’t think so. - Surprise should be always measured quantitatively.

But observe that in the ‘death at noon of Friday’ scenario there is no surprise whatsoever. It is qualitatively absent under the condition that the warden speaks the truth:

P(Death at noon of Fri.|Not dead after noon of Thu., The warden speaks truth.)=1 P(Death at noon of Fri.|Death at noon of Fri.)=1 (duh) There is no updating.

The second piece of information from the warden is:
- The prisoner will be surprised by his death.
What does this mean, anyway? Can we actually alter our probability distribution based on this datum?

I highly doubt that, but the prisoner in the canonical treatment certainly does update: As it holds that:

P(Death at noon of Fri.|Not dead after noon of Thu.)=1

He concludes:

P(Not dead after noon of Thu. AND The prisoner will be surprised by his death.)=0

And:

P(Death at noon of Fri.|Not dead after noon of Thu., The prisoner will be surprised by his death. )=0

As a special case of ‘P(Anything.|Contradiction.)=0’

He then runs a few iterations and concludes that all outcomes have the P=0 under all conditions.

Here the background assumption is still that the warden’s words are true. This assumption however is contradictory if the updating procedure of the prisoner is correct. But we can easily see that the new belief structure of the prisoner will be surprised by any of the outcomes; rendering the warden correct. Thus the paradox.

The solution is simply that the prisoner’s updating procedure is incorrect.

The datum ‘The prisoner will be surprised by his death.’ does not warrant the update. The warden’s statements are contradictory if the original belief structure is retained and if the only remaining outcome is death at noon of Friday. However, after the first change of the belief structure by the prisoner this no longer holds. The further ‘iterations’ make even less sense and the whole ‘update’ is unstable, as our simple reflection shows—now the prisoner will be surprised by any outcome.

So obviously, this is not Bayesian updating.

The prisoner tried to reason. Concluded that he couldn’t be killed without proving the warden wrong. Changed his probability distribution over outcomes to reflect this. Thus changing the prerequisites for his initial conclusion. He did not examine the implications of the changed prerequisites. Ensuring that the warden could always be right.

He thus updated wrongly—his believes do not reflect reality.

Observe that the outcome ‘The warden was correct.’ and it’s negation ‘The warden was incorrect.’ regarding the proposition ‘The prisoner will be surprised by his death.‘, given ‘He will be killed at noon of one day of the next five days.’ depend solely on the belief structure of the prisoner.

Given that a belief structure is normally used by an agent to maximize utility and yet the prisoner is not an agent (he lacks a utility function), the belief structure is inconsequential apart from proving the warden right or wrong. The shaping of the structure is the only choice given to the prisoner and as such it can be hardly called a structure of belief at all.

If there was a non-constant utility function over ‘The warden was correct.’ and ‘The warden was incorrect.‘, this would be a ‘belief-determined problem’ which is likely an inconsistent class of problems by itself—an agent trying to maximize such a problem would have to simultaneously represent the problem and ‘believe’ in things which generally contradicting this representation in order to maximize the payoff, thus making the ‘belief’ something indistinguishable from a ‘mere’ decision.

Nevertheless, in the canonical treatment the prisoner ensured by ‘incorrect’ updating that the warden was always right.

Likewise, we can construct an ‘incorrect’ belief structure that ensures that the warden will always be incorrect:

P(Death at noon of day #(N).|Not dead after noon of #(N-1).)=1

This structure will be ‘surprised’ by any survival, as it expects certain death each day.

Of course, this is total BS from the perspective of probability theory, but so is the original updating scheme.