OK, so just spelling out the argument for my own amusement:
The agent can prove the implications “agent() == 1 implies world() == 1000000” and “agent() == 2 implies world() == 1000″, both of which are true.
Let x and y in {1, 2} be such that agent() == x and agent() != y.
Let Ux be the greatest value of U for which the agent proves “agent() == x implies world() == U”.
Let Uy be the greatest value of U for which the agent proves “agent() == y implies world() == U”.
Then Ux is greater than or equal to Uy.
If the agent’s formalised reasoning is consistent then since agent() == x is true, Ux must be the only U such that “agent() == x implies world() == U” is provable.
Now, if there is more than one U such that “agent() == y implies world() == U” is provable then “agent() == y implies contradiction” is provable, and hence so is “agent() == y implies world() == U” for some value of U larger than Ux. Hence Uy would be larger than Ux, which is impossible.
Therefore, the implications “agent() == 1 implies world() == 1000000” and “agent() == 2 implies world() == 1000″ are the only ones provable.
Thanks for this comment! The idea that only a single Uy can be provable, otherwise we get a contradiction and hence we can prove an arbitrarily high utility has greatly clarified this post for me. I still haven’t quite figured out why the agent can’t use the above proof that those are the only two provable and the fact that agent()==1 gives a higher value to prove that agent()!=2 and then create a contradiction.
Edit: Actually, I think I now understand this. After finding those two proofs, the agent can only conclude that there aren’t any more proofs of the form “agent()=_ implies world()=_”if its reasoning (say PA) is consistent. But it can’t actually prove this!
OK, so just spelling out the argument for my own amusement:
The agent can prove the implications “agent() == 1 implies world() == 1000000” and “agent() == 2 implies world() == 1000″, both of which are true.
Let x and y in {1, 2} be such that agent() == x and agent() != y.
Let Ux be the greatest value of U for which the agent proves “agent() == x implies world() == U”. Let Uy be the greatest value of U for which the agent proves “agent() == y implies world() == U”.
Then Ux is greater than or equal to Uy.
If the agent’s formalised reasoning is consistent then since agent() == x is true, Ux must be the only U such that “agent() == x implies world() == U” is provable.
Now, if there is more than one U such that “agent() == y implies world() == U” is provable then “agent() == y implies contradiction” is provable, and hence so is “agent() == y implies world() == U” for some value of U larger than Ux. Hence Uy would be larger than Ux, which is impossible.
Therefore, the implications “agent() == 1 implies world() == 1000000” and “agent() == 2 implies world() == 1000″ are the only ones provable.
Therefore, agent() == 1.
Thanks for this comment! The idea that only a single Uy can be provable, otherwise we get a contradiction and hence we can prove an arbitrarily high utility has greatly clarified this post for me. I still haven’t quite figured out why the agent can’t use the above proof that those are the only two provable and the fact that agent()==1 gives a higher value to prove that agent()!=2 and then create a contradiction.
Edit: Actually, I think I now understand this. After finding those two proofs, the agent can only conclude that there aren’t any more proofs of the form “agent()=_ implies world()=_”if its reasoning (say PA) is consistent. But it can’t actually prove this!
Yes, this looks right.