See the posts linked to from UDT, ADT, TDT, section “Decision theory” in this post.
I don’t see why an agent can’t know about itself that it’s a utility maximizer.
It can, it’s just a consideration it can’t use in figuring out how the outcome depends on its decision. While looking for what is influenced by a decision, the decision itself should remain unknown (given that the decision is determined by what this dependence turns out to be, it’s not exactly self-handicapping).
Yes, A=1 and A=2 were mixed up in the (Ax2), fixed it now. But I don’t see your contradiction. (Ax3) + [A=1 ⇒ W=0] + [A=2 ⇒ W=1000000] does not imply A=1. It implies NOT(A=1)
It seems the mixing up of A=1 and A=2 didn’t spare this argument, but it’s easy enough to fix. From A=2 we have NOT(A=1), so [A=1 ⇒ W=1,000,000,000] and together with [A=2 ⇒ W=1,000,000] by (Ax3) we have A=1, contradiction.
If I attempted to fix this, I would try to change (Ax3) to something like:
forall a1,a2,w1,w2 ((A=a1 ⇒ W=w1) AND (A=a2 ⇒ W=w2) AND (w1>w2)) ⇒ NOT (FinalA=a2)
where FinalA is the actual decision. But then, why should the world’s axiom (Ax2) be defined in terms of A and not FinalA? This seems conceptually wrong...
See the posts linked to from UDT, ADT, TDT, section “Decision theory” in this post.
It can, it’s just a consideration it can’t use in figuring out how the outcome depends on its decision. While looking for what is influenced by a decision, the decision itself should remain unknown (given that the decision is determined by what this dependence turns out to be, it’s not exactly self-handicapping).
It seems the mixing up of A=1 and A=2 didn’t spare this argument, but it’s easy enough to fix. From A=2 we have NOT(A=1), so [A=1 ⇒ W=1,000,000,000] and together with [A=2 ⇒ W=1,000,000] by (Ax3) we have A=1, contradiction.
I thought about a different axiomatization, which would not have the same consistency problems. Not sure whether this is the right place, but:
Let X be a variable ranging over all possible agents.
(World Axioms)
forall X [Decides(X)=1 ⇒ Receives(X)=1000000]
forall X [Decides(X)=2 ⇒ Receives(X)=1000]
(Agent Axioms—utility maximization)
BestX = argmax{X} Receives(X)
Decision = Decides(BestX)
Then Decision=1 is easily provable, regardless of whether any specific BestX is known.
Does not seem to lead to contradictions.
Yes, I see now, thanks.
If I attempted to fix this, I would try to change (Ax3) to something like:
forall a1,a2,w1,w2 ((A=a1 ⇒ W=w1) AND (A=a2 ⇒ W=w2) AND (w1>w2)) ⇒ NOT (FinalA=a2)
where FinalA is the actual decision. But then, why should the world’s axiom (Ax2) be defined in terms of A and not FinalA? This seems conceptually wrong...
Ok, I’ll go read up on the posts :)