It seems to me this problem can be avoided by allowing access to random bits. See my reply to KnaveOfAllTrades and my reply to V_V. Formally, we should allow pi in (4′) to be a random algorithm.
...The above reasoning process selects a decision theory that logically-causes the worse outcome, and I don’t think that’s the right move.
I don’t think “logical causation” in the sense you are using here is the right way to think about the anti-Newcomb problem. From the precursor’s point of view, there is no loss in utility due of choosing XDT over UDT.
If you’re only trying to make agents that work well on “fair” games (where the obvious formalization of “fair” is “extensional” as defined above), then you should probably make that much more explicit.
Of course. I didn’t attempt to formalize “fairness” at that post but the idea is approaching optimality for decision-determined problems in the sense of Yudkowsky 2010.
I have the impression that the agents you define in this post, while interesting, aren’t really attacking the core of the problem, which is this: how can one reason under false premises?
I realize that the logical expectation values I’m using are so far mostly wishful thinking. However, I think there is benefit in attacking the problems from both ends: understanding the usage of logical probabilities may shed light on the desirada they should satisfy.
...UDT choosing strategies without regard for its inputs is the mechanism by which it is able to trade with counterfactual versions of itself.
Consider two UDT agents A & B with identical utility functions living in different universes. Each of the agents is charged with making a certain decision, while receiving no input. If both agents are aware of each other’s existence, we expect [in the sense of “hope” rather than “are able to prove” :)] them to make decisions that will maximizing overall utility, even though that on the surface, each agent is only maximizing over its own decisions rather than the decisions of both agents.
What is the difference between this scenario and the scenario of a single agent existing in both universes which receives a single bit of input that indicates in which universe the given copy is?
… I’m not quite sure why you’re so concerned with avoiding quining here.
...how do we resolve the problem where sometimes it seems like agents with less computing power have some sort of logical-first-mover-advantage?
You’re referring to the agent-simulates-predictor problem? Actually, I think my (4′) may contain a clue for solving it. As I commented, the logical expectation values should only use about as much computer power as the precursor has rather than as much computing power as the successor has. Therefore, if the predictor is as at least strong as the precursor, the successor wins by choosing a policy pi which is a UDT agent symmetric to the predictor.
There are certainly some places where our thinking has diverged...
Hopefully, further discussion will lead us to a practical demonstration of Aumann’s theorem :)
Hi Nate, thx for commenting!
It seems to me this problem can be avoided by allowing access to random bits. See my reply to KnaveOfAllTrades and my reply to V_V. Formally, we should allow pi in (4′) to be a random algorithm.
I don’t think “logical causation” in the sense you are using here is the right way to think about the anti-Newcomb problem. From the precursor’s point of view, there is no loss in utility due of choosing XDT over UDT.
Of course. I didn’t attempt to formalize “fairness” at that post but the idea is approaching optimality for decision-determined problems in the sense of Yudkowsky 2010.
I realize that the logical expectation values I’m using are so far mostly wishful thinking. However, I think there is benefit in attacking the problems from both ends: understanding the usage of logical probabilities may shed light on the desirada they should satisfy.
Consider two UDT agents A & B with identical utility functions living in different universes. Each of the agents is charged with making a certain decision, while receiving no input. If both agents are aware of each other’s existence, we expect [in the sense of “hope” rather than “are able to prove” :)] them to make decisions that will maximizing overall utility, even though that on the surface, each agent is only maximizing over its own decisions rather than the decisions of both agents.
What is the difference between this scenario and the scenario of a single agent existing in both universes which receives a single bit of input that indicates in which universe the given copy is?
See my reply to Wei Dai.
You’re referring to the agent-simulates-predictor problem? Actually, I think my (4′) may contain a clue for solving it. As I commented, the logical expectation values should only use about as much computer power as the precursor has rather than as much computing power as the successor has. Therefore, if the predictor is as at least strong as the precursor, the successor wins by choosing a policy pi which is a UDT agent symmetric to the predictor.
Hopefully, further discussion will lead us to a practical demonstration of Aumann’s theorem :)