I note here that simply enumerating possible worlds evades this problem as far as I can tell.
The analogous unfair decision problem would be “punish the agent if they simply enumerate possible worlds and then choose the action that maximizes their expected payout”. Not calling something a decision theory doesn’t mean it isn’t one.
Please propose a mechanism by which you can make an agent who enumerates the worlds seen as possible by every agent, no matter what their decision theory is, end up in a world with lower utility than some other agent.
Say you have an agent A who follows the world-enumerating algorithm outlined in the post. Omega makes a perfect copy of A and presents the copy with a red button and a blue button, while telling it the following:
“I have predicted in advance which button A will push. (Here is a description of A; you are welcome to peruse it for as long as you like.) If you press the same button as I predicted A would push, you receive nothing; if you push the other button, I will give you $1,000,000. Refusing to push either button is not an option; if I predict that you do not intend to push a button, I will torture you for 3^^^3 years.”
The copy’s choice of button is then noted, after which the copy is terminated. Omega then presents the real agent facing the problem with the exact same scenario as the one faced by the copy.
Your world-enumerating agent A will always fail to obtain the maximum $1,000,000 reward accessible in this problem. However, a simple agent B who chooses randomly between the red and blue buttons has a 50% chance of obtaining this reward, for an expected utility of $500,000. Therefore, A ends up in a world with lower expected utility than B.
Your scenario is somewhat ambiguous, but let me attempt to answer all versions of it that I can see.
First: does the copy of A (hereafter, A′) know that it’s a copy?
If yes, then the winning strategy is “red if I am A, blue if I am A′”. (Or the reverse, of course; but whichever variant A selects, we can be sure that A′ selects the same one, being a perfect copy and all.)
If no, then indeed A receives nothing, but then of course this has nothing to do with any copies; it is simply the same scenario as if Omega predicted A’s choice, then gave A the money if A chose differently than predicted—which is, of course, impossible (Omega is a perfect predictor), and thus this, in turn, is the same as “Omega shows up, doesn’t give A any money, and leaves”.
Or is it? You claim that in the scenario where Omega gives the money iff A chooses otherwise than predicted, A could receive the money with 50% probability by choosing randomly. But this requires us to reassess the terms of the “Omega, a perfect predictor” stipulation, as previously discussed by cousin_it. In any case, until we’ve specified just what kind of predictor Omega is, and how its predictive powers interact with sources of (pseudo-)randomness—as well as whether, and how, Omega’s behavior changes in situations involving randomness—we cannot evaluate scenarios such as the one you describe.
dxu did not claim that A could receive the money with 50% probability by choosing randomly. They claimed that a simple agent B that chose randomly would receive the money with 50% probability. The point is that Omega is only trying to predict A, not B, so it doesn’t matter how well Omega can predict B’s actions.
The point can be made even more clear by introducing an agent C that just does the opposite of whatever A would do. Then C gets the money 100% of the time (unless A gets tortured, in which case C also gets tortured).
This doesn’t make a whole lot of sense. Why, and on what basis, are agents B and C receiving any money?
Are you suggesting some sort of scenario where Omega gives A money iff A does the opposite of what Omega predicted A would do, and then also gives any other agent (such as B or C) money iff said other agent does the opposite of what Omega predicted A would do?
This is a strange scenario (it seems to be very different from the sort of scenario one usually encounters in such problems), but sure, let’s consider it. My question is: how is it different from “Omega doesn’t give A any money, ever (due to a deep-seated personal dislike of A). Other agents may, or may not, get money, depending on various factors (the details of which are moot)”?
This doesn’t seem to have much to do with decision theories. Maybe shminux ought to rephrase his challenge. After all—
Please propose a mechanism by which you can make an agent who enumerates the worlds seen as possible by every agent, no matter what their decision theory is, end up in a world with lower utility than some other agent.
… can be satisfied with “Omega punches A in the face, thus causing A to end up with lower utility than B, who remains un-punched”. What this tells us about decision theories, I can’t rightly see.
This is a strange scenario (it seems to be very different from the sort of scenario one usually encounters in such problems), but sure, let’s consider it. My question is: how is it different from “Omega doesn’t give A any money, ever (due to a deep-seated personal dislike of A). Other agents may, or may not, get money, depending on various factors (the details of which are moot)”?
This doesn’t seem to have much to do with decision theories.
Yes, this is correct, and is precisely the point EYNS was trying to make when they said
Intuitively, this problem is unfair to Fiona, and we should compare her performance to Carl’s not on the “act differently from Fiona” game, but on the analogous “act differently from Carl” game.
“Omega doesn’t give A any money, ever (due to a deep-seated personal dislike of A)” is a scenario that does not depend on the decision theory A uses, and hence is an intuitively “unfair” scenario to examine; it tells us nothing about the quality of the decision theory A is using, and therefore is useless to decision theorists. (However, formalizing this intuitive notion of “fairness” is difficult, which is why EYNS brought it up in the paper.)
I’m not sure why shminux seems to think that his world-counting procedure manages to avoid this kind of “unfair” punishment; the whole point of it is that it is unfair, and hence unavoidable. There is no way for an agent to win if the problem setup is biased against them to start with, so I can only conclude that shminux misunderstood what EYNS was trying to say when he (shminux) wrote
I note here that simply enumerating possible worlds evades this problem as far as I can tell.
I didn’t read shminux’s post as suggesting that his scheme allows an agent to avoid, say, being punched in the face apropos of nothing. (And that’s what all the “unfair” scenarios described in the comments here boil down to!) I think we can all agree that “arbitrary face-punching by an adversary capable of punching us in the face” is not something we can avoid, no matter our decision theory, no matter how we make choices, etc.
can be satisfied with “Omega punches A in the face, thus causing A to end up with lower utility than B, who remains un-punched”.
It seems to be a good summary of what dxu and Dacyn were suggesting! I think it preserves the salient features without all the fluff of copying and destroying, or having multiple agents. Which makes it clear why the counterexample does not work: I said “the worlds seen as possible by every agent, no matter what their decision theory is,” and the unpunched world is not a possible one for the world enumerator in this setup.
My point was that CDT makes a suboptimal decision in Newcomb, and FDT struggles to pick the best decision in some of the problems, as well, because it is lost in the forest of causal trees, or at least this is my impression from the EYNS paper. Once you stop worrying about causality and the agent’s ability to change the world by their actions, you end up with a simper question “what possible world does this agent live in and with what probability?”
A mind-reader looks to see whether this is an agent’s decision procedure, and then tortures them if it is. The point of unfair decision problems is that they are unfair.
enumerates the worlds seen as possible by every agent, no matter what their decision theory is
Can you clarify this?
One interpretation is that you’re talking about an agent who enumerates every world that any agent sees as possible. But your post further down seems to contradict this, “the unpunched world is not a possible one for the world enumerator”. And it’s not obvious to me that this agent can exist.
Another is that the agent enumerates only the worlds that every agent sees as possible, but that agent doesn’t seem likely to get good results. And it’s not obvious to me that there are guaranteed to be any worlds at all in this intersection.
The analogous unfair decision problem would be “punish the agent if they simply enumerate possible worlds and then choose the action that maximizes their expected payout”. Not calling something a decision theory doesn’t mean it isn’t one.
Please propose a mechanism by which you can make an agent who enumerates the worlds seen as possible by every agent, no matter what their decision theory is, end up in a world with lower utility than some other agent.
Say you have an agent A who follows the world-enumerating algorithm outlined in the post. Omega makes a perfect copy of A and presents the copy with a red button and a blue button, while telling it the following:
“I have predicted in advance which button A will push. (Here is a description of A; you are welcome to peruse it for as long as you like.) If you press the same button as I predicted A would push, you receive nothing; if you push the other button, I will give you $1,000,000. Refusing to push either button is not an option; if I predict that you do not intend to push a button, I will torture you for 3^^^3 years.”
The copy’s choice of button is then noted, after which the copy is terminated. Omega then presents the real agent facing the problem with the exact same scenario as the one faced by the copy.
Your world-enumerating agent A will always fail to obtain the maximum $1,000,000 reward accessible in this problem. However, a simple agent B who chooses randomly between the red and blue buttons has a 50% chance of obtaining this reward, for an expected utility of $500,000. Therefore, A ends up in a world with lower expected utility than B.
Q.E.D.
Your scenario is somewhat ambiguous, but let me attempt to answer all versions of it that I can see.
First: does the copy of A (hereafter, A′) know that it’s a copy?
If yes, then the winning strategy is “red if I am A, blue if I am A′”. (Or the reverse, of course; but whichever variant A selects, we can be sure that A′ selects the same one, being a perfect copy and all.)
If no, then indeed A receives nothing, but then of course this has nothing to do with any copies; it is simply the same scenario as if Omega predicted A’s choice, then gave A the money if A chose differently than predicted—which is, of course, impossible (Omega is a perfect predictor), and thus this, in turn, is the same as “Omega shows up, doesn’t give A any money, and leaves”.
Or is it? You claim that in the scenario where Omega gives the money iff A chooses otherwise than predicted, A could receive the money with 50% probability by choosing randomly. But this requires us to reassess the terms of the “Omega, a perfect predictor” stipulation, as previously discussed by cousin_it. In any case, until we’ve specified just what kind of predictor Omega is, and how its predictive powers interact with sources of (pseudo-)randomness—as well as whether, and how, Omega’s behavior changes in situations involving randomness—we cannot evaluate scenarios such as the one you describe.
dxu did not claim that A could receive the money with 50% probability by choosing randomly. They claimed that a simple agent B that chose randomly would receive the money with 50% probability. The point is that Omega is only trying to predict A, not B, so it doesn’t matter how well Omega can predict B’s actions.
The point can be made even more clear by introducing an agent C that just does the opposite of whatever A would do. Then C gets the money 100% of the time (unless A gets tortured, in which case C also gets tortured).
This doesn’t make a whole lot of sense. Why, and on what basis, are agents B and C receiving any money?
Are you suggesting some sort of scenario where Omega gives A money iff A does the opposite of what Omega predicted A would do, and then also gives any other agent (such as B or C) money iff said other agent does the opposite of what Omega predicted A would do?
This is a strange scenario (it seems to be very different from the sort of scenario one usually encounters in such problems), but sure, let’s consider it. My question is: how is it different from “Omega doesn’t give A any money, ever (due to a deep-seated personal dislike of A). Other agents may, or may not, get money, depending on various factors (the details of which are moot)”?
This doesn’t seem to have much to do with decision theories. Maybe shminux ought to rephrase his challenge. After all—
… can be satisfied with “Omega punches A in the face, thus causing A to end up with lower utility than B, who remains un-punched”. What this tells us about decision theories, I can’t rightly see.
Yes, this is correct, and is precisely the point EYNS was trying to make when they said
“Omega doesn’t give A any money, ever (due to a deep-seated personal dislike of A)” is a scenario that does not depend on the decision theory A uses, and hence is an intuitively “unfair” scenario to examine; it tells us nothing about the quality of the decision theory A is using, and therefore is useless to decision theorists. (However, formalizing this intuitive notion of “fairness” is difficult, which is why EYNS brought it up in the paper.)
I’m not sure why shminux seems to think that his world-counting procedure manages to avoid this kind of “unfair” punishment; the whole point of it is that it is unfair, and hence unavoidable. There is no way for an agent to win if the problem setup is biased against them to start with, so I can only conclude that shminux misunderstood what EYNS was trying to say when he (shminux) wrote
I didn’t read shminux’s post as suggesting that his scheme allows an agent to avoid, say, being punched in the face apropos of nothing. (And that’s what all the “unfair” scenarios described in the comments here boil down to!) I think we can all agree that “arbitrary face-punching by an adversary capable of punching us in the face” is not something we can avoid, no matter our decision theory, no matter how we make choices, etc.
I am not sure how else to interpret the part of shminux’s post quoted by dxu. How do you interpret it?
It seems to be a good summary of what dxu and Dacyn were suggesting! I think it preserves the salient features without all the fluff of copying and destroying, or having multiple agents. Which makes it clear why the counterexample does not work: I said “the worlds seen as possible by every agent, no matter what their decision theory is,” and the unpunched world is not a possible one for the world enumerator in this setup.
My point was that CDT makes a suboptimal decision in Newcomb, and FDT struggles to pick the best decision in some of the problems, as well, because it is lost in the forest of causal trees, or at least this is my impression from the EYNS paper. Once you stop worrying about causality and the agent’s ability to change the world by their actions, you end up with a simper question “what possible world does this agent live in and with what probability?”
A mind-reader looks to see whether this is an agent’s decision procedure, and then tortures them if it is. The point of unfair decision problems is that they are unfair.
Can you clarify this?
One interpretation is that you’re talking about an agent who enumerates every world that any agent sees as possible. But your post further down seems to contradict this, “the unpunched world is not a possible one for the world enumerator”. And it’s not obvious to me that this agent can exist.
Another is that the agent enumerates only the worlds that every agent sees as possible, but that agent doesn’t seem likely to get good results. And it’s not obvious to me that there are guaranteed to be any worlds at all in this intersection.
Am I missing an interpretation?