I’ve been tinkering with the idea of making a top level post on this issue, but figured it would get excessively downvoted. So I’ll risk it here.
For any decision theory, isn’t there some hypothetical where Omega can say, “I’ve analyzed your decision theory, and I’m giving you proposition X, such that if you act the way your decision theory believes is optimal, you will lose?” The “Omega scans your brain and tortures you if you’re too rational” would be an obvious example of this.
Designing a decision theory around any such problem seems relatively trivial. Recognizing when such a proposition is actually legitimate, on the other hand, seems virtually if not actually impossible. In other words, the evidence one would need about Omega’s predictive capacity and honesty is quite staggering. Absent that evidence, you should always two-box. The Counterfactual mugging is even more problematic; the relative chances of running into a trickster versus an honest entity in those circumstances are probably so large that, to the human mind, they may as well be infinite.
If this sense is correct, then designing an agent to be able to accomodate Newcomb’s or the Counterfactual mugging would actually be a reduction in its rationality. These events are so phenomenally unlikely to occur that actually executing the behaviour specified for them would almost certainly be a misfiring. The entity would be better off losing the 1/3^^^3 times when it actually encounters Newcomb’s, and winning the remaining ~100% of the time.
In other words, for want of a better term, much of the discussion of decision theory seems masturbatory. You have an existing system. Someone thinks of how to create a problem for your existing system. Someone solves said problem. Someone thinks of a new problem. Repeat ad infinitum. The marginal cases of something like Newcomb’s so thoroughly lack any practical consequence as to be wholly irrelevant for any actual entity that needs to make decisions.
I’m entirely open to the idea that I’m wrong and that Newcomblike problems occur, or that maybe there is some uber-decision theory that can never be broken by Omega. But if neither of those conditions are satisfied, this seems like something of a waste of mental effort. Of course, if it’s fun to discuss despite being essentially useless, that’s cool. It’s just best not to pretend otherwise.
I think of Omega as a simplified stand-in for other people.
The part about Omega being omniscient and knowably trustworthy isn’t solved. But I think the problem of Omega rewarding bizarre irrational behaviour on your part mostly goes away if you assume it’s fairly human-like, perhaps following UDT or some other decision theory itself. The human motivation for it posing Newcomb’s problem could be that it wants one of the boxes kept closed for some reason, and will reward you for keeping it closed. To make it fit this explanation, Omega should say it doesn’t want you to open the box, and preferably give a reason.
Kinds of things the human-like Omega might do:
trust you or not based on it’s prediction of your behaviour.
prefer you to be rewarded if you act how it wants.
prefer you be punished if you harm it.
tell you what it wants of you.
But it should be less likely to reward you for acting irrational for no reason, or for doing what it wants you not to do.
For any decision theory, isn’t there some hypothetical where Omega can say, “I’ve analyzed your decision theory, and I’m giving you proposition X, such that if you act the way your decision theory believes is optimal, you will lose?” The “Omega scans your brain and tortures you if you’re too rational” would be an obvious example of this.
This isn’t obvious. In particular, note that your “obvious example” violates the basic assumption all these attempts at a decision theory are using, that the payoff depends only on your choice and not how you arrived at it. Of course this is not necessarily a realistic assumption, but that is, IINM, the problem they’re trying to solve.
This isn’t obvious. In particular, note that your “obvious example” violates the basic assumption all these attempts at a decision theory are using, that the payoff depends only on your choice and not how you arrived at it.
Omega simulates you in a variety of scenarios. If you consistently make rational decisions he tortures you.
That does make it somewhat more useful, if that’s the constraint under which it’s operating. It still strikes me as probable that, insofar as decision theory A+ makes decisions that theory A- does not, there must be some way to reward A- and punish A+. I may well be wrong about this. The other flaws, namely the fact that actual decision makers do not encounter omniscient entities with entirely inscrutable motives that are unwaveringly honest, still seem to render the pursuit futile. It’s decidedly less futile if Omega is constrained to outcome based reward/punishment.
I’ve been tinkering with the idea of making a top level post on this issue, but figured it would get excessively downvoted. So I’ll risk it here.
For any decision theory, isn’t there some hypothetical where Omega can say, “I’ve analyzed your decision theory, and I’m giving you proposition X, such that if you act the way your decision theory believes is optimal, you will lose?” The “Omega scans your brain and tortures you if you’re too rational” would be an obvious example of this.
Designing a decision theory around any such problem seems relatively trivial. Recognizing when such a proposition is actually legitimate, on the other hand, seems virtually if not actually impossible. In other words, the evidence one would need about Omega’s predictive capacity and honesty is quite staggering. Absent that evidence, you should always two-box. The Counterfactual mugging is even more problematic; the relative chances of running into a trickster versus an honest entity in those circumstances are probably so large that, to the human mind, they may as well be infinite.
If this sense is correct, then designing an agent to be able to accomodate Newcomb’s or the Counterfactual mugging would actually be a reduction in its rationality. These events are so phenomenally unlikely to occur that actually executing the behaviour specified for them would almost certainly be a misfiring. The entity would be better off losing the 1/3^^^3 times when it actually encounters Newcomb’s, and winning the remaining ~100% of the time.
In other words, for want of a better term, much of the discussion of decision theory seems masturbatory. You have an existing system. Someone thinks of how to create a problem for your existing system. Someone solves said problem. Someone thinks of a new problem. Repeat ad infinitum. The marginal cases of something like Newcomb’s so thoroughly lack any practical consequence as to be wholly irrelevant for any actual entity that needs to make decisions.
I’m entirely open to the idea that I’m wrong and that Newcomblike problems occur, or that maybe there is some uber-decision theory that can never be broken by Omega. But if neither of those conditions are satisfied, this seems like something of a waste of mental effort. Of course, if it’s fun to discuss despite being essentially useless, that’s cool. It’s just best not to pretend otherwise.
I think of Omega as a simplified stand-in for other people.
The part about Omega being omniscient and knowably trustworthy isn’t solved. But I think the problem of Omega rewarding bizarre irrational behaviour on your part mostly goes away if you assume it’s fairly human-like, perhaps following UDT or some other decision theory itself. The human motivation for it posing Newcomb’s problem could be that it wants one of the boxes kept closed for some reason, and will reward you for keeping it closed. To make it fit this explanation, Omega should say it doesn’t want you to open the box, and preferably give a reason.
Kinds of things the human-like Omega might do:
trust you or not based on it’s prediction of your behaviour.
prefer you to be rewarded if you act how it wants.
prefer you be punished if you harm it.
tell you what it wants of you.
But it should be less likely to reward you for acting irrational for no reason, or for doing what it wants you not to do.
This isn’t obvious. In particular, note that your “obvious example” violates the basic assumption all these attempts at a decision theory are using, that the payoff depends only on your choice and not how you arrived at it. Of course this is not necessarily a realistic assumption, but that is, IINM, the problem they’re trying to solve.
Omega simulates you in a variety of scenarios. If you consistently make rational decisions he tortures you.
My reply to this was going to be essentially the same as my comment on bentarm’s thread, so I’ll just point you there.
That does make it somewhat more useful, if that’s the constraint under which it’s operating. It still strikes me as probable that, insofar as decision theory A+ makes decisions that theory A- does not, there must be some way to reward A- and punish A+. I may well be wrong about this. The other flaws, namely the fact that actual decision makers do not encounter omniscient entities with entirely inscrutable motives that are unwaveringly honest, still seem to render the pursuit futile. It’s decidedly less futile if Omega is constrained to outcome based reward/punishment.