The backward link isn’t causal. It’s a logical/Platonic-dependency link, which is indeed how TDT handles counterfactuals (i.e., how it handles the propagation of “surgical alterations” to the decision node C).
Gary_Drescher
(I refrained from doing this for the problem described in Gary’s post, since it doesn’t mention UDT at all, and therefore I’m assuming you want to find a TDT-only solution.)
Yes, I was focusing on a specific difficulty in TDT, But I certainly have no objection to bringing UDT into the thread too. (I myself haven’t yet gotten around to giving UDT the attention I think it deserves.)
By “unsolvable” I mean that you’re screwed over in final outcomes, not that TDT fails to have an output.
Oh ok. So it’s unsolvable in the same sense that “Choose red or green. Then I’ll shoot you.” is unsolvable. Sometimes choice really is futile. :) [EDIT: Oops, I probably misunderstood what you’re referring to by “screwed over”.]
The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing.
Yes, assuming that you’re the sort of algorithm that can (without inconsistency) know its own choice here before the choice is executed.
If you’re the sort of algorithm that may revise its intended action in response to the updated deduction, and if you have enough time left to perform the updated deduction, then the (previously) intended action may not be reliable evidence of what you will actually do, so it fails to provide sound reason for the update in the first place.
When:
D(M) = true, D(!M) = true, E = true
Omega fails.
No, but it seems that way because I neglected in my OP to supply some key details of the transparent-boxes scenario. See my new edit at the end of the OP.
In the setup in question, D goes into an infinite loop (since in the general case it must call a copy of C, but because the box is transparent, C takes as input the output of D).
No, because by stipulation here, D only simulates the hypothetical case in which the box contains $1M, which does not necessarily correspond to the output of D (see my earlier reply to JGWeissman:
http://lesswrong.com/lw/1qo/a_problem_with_timeless_decision_theory_tdt/1kpk).
I think this problem is based (at least in part) on an incoherence in the basic transparent box variant of Newcomb’s problem.
If the subject of the problem will two-box if he sees the big box has the million dollars, but will one-box if he sees the big box is empty. Then there is no action Omega could take to satisfy the conditions of the problem.
The rules of the transparent-boxes problem (as specified in Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff the simulation showed one-boxing. So the subject you describe gets an empty box and one-boxes, but that doesn’t violate the conditions of the problem, which do not require the empty box to be predictive of the subject’s choice.
- Feb 5, 2010, 1:31 PM; 0 points) 's comment on A problem with Timeless Decision Theory (TDT) by (
For now, let me just reply to your incidental concluding point, because that’s brief.
I disagree that the red/green problem is unsolvable. I’d say the solution is that, with respect to the available information, both choices have equal (low) utility, so it’s simply a toss-up. A correct decision algorithm will just flip a coin or whatever.
Having done so, will a correct decision algorithm try to revise its choice in light of its (tentative) new knowledge of what its choice is? Only if it has nothing more productive to do with its remaining time.
Actually, you’re in a different camp than Laura: she agrees that it’s incorrect to two-box regardless of any preference you have about the specified digit of pi. :)
The easiest way to see why two-boxing is wrong is to imagine a large number of trials, with a different chooser, and a different value of i, for each trial. Suppose each chooser strongly prefers that their trial’s particular digit of pi be zero. The proportion of two-boxer simulations that end up with the digit equal to zero is no different than the proportion of one-boxer simulations that end up with the digit equal to zero (both are approximately .1). But the proportion of the one-boxer simulations that end up with an actual $1M is much higher (.9) than the proportion of two-boxer simulations that end up with an actual $1M (.1).
- Feb 5, 2010, 5:47 PM; 0 points) 's comment on A problem with Timeless Decision Theory (TDT) by (
Everything you just said is true.*
Everything you just said is also consistent with everything I said in my original post.
*Except for one typo: you wrote (D or E) instead of (D xor E).
If D=false and E=true and there’s $1M in the box and I two-box, then (in the particular Newcomb’s variant described above) the predictor is not wrong. The predictor correctly computed that (D xor E) is true, and set up the box accordingly, as the rules of this particular variant prescribe.
Sorry, the above post omits some background information. If E “depends on” C in the particular sense defined, then the TDT algorithm mandates that when you “surgically alter” the output of C in the factored causal graph, you then you must correspondingly surgically alter the output of E in the graph.
So it’s not at all a matter of any intuitive connotation of “depends on”. Rather, “depends on”, in this context, is purely a technical term that designates a particular test that the TDT algorithm performs. And the algorithm’s prescribed use of that test culminates in the algorithm making the wrong decision in the case described above (namely, it tells me to two-box when I should one-box).
Better now?
Hm, sorry, it’s displaying for me in the same size as the rest of the site, so I’m not sure what you’re seeing. I’ll strip the formatting and see if that helps.
Done.
[In TDT] If you desire to smoke cigarettes, this would be observed and screened off by conditioning on the fixed initial conditions of the computation—the fact that the utility function had a positive term for smoking cigarettes, would already tell you that you had the gene. (Eells’s “tickle”.) If you can’t observe your own utility function then you are actually taking a step outside the timeless decision theory as formulated.
Consider a different scenario where people with and without the gene both desire to smoke, but the gene makes that desire stronger, and the stronger it is, the more likely one is to smoke. Even when you observe your own utility function, you don’t necessarily have a clue whether the utility assigned to smoking is the level caused by the gene or else by the gene’s absence. So your observation of your utility function doesn’t necessarily help you to move away from the base-level probability of having cancer here.
Thanks, Eliezer—that’s a clear explanation of an elegant theory. So far, TDT (I haven’t looked carefully at UDT) strikes me as more promising than any other decision theory I’m aware of (including my own efforts, past and pending). Congratulations are in order!
I agree, of course, that TDT doesn’t make the A6/A7 mistake. That was just a simple illustration of the need, in counterfactual reasoning (broadly construed), to specify somehow what to hold fixed and what not to, and that different ways of doing so specify different senses of counterfactual inference (i.e., that there are different kinds of ‘if-counterfactually’). If counterfactual inference is construed a la Pearl, for example, then such inferences (causal-counterfactual) correspond to causal links (if-causally).
As you say, TDT’s utility formula doesn’t perform general logical inferences (or evidential-counterfactual inferences) from the antecedents it evaluates (i.e. the candidate outputs of the Platonic computation). Rather, the utility formula performs causal-counterfactual inferences from the set of nodes that designate the outputs of the Platonic computation, in all places where that Platonic computation is approximately physically instantiated.
However, it seems to me we can, if we wish, use TDT to define what we can call a TDT-counterfactual that tells us would be true ‘if-timelessly’ a particular physical agent’s particular physical action were to occur. In particular, whereas CDT says that what would be true (if-causally) consists of what’s causally downstream from that action, TDT says that what would be true (if-timelessly) consists of what’s causally downstream from the output of the suitably-specified Platonic computation that the particular physical agent approximately implements, and also what’s causally downstream from that same Platonic computation in all other places where that computation is approximately physically instantiated. (And the physical TDT agent argmaxes over the utilities of the TDT-counterfactual consequences of that agent’s candidate actions.)
I think there are a few reasons we might sometimes find it useful to think in terms of the TDT-counterfactual consequences of a physical agent’s actions, rather than directly in terms of the standard TDT formulation (even though they’re merely two different ways of expressing the same decision theory, unless I’ve misunderstood).
The TDT-counterfactual perspective places TDT in a common framework with other decision theories that (implicitly or explicitly) use other kinds of counterfactual reasoning, starting with a physical agent’s action as the antecedent. Then we can apply some meta-criterion to ask which of those alternative theories is correct, and why. (That was the intuition behind my MCDT proposal, although MCDT itself was hastily specified and too simpleminded to be correct.)
Plausibly, people are agents who think in terms of the counterfactual consequences of an action, rather than being hardwired to use TDT. If we are to choose to act in accordance with TDT from now on (or, equivalently, if we are to build AIs who act in accordance with TDT), we need to be persuaded that doing so is for the best (even if e.g. a Newcomb snapshot was already taken before we became persuaded). (I’m assuming here that our extant choice machinery allows us the flexibility to be persuaded about what sort of counterfactual to use; if not, alas, we can’t necessarily get there from here).
In the standard formulation of TDT, you effectively view yourself as an abstract computation with one or more approximate physical instantiations, and you ask what you (thus construed) cause (i.e. what follows causal-counterfactually). In the alternative formulation, I view myself as a particular physical agent that is among one or more approximate instantiations of an abstract computation, and I ask what follows TDT-counterfactually from what I (thus construed) choose.
The original formulation seems to require a precommitment to identify oneself with all instantiations (in the causal net) of the abstract computation (or at least seems to require that in order for us non-TDT agents to decide to emulate TDT). And that identification is indeed plausible in the case of fairly exact replication. But consider, say, a 1-shot PD game between Eliezer and me. Our mutual understanding of reflexive consistency would let us win. And I agree that we both approximately instantiate, at some level of abstraction, a common decision computation, which is what lets the TDT framework apply and lets us both win.
But (in contrast with an exact-simulation case) that common computation is at a level of abstraction that does not preserve our respective personal identities. (That’s kind of the point of the abstraction. My utility function for the game places value on Gary’s points and not Eliezer’s points; the common abstract computation lacks that bias.) So I would hesitate to identify either of us with the common abstraction. (And I see in other comments that Eliezer explicitly agrees.) Rather, I’d like to reason that if-timelessly I, Gary, choose ‘Cooperate’, then so does Eliezer. That way, “I am you as you are me” emerges as a (metaphorical) conclusion about the situation (we each have a choice about the other’s action in the game, and are effectively acting together) rather than being needed as the point of departure.
Again, the foregoing is just an alternative but equivalent (unless I’ve erred) way of viewing TDT, an alternative that may be useful for some purposes.
If you could spend a day with any living person
I think you’d find me anticlimactic. :) But I do appreciate the kind words.
I agree that “choose” connotes multiple alternatives, but they’re counterfactual antecedents, and when construed as such, are not inconsistent with determinism.
I don’t know about being ontologically basic, but (what I think of as) physical/causal laws have the important property that they compactly specify the entirety of space-time (together with a specification of the initial conditions).
Just as a matter of terminology, I prefer to say that we can choose (or that we have a choice about) the output, rather than that we control it. To me, control has too strong a connotation of cause.
It’s tricky, of course, because the concepts of choice-about and causal-influence-over are so thoroughly conflated that most people will use the same word to refer to both without distinction. So my terminology suggestion is kind of like most materialsts’ choice to relinquish the word soul to refer to something extraphysical, retaining consciousness to refer to the actual physical/computational process. (Causes, unlike souls, are real, but still distinct from what they’re often conflated with.)
Again, this is just terminology, nothing substantive.
EDIT: In the (usual) special case where a means-end link is causal, I agree with you that we control something that’s ultimately mathematical, even in my proposed sense of the term.
Yes, and that’s the intent in this example as well. Still, it can be useful to look at the expected distribution of outcomes over a large enough number of trials that have the same structure, in order to infer the (counterfactual) probabilities that apply to a single trial.