Thank you—Yes, that is what I meant by recursion, and your second attempt seems to go in the right direction to answer my concerns, but I’ll have to think about this for a while to see if I can be convinced.
As for the G-Formula: I don’t think there is free will in it either, just that in contrast with UDT/TDT, it is not inconsistent with my concept of free will.
As an interesting anecdote, I am a Teaching Assistant for Jamie (who came up with the G-Formula), so I have heard him lecture on it several times now. The last couple of years he brought up experiments that seemingly provide evidence against free will and promised to discuss the implications for his theory. Unfortunately, both years he ran out of time before he got around to it. I should ask him the next time I meet him.
re: free will, this is one area where Jamie and Judea agree, it seems.
I think one thing I personally (and probably others) find very confusing about UDT is reconciling the picture to be consistent with temporal constraints on causality. Nothing should create physical retro causality. Here is my posited sequence of events.
step 1. You are an agent with source code read/write access. You suspect there will be (in the future) Omegas in your environment, posing tricky problems. At this point (step 1), you realize you should “preprocess” your own source code in such a way as to maximize expected utility in such problems.
That is, for all causal graphs (possibly with Omega causal pathways), you find where nodes for [my source code goes here] are, and you “pick the optimal treatment regime”.
step 2. Omega scans your source code, and puts things in boxes based on examining this code, or simulating it.
step 3. Omega gives you boxes, with stuff already in them. You already preprocessed what to do, so you one box immediately and walk away with a million.
Given that Omega can scan your source, and given that you can credibly rewrite own decision making source code, there is nothing exotic in this sequence of steps, in particular there is no retro causality anywhere. It is just that there are some constraints (what people here call “logical counterfactuals”) that force the output of Omega’s sim at step 2, and your output at step 3 to coincide. This constraint is what lead you to preprocess to one box in step 1, by drawing an “exotic causal graph” with Omega’s sim creating an additional causal link that seemingly violates “no retro causality.”
The “decision” is in step 1. Had you counterfactually decided to not preprocess there, or preprocess to something else, you would walk away poorer in step 3. There is this element to UDT of “preprocessing decisions in advance.” It seems all real choice, that is examining of alternatives and picking one wisely, happens there.
step 1. You are an agent with source code read/write access. You suspect there will be (in the future) Omegas in your environment, posing tricky problems. At this point (step 1), you realize you should “preprocess” your own source code in such a way as to maximize expected utility in such problems.
This is closer to describing the self-modifying CDT approach. One of the motivations for development of TDT and UDT is that you don’t necessarily get an opportunity to do such self-modification beforehand, let alone to compute the optimal decisions for all possible scenarios you think might occur.
So the idea of UDT is that the design of the code should already suffice to guarantee that if you end up in a newcomblike situation you behave “as if” you did have the opportunity to do whatever precommitment would have been useful. When prompted for a decision, UDT asks “what is the (fixed) optimal conditional strategy” and outputs the result of applying that strategy to its current state of knowledge.
That is, for all causal graphs (possibly with Omega causal pathways), you find where nodes for [my source code goes here] are, and you “pick the optimal treatment regime”.
Basically this, except there’s no need to actually do it beforehand.
If you like, you can consider the UDT agent’s code itself to be the output of such “preprocessing”… except that there is no real pre-computation required, apart from giving the UDT agent a realistic prior.
Basically this, except there’s no need to actually do it beforehand.
Actually, no. To implement things correctly, UDT needs to determine its entire strategy all at once. It cannot decide whether to one-box or two-box in Newcomb just by considering the Newcomb that it is currently dealing with. It must also consider all possible hypothetical scenarios where any other agent’s action depends on whether or not UDT one-boxes.
Furthermore, UDT cannot decide what it does in Newcomb independently of what it does in the Counterfactual Mugging, because some hypothetical entity might give it rewards based on some combination of the two behaviors. UDT needs to compute its entire strategy (i.e. it’s response to all possible scenarios) all at the same time before it can determine what it should do in any particular situation [OK. Not quite true. It might be able to prove that whatever the optimal strategy is it involves doing X in situation Y without actually determining the optimal strategy. Then again, this seems really hard since doing almost anything directly from Kolmogorov priors is basically impossible].
To implement things correctly, UDT needs to determine its entire strategy all at once.
Conceptually, yes. The point is that you don’t need to actually literally explicitly compute your entire strategy at t=-∞. All you have to do is prove a particular property of the strategy (namely, its action in situation Y) at the time when you are asked for a decision.
Obviously, like every computational activity ever, you must still make approximations, because it is usually infeasible to make inferences over the entire tegmark-IV multiverse when you need to make a decision. An example of such approximations would be neglecting the measure of “entities that give it rewards based on some combination of [newcomb’s and counterfactual mugging]” in many situations because I expect such things to be rare (significantly rarer than newcomb’s and counterfactual mugging themselves).
One of the motivations for development of TDT and UDT is that you don’t necessarily get an opportunity to
do such self-modification beforehand,
Basically this, except there’s no need to actually do it beforehand.
Consider the set of all possible source codes an agent might have. This set is partitioned in two: those on which Omega rewards you (where you both one box, and your source code is such that Omega can tell you will), and those on which Omega punishes you (all others). Call the former set A, and the latter set B.
Agents are not guaranteed to start with a source code in set A, some start with source code in B. (Maybe they are classically trained causal decision theorists? Maybe they are skeptical about UDT? Maybe their programmers were careless? Who knows!) The point is, there comes a time in an agent’s life when it needs to grow up and move its source code to set A. Maybe it does not immediately self-modify to directly do UDTish things on Newcomb-like causal graphs, maybe it self-modifies to self-modify before being asked to one box.
But it is crucial for the agent to move itself from set B to set A at some point before Omega shows up. This is what I mean by step 1.
step 1. You are an agent with source code read/write access. You suspect there will be (in the future) Omegas in your >environment, posing tricky problems. At this point (step 1), you realize you should “preprocess” your own source code in >such a way as to maximize expected utility in such problems.
That is, for all causal graphs (possibly with Omega causal pathways), you find where nodes for [my source code goes >here] are, and you “pick the optimal treatment regime”.
I think what confuses me is that if we want the logical connections to hold (between my decision and the decision of another agent running the same source code), it is necessary that when he preprocesses his source code he will deterministically make the same choice as me. Which means that my decision about how to preprocess has already been made by some deeper part of my source code
My understanding of the statement of Newcomb is that Omega puts things on boxes only based on what your source code says you will do when faced with input that looks like the Newcomb’s problem. Since the agent already preprocessed the source code (possibly using other complicated bits of its own source code) to one box on Newcomb, Omega will populate boxes based on that. Omega does not need to deal with any other part of the agent’s source code, including some unspecified complicated part that dealt with preprocessing and rewriting, except to prove to itself that one boxing will happen.
All that matters is that the code currently has the property that IF it sees the Newcomb input, THEN it will one box.
Omega that examines the agent’s code before the agent preprocessed will also put a million dollars in, if it can prove the agent will self-modify to one-box before choosing the box.
My understanding of the statement of Newcomb is that Omega puts things on boxes only based on what your source code says you will do when faced with input that looks like the Newcomb’s problem.
Phrasing it in terms of source code makes it more obvious that this is equivalent to expecting Omega to be able to solve the halting problem.
If Omega only puts the million in if it finds a proof fast enough, it is then possible that you will one-box and not get the million.
Yes, it’s possible, and serves you right for trying to be clever. Solving the halting problem isn’t actually hard for a large class of programs, including the usual case for an agent in a typical decision problem (ie. those that in fact do halt quickly enough to make an actual decision about the boxes in less than a day). If you try to deliberately write a very hard to predict program, then of course omega takes away the money in retaliation, just like the other attempts to “trick” omega by acting randomly or looking inside the boxes with xrays.
The problem requires that Omega be always able to figure out what you do. If Omega can only figure out what you can do under a limited set of circumstances, you’ve changed one of the fundamental constraints of the problem.
You seem to be thinking of this as “the only time someone won’t come to a decision fast enough is if they deliberately stall”, which is sort of the reverse of fighting the hypothetical—you’re deciding that an objection can’t apply because the objection applies to an unlikely situation.
Suppose that in order to decide what to do, I simulate Omega in my head as one of the steps of the process? That is not intentionally delaying, but it still could result in halting problem considerations. Or do you just say that Omega doesn’t give me the money if I try to simulate him?
Usually, in the thought experiment, we assume that Omega has enough computation power to simulate the agent, but that the agent does not have enough computation power to compute Omega. We usually further assume that the agent halts and that Omega is a perfect predictor. However, these are expositional simplifications, and none of these assumptions are necessary in order to put the agent into a Newcomblike scenario.
For example, in the game nshepperd is describing (where Omega plays Newcomb’s problem, but only puts the money in the box if it has very high confidence that you will one-box) then, if you try to simulate Omega, you won’t get the money. You’re still welcome to simulate Omega, but while you’re doing that, I’ll be walking away with a million dollars and you’ll be spending lots of money on computing resources.
No one’s saying you can’t, they’re just saying that if you find yourself in a situation where someone is predicting you and rewarding you for obviously acting like they want you to, and you know this, then it behooves you to obviously act like they want you to.
Or to put it another way, consider a game where Omega is only a pretty good predictor who only puts the money in the box if Omega predicts that you one-box unconditionally (e.g. without using a source of randomness) and whose predictions are correct 99% of the time. Omega here doesn’t have any perfect knowledge, and we’re not necessarily assuming that anyone has superpowers, but i’d still onebox.
Or if you want to see a more realistic problem (where the predictor has only human-level accuracy) then check out Hintze’s formulation of Parfit’s Hitchhiker (though be warned, I’m pretty sure he’s wrong about TDT succeeding on this formulation of Parfit’s Hitchhiker. UDT succeeds on this problem, but TDT would fail.)
Thank you—Yes, that is what I meant by recursion, and your second attempt seems to go in the right direction to answer my concerns, but I’ll have to think about this for a while to see if I can be convinced.
As for the G-Formula: I don’t think there is free will in it either, just that in contrast with UDT/TDT, it is not inconsistent with my concept of free will.
As an interesting anecdote, I am a Teaching Assistant for Jamie (who came up with the G-Formula), so I have heard him lecture on it several times now. The last couple of years he brought up experiments that seemingly provide evidence against free will and promised to discuss the implications for his theory. Unfortunately, both years he ran out of time before he got around to it. I should ask him the next time I meet him.
re: free will, this is one area where Jamie and Judea agree, it seems.
I think one thing I personally (and probably others) find very confusing about UDT is reconciling the picture to be consistent with temporal constraints on causality. Nothing should create physical retro causality. Here is my posited sequence of events.
step 1. You are an agent with source code read/write access. You suspect there will be (in the future) Omegas in your environment, posing tricky problems. At this point (step 1), you realize you should “preprocess” your own source code in such a way as to maximize expected utility in such problems.
That is, for all causal graphs (possibly with Omega causal pathways), you find where nodes for [my source code goes here] are, and you “pick the optimal treatment regime”.
step 2. Omega scans your source code, and puts things in boxes based on examining this code, or simulating it.
step 3. Omega gives you boxes, with stuff already in them. You already preprocessed what to do, so you one box immediately and walk away with a million.
Given that Omega can scan your source, and given that you can credibly rewrite own decision making source code, there is nothing exotic in this sequence of steps, in particular there is no retro causality anywhere. It is just that there are some constraints (what people here call “logical counterfactuals”) that force the output of Omega’s sim at step 2, and your output at step 3 to coincide. This constraint is what lead you to preprocess to one box in step 1, by drawing an “exotic causal graph” with Omega’s sim creating an additional causal link that seemingly violates “no retro causality.”
The “decision” is in step 1. Had you counterfactually decided to not preprocess there, or preprocess to something else, you would walk away poorer in step 3. There is this element to UDT of “preprocessing decisions in advance.” It seems all real choice, that is examining of alternatives and picking one wisely, happens there.
(???)
This is closer to describing the self-modifying CDT approach. One of the motivations for development of TDT and UDT is that you don’t necessarily get an opportunity to do such self-modification beforehand, let alone to compute the optimal decisions for all possible scenarios you think might occur.
So the idea of UDT is that the design of the code should already suffice to guarantee that if you end up in a newcomblike situation you behave “as if” you did have the opportunity to do whatever precommitment would have been useful. When prompted for a decision, UDT asks “what is the (fixed) optimal conditional strategy” and outputs the result of applying that strategy to its current state of knowledge.
Basically this, except there’s no need to actually do it beforehand.
If you like, you can consider the UDT agent’s code itself to be the output of such “preprocessing”… except that there is no real pre-computation required, apart from giving the UDT agent a realistic prior.
Actually, no. To implement things correctly, UDT needs to determine its entire strategy all at once. It cannot decide whether to one-box or two-box in Newcomb just by considering the Newcomb that it is currently dealing with. It must also consider all possible hypothetical scenarios where any other agent’s action depends on whether or not UDT one-boxes.
Furthermore, UDT cannot decide what it does in Newcomb independently of what it does in the Counterfactual Mugging, because some hypothetical entity might give it rewards based on some combination of the two behaviors. UDT needs to compute its entire strategy (i.e. it’s response to all possible scenarios) all at the same time before it can determine what it should do in any particular situation [OK. Not quite true. It might be able to prove that whatever the optimal strategy is it involves doing X in situation Y without actually determining the optimal strategy. Then again, this seems really hard since doing almost anything directly from Kolmogorov priors is basically impossible].
Conceptually, yes. The point is that you don’t need to actually literally explicitly compute your entire strategy at
t=-∞
. All you have to do is prove a particular property of the strategy (namely, its action in situation Y) at the time when you are asked for a decision.Obviously, like every computational activity ever, you must still make approximations, because it is usually infeasible to make inferences over the entire tegmark-IV multiverse when you need to make a decision. An example of such approximations would be neglecting the measure of “entities that give it rewards based on some combination of [newcomb’s and counterfactual mugging]” in many situations because I expect such things to be rare (significantly rarer than newcomb’s and counterfactual mugging themselves).
Consider the set of all possible source codes an agent might have. This set is partitioned in two: those on which Omega rewards you (where you both one box, and your source code is such that Omega can tell you will), and those on which Omega punishes you (all others). Call the former set A, and the latter set B.
Agents are not guaranteed to start with a source code in set A, some start with source code in B. (Maybe they are classically trained causal decision theorists? Maybe they are skeptical about UDT? Maybe their programmers were careless? Who knows!) The point is, there comes a time in an agent’s life when it needs to grow up and move its source code to set A. Maybe it does not immediately self-modify to directly do UDTish things on Newcomb-like causal graphs, maybe it self-modifies to self-modify before being asked to one box.
But it is crucial for the agent to move itself from set B to set A at some point before Omega shows up. This is what I mean by step 1.
I think what confuses me is that if we want the logical connections to hold (between my decision and the decision of another agent running the same source code), it is necessary that when he preprocesses his source code he will deterministically make the same choice as me. Which means that my decision about how to preprocess has already been made by some deeper part of my source code
My understanding of the statement of Newcomb is that Omega puts things on boxes only based on what your source code says you will do when faced with input that looks like the Newcomb’s problem. Since the agent already preprocessed the source code (possibly using other complicated bits of its own source code) to one box on Newcomb, Omega will populate boxes based on that. Omega does not need to deal with any other part of the agent’s source code, including some unspecified complicated part that dealt with preprocessing and rewriting, except to prove to itself that one boxing will happen.
All that matters is that the code currently has the property that IF it sees the Newcomb input, THEN it will one box.
Omega that examines the agent’s code before the agent preprocessed will also put a million dollars in, if it can prove the agent will self-modify to one-box before choosing the box.
Phrasing it in terms of source code makes it more obvious that this is equivalent to expecting Omega to be able to solve the halting problem.
This is fighting the hypothetical, Omega can say it will only put a million in if it can find a proof of you one boxing quickly enough.
If Omega only puts the million in if it finds a proof fast enough, it is then possible that you will one-box and not get the million.
(And saying “there isn’t any such Omega” may be fighting the hypothetical. Saying there can’t in principle be such an Omega is not.)
Yes, it’s possible, and serves you right for trying to be clever. Solving the halting problem isn’t actually hard for a large class of programs, including the usual case for an agent in a typical decision problem (ie. those that in fact do halt quickly enough to make an actual decision about the boxes in less than a day). If you try to deliberately write a very hard to predict program, then of course omega takes away the money in retaliation, just like the other attempts to “trick” omega by acting randomly or looking inside the boxes with xrays.
The problem requires that Omega be always able to figure out what you do. If Omega can only figure out what you can do under a limited set of circumstances, you’ve changed one of the fundamental constraints of the problem.
You seem to be thinking of this as “the only time someone won’t come to a decision fast enough is if they deliberately stall”, which is sort of the reverse of fighting the hypothetical—you’re deciding that an objection can’t apply because the objection applies to an unlikely situation.
Suppose that in order to decide what to do, I simulate Omega in my head as one of the steps of the process? That is not intentionally delaying, but it still could result in halting problem considerations. Or do you just say that Omega doesn’t give me the money if I try to simulate him?
Usually, in the thought experiment, we assume that Omega has enough computation power to simulate the agent, but that the agent does not have enough computation power to compute Omega. We usually further assume that the agent halts and that Omega is a perfect predictor. However, these are expositional simplifications, and none of these assumptions are necessary in order to put the agent into a Newcomblike scenario.
For example, in the game nshepperd is describing (where Omega plays Newcomb’s problem, but only puts the money in the box if it has very high confidence that you will one-box) then, if you try to simulate Omega, you won’t get the money. You’re still welcome to simulate Omega, but while you’re doing that, I’ll be walking away with a million dollars and you’ll be spending lots of money on computing resources.
No one’s saying you can’t, they’re just saying that if you find yourself in a situation where someone is predicting you and rewarding you for obviously acting like they want you to, and you know this, then it behooves you to obviously act like they want you to.
Or to put it another way, consider a game where Omega is only a pretty good predictor who only puts the money in the box if Omega predicts that you one-box unconditionally (e.g. without using a source of randomness) and whose predictions are correct 99% of the time. Omega here doesn’t have any perfect knowledge, and we’re not necessarily assuming that anyone has superpowers, but i’d still onebox.
Or if you want to see a more realistic problem (where the predictor has only human-level accuracy) then check out Hintze’s formulation of Parfit’s Hitchhiker (though be warned, I’m pretty sure he’s wrong about TDT succeeding on this formulation of Parfit’s Hitchhiker. UDT succeeds on this problem, but TDT would fail.)