Psy-Kosh comments on A problem with Timeless Decision Theory (TDT)

Psy-Kosh 5 Feb 2010 5:45 UTC
3 points
Actually, one can do even better than that. As (I think), Eliezer implied, the key is Omega saying those words. (about the simulated you getting it wrong)

Did the simulated version receive that message too? (if yes, and if we assume Omega is always truthful, this implies an infinite recursion of simulations… let us not go invoking infinite nested computations willy-nilly.) If there was only a single layer of simulation, them Omega either gave that statement as input to it or did not. If yes, Omega is untruthful, which throws pretty much all of the standard reasoning about Omega out the window and we can simply take into account the possibility that Omega is blatantly lying.

If Omega is truthful, even to the simulations, then the simulation would not have received that prefix message. In which case you are in a different state than simulated you was. So all you have to do is make the decision opposite to what you would have done if you hadn’t heard that particular extra message. This may be guessed by simply one iteration of “I automatically want to guess color1… but wait, simulated me got it wrong, so I’ll guess color2 instead” since “actual” you has the knowledge that the previous version of you got it wrong.

If Omega lies to simulations and tells truth to “actuals” (and can somehow simulate without the simulation being conscious, so there’s no ambiguity about which you are, yet still be accurate… (am skeptical but confused on that point)), then we have an issue. But then it would require Omega to take a risk: if when telling the lie to the simulation, the simulation then gets it right, then what does Omega tell “actual” you?

(“actual” in quotes because I honestly don’t know whether or not one could be modeled with sufficient accuracy, however indirectly, without the model being conscious. I’m actually kind of skeptical of the prospect of a perfectly accurate model not being conscious, although a model that can determine some properties/approximations of the person without being conscious is probably possible)

TL;DR: even without access to coinflips beyond Omega’s predictive power, one might be able to do better in the red/green problem simply by noting that the nature of the additional information Omega provided you opens up the possibility that Omega’s simulation of you was a bit different than the actual situation you are in.
- Eliezer Yudkowsky 5 Feb 2010 6:29 UTC
  4 points
  Parent
  Omega can use the following algorithm:
  
  “Simulate telling the human that they got the answer wrong. If in this case they get the answer wrong, actually tell them that they get the answer wrong. Otherwise say nothing.”
  
  This ought to make it relatively easy for Omega to truthfully put you in a “you’re screwed” situation a fair amount of the time. Albeit, if you know that this is Omega’s procedure, the rest of the time you should figure out what you would have done if Omega said “you’re wrong” and then do that.
  
  This kind of thinking is, I think, outside the domain of current TDT, because it involves strategies that depend on actions you would have taken in counterfactual branches. I think it may even be outside the domain of current UDT for the same reason.
  What links here?
  - JGWeissman's comment on My Fundamental Question About Omega by MrHen (10 Feb 2010 18:23 UTC; 0 points)
  - Wei Dai 5 Feb 2010 11:44 UTC
    2 points
    Parent
    I don’t see why this is outside of UDT’s domain. It seems straightforward to model and solve the decision problem in UDT1. Here’s the world program:
    
    def P(color): outcome = "die" if Omega_Predict(S, "you're wrong") == color: if S("") == color: outcome = "live" else: if S("you're wrong") == color: outcome = "live"
    Assuming a preference to maximize the occurrence of outcome=”live” averaged over P(“green”) and P(“red”), UDT1 would conclude that the optimal S returns a constant, either “green” or “red”, and do that.
    
    BTW, do you find this “world program” style analysis useful? I don’t want to over-do them and get people annoyed. (I refrained from doing this for the problem described in Gary’s post, since it doesn’t mention UDT at all, and therefore I’m assuming you want to find a TDT-only solution.)
    What links here?
    Explicit Optimization of Global Strategy (Fixing a Bug in UDT1) by Wei Dai (19 Feb 2010 1:30 UTC; 55 points)
    Wei Dai's comment on UDT1.01: The Story So Far (1/10) by Diffractor (28 Mar 2024 0:52 UTC; 4 points)
    - Gary_Drescher 5 Feb 2010 15:07 UTC
      2 points
      Parent
      
      (I refrained from doing this for the problem described in Gary’s post, since it doesn’t mention UDT at all, and therefore I’m assuming you want to find a TDT-only solution.)
      
      Yes, I was focusing on a specific difficulty in TDT, But I certainly have no objection to bringing UDT into the thread too. (I myself haven’t yet gotten around to giving UDT the attention I think it deserves.)
    - JGWeissman 5 Feb 2010 18:01 UTC
      0 points
      Parent
      The world program I would use to model this scenario is:
      
      def P(color): if Omega_Predict(S, "you're wrong") == color: outcome = "die" else: outcome = "live"
      The else branch seems unreachable, given color = S(“your’e wrong) and the usual assumptions about Omega.
      
      I don’t understand what your nested if statements are modeling.
      - Wei Dai 5 Feb 2010 19:18 UTC
        1 point
        Parent
        I was modeling what Eliezer wrote in the comment that I was responding to:
        
        “Simulate telling the human that they got the answer wrong. If in this case they get the answer wrong, actually tell them that they get the answer wrong. Otherwise say nothing.”
        
        BTW, if you add a tab in front of each line of your program listing, it will get formatted correctly.
        JGWeissman 5 Feb 2010 19:28 UTC
        1 point
        Parent
        Ah, I see. Then it seems that you are really solving the problem of minimizing the probability that Omega presents this problem in the first place.
        
        What about the scenario, where Omega uses the strategy: Simulate telling the human that they got the answer wrong. Define the resulting answer as wrong, and the other as right.
        
        This is what I modeled.
        
        BTW, if you add a tab in front of each line of your program listing, it will get formatted correctly.
        
        Thanks. Is there an easier way to get a tab into the comment input box than copy paste from an outside editor?
        Wei Dai 5 Feb 2010 19:51 UTC
        1 point
        Parent
        
        What about the scenario, where Omega uses the strategy: Simulate telling the human that they got the answer wrong. Define the resulting answer as wrong, and the other as right.
        
        In that case it should be modeled like this:
        
        def P(color): wrong_color = Omega_Predict(S, "you're wrong") if S("you're wrong") == wrong_color: outcome = "die" else: outcome = "live"
        
        Thanks. Is there an easier way to get a tab into the comment input box than copy paste from an outside editor?
        
        Not that I’m aware of.
        Tyrrell_McAllister 5 Feb 2010 19:58 UTC
        3 points
        Parent
        
        Is there an easier way to get a tab into the comment input box than copy paste from an outside editor?
        
        Not that I’m aware of.
        
        Are you guys talking about getting code to indent properly? You can do that by typing four spaces in front of each line. Each quadruple of spaces produces a further indentation.
        
        http://daringfireball.net/projects/markdown/syntax#precode
        Wei Dai 6 Feb 2010 7:06 UTC
        2 points
        Parent
        
        Are you guys talking about getting code to indent properly? You can do that by typing four spaces in front of each line.
        
        Spaces? Think of the wasted negentropy! I say we make tab the official Less Wrong indention symbol, and kick out anyone who disagrees. Who’s with me? :-)
        JGWeissman 5 Feb 2010 20:06 UTC
        0 points
        Parent
        Hm, I think the difference in our model programs indicates something that I don’t understand about UDT, like a wrong assumption that justified an optimization. But it seems they both produce the same result for P(S(“you’re wrong”)), which is outcome=”die” for all S.
        
        Do you agree that this problem is, and should remain, unsolvable? (I understand “should remain unsolvable” to mean that any supposed solution must represent some sort of confusion about the problem.)
        Wei Dai 5 Feb 2010 20:23 UTC
        0 points
        Parent
        The input to P is supposed to contain the physical randomness in the problem, so P(S(“you’re wrong”)) doesn’t make sense to me. The idea is that both P(“green”) and P(“red”) get run, and we can think of them as different universes in a multiverse. Actually in this case I should have wrote “def P():” since there is no random correct color.
        
        wrong assumption that justified an optimization
        
        I’m not quite sure what you mean here, but in general I suggest just translating the decision problem directly into a world program without trying to optimize it.
        
        Do you agree that this problem is, and should remain, unsolvable? (I understand “should remain unsolvable” to mean that any supposed solution must represent some sort of confusion about the problem.)
        
        No, like I said, it seems pretty straightforward to solve in UDT. It’s just that even in the optimal solution you still die.
        JGWeissman 5 Feb 2010 21:18 UTC
        0 points
        Parent
        
        The input to P is supposed to contain the physical randomness in the problem, so P(S(“you’re wrong”)) doesn’t make sense to me. The idea is that both P(“green”) and P(“red”) get run, and we can think of them as different universes in a multiverse. Actually in this case I should have wrote “def P():” since there is no random correct color.
        
        Ok, now I understood why you wrote your program the way you did.
        
        It’s just that even in the optimal solution you still die.
        
        By solve, I meant find a way to win. I think that after getting past different word use, we agree on the nature of the problem.
  - Psy-Kosh 6 Feb 2010 6:24 UTC
    0 points
    Parent
    Fair enough.
    
    I’m not sure the algorithm you describe here is necessarily outside current TDT though. The counterfactual still corresponds to an actual thing Omega simulated. It’d be more like this: Omega did not add the “you are wrong” prefix. Therefore, conditioning on the idea that Omega always tries simulating with that prefix and only states the prefix if I (or whoever Omega is offering the challenge to) was wrong in that simulation, the simulation in question then did not produce the wrong answer.
    
    Therefore a sufficient property for a good answer (one with higher expected utility) is that it should have the same output as that simulation. Therefore determine what that output was...
    
    ie, TDT shouldn’t have much more problem (in principle) with that than with being told that it needs to guess the Nth digit of Pi. If possible, it would simply compute the Nth digit of Pi. In this case, it has to simply compute the outcome of a certain different algorithm which happens to be equivalent to its own decision algorithm when faced with a certain situation. I don’t THINK this would be inherently outside of current TDT as I understand it
    
    I may be completely wrong on this, though, but that’s the way it seems to me.
    
    As far as stuff like the problem in the OP, I suspect though that the Right Way for dealing with things analogous to counterfactual mugging (and extended to the problem in the OP) and such amounts to a very general precommitment… Or a retroactive precommitment.
    
    My thinking here is rather fuzzy. I do suspect though that the Right Way probably looks something like the the TDT, in advance, doing a very general precommitment to be the sort of being that tends to have high expected utility when faced with counterfactual muggers and whatnot… (Or retroactively deciding to be the sort of being that effectively has the logical implication of being mathematically “precommited” to be such.)