Wei Dai comments on Ingredients of Timeless Decision Theory

Wei Dai 19 Aug 2009 12:42 UTC
9 points

This is another open problem—“who acts first” in timeless negotiations.

You’re right, I failed to realize that with timeless agents, we can’t do backwards induction using the physical order of decisions. We need some notion of the logical order of decisions.

Here’s an idea. The logical order of decisions is related to simulation ability. Suppose A can simulate B, meaning it has trustworthy information about B’s source code and has sufficient computing power to fully simulate B or sufficient intelligence to analyze B using reliable shortcuts, but B can’t simulate A. Then the logical order of decisions is B followed by A, because when B makes his decision, he can treat A’s decision as conditional on his. But when A makes her decision, she has to take B’s decision as a given.

Does that make sense?
What links here?
- Steve_Rayhawk's comment on The Danger of Stories by Matt_Simpson (9 Nov 2009 4:17 UTC; 0 points)
- Eliezer Yudkowsky 19 Aug 2009 15:05 UTC
  12 points
  Parent
  Moving second is a disadvantage (at least it seems to always work out that way, counterexamples requested if you can find them) and A can always use less computing power. Rational agents should not regret having more computing power (because they can always use less) or more knowledge (because they can always implement the same strategy they would use with less knowledge) - this sort of thing is a sure sign of reflective inconsistency.
  
  To see why moving logically second is a disadvantage, consider that it lets an opponent playing Chicken always toss their steering wheel out the window and get away with it.
  
  That both players desire to move “logically first” argues strongly that neither one will; that the resolution here does not involve any particular fixed global logical order of decisions.
  
  (I should comment in the future about the possibility that bio-values-derived civs, by virtue of having evolved to be crazy, can succeed in moving logically first using crazy reasoning, but that would be a whole ’nother story, and of course also falls into the “Way the fuck too dangerous to try in real life” category relative to my present knowledge.)
  
  With timeless agents, we can’t do backwards induction using the physical order of decisions. We need some notion of the logical order of decisions.
  
  BTW, thanks for this compact way of putting it.
  - rwallace 19 Aug 2009 19:33 UTC
    4 points
    Parent
    Being logically second only keeps being a disadvantage because examples keep being chosen to be of the kind that make it so.
    
    One category of counterexample comes from warfare, where if you know what the enemy will do and he doesn’t know what you will do, you have the upper hand. (The logical versus temporal distinction is clear here: being temporally the first to reach an objective can be a big advantage.)
    
    Another counterexample is in negotiation where a buyer and seller are both uncertain about fair market price; each may prefer the other to be first to suggest a price. (In practice this is often resolved by the party with more knowledge, or more at stake, or both—usually the seller—being first to suggest a price.)
    - Wei Dai 20 Aug 2009 0:18 UTC
      3 points
      Parent
      
      Being logically second only keeps being a disadvantage because examples keep being chosen to be of the kind that make it so.
      
      You’re right. Rock-paper-scissors is another counter-example. In these cases, the relationship between between the logical order of moves and simulation ability seems pretty obvious and intuitive.
      - Eliezer Yudkowsky 20 Aug 2009 0:19 UTC
        4 points
        Parent
        Except that the analogy to rock-paper-scissors would be that I get to move logically first by deciding my conditional strategy “rock if you play scissors” etc., and simulating you simulating me without running into an apparently non-halting computation (that would otherwise have to be stopped by my performing counterfactual surgery on the part of you that simulates my own decision), then playing rock if I simulate you playing scissors.
        
        At least I think that’s how the analogy would work.
        Vladimir_Nesov 20 Aug 2009 0:36 UTC
        4 points
        Parent
        I suspect that this kind of problems will run into computational complexity issues, not clever decision theory issues. Like with a certain variation on St. Petersburg paradox (see the last two paragraphs), where you need to count to the greatest finite number to which you can count, and then stop.
        Wei Dai 20 Aug 2009 0:29 UTC
        1 point
        Parent
        Suppose I know that’s your strategy, and decide to play the move equal to (the first googleplex digits of pi mod 3), and I can actually compute that but you can’t. What are you going to do?
        
        If you can predict what I do, then your conditional strategy works, which just shows that move order is related to simulation ability.
        Eliezer Yudkowsky 20 Aug 2009 3:32 UTC
        4 points
        Parent
        In this zero-sum game, yes, it’s possible that whoever has the most computing power wins, if neither can access unpredictable random or private variables. But what if both sides have exactly equal computing power? We could define a Timeless Paper-Scissors-Rock Tournament this way—standard language, no random function, each program gets access to the other’s source code and exactly 100 million ticks, if you halt without outputting a move then you lose 2 points.
        Wei Dai 20 Aug 2009 9:13 UTC
        3 points
        Parent
        This game is pretty easy to solve, I think. A simple equilibrium is for each side to do something like iterate x = SHA-512(x), with a random starting value, using an optimal implementation of SHA-512, until time is just about to run out, then output x mod 3. SHA-512 is easy to optimize (in the sense of writing the absolutely fastest implementation), and It seems very unlikely that there could be shortcuts to computing (SHA-512)^n until n gets so big (around 2^256 unless SHA-512 is badly designed) that the function starts to cycle.
        
        I think I’ve answered your specific question, but the answer doesn’t seem that interesting, and I’m not sure why you asked it.
        Paul Crowley 23 May 2012 11:51 UTC
        3 points
        Parent
        Schneier et al here prove that being able to calculate H^n(x) quickly leads to a faster way of finding collisions in H. http://www.schneier.com/paper-low-entropy.html
        Eliezer Yudkowsky 20 Aug 2009 22:11 UTC
        1 point
        Parent
        Well, it’s probably not all that interesting from a purely theoretical perspective, but if the prize money was divided up among only the top fifth of players, you’d actually have to try to win, and that would be an interesting challenge for computer programmers.
  - Wei Dai 19 Aug 2009 19:31 UTC
    1 point
    Parent
    
    Moving second is a disadvantage (at least it seems to always work out that way, counterexamples requested if you can find them) and A can always use less computing power.
    
    But if you are TDT, you can’t always use less computing power, because that might be correlated with your opponents also deciding to use less computing power, or will be distrusted by your opponent because it can’t simulate you.
    
    But if you simply don’t have that much computing power (and opponent knows this) then you seem to have the advantage of logically moving first.
    
    (I should comment in the future about the possibility that bio-values-derived civs, by virtue of having evolved to be crazy, can succeed in moving logically first using crazy reasoning, but that would be a whole ’nother story, and of course also falls into the “Way the fuck too dangerous to try in real life” category relative to my present knowledge.)
    
    Lack of computing power could be considered a form of “crazy reasoning”...
    
    Why does TDT lead to the phenomenon of “stupid winners”? If there’s a way to explain this as a reasonable outcome, I’d feel a lot better. But is that like a two-boxer asking for an explanation of why, when the stupid (from their perspective) one-boxers keep winning, that’s a reasonable outcome?
    What links here?
    Wei Dai's comment on The Commitment Races problem by Daniel Kokotajlo (23 Aug 2019 6:02 UTC; 5 points)
    - Eliezer Yudkowsky 19 Aug 2009 19:55 UTC
      1 point
      Parent
      
      But if you are TDT, you can’t always use less computing power, because that correlates with your opponents also deciding to use less computing power.
      
      Substitute “move logically first” for “use less computing power”? Using less computing power seems like a red herring to me. TDT on simple problems (with the causal / logical structure already given) uses skeletally small amounts of computing power. “Who moves first” is a “battle”(?) over the causal / logical structure, not over who can manage to run out of computing power first. If you’re visualizing this using lots of computing power for the core logic, rather than computing the 20th decimal place of some threshold or verifying large proofs, then we’ve got different visualizations.
      
      The idea of “if you do this, the opponent does the same” might apply to trying to move logically first, but in my world this has nothing to do with computing power, so at this point I think it’d be pretty odd if the agents were competing to be stupider.
      
      Besides, you don’t want to respond to most logical threats, because that gives your opponent an incentive to make logical threats; you only want to respond to logical offers that you want your opponent to have an incentive to make. This gets into the scary issues I was hinting at before, like determining in advance that if you see your opponent predetermine to destroy the universe in a mutual suicide unless you pay a ransom, you’ll call their bet and die with them, even if they’ve predetermined to ignore your decision, etcetera; but if they offer to trade you silver for gold at a Ricardian-advantageous rate, you’ll predetermine to cooperate, etc. The point, though, is that “If I do X, they’ll do Y” is not a blank check to decide that minds do X, because you could choose a different form of responsiveness.
      
      But anyway, I don’t see in the first place that agents should be having these sorts of contests over how little computing power to use. That doesn’t seem to me like a compelling advantage to reach for.
      
      But if you simply don’t have that much computing power then you seem to have the advantage of logically moving first.
      
      If you’ve got that little computing power then perhaps you can’t simulate your opponent’s skeletally small TDT decision, i.e., you can’t use TDT at all. If you can’t close the loop of “I simulate you simulating me”—which isn’t infinite, and actually terminates rather quickly in the simple cases I know how to analyze at all, because we perform counterfactual surgery inside the loop—then you can’t use TDT at all.
      
      Lack of computing power could be considered a form of “crazy reasoning”...
      
      No, I mean much crazier than that. Like “This doesn’t follow, but I’m going to believe it anyway!” That’s what it takes to get “unusual reasons”—the sort of madness that only strictly naturally selected biological minds would find compelling in advance of a timeless decision to be crazy. Like “I’M GOING TO THROW THE STEERING WINDOW OUT THE WHEEL AND I DON’T CARE WHAT THE OPPONENT PREDETERMINES” crazy.
      
      Why does TDT lead to the phenomenon of “stupid winners”?
      
      It has not been established to my satisfaction that it does. It is a central philosophical intuition driving my decision theory that increased computing power, knowledge, or self-control, should not harm a rational agent.
  - Eliezer Yudkowsky 20 Aug 2009 0:11 UTC
    0 points
    Parent
    
    That both players desire to move “logically first” argues strongly that neither one will; that the resolution here does not involve any particular fixed global logical order of decisions.
    
    ...possibly employing mixed strategies, by analogy to the equilibrium of games where neither agent gets to go first and both must choose simultaneously? But I haven’t done anything with this idea, yet.
  - RickJS 29 Aug 2009 3:04 UTC
    −9 points
    Parent
    First of all, congratulations, Eliezer! That’s great work. When I read your 3-line description, I thought it would never be computable. I’m glad to see you can actually test it.
    
    Eliezer_Yudkowsky wrote on 19 August 2009 03:05:15PM
    
    … Moving second is a disadvantage (at least it seems to always work out that way, counterexamples requested if you can find them)
    
    Rock-paper-scissors ?
    Negotiating to buy a car?
    
    I would like to begin by saying that I don’t believe my own statements are True, and I suggest you don’t either. I do request that you try thinking WITH them before attacking them. It’s really hard to think with an idea AFTER you’ve attacked it. I’ve been told my writing sounds preachy or even fanatical. I don’t say “In My Opinion” enough. Please imagine “IMO” in front of every one of my statements. Thanks!
    
    Having more information (not incorrect “information”) on the opponent’s decisions is beneficial.
    
    Let’s distinguish Secret Commit & Simultaneous Effect (SCSE) from Commit First & Simultaneous Effect (CFSE) and from Act & Effect First (AEF). That’s just a few categories from a coarse categorization of board war games.
    
    The classic gunfight at high noon is AEF (to a first approximation, not counting watching his face & guessing when his reaction time will be lengthened). The fighter who draws first has a serious advantage, the fighter who hits first has a tremendous advantage, but not certain victory. (Hollywood not withstanding, people sometimes keep fighting after taking handgun hits, even a dozen of them.) I contend that all AEFs give advantage to the first actor. Chess is AEF.
    
    My understanding of the Prisoner’s Dilemma is that it is SCSE as presented. On this thread, it seems to have mutated into a CFSE (otherwise, there just isn’t any “first”, in the ordinary, inside-the-Box-Universe, timeful sense). If Prisoner A has managed to get information on Prisoner B’s commitment before he commits, this has to be useful. Even if PA is a near-Omega, it can be a reality check on his Visualization of the Cosmic All. In realistic July 2009 circumstances, it identifies PB as one of the 40% of humans who choose ‘cooperate’ in one-shot PD. PA now has a choice whether to be an economist or a friend.
    
    And now we get down to something fundamental. Some humans are better people than the economic definition of rationality, which ” … assume that each player cares only about minimizing his or her own time in jail”. ” … cooperating is strictly dominated) by defecting … ” even with leaked information.
    
    “I don’t care what happens to my partner in crime. I don’t and I won’t. You can’t make me care. On the advice of my economist… ” That gets both prisoners a 5-year sentence when they could have had 6 months.
    
    That is NOT wisdom! That will make us extinct. (In My Opinion)
    
    Now try on “an injury to one is an injury to all”. Or maybe “an injury to one is an (discounted) injury to ME”. We just might be able to see that the big nuclear arsenals are a BAD IDEA!
    
    Taking that on, the payoff matrix offered by Wei Dai’s Omega (19 August 2009 07:08:23AM)
    
    * cooperate 5/5 0/6 * defect 6/0 1/1
    is now transformed into PA’s Internal Payoff Matrix (IPM)
    
    * cooperate 5+5κ/5 0+6κ/6 * defect 6+0κ/0 1+1κ/1
    In other words, his utility function has a term for the freedom of Prisoner B. (Economists be damned! Some of us do, sometimes.)
    
    “I’ll set κ=0.3 ,” Says PA (well, he is a thief). Now PA’s IPM is:
    
    * cooperate 6.5/5 1.8/6 * Defect 6/0 1.3/1
    Lo and behold! ‘cooperate’ now strictly dominates!
    
    When over 6 billion people are affected, it doesn’t take much of a κ to swing my decisions around. If I’m not working to save humanity, I must have a very low κ for each distant person unknown to me.
    
    People say, “Human life is precious!” Show it to me in results. Show it to me in how people budget their time and money. THAT is why Friendly AI is our only hope. We will ‘defect’ our way into thwarting any plan that requires a lot of people to change their beliefs or actions. That sub-microscopic κ for unknown strangers is evolved-in, it’s not going away. We need a program that can be carried out by a tiny number of people.
    
    .
    
    .
    
    .
    
    IMO.
    
    ---=
    
    Maybe I missed the point. Maybe the whole point of TDT is to derive some sort of reduced-selfishness decision norm without an ad-hoc utility function adjustment (is that what “rained down from heaven” means?). I can derive the κ needed in order to save humanity, if there were a way to propagate it through the population. I cannot derive The One True κ from absolute principles, nor have I shown a derivation of “we should save humanity”. I certainly fell short of ” … looking at which agents walk away with huge heaps of money and then working out how to do it systematically … ”. I would RATHER look at which agents get their species through their singularity alive. Then, and only then, can we look at something grander than survival. I don’t grok in fullness “reflective consistency”, but from extinction we won’t be doing a lot of reflecting on what went wrong.
    
    IMO.
    
    Now, back to one-shot PD and “going first”. For some values of κ and some external payoff matrices (not this one), the resulting IPM is not strictly dominated, and having knowledge of PB’s commitment actually determines whether ‘cooperate’ or ‘defect’ produces a better world in PA’s internal not-quite-so-selfish world-view. Is that a disadvantage? (That’s a serious, non-rhetorical question. I’m a neophyte and I may not see some things in the depths where Eliezer & Wei think.)
    
    Now let’s look at that game of chicken. Was “throw out the steering wheel” in the definition of the thought experiment? If not, that player just changed the universe-under-consideration, which is a fairly impressive effect in an AEF, not a CFSE.
    
    If re-engineering was included, then Driver A may complete his wheel-throwing (while in motion!) only to look up and see Driver B’s steering gear on a ballistic trajectory. Each will have a few moments to reflect on “always get away with it.”
    
    If Driver A successfully defenestrates first, is Driver B at a disadvantage? Among humans, the game may be determined more by autonomic systems than by conscious computation, and B now knows that A won’t be flinching away. However, B now has information and choices. One that occurs to me is to stop the car and get out. “Your move, A.” A truly intelligent player (in which category I do not, alas, qualify) would think up better, or funnier, choices.
    
    Hmmm… to even play Chicken you have to either be irrational or have a damned strange IPM. We should establish that before proceeding further.
    
    I challenge anyone to show me a CFSE game that gives a disadvantage to the second player.
    
    I’m not too proud to beg: I request your votes. I’ve got an article I’d like to post, and I need the karma.
    
    Thanks for your time and attention.
    
    RickJS
    Saving Humanity from Homo Sapiens
    
    08/28/2009 ~20:10 Edit: formatting … learning formatting … grumble … GDSOB tab-deleter … Fine. I’ll create the HTML for tables, but this is a LOT of work for 3 simple tables … COMMENT TOO LONG!?!? … one last try … now I can’t quit, I’m hooked! … NAILED that sucker! … ~22:40 : added one more example *YAWN*
    - Vladimir_Nesov 29 Aug 2009 9:09 UTC
      5 points
      Parent
      It’s incomprehensible. Try debugging individual ideas first, written up more carefully.
- Eliezer Yudkowsky 19 Aug 2009 15:31 UTC
  1 point
  Parent
  
  With timeless agents, we can’t do backwards induction using the physical order of decisions. We need some notion of the logical order of decisions.
  
  BTW, thanks for this compact way of putting it.
- [deleted] 13 Jun 2014 6:52 UTC
  −2 points
  Parent
  This reminds me of logical Fatalism and the Argument from Bivalence