OrphanWilde comments on The Number Choosing Game: Against the existence of perfect theoretical rationality

OrphanWilde 5 Jan 2016 18:39 UTC
9 points
The issue here isn’t that rationality is impossible. The issue here is that you’re letting an undefined abstract concept do all your heavy lifting, and taking it places it cannot meaningfully be.

Utilitarianism: Defining “Good” is hard. Math is easy, let X stand in for “Good”, and we’ll maximize X, thereby maximizing “Good”.

So let’s do some substitution. Let’s say apples are good. Would you wait forever for an apple? No? What if make it so you live forever? No, you’d get bored? What if we make it so that you don’t get bored waiting? No, you have other things that have more value to you? Well, we’ll put you in a (science fiction words) closed time loop, so that no matter how long you spend trading the apple back and forth, you’ll come out without having lost anything? And so on and so forth, until all the countless potential objections are eliminated.

Keep going until all that’s left is one extra apple, and the rational thing to do is to wait forever for an apple you’ll never end up with. One by one, you’ve eliminated every reason -not- to wait forever—why should it surprise you that waiting forever is the correct thing to do, when you’ve gone to some much trouble to make sure that it is the correct thing to do?

Your “What’s the highest number game” is, well, a “What’s the highest number game”. Let’s put this in concrete terms: Whoever names the highest number gets $1,000. There are now two variants of the game: In the first variant, you get an infinite number of turns. I think it’s obvious this is identical to the Apple swapping game. In the second variant, you get exactly one turn to name a number. Apply all the constraints of the Apple swapping game, such that there is no cost to the player for taking longer. Well, the obvious strategy now is to keep repeating the number “9” until you’ve said it more times than your opponent. And we’re back to the apple swapping game. There’s no cost to continuing.

What makes all this seem to break rationality? Because we don’t live in a universe without costs, and our brains are hardwired to consider costs. If you find yourself in a universe without costs, where you can obtain an infinite amount of utility by repeating the number “9” forever, well, keep repeating the number “9″ forever, along with everybody else in the universe. It’s not like you’ll ever get bored or have something more important to do.
- casebash 6 Jan 2016 0:05 UTC
  0 points
  Parent
  “Keep going until all that’s left is one extra apple, and the rational thing to do is to wait forever for an apple you’ll never end up with”—that doesn’t really follow. You have to get the Apple and exit the time loop at some point or you never get anything.
  
  “If you find yourself in a universe without costs, where you can obtain an infinite amount of utility by repeating the number “9” forever, well, keep repeating the number “9″ forever, along with everybody else in the universe.”—the scenario specifically requires you to terminate in order to gain any utility.
  - thakil 6 Jan 2016 10:24 UTC
    2 points
    Parent
    But apparently you are not losing utility over time? And holding utility over time isn’t of value to me, otherwise my failure to terminate early is costing me the utility I didn’t take at that point in time? If there’s a lever compensating for that loss of utility then I’m actually gaining the utility I’m turning down anyway!
    
    Basically the only reason to stop at time t1 would be that you will regret not having had the utility available at t1 until t2, when you decide to stop.
    - casebash 6 Jan 2016 10:48 UTC
      0 points
      Parent
      “Basically the only reason to stop at time t1 would be that you will regret not having had the utility available at t1 until t2, when you decide to stop.”—In this scenario, you receive the utility when you stop speaking. You can speak for an arbitrarily long amount of time and it doesn’t cost you any utility as you are compensated for any utility that it would cost, but if you never stop speaking you never gain any utility.
      - thakil 6 Jan 2016 13:51 UTC
        3 points
        Parent
        Then the “rational” thing is to never stop speaking. It’s true that by never stopping speaking I’ll never gain utility but by stopping speaking early I miss out on future utility.
        
        The behaviour of speaking forever seems irrational, but you have deliberately crafted a scenario where my only goal is to get the highest possible utility, and the only way to do that is to just keep speaking. If you suggest that someone who got some utility after 1 million years is “more rational” than someone still speaking at 1 billion years then you are adding a value judgment not apparent in the original scenario.
        Slider 6 Jan 2016 20:06 UTC
        0 points
        Parent
        Infinite utility is not a possible utility in the scenario and therefore the behaviour of not stopping is not a highest possible utility. Continue to speak is an improvement only given that you do stop at some time. If you continue by not stopping ever you get 0 utility which is lower than speaking a 2 digit number.
        thakil 7 Jan 2016 8:41 UTC
        0 points
        Parent
        But time doesn’t end. The criteria of assessment is
        
        1)I only care about getting the highest number possible
        
        2)I am utterly indifferent to how long this takes me
        
        3)The only way to generate this value is by speaking this number (or, at the very least, any other methods I might have used instead are compensated explicitly once I finish speaking).
        
        If your argument is that Bob, who stopped at Grahams number, is more rational than Jim, who is still speaking, then you’ve changed the terms. If my goal is to beat Bob, then I just need to stop at Graham’s number plus one.
        
        At any given time, t, I have no reason to stop, because I can expect to earn more by continuing. The only reason this looks irrational is we are imagining things which the scenario rules out: time costs or infinite time coming to an end.
        
        The argument “but then you never get any utility” is true, but that doesn’t matter, because I last forever. There is no end of time in this scenario.
        
        If your argument is that in a universe with infinite time, infinite life and a magic incentive button then all everyone will do is press that button forever then you are correct, but I don’t think you’re saying much.
        Slider 7 Jan 2016 9:49 UTC
        0 points
        Parent
        python code of
        
        while True: pass ohnoes=1/0
        
        doesn’t generate a runtime exception when ran
        
        similiarly
        
        utility=0 a=0 while True: a+=1 utility+=a
        
        doesn’t assign to utility more than once
        
        in contrast
        
        utility=0 while True: utility+=1
        
        does assign to utility more than once. With finite iterations these two would be quite interchangeable but with non-terminating iterations its not. The iteration doesn’t need to terminate for this to be true.
        
        Say you are in a market and you know someone who sells wheat for $5 and someone who buys it for $10 and someone who sells wine for $7 and suppose that you care about wine. If you have a strategy that only consists of buying and selling wheat you don’t get any wine. There needs to be a “cashout” move of buying wine atleast once. Now think of a situation that when you buy wine you need to hand over your wheat dealing licence. Well a wheat licence means arbitrary amounts of wine so irrational to ever trade wheat license away for a finite amount of wine right? But then you end up with a wine “maximising strategy” that does so by not ever buying wine.
        thakil 7 Jan 2016 11:54 UTC
        2 points
        Parent
        Indeed. And that’s what happens when you give a maximiser perverse incentives and infinity in which to gain them.
        
        This scenario corresponds precisely to pseudocode of the kind
        
        newval<-1
        
        oldval<-0
        
        while newval>oldval
        
        {
        
        oldval<-newval
        
        newval<-newval+1
        
        }
        
        Which never terminates. This is only irrational if you want to terminate (which you usually do), but again, the claim that the maximiser never obtains value doesn’t matter because you are essentially placing an outside judgment on the system.
        
        Basically, what I believe you (and the op) are doing is looking at two agents in the numberverse.
        
        Agent one stops at time 100 and gains X utility Agent two continues forever and never gains any utility.
        
        Clearly, you think, agent one has “won”. But how? Agent two has never failed. The numberverse is eternal, so there is no point at which you can say it has “lost” to agent one. If the numberverse had a non zero probability of collapsing at any point in time then Agent two’s strategy would instead be more complex (and possibly uncomputable if we distribute over infinity), but as we are told that agent one and two exist in a changeless universe and their only goal is to obtain the most utility then we can’t judge either to have won. In fact agent two’s strategy only prevents it from losing, and it can’t win.
        
        That is, if we imagine the numberverse full of agents, any agent which chooses to stop will lose in a contest of utility, because the remaining agents can always choose to stop and obtain their far greater utility. So the rational thing to do in this contest is to never stop.
        
        Sure, that’s a pretty bleak lookout, but as I say, if you make a situation artificial enough you get artificial outcomes.
        Slider 7 Jan 2016 21:12 UTC
        0 points
        Parent
        What you are saying would be optimising in a universe where the agent gets the utility as it says the number. Then the average utility of a ungoer would be greater than that of a idler.
        
        However if the utility is dished out after the number has been spesified then an idler and a ongoer have exactly the same amount of utility and ought to be as optimal. 0 is not a optimum of this game so an agent that results in 0 utility is not an optimiser. If you take an agent that is an optimiser in other context then it ofcourse might not be an optimiser for this game.
        
        There is also the problem that choosing the continue doesn’t yield the utilty with certainty only “almost always”. The ongoer strategy hits precicely in the hole in this certainty when no payout happens. I guess you may be able to define a game where concurrently with their actions. But this reeks of “the house” having premonition on what the agent is going to do instead of inferring its from its actions. if the rules are “first actions and THEN payout” you need to be able to do your action to get a payout.
        
        In the ongoing version I could think of rules that an agent that has said “9.9999...” to 400 digits would receive 0.000.(401 zeroes)..9 utility on the next digit. However if the agents get utility assigned only once there won’t be a “standing so far”. However this behaviour would then be the perfectly rational thing to do as there would be a uniquely determined digit to keep on saying. I am suspecting the trouble is mixing the ongoing and the dispatch version to each other inconsistently.
        thakil 8 Jan 2016 8:49 UTC
        0 points
        Parent
        “However if the utility is dished out after the number has been spesified then an idler and a ongoer have exactly the same amount of utility and ought to be as optimal. 0 is not a optimum of this game so an agent that results in 0 utility is not an optimiser. If you take an agent that is an optimiser in other context then it ofcourse might not be an optimiser for this game.”
        
        The problem with this logic is the assumption that there is a “result” of 0. While it’s certainly true that an “idler” will obtain an actual value at some point, so we can assess how they have done, there will never be a point in time that we can assess the ongoer. If we change the criteria and say that we are going to assess at a point in time then the ongoer can simply stop then and obtain the highest possible utility. But time never ends, and we never mark the ongoer’s homework, so to say he has a utility of 0 at the end is nonsense, because there is, by definition, no end to this scenario.
        
        Essentially, if you include infinity in a maximisation scenario, expect odd results.
  - OrphanWilde 7 Jan 2016 13:43 UTC
    0 points
    Parent
    
    “Keep going until all that’s left is one extra apple, and the rational thing to do is to wait forever for an apple you’ll never end up with”—that doesn’t really follow. You have to get the Apple and exit the time loop at some point or you never get anything.
    
    And the infinite time you have to spend to get that apple, multiplied by the zero cost of the time, is...?
    
    the scenario specifically requires you to terminate in order to gain any utility.
    
    Your mortality bias is showing. “You have to wait an infinite amount of time” is only a meaningful objection when that costs you something.
    - casebash 7 Jan 2016 14:48 UTC
      −2 points
      Parent
      How would you rate your maths ability?
      - OrphanWilde 7 Jan 2016 20:29 UTC
        4 points
        Parent
        Better than your philosophic ability.
        
        I can give you solutions for all your sample problems. The apple-swapping problem is a prisoner’s dilemma; agree to split the utilon and get out. The biggest-number problem can be easily resolved by stepping outside the problem framework with a simple pyramid scheme (create enough utilons to create X more entities who can create utility; each entity then creates enough utility to make X entities plus pay its creator three times its creation cost. Creator then spends two thirds of those utilons creating new entities, and the remaining third on itself. Every entity engages in this scheme, ensuring exponentially-increasing utility for everybody. Adjust costs and payouts however you want, infinite utility is infinite utility.) There are sideways solutions for just about any problem.
        
        The problem isn’t that any of your little sample problems don’t have solutions, the problem is that you’ve already carefully eliminated all the solutions you can think of, and will keep eliminating solutions until nobody can think of a solution—if I suggested the pyramid scheme, I’m sure you’d say I’m not allowed to create new entities using my utilons, because I’m breaking what your thought experiment was intended to convey and just showing off.
        
        I bypassed all of that and got to the point—you’re not criticizing rationality for its failure to function in this universe, you’re criticizing rationality for its behavior in radically difference universes and the failure of that behavior to conform to basic sanity-checks that only make sense in the universe you yourself happen to occupy.
        
        Rationality belongs to the universe. In a bizarre and insane universe, rational behavior is bizarre and insane, as it should be.
        casebash 8 Jan 2016 0:32 UTC
        0 points
        Parent
        Sorry, I was being rude then.
        
        The problem is: 1) 0 times infinity is undefined not 0 2) You are talking about infinity as something that can be reached, when it is only something that can be approached.
        
        These are both very well known mathematical properties.
        
        “If I suggested the pyramid scheme, I’m sure you’d say I’m not allowed to create new entities using my utilons”—If you read Richard Kennawy’s comment—you’ll see that utilions are not what you think that they are.
        
        “The apple-swapping problem is a prisoner’s dilemma; agree to split the utilon and get out.”—You may want to read this link. “Likewise, people who responds to the Trolley problem by saying that they would call the police are not talking about the moral intuitions that the Trolley problem intends to explore. There’s nothing wrong with you if those problems are not interesting to you. But fighting the hypothetical by challenging the premises of the scenario is exactly the same as saying, “I don’t find this topic interesting for whatever reason, and wish to talk about something I am interested in.”″
        OrphanWilde 8 Jan 2016 13:32 UTC
        0 points
        Parent
        
        1) 0 times infinity is undefined not 0
        
        Correct. Now, observe that’s you’ve created multiple problems with massive “Undefined” where any optimization is supposed to take place, and then claimed you’ve proven that optimization is impossible.
        
        You are talking about infinity as something that can be reached, when it is only something that can be approached.
        
        No, I am not. I never assume anybody ends up with the apple/utilon, for example. There’s just never a point where it makes sense to stop, so you should never stop. If this doesn’t make sense to you and offends your sensibilities, well, quit constructing nonsensical scenarios that don’t match the reality you understand.
        
        If you read Richard Kennawy’s comment—you’ll see that utilions are not what you think that they are.
        
        They’re not anything at all, which was my point about you letting abstract things do all your heavy lifting for you.
        
        “The apple-swapping problem is a prisoner’s dilemma; agree to split the utilon and get out.”—You may want to read this link. “Likewise, people who responds to the Trolley problem by saying that they would call the police are not talking about the moral intuitions that the Trolley problem intends to explore. There’s nothing wrong with you if those problems are not interesting to you. But fighting the hypothetical by challenging the premises of the scenario is exactly the same as saying, “I don’t find this topic interesting for whatever reason, and wish to talk about something I am interested in.”″
        
        I do believe I already addressed the scenarios you raised.