abramdemski comments on Counterfactual Mugging: Why should you pay?

abramdemski 23 Dec 2019 22:06 UTC
2 points
I actually agree with Elizier’s argument that winning is more important than abstract conventions of thought. It’s just it’s not always clear which option is winning. Indeed here, as I’ve argued, winning seems to match more directly to not paying and abstract conventions of thought to the arguments about the counterfactual.
It seems to me as if you’re ignoring the general thrust of my position, which is that the notion of winning that’s important is the one we have in-hand when we are thinking about what decision procedure to use. This seems to strongly favor paying up in counterfactual mugging, except for some finite set of counterfactual muggings which we already know about at the time when we consider this.
Yeah I’m not disputing pre-committing to UDT for future actions, the question is more difficult when it comes to past actions. One thought: even if you’re in a counterfactual mugging that was set up before you came into existence, before you learn about it you might have time to pre-commit to paying in any such situations.
It seems right to focus on future actions, because those are the ones which our current thoughts about which decision theory to adopt will influence.
Well, this is the part of the question I’m interested in. As I said, I have no objection to pre-committing to UDT for future actions
So is it that we have the same position with respect to future counterfactual muggings, but you are trying to figure out how to deal with present ones?
I think that since no agent can be perfect from the start, we always have to imagine that an agent will make some mistakes before it gets on the right track. So if it refuses to be counterfactually mugged a few times before settling on a be-mugged strategy, we cannot exactly say that was rational or irrational; it depends on the prior. An agent might assent or refuse to pay up on a counterfactual mugging on the 5th digit of $π$ . We can’t absolutely call that right or wrong.
So, I think how an agent deals with a single counterfactual muggings is kind of its own business. It is only clear that it should not refuse mugging forever. (And if it refuses mugging for a really long time, this feels not so good, even if it would eventually start being mugged.)
- Chris_Leong 24 Dec 2019 12:35 UTC
  2 points
  Parent
  It seems to me as if you’re ignoring the general thrust of my position, which is that the notion of winning that’s important is the one we have in-hand when we are thinking about what decision procedure to use
  Why can’t I use this argument for CDT in Newcomb’s?
  It seems right to focus on future actions, because those are the ones which our current thoughts about which decision theory to adopt will influence.
  What I meant to say instead of future actions is that it is clear that we should commit to UDT for future muggings, but less clear if the mugging was already set up.
  I think that since no agent can be perfect from the start, we always have to imagine that an agent will make some mistakes before it gets on the right track
  The agent should still be able to solve such scenarios given a sufficient amount of time to think and the necessary starting information. Such as reliable reports about what happened to others who encountered counterfactual muggers
  - abramdemski 30 Dec 2019 3:03 UTC
    4 points
    Parent
    Why can’t I use this argument for CDT in Newcomb’s?
    From my perspective right now, CDT does worse in Newcomb’s. So, considering between CDT and EDT as ways of thinking about Newcomb, EDT and other 1-boxing DTs are better.
    What I meant to say instead of future actions is that it is clear that we should commit to UDT for future muggings, but less clear if the mugging was already set up.
    Even UDT advises to not give in to muggings if it already knows, in its prior, that it is in the world where Omega asks for the $10. But you have to ask: who would be motivated to create such a UDT? Only “parents” who already knew the mugging outcome themselves, and weren’t motivated to act updatelessly about it. And where did they come from? At some point, more-rational agency comes from less-rational agency. In the model where a CDT agent self-modifies to become updateless, which counterfactual muggings the UDT agent will and won’t be mugged by gets baked in at that time. With evolved creatures, of course it is more complicated.
    I’m not sure, but it seems like our disagreement might be around the magnitude of this somehow. Like, I’m saying something along the lines of “Sure, you refuse some counterfactual muggings, but only finitely many. From the outside, that looks like making a finite number of mistakes and then learning.” While you’re saying something like, “Sure, you’d rather get counterfactually mugged for all future muggings, but it still seems like you want to take the one in front of you.” (So from my perspective you’re putting yourself in the shoes of an agent who hasn’t “learned better” yet.)
    The analogy is a little strained, but I am thinking about it like a Bayesian update. If you keep seeing things go a certain way, you eventually predict that. But that doesn’t make it irrational to hedge your bets for some time. So it can be rational in that sense to refuse some counterfactual muggings. But you should eventually take them.
    The agent should still be able to solve such scenarios given a sufficient amount of time to think and the necessary starting information. Such as reliable reports about what happened to others who encountered counterfactual muggers
    Basically, I don’t think that way of thinking completely holds when we’re dealing with logical uncertainty. A counterlogical mugging is a situation where time to think can, in a certain sense, hurt (if you fully update on that thinking, anyway). So there isn’t such a clear distinction between thinking-from-starting-information and learning from experience.
    - Chris_Leong 30 Dec 2019 12:39 UTC
      2 points
      Parent
      I’m not sure, but it seems like our disagreement might be around the magnitude of this somehow
      My interest is in the counterfactual mugging in front of you, as this is the hardest part to justify. Future muggings aren’t a difficult problem.
      Basically, I don’t think that way of thinking completely holds when we’re dealing with logical uncertainty. A counterlogical mugging is a situation where time to think can, in a certain sense, hurt (if you fully update on that thinking, anyway)
      Are you saying that it will pre-commit to something before it receives all the information?
      - abramdemski 30 Dec 2019 21:22 UTC
        3 points
        Parent
        My interest is in the counterfactual mugging in front of you, as this is the hardest part to justify. Future muggings aren’t a difficult problem.
        I’m not sure exactly what you’re getting at, though. Obviously counterfactual mugging in front of you is always the one that matters, in some sense. But if I’ve considered things ahead of time already when confronted with my very first counterfactual mugging, then I may have decided to handle counterfactual mugging by paying up in general. And further, there’s the classic argument that you should always consider what you would have committed to ahead of time.
        I’m kind of feeling like you’re ignoring those arguments, or something? Or they aren’t interesting for your real question?
        Basically I keep talking about how “yes you can refuse a finite number of muggings” because I’m trying to say that, sure, you don’t end up concluding you should accept every mugging, but generally the argument via treat-present-cases-as-if-they-were-future-cases seems pretty strong. And the response I’m hearing from you sounds like “but what about present cases?”
        Chris_Leong 30 Dec 2019 22:16 UTC
        2 points
        Parent
        “Basically I keep talking about how “yes you can refuse a finite number of muggings”″ - considering I’m considering the case when you are only mugged once, that sounds an awful lot like saying it’s reasonable to choose not to pay.
        “But if I’ve considered things ahead of time”—a key part of counterfactual mugging is that you haven’t considered things ahead of time. I think it is important to engage with this aspect or explain why this doesn’t make sense.
        “And further, there’s the classic argument that you should always consider what you would have committed to ahead of time”—imagine instead of $50 it was your hand being cut off to save your life in the counterfactual. It’s going to be awfully tempting to keep your hand. Why is what you would have committed to, but didn’t relevant?
        My goal is to understand versions that haven’t been watered down or simplified.
        abramdemski 31 Dec 2019 23:43 UTC
        4 points
        Parent
        considering I’m considering the case when you are only mugged once, that sounds an awful lot like saying it’s reasonable to choose not to pay.
        The perspective I’m coming from is that you have to ask how you came to be in the epistemic situation you’re in. Setting agents up in decision problems “from nothing” doesn’t tell us much, if it doesn’t make sense for an agent to become confident that it’s in that situation.
        An example of this is smoking lesion. I’ve written before about how the usual version doesn’t make very much sense as a situation that an agent can find itself in.
        The best way to justify the usual “the agent finds itself in a decision problem” way of working is to have a learning-theoretic setup in which a learning agent can successfully learn that it’s in the scenario. Once we have that, it makes sense to think about the one-shot case, because we have a plausible story whereby an agent comes to believe it’s in the situation described.
        This is especially important when trying to account for logical uncertainty, because now everything is learned—you can’t say a rational agent should be able to reason in a particular way, because the agent is still learning to reason.
        If an agent is really in a pure one-shot case, that agent can do anything at all. Because it has not learned yet. So, yes, “it’s reasonable to choose not to pay”, BUT ALSO any behavior at all is reasonable in a one-shot scenario, because the agent hasn’t had a chance to learn yet.
        This doesn’t necessarily mean you have to deal with an iterated counterfactual mugging. You can learn enough about the universe to be confident you’re now in a counterfactual mugging without ever having faced one before. But
        a key part of counterfactual mugging is that you haven’t considered things ahead of time. I think it is important to engage with this aspect or explain why this doesn’t make sense.
        This goes along with the idea that it’s unreasonable to consider agents as if they emerge spontaneously from a vacuum, face a single decision problem, and then disappear. An agent is evolved or built or something. This ahead-of-time work can’t be in principle distinguished from “thinking ahead”.
        As I said above, this becomes especially clear if we’re trying to deal with logical uncertainty on top of everything else, because the agent is still learning to reason. The agent has to have experience reasoning about similar stuff in order to learn.
        We can give a fresh logical inductor a bunch of time to think about one thing, but how it spends that time is by thinking about all sorts of other logical problems in order to train up its heuristic reasoning. This is why I said all games are iterated games in logical time—the logical inductor doesn’t literally play the game a bunch of times to learn, but it simulates a bunch of parallel-universe versions of itself who have played a bunch of very similar games, which is very similar.
        imagine instead of $50 it was your hand being cut off to save your life in the counterfactual. It’s going to be awfully tempting to keep your hand. Why is what you would have committed to, but didn’t relevant?
        One way of appealing to human moral intuition (which I think is not vacuous) is to say, what if you know that someone is willing to risk great harm to save your life because they trust you the same, and you find yourself in a situation where you can sacrifice your own hand to prevent a fatal injury from happening to them? It’s a good deal; it could have been your life on the line.
        But really my justification is more the precommitment story. Decision theory should be reflectively endorsed decision theory. That’s what decision theory basically is: thinking we do ahead of time which is supposed to help us make decisions. I’m fine with imagining hypothetically that we haven’t thought about things ahead of time, as an exercise to help us better understand how to think. But that means my take-away from the exercise is based on which ways of thinking seemed to help get better outcomes, in the hypothetical situations envisioned!
        Chris_Leong 1 Jan 2020 14:30 UTC
        1 point
        Parent
        If an agent is really in a pure one-shot case, that agent can do anything at all
        You can learn about a situation other than by facing that exact situation yourself. For example, you may observe other agents facing that situation or receive testimony from an agent that has proven itself trustworthy. You don’t even seem to disagree with me here as you wrote: “you can learn enough about the universe to be confident you’re now in a counterfactual mugging without ever having faced one before”
        “This goes along with the idea that it’s unreasonable to consider agents as if they emerge spontaneously from a vacuum, face a single decision problem, and then disappear”—I agree with this. I asked this question because I didn’t have a good model of how to conceptualise decision theory problems, although I think I have a clearer idea now that we’ve got the Counterfactual Prisoner’s Dilemma.
        One way of appealing to human moral intuition
        Doesn’t work on counter-factually selfish agents
        Decision theory should be reflectively endorsed decision theory. That’s what decision theory basically is: thinking we do ahead of time which is supposed to help us make decisions
        Thinking about decisions before you make them != thinking about decisions timelessly
        abramdemski 1 Jan 2020 20:23 UTC
        4 points
        Parent
        You can learn about a situation other than by facing that exact situation yourself. For example, you may observe other agents facing that situation or receive testimony from an agent that has proven itself trustworthy. You don’t even seem to disagree with me here as you wrote: “you can learn enough about the universe to be confident you’re now in a counterfactual mugging without ever having faced one before”
        Right, I agree with you here. The argument is that we have to understand learning in the first place to be able to make these arguments, and iterated situations are the easiest setting to do that in. So if you’re imagining that an agent learns what situation it’s in more indirectly, but thinks about that situation differently than an agent who learned in an iterated setting, there’s a question of why that is. It’s more a priori plausible to me that a learning agent thinks about a problem by generalizing from similar situations it has been in, which I expect to act kind of like iteration.
        Or, as I mentioned re: all games are iterated games in logical time, the agent figures out how to handle a situation by generalizing from similar scenarios across logic. So any game we talk about is iterated in this sense.
        >One way of appealing to human moral intuition
        Doesn’t work on counter-factually selfish agents
        I disagree. Reciprocal altruism and true altruism are kind of hard to distinguish in human psychology, but I said “it’s a good deal” to point at the reciprocal-altruism intuition. The point being that acts of reciprocal altruism can be a good deal w/o having considered them ahead of time. It’s perfectly possible to reason “it’s a good deal to lose my hand in this situation, because I’m trading it for getting my life saved in a different situation; one which hasn’t come about, but could have.”
        I kind of feel like you’re just repeatedly denying this line of reasoning. Yes, the situation in front of you is that you’re in the risk-hand world rather than the risk-life world. But this is just question-begging with respect to updateful reasoning. Why give priority to that way of thinking over the “but it could just as well have been my life at steak” world? Especially when we can see that the latter way of reasoning does better on average?
        >Decision theory should be reflectively endorsed decision theory. That’s what decision theory basically is: thinking we do ahead of time which is supposed to help us make decisions
        Thinking about decisions before you make them != thinking about decisions timelessly
        Ah, that’s kind of the first reply from you that’s surprised me in a bit. Can you say more about that? My feeling is that in this particular case the equality seems to hold.
        Chris_Leong 2 Jan 2020 0:24 UTC
        2 points
        Parent
        The argument is that we have to understand learning in the first place to be able to make these arguments, and iterated situations are the easiest setting to do that in
        Iterated situations are indeed useful for understanding learning. But I’m trying to abstract out over the learning insofar as I can. I care that you get the information required for the problem, but not so much how you get it.
        Especially when we can see that the latter way of reasoning does better on average?
        The average includes worlds that you know you are not in. So this doesn’t help us justify taking these counterfactuals into account, indeed for us to care about the average we need to already have an independent reason to care about these counterfactuals.
        I kind of feel like you’re just repeatedly denying this line of reasoning. Yes, the situation in front of you is that you’re in the risk-hand world rather than the risk-life world. But this is just question-begging with respect to updateful reasoning.
        I’m not saying you should reason in this way. You should reason updatelessly. But in order to get to the point of finding the Counterfactual Prisoner’s Dilemma, while I consider a satisfactory justification, I had rigorously question every other solution until I found one which could withstand the questioning. This seems like a better solution as it is less dependent on tricky to evaluate philosophical claims.
        Ah, that’s kind of the first reply from you that’s surprised me in a bit
        Well, thinking about a decision after you make it won’t do you much good. So you’re pretty always thinking about decisions before you make them. But timelessness involves thinking about decision before you end up facing them.
        abramdemski 2 Jan 2020 7:50 UTC
        2 points
        Parent
        Iterated situations are indeed useful for understanding learning. But I’m trying to abstract out over the learning insofar as I can. I care that you get the information required for the problem, but not so much how you get it.
        OK, but I don’t see how that addresses my argument.
        The average includes worlds that you know you are not in. So this doesn’t help us justify taking these counterfactuals into account,
        This is the exact same response again (ie the very kind of response I was talking about in my remark you’re responding to), where you beg the question of whether we should evaluate from an updateful perspective. Why is it problematic that we already know we are not in those worlds? Because you’re reasoning updatefully? My original top-level answer explained why I think this is a circular justification in a way that the updateless position isn’t.
        I’m not saying you should reason in this way. You should reason updatelessly.
        Ok. So what’s at steak in this discussion is the justification for updatelessness, not the whether of updatelessness.
        I still don’t get why you seem to dismiss my justification for updatelessness, though. All I’m understanding of your objection is a question-begging appeal to updatelful reasoning.
        Expand this thread
        Chris_Leong 2 Jan 2020 12:52 UTC
        2 points
        Parent
        You feel that I’m begging the question. I guess I take only thinking about this counterfactual as the default position, as where an average person is likely to be starting from. And I was trying to see if I could find an argument strong enough to displace this. So I’ll freely admit I haven’t provided a first-principles argument for focusing just on this counterfactual.
        OK, but I don’t see how that addresses my argument.
        Your argument is that we need to look at iterated situations to understand learning. Sure, but that doesn’t mean that we have to interpret every problem in iterated form. If we need to understand learning better, we can look at a few iterated problems beforehand, rather than turning this one into an iterated problems.
        The average includes worlds that you know you are not in. So this doesn’t help us justify taking these counterfactuals into account,
        Let me explain more clearly why this is a circular argument:
        a) You want to show that we should take counterfactuals into account when making decisions
        b) You argue that this way of making decisions does better on average
        c) The average includes the very counterfactuals whose value is in question. So b depends on a already being proven ⇒ circular argument
        abramdemski 3 Jan 2020 5:13 UTC
        2 points
        Parent
        Let me explain more clearly why this is a circular argument:
        a) You want to show that we should take counterfactuals into account when making decisions
        b) You argue that this way of making decisions does better on average
        c) The average includes the very counterfactuals whose value is in question. So b depends on a already being proven ⇒ circular argument
        That isn’t my argument though. My argument is that we ARE thinking ahead about counterfactual mugging right now, in considering the question. We are not misunderstanding something about the situation, or missing critical information. And from our perspective right now, we can see that agreeing to be mugged is the best strategy on average.
        We can see that if we update on the value of the coin flip being tails, we would change our mind about this. But the statement of the problem requires that there is also the possibility of heads. So it does not make sense to consider the tails scenario in isolation; that would be a different decision problem (one in which Omega asks us for $100 out of the blue with no other significant backstory).
        So we (right now, considering how to reason about counterfactual muggings in the abstract) know that there are the two possibilities, with equal probability, and so the best strategy on average is to pay. So we see behaving updatefully as bad.
        So my argument for considering the multiple possibilities is, the role of thinking about decision theory now is to help guide the actions of my future self.
        You feel that I’m begging the question. I guess I take only thinking about this counterfactual as the default position, as where an average person is likely to be starting from. And I was trying to see if I could find an argument strong enough to displace this. So I’ll freely admit I haven’t provided a first-principles argument for focusing just on this counterfactual.
        I think the average person is going to be thinking about things like duty, honor, and consistency which can serve some of the purpose of updatelessness. But sure, updateful reasoning is a natural kind of starting point, particularly coming from a background of modern economics or bayesian decision theory.
        But my argument is compatible with that starting point, if you accept my “the role of thinking about decision theory now is to help guide future actions” line of thinking. In that case, starting from updateful assumptions now, decision-theoretic reasoning makes you think you should behave updatelessly in the future.
        Whereas the assumption you seem to be using, in your objection to my line of reasoning, is “we should think of decision-theoretic problems however we think of problems now”. So if we start out an updateful agent, we would think about decision-theoretic problems and think “I should be updateful”. If we start out a CDT agent, then when we think about decision-theoretic problems we would conclude that you should reason causally. EDT agents would think about problems and conclude you should reason evidentially. And so on. That’s the reasoning I’m calling circular.
        Of course an agent should reason about a problem using its best current understanding. But my claim is that when doing decision theory, the way that best understanding should be applied is to figure out what decision theory does best, not to figure out what my current decision theory already does. And when we think about problems like counterfactual mugging, the description of the problem requires that there’s both the possibility of heads and tails. So “best” means best overall, not just down the one branch.
        If the act of doing decision theory were generally serving the purpose of aiding in making the current decision, then my argument would not make sense, and yours would. Current-me might want to tell the me in that universe to be more updateless about things, but alternate-me would not be interested in hearing it, because alternate-me wouldn’t be interested in thinking ahead in general, and the argument wouldn’t make any sense with respect to alternate-me’s current decision.
        So my argument involves a fact about the world which I claim determines which of several ways to reason, and hence, is not circular.
        Chris_Leong 3 Jan 2020 15:01 UTC
        2 points
        Parent
        My argument is that we ARE thinking ahead about counterfactual mugging right now, in considering the question
        When we think about counterfactual muggings, we naturally imagine the possibility of facing a counterfactual mugging in the future. I don’t dispute the value of pre-committing either to take a specific action or to acting updatelessly. However, instead of imagining a future mugging, we could also imagine a present mugging where we didn’t have time to make any pre-commitments. I don’t think it is immediately obvious that we should think updatelessly, instead I believe that it requires further justification.
        The role of thinking about decision theory now is to help guide the actions of my future self
        This is effectively an attempt at proof-by-definition
        I think the average person is going to be thinking about things like duty, honor, and consistency which can serve some of the purpose of updatelessness. But sure, updateful reasoning is a natural kind of starting point, particularly coming from a background of modern economics or bayesian decision theory
        If someone’s default is already updateless reasoning, then there’s no need for us to talk them into it. It’s only people with an updateful default that we need to convince (until recently I had an updateful default).
        And when we think about problems like counterfactual mugging, the description of the problem requires that there’s both the possibility of heads and tails
        It requires a counterfactual possibility, not an actual possibility. And a counterfactual possibility isn’t actual, it’s counter to the factual. So it’s not clear this has any relevance.
        It looks like to me that you’re tripping yourself up with verbal arguments that aren’t at all obviously true. The reason why I believe that the Counterfactual Prisoner’s Dilemma is important is because it is a mathematical result that doesn’t require much in the way of assumptions. Sure, it still has to be interpreted, but it seems hard to find an interpretations that avoids the conclusion that the updateful perspective doesn’t quite succeed on its own terms.