anon85 comments on A few misconceptions surrounding Roko’s basilisk

anon85 7 Oct 2015 0:57 UTC
−1 points
Somehow, blackmail from the future seems less plausible to me than every single one of your examples. Not sure why exactly.
- Richard_Kennaway 7 Oct 2015 8:18 UTC
  4 points
  Parent
  
  Somehow, blackmail from the future seems less plausible to me than every single one of your examples. Not sure why exactly.
  
  How plausible do you find TDT and related decision theories as normative accounts of decision making, or at least as work towards such accounts? They open whole new realms of situations like Pascal’s Mugging, of which Roko’s Basilisk is one. If you’re going to think in detail about such decision theories, and adopt one as normative, you need to have an answer to these situations.
  
  Once you’ve decided to study something seriously, the plausibility heuristic is no longer available.
  - anon85 7 Oct 2015 17:35 UTC
    −5 points
    Parent
    I find TDT to be basically bullshit except possibly when it is applied to entities which literally see each others’ code, in which case I’m not sure (I’m not even sure if the concept of “decision” even makes sense in that case).
    
    I’d go so far as to say that anyone who advocates cooperating in a one-shot prisoners’ dilemma simply doesn’t understand the setting. By definition, defecting gives you a better outcome than cooperating. Anyone who claims otherwise is changing the definition of the prisoners’ dilemma.
    - Rob Bensinger 7 Oct 2015 18:12 UTC
      2 points
      Parent
      Defecting gives you a better outcome than cooperating if your decision is uncorrelated with the other players’. Different humans’ decisions aren’t 100% correlated, but they also aren’t 0% correlated, so the rationality of cooperating in the one-shot PD varies situationally for humans.
      
      Part of the reason why humans often cooperate in PD-like scenarios in the real world is probably that there’s uncertainty about how iterated the PD is (and our environment of evolutionary adaptedness had a lot more iterated encounters than once-off encounters). But part of the reason for cooperation is probably also that we’ve evolved to do a very weak and probabilistic version of ‘source code sharing’: we’ve evolved to (sometimes) involuntarily display veridical evidence of our emotions, personality, etc. -- as opposed to being in complete control of the information we give others about our dispositions.
      
      Because they’re at least partly involuntary and at least partly veridical, ‘tells’ give humans a way to trust each other even when there are no bad consequences to betrayal—which means at least some people can trust each other at least some of the time to uphold contracts in the absence of external enforcement mechanisms. See also Newcomblike Problems Are The Norm.
      - anon85 8 Oct 2015 2:16 UTC
        −6 points
        Parent
        
        Defecting gives you a better outcome than cooperating if your decision is uncorrelated with the other players’. Different humans’ decisions aren’t 100% correlated, but they also aren’t 0% correlated, so the rationality of cooperating in the one-shot PD varies situationally for humans.
        
        You’re confusing correlation with causation. Different players’ decision may be correlated, but they sure as hell aren’t causative of each other (unless they literally see each others’ code, maybe).
        
        But part of the reason for cooperation is probably also that we’ve evolved to do a very weak and probabilistic version of ‘source code sharing’: we’ve evolved to (sometimes) involuntarily display veridical evidence of our emotions, personality, etc. -- as opposed to being in complete control of the information we give others about our dispositions.
        
        Calling this source code sharing, instead of just “signaling for the purposes of a repeated game”, seems counter-productive. Yes, I agree that in a repeated game, the situation is trickier and involves a lot of signaling. The one-shot game is much easier: just always defect. By definition, that’s the best strategy.
        Houshalter 8 Oct 2015 14:40 UTC
        4 points
        Parent
        
        You’re confusing correlation with causation. Different players’ decision may be correlated, but they sure as hell aren’t causative of each other (unless they literally see each others’ code, maybe). [...] The one-shot game is much easier: just always defect. By definition, that’s the best strategy.
        
        Imagine you are playing against a clone of yourself. Whatever you do, the clone will do the exact same thing. If you choose to cooperate, he will choose to cooperate. If you choose to defect, he chooses to defect.
        
        The best choice is obviously to cooperate.
        
        So there are situations where cooperating is optimal. Despite there not being any causal influence between the players at all.
        
        I think these kinds of situations are so exceedingly rare and unlikely they aren’t worth worrying about. For all practical purposes, the standard game theory logic is fine. But it’s interesting that they exist. And some people are so interested by that, that they’ve tried to formalize decision theories that can handle these situations. And from there you can possibly get counter-intuitive results like the basilisk.
        anon85 8 Oct 2015 22:46 UTC
        −8 points
        Parent
        If I’m playing my clone, it’s not clear that even saying that I’m making a choice is well-defined. After all, my choice will be what my code dictates it will be. Do I prefer that my code cause me to accept? Sure, but only because we stipulated that the other player shares the exact same code; it’s more accurate to say that I prefer my opponent’s code to cause him to defect, and it just so happens that his code is the same as mine.
        
        In real life, my code is not the same as my opponent’s, and when I contemplate a decision, I’m only thinking about what I want my code to say. Nothing I do changes what my opponent does; therefore, defecting is correct.
        
        Let me restate once more: the only time I’d ever want to cooperate in a one-shot prisoners’ dilemma was if I thought my decision could affect my opponent’s decision. If the latter is the case, though, then I’m not sure if the game was even a prisoners’ dilemma to begin with; instead it’s some weird variant where the players don’t have the ability to independently make decisions.
        Houshalter 9 Oct 2015 6:41 UTC
        4 points
        Parent
        
        If I’m playing my clone, it’s not clear that even saying that I’m making a choice is well-defined. After all, my choice will be what my code dictates it will be. Do I prefer that my code cause me to accept? Sure, but only because we stipulated that the other player shares the exact same code; it’s more accurate to say that I prefer my opponent’s code to cause him to defect, and it just so happens that his code is the same as mine.
        
        I think you are making this more complicated than it needs to be. You don’t need to worry about your code. All you need to know that it’s an exact copy of you playing. And that he will make the same decision you do. No matter how hard you think about your “code” or wish he would make a different choice, he will just do the same thing about you.
        
        In real life, my code is not the same as my opponent’s, and when I contemplate a decision, I’m only thinking about what I want my code to say. Nothing I do changes what my opponent does; therefore, defecting is correct.
        
        In real games with real humans, yes, usually. As I said, I don’t think these cases are common enough to worry about. But I’m just saying they exist.
        
        But it is more general than just clones. If you know your opponent isn’t exactly the same as you, but still follows the same decision algorithm in this case, the principle is still valid. If you cooperate, he will cooperate. Because you are both following the same process to come to a decision.
        
        the only time I’d ever want to cooperate in a one-shot prisoners’ dilemma was if I thought my decision could affect my opponent’s decision.
        
        Well there is no causal influence. Your opponent is deterministic. His choice may have already been made and nothing you do will change it. And yet the best decision is still to cooperate.
        anon85 9 Oct 2015 12:59 UTC
        −2 points
        Parent
        
        Well there is no causal influence. Your opponent is deterministic. His choice may have already been made and nothing you do will change it. And yet the best decision is still to cooperate.
        
        If his choice is already made and nothing I do will change it, then by definition my choice is already made and nothing I do will change it. That’s why my “decision” in this setting is not even well-defined—I don’t really have free will if external agents already know what I will do.
        Houshalter 10 Oct 2015 10:08 UTC
        2 points
        Parent
        Yes. The universe is deterministic. Your actions are completely predictable, in principle. That’s not unique to this thought experiment. That’s true for every thing you do. You still have to make a choice. Cooperate or defect?
        anon85 10 Oct 2015 19:36 UTC
        −3 points
        Parent
        
        Yes. The universe is deterministic. Your actions are completely predictable, in principle. That’s not unique to this thought experiment. That’s true for every thing you do. You still have to make a choice. Cooperate or defect?
        
        Um, what? First of all, the universe is not deterministic—quantum mechanics means there’s inherent randomness. Secondly, as far as we know, it’s consistent with the laws of physics that my actions are fundamentally unpredictable—see here.
        
        Third, if I’m playing against a clone of myself, I don’t think it’s even a valid PD. Can the utility functions ever differ between me and my clone? Whenever my clone gets utility, I get utility, because there’s no physical way to distinguish between us (I have no way of saying which copy “I” am). But if we always have the exact same utility—if his happiness equals my happiness—then constructing a PD game is impossible.
        
        Finally, even if I agree to cooperate against my clone, I claim this says nothing about cooperating versus other people. Against all agents that don’t have access to my code, the correct strategy in a one-shot PD is to defect, but first do/say whatever causes my opponent to cooperate. For example, if I was playing against LWers, I might first rant on about TDT or whatever, agree with my opponent’s philosophy as much as possible, etc., etc., and then defect in the actual game. (Note again that this only applies to one-shot games).
        Expand this thread
        entirelyuseless 10 Oct 2015 21:00 UTC
        0 points
        Parent
        Even if you’re playing against a clone, you can distinguish the copies by where they are in space and so on. You can see which side of the room you are on, so you know which one you are. That means one of you can get utility without the other one getting it.
        
        People don’t actually have the same code, but they have similar code. If the code in some case is similar enough that you can’t personally tell the difference, you should follow the same rule as when you are playing against a clone.
        anon85 10 Oct 2015 22:02 UTC
        0 points
        Parent
        
        You can see which side of the room you are on, so you know which one you are.
        
        If I can do this, then my clone and I can do different things. In that case, I can’t be guaranteed that if I cooperate, my clone will too (because my decision might have depended on which side of the room I’m on). But I agree that the cloning situation is strange, and that I might cooperate if I’m actually faced with it (though I’m quite sure that I never will).
        
        People don’t actually have the same code, but they have similar code. If the code in some case is similar enough that you can’t personally tell the difference, you should follow the same rule as when you are playing against a clone.
        
        How do you know if people have “similar” code to you? See, I’m anonymous on this forum, but in real life, I might pretend to believe in TDT and pretend to have code that’s “similar” to people around me (whatever that means—code similarity is not well-defined). So you might know me in real life. If so, presumably you’d cooperate if we played a PD, because you’d believe our code is similar. But I will defect (if it’s a one-time game). My strategy seems strictly superior to yours—I always get more utility in one-shot PDs.
        entirelyuseless 10 Oct 2015 22:08 UTC
        0 points
        Parent
        I would cooperate with you if I couldn’t distinguish my code from yours, even if there might be minor differences, even in a one-shot case, because the best guess I would have of what you would do is that you would do the same thing that I do.
        
        But since you’re making it clear that your code is quite different, and in a particular way, I would defect against you.
        anon85 10 Oct 2015 22:11 UTC
        0 points
        Parent
        
        But since you’re making it clear that your code is quite different, and in a particular way, I would defect against you.
        
        You don’t know who I am! I’m anonymous! Whoever you’d cooperate with, I might be that person (remember, in real life I pretend to have a completely different philosophy on this matter). Unless you defect against ALL HUMANS, you risk cooperating when facing me, since you don’t know what my disguise will be.
        entirelyuseless 11 Oct 2015 13:14 UTC
        0 points
        Parent
        I will take that chance into account. Fortunately it is a low one and should hardly be a reason to defect against all humans.
        anon85 11 Oct 2015 14:45 UTC
        −1 points
        Parent
        Cool, so in conclusion, if we met in real life and played a one-shot PD, you’d (probably) cooperate and I’d defect. My strategy seems superior.
        gjm 11 Oct 2015 17:30 UTC
        3 points
        Parent
        And yet I somehow find myself more inclined to engage in PD-like interactions with entirelyuseless than with your good self.
        anon85 11 Oct 2015 18:14 UTC
        −2 points
        Parent
        Oh, yes, me too. I want to engage in one-shot PD games with entirelyuseless (as opposed to other people), because he or she will give me free utility if I sell myself right. I wouldn’t want to play one-shot PDs against myself, in the same way that I wouldn’t want to play chess against Kasparov.
        
        By the way, note that I usually cooperate in repeated PD games, and most real-life PDs are repeated games. In addition, my utility function takes other people into consideration; I would not screw people over for small personal gains, because I care about their happiness. In other words, defecting in one-shot PDs is entirely consistent with being a decent human being.
        Rob Bensinger 8 Oct 2015 9:35 UTC
        4 points
        Parent
        
        You’re confusing correlation with causation. Different players’ decision may be correlated, but they sure as hell aren’t causative of each other (unless they literally see each others’ code, maybe).
        
        Causation isn’t necessary. You’re right that correlation isn’t quite sufficient, though!
        
        What’s needed for rational cooperation in the prisoner’s dilemma is a two-way dependency between A and B’s decision-making. That can be because A is causally impacting B, or because B is causally impacting B; but it can also occur when there’s a common cause and neither is causing the other, like when my sister and I have similar genomes even though my sister didn’t create my genome and I didn’t create her genome. Or our decision-making processes can depend on each other because we inhabit the same laws of physics, or because we’re both bound by the same logical/mathematical laws—even if we’re on opposite sides of the universe.
        
        (Dependence can also happen by coincidence, though if it’s completely random I’m not sure how’d you find out about it in order to act upon it!)
        
        The most obvious example of cooperating due to acausal dependence is making two atom-by-atom-identical copies of an agent and put them in a one-shot prisoner’s dilemma against each other. But two agents whose decision-making is 90% similar instead of 100% identical can cooperate on those grounds too, provided the utility of mutual cooperation is sufficiently large.
        
        For the same reason, a very large utility difference can rationally mandate cooperation even if cooperating only changes the probability of the other agent’s behavior from ’100% probability of defection’ to ‘99% probability of defection’.
        
        Calling this source code sharing, instead of just “signaling for the purposes of a repeated game”, seems counter-productive.
        
        I disagree! “Code-sharing” risks confusing someone into thinking there’s something magical and privileged about looking at source code. It’s true this is an unusually rich and direct source of information (assuming you understand the code’s implications and are sure what you’re seeing is the real deal), but the difference between that and inferring someone’s embarrassment from a blush is quantitative, not qualitative.
        
        Some sources of information are more reliable and more revealing than others; but the same underlying idea is involved whenever something is evidence about an agent’s future decisions. See: Newcomblike Problems are the Norm
        
        Yes, I agree that in a repeated game, the situation is trickier and involves a lot of signaling. The one-shot game is much easier: just always defect. By definition, that’s the best strategy.
        
        If you and the other player have common knowledge that you reason the same way, then the correct move is to cooperate in the one-shot game. The correct move is to defect when those conditions don’t hold strongly enough, though.
        anon85 8 Oct 2015 23:02 UTC
        0 points
        Parent
        
        The most obvious example of cooperating due to acausal dependence is making two atom-by-atom-identical copies of an agent and put them in a one-shot prisoner’s dilemma against each other. But two agents whose decision-making is 90% similar instead of 100% identical can cooperate on those grounds too, provided the utility of mutual cooperation is sufficiently large.
        
        I’m not sure what “90% similar” means. Either I’m capable of making decisions independently from my opponent, or else I’m not. In real life, I am capable of doing so. The clone situation is strange, I admit, but in that case I’m not sure to what extent my “decision” even makes sense as a concept; I’ll clearly decide whatever my code says I’ll decide. As soon as you start assuming copies of my code being out there, I stop being comfortable with assigning me free will at all.
        
        Anyway, none of this applies to real life, not even approximately. In real life, my decision cannot change your decision at all; in real life, nothing can even come close to predicting a decision I make in advance (assuming I put even a little bit of effort into that decision).
        
        If you’re concerned about blushing etc., then you’re just saying the best strategy in a prisoner’s dilemma involves signaling very strongly that you’re trustworthy. I agree that this is correct against most human opponents. But surely you agree that if I can control my microexpressions, it’s best to signal “I will cooperate” while actually defecting, right?
        
        Let me just ask you the following yes or no question: do you agree that my “always defect, but first pretend to be whatever will convince my opponent to cooperate” strategy beats all other strategies for a realistic one-shot prisoners’ dilemma? By one-shot, I mean that people will not have any memory of me defecting against them, so I can suffer no ill effects from retaliation.
    - mwengler 9 Oct 2015 13:49 UTC
      −6 points
      Parent
      
      I’d go so far as to say that anyone who advocates cooperating in a one-shot prisoners’ dilemma simply doesn’t understand the setting. By definition, defecting gives you a better outcome than cooperating. Anyone who claims otherwise is changing the definition of the prisoners’ dilemma.
      
      I think this is correct. I think the reason to cooperate is not to get the best personal outcome, but because you care about the other person. I think we have evolved to cooperate, or perhaps that should be stated as we have evolved to want to cooperate. We have evolved to value cooperating. Our values come from our genes and our memes, and both are subject to evolution, to natural selection. But we want to cooperate.
      
      So if I am in a prisoner’s dilemma against another human, if I perceive that other human as “one of us,” I will choose cooperation. Essentially, I care about their outcome. But in a one-shot PD defecting is the “better” strategy. The problem is that with genetic and/or memetic evolution of cooperation, we are not playing in a one-shot PD. We are playing with a set of values that developed over many shots.
      
      Of course we don’t always cooperate. But when we do cooperate in one-shot PD’s, it is because, in some sense, there are so darn many one-shot PD’s, especially in the universe of hypotheticals, that we effectively know there is no such thing as a one-shot PD. This should not be too hard to accept around here where people semi-routinely accept simulations of themselves or clones of themselves as somehow just as important as their actual selves. I.e. we don’t even accept the “one-shottedness” of ourselves.
      - Rob Bensinger 9 Oct 2015 19:58 UTC
        8 points
        Parent
        
        I think the reason to cooperate is not to get the best personal outcome, but because you care about the other person.
        
        If you have 100% identical consequentialist values to all other humans, then that means ‘cooperation’ and ‘defection’ are both impossible for humans (because they can’t be put in PDs). Yet it will still be correct to defect (given that your decision and the other player’s decision don’t strongly depend on each other) if you ever run into an agent that doesn’t share all your values. See The True Prisoner’s Dilemma.
        
        This shows that the iterated dilemma and the dilemma-with-common-knowledge-of-rationality allow cooperation (i.e., giving up on your goal to enable someone else to achieve a goal you genuinely don’t want them to achieve), whereas loving compassion and shared values merely change goal-content. To properly visualize the PD, you need an actual value conflict—e.g., imagine you’re playing against a serial killer in a hostage negotiation. ‘Cooperating’ is just an English-language label; the important thing is the game-theoretic structure, which allows that sometimes ‘cooperating’ looks like letting people die in order to appease a killer’s antisocial goals.
        Vaniver 9 Oct 2015 20:39 UTC
        2 points
        Parent
        
        To properly visualize the PD, you need an actual value conflict
        
        I think belief conflicts might work, even if the same values are shared. Suppose you and I are at a control panel for three remotely wired bombs in population centers. Both of us want as many people to live as possible. One bomb will go off in ten seconds unless we disarm it, but the others will stay inert unless activated. I believe that pressing the green button causes all bombs to explode, and pressing the red button defuses the time bomb. You believe the same thing, but with the colors reversed. Both of us would rather that no buttons be pressed than both buttons be pressed, but each of us would prefer that just the defuse button be pressed, and that the other person not mistakenly kill all three groups. (Here, attempting to defuse is ‘defecting’ and not attempting to defuse is ‘cooperating’.)
        
        [Edit]: As written, in terms of lives saved, this doesn’t have the property that (D,D)>(C,D); if I press my button, you are indifferent between pressing your button or not. So it’s not true that D strictly dominates C, but the important part of the structure is preserved, and a minor change could make it so D strictly dominates C.
        bogus 9 Oct 2015 20:50 UTC
        0 points
        Parent
        
        I think belief conflicts might work, even if the same values are shared.
        
        You can solve belief conflicts simply by trading in a prediction market with decision-contingent contracts (a “decision market”). Value conflicts are more general than that.
        Vaniver 9 Oct 2015 23:00 UTC
        2 points
        Parent
        
        Value conflicts are more general than that.
        
        I think this is misusing the word “general.” Value conflicts are more narrow than the full class of games that have the PD preference ordering. I do agree that value conflicts are harder to resolve than belief conflicts, but that doesn’t make them more general.
        bogus 9 Oct 2015 20:44 UTC
        0 points
        Parent
        
        If you have 100% identical consequentialist values to all other humans, then that means ‘cooperation’ and ‘defection’ are both impossible for humans (because they can’t be put in PDs). … To properly visualize the PD, you need an actual value conflict
        
        True, but the flip side of this is that efficiency (in Coasian terms) is precisely defined as pursuing 100% identical consequentialist values, where the shared “values” are determined by a weighted sum of each agent’s utility function (and the weights are typically determined by agent endowments).
      - anon85 10 Oct 2015 1:24 UTC
        −6 points
        Parent
        
        I think the reason to cooperate is not to get the best personal outcome, but because you care about the other person.
        
        I just want to make it clear that by saying this, you’re changing the setting of the prisoners’ dilemma, so you shouldn’t even call it a prisoners’ dilemma anymore. The prisoners’ dilemma is defined so that you get more utility by defecting; if you say you care about your opponent’s utility enough to cooperate, it means you don’t get more utility by defecting, since cooperation gives you utility. Therefore, all you’re saying is that you can never be in a true prisoners’ dilemma game; you’re NOT saying that in a true PD, it’s correct to cooperate (again, by definition, it isn’t).
        
        The most likely reason people are evolutionarily predisposed to cooperate in real-life PDs is that almost all real-life PDs are repeated games and not one-shot. Repeated prisoners’ dilemmas are completely different beasts, and it can definitely be correct to cooperate in them.