mendel comments on Counterfactual Mugging

mendel 22 May 2011 11:31 UTC
−7 points
This is actually a parable on the boundaries of self (think a bit Buddhist here). Let me pose this another way: late last night in the pub, my past self committed to the drunken bet of $100 vs. $200 on the flip of a coin (the other guy was even more drunk than I was). My past self lost, but didn’t have the money. This morning, my present self gets a phone call from the person it lost to. Does it honor the bet? Assuming, as in typical in these hypothetical problems, that we can ignore the consequences (else we’d have to assign a cost to them that might well offset the gains, so we’ll just assign 0 and don’t consider them), a utilitarian approach is that I should default on the bet if I can get away with it. Why should I be responsible for what I said yesterday?

However, as usual in utilitarian dilemmas, the effect that we get in real-life is that we have a conscience—can I live with myself being the kind of person that doesn’t honor past commitments? So, most people will, out of one consideration or another, not think twice about paying up the $100.

Of Omega it is said that I can trust it more than I would myself. It knows more about me than I do myself. It would be part of myself if I didn’t consider it seperate from myself. If I consider my ego and Omega part of the same all-encompassing self, then honoring the commitment that Omega committed itself to on my behalf should draw the same response as if I had done it myself. Only if I perceive Omega as a separate entity to whom I am not morally obligated can I justify not paying the $100. Only with this individualist viewpoint will I see someone whom I am not obligated to in any way demanding $100 of me.

If you manage to instill your AI with a sense of the “common good”, a sense of brotherhood of all intelligent creatures, then it will, given the premises of trust etc., cooperate in this brotherhood—in fact, that is what I believe would be one of the meanings of “friendly”.
- AlephNeil 22 May 2011 12:45 UTC
  2 points
  Parent
  Your version of the story discards the most important ingredient: The fact that when you win the coin toss, you only receive money if you would have paid had you lost.
  
  As for Omega, all we know about it is that somehow it can accurately predict your actions. For the purposes of Counterfactual Mugging we may as well regard Omega as a mindless robot which will burn the money you give to it and then self-destruct immediately after the game. (This makes it impossible to pay because you feel obligated to Omega. In fact, the idea is that you pay up because you feel obligated to your counterfactual self.)
  - mendel 22 May 2011 22:30 UTC
    −4 points
    Parent
    I don’t see how your points apply: I would have paid had I lost. Except if my hypothetical self is so much in debt that it can’t reasonably spend $100 on an investment such as this—in which case Omega would have known in advance, and understands my nonpayment.
    
    I do not consider the future existence of Omega as a factor at all, so it doesn’t matter whether it self-destructs or not. And it is also a given that Omega is absolutely trustworthy (more than I could say for myself).
    
    My view is that this may well be one of the undecidable theorems that Goedel has shown must exist in any reasonably complex formal system. The only way to make it decidable is to think out of the box, and in this case it means that I consider that someone else is somehow still “me” (at least under ethical aspects) - there are other threads on here that involve splitting myself and still remaining the same person somehow, so it’s not intrinsically irrational or anything. My reference to Buddhism was merely meant to show that the concept is mainstream enough to be part of a major world religion, though most other religions and the UN charta of human rights have it as well, though not as pronounced, as “brotherhood”—not a factual, but an ethical identity.
  - mendel 23 May 2011 9:27 UTC
    −6 points
    Parent
    After a good night’s sleep, here are some more thoughts:
    
    the idea is that you pay up because you feel obligated to your counterfactual self.
    
    To feel obligated to my counterfactual self, which exists only in the “mind” of Omega, and not feel obligated to Omega doesn’t make any sense to me.
    
    Your additional assumptions about Omega destroy the utility that the $100 had—in the original version, $100 is $100 to both me and Omega, but in your version it is nothing to Omega. Your amended version of the problem amounts to “would I throw $100 into an incinerator on the basis of some thought experiment”, and that is clearly not even a zero-sum game if you consider the whole system—the original problem is zero-sum, and that gives me more freedom of choice.