Eliezer Yudkowsky comments on Counterfactual Mugging

Eliezer Yudkowsky 19 Mar 2009 19:48 UTC
16 points
0
I work on AI. In particular, on decision systems stable under self-modification. Any agent who does not give the $100 in situations like this will self-modify to give $100 in situations like this. I don’t spend a whole lot of time thinking about decision theories that are unstable under reflection. QED.
- thomblake 19 Mar 2009 19:52 UTC
  1 point
  0
  Parent
  Even considering situations like this and having special cases for them sounds like it would add a bit much cruft to the system.
  
  Do you have a working AI that I could look at to see how this would work?
  - Eliezer Yudkowsky 19 Mar 2009 19:54 UTC
    12 points
    0
    Parent
    If you need special cases, your decision theory is not consistent under reflection. In other words, it should simply always do the thing that it would precommit to doing, because, as MBlume put it, the decision theory is formulated in such fashion that “What would you precommit to?” and “What will you do?” work out to be one and the same question.
    - pjeby 19 Mar 2009 22:24 UTC
      1 point
      0
      Parent
      But this is precisely what humans don’t do, because we respond to a “near” situation differently than a “far” one. Your advance prediction of your decision is untrustworthy unless you can successfully simulate the real future environment in your mind with sufficient sensory detail to invoke “near” reasoning. Otherwise, you will fail to reach a consistent decision in the actual situation.
      
      Unless of course, In the actual situation, you’re projecting back, “What would I have decided in advance to do had I thought about this in advance?”—and you successfully mitigate all priming effects and situationally-motivated reasoning.
      
      Or to put all of the above in short, common-wisdom form: “that’s easy for you to say NOW...” ;-)