ryan_b comments on Pascal’s mugging in reward learning

ryan_b 21 Nov 2017 0:05 UTC
1 point
Reviewing the post with your update, I think the problem may just be that the examples are de-priming my intuition. In your reply you chose ‘the human doesn’t eat’ as the reward for a modified human to maximize, which means the gains are only all the food humans would eat if unmodified. This is compared to brain surgery, which a bit of googling suggests costs 50-150K, much more than it costs to feed a person. It looks like I chunked the proposition as ‘costly intervention to achieve bounded reward’ as a consequence.
However, none of this is actually implied by the math. Insofar as you project there are likely to be other readers like me, it may be worth changing the examples to emphasize a trivial intervention for a very high reward.
- Stuart_Armstrong 21 Nov 2017 9:53 UTC
  2 points
  Parent
  The brain surgery is an example of how the AI can transform us into the humans it wants us to be—an extreme version of wireheading.
  - ryan_b 21 Nov 2017 15:37 UTC
    3 points
    Parent
    That much I understood—my flaw was reading too much into the example.