nigerweiss comments on Inferring Values from Imperfect Optimizers

nigerweiss 30 Dec 2012 6:27 UTC
0 points
Thank you, that was interesting reading. If I’m not mistaken, though, the Nielsen-Jenson paper is talking about how to make the value inference more robust in the presence of contradictory behavior. It doesn’t seem to me that this sort of procedure will reliably isolate the values we’re interested in from limitations on human rationality.

The idea (page sixteen of your second citation) of extracting a human utility function by eliminating contradictory or inconsistent features of your model of human behavior-in-general is interesting, but I have some reservations about it. There are numerous studies floating around suggesting that human moral intuition can be contradictory or incoherent, and I’d prefer not to throw the baby out with the bathwater if that’s the case.