Agreed. But depending on exactly what’s meant I think lukeprog is still correct in the statement that “we’d be unlikely to get what we really want if the world was re-engineered in accordance with a description of what we want that came from verbal introspective access to our motivations”, simply because the descriptions that people actually produce from this are so incomplete. We’d have to compile something from asking “Would you prefer Moldova to be invaded or not? Would you prefer...”, etc., since people wouldn’t even think of that question themselves. (And we’d probably need specific scenarios, not just “Moldova is invaded vs. not”.)
And since verbal introspection is so unreliable, a better check might be somehow actually simulating you in a world where Moldova is invaded vs. not, and seeing which you prefer. That may be getting a little too close to “license to be human” territory, since that obviously would be revealed preference, but due to human inconsistency—specifically, the fact that our preferences over actions don’t seem to always follow from preferences over consequences like they should—I’m not certain it’s necessarily the sort that gives us problems. It’s when you go by our preferences over actions that you get the real problems...
Agreed. But depending on exactly what’s meant I think lukeprog is still correct in the statement that “we’d be unlikely to get what we really want if the world was re-engineered in accordance with a description of what we want that came from verbal introspective access to our motivations”, simply because the descriptions that people actually produce from this are so incomplete. We’d have to compile something from asking “Would you prefer Moldova to be invaded or not? Would you prefer...”, etc., since people wouldn’t even think of that question themselves. (And we’d probably need specific scenarios, not just “Moldova is invaded vs. not”.)
And since verbal introspection is so unreliable, a better check might be somehow actually simulating you in a world where Moldova is invaded vs. not, and seeing which you prefer. That may be getting a little too close to “license to be human” territory, since that obviously would be revealed preference, but due to human inconsistency—specifically, the fact that our preferences over actions don’t seem to always follow from preferences over consequences like they should—I’m not certain it’s necessarily the sort that gives us problems. It’s when you go by our preferences over actions that you get the real problems...