Vladimir_Nesov comments on My current take on the Paul-MIRI disagreement on alignability of messy AI

Vladimir_Nesov 3 Jan 2017 23:56 UTC
0 points
AF
Unaligned AIs don’t necessarily have efficient idealized values. Waiting for (simulated) humans to decide is analogous to computing a complicated pivotal fact about unaligned AI’s values. It’s not clear that “naturally occurring” unaligned AIs have simpler idealized/extrapolated values than aligned AIs with upload-based value definitions. Some unaligned AIs may actually be on the losing side, recall the encrypted-values AI example.