Once upon a time, enough humans cooperated to make sure that AI would behave according to (something encoding a generally acceptable approximation to) the coherent extrapolated volition of the majority of humans. Unfortunately, it turned out that most humans have really lousy volition. The entire universe ended up devoted to sports and religion. The minority whose volition lay outside of that attractor were gently reprogrammed to like it.
This is often overlooked here (perhaps with good reason as many examples will be controversial). Scenarios of this kind can be very, very bad, much worse than a typical unaligned AI like Clippy.
The following spoiler has 2 parts: 1 tells you the work that is referenced, 2 tells you the details.
1. Reminds me of Crystal Society.
2. (Specifically, how ‘the robot doesn’t care about us’ is treated with ‘let’s change its values so it does.’ There are probably other examples from that series though.)
The minority whose volition lay outside of that attractor were gently reprogrammed to like it.
Or there could be...more kinds of sports? Via novel tech? You might not like sports now, but
that’s basically where video games/VR can be pointed. (Realizing a vision may not be easy, though.)
Would you want to go to the moon, and play a game involving human powered gliders at least once?
AI was responsible for an event a while back that got way more people watching chess than normally do.
Well, it’s intentionally a riff on that one. I wanted one that illustrated that these “shriek” situations, where some value system takes over and gets locked in forever, don’t necessarily involve “defectors”. I felt that the last scenario was missing something by concentrating entirely on the “sneaky defector takes over” aspect, and I didn’t see any that brought out the “shared human values aren’t necssarily all that” aspect.
I like these. Can I add one?
Democratic Lock-In
Once upon a time, enough humans cooperated to make sure that AI would behave according to (something encoding a generally acceptable approximation to) the coherent extrapolated volition of the majority of humans. Unfortunately, it turned out that most humans have really lousy volition. The entire universe ended up devoted to sports and religion. The minority whose volition lay outside of that attractor were gently reprogrammed to like it.
Moral: You, personally, may not be “aligned”.
This is often overlooked here (perhaps with good reason as many examples will be controversial). Scenarios of this kind can be very, very bad, much worse than a typical unaligned AI like Clippy.
For example, I would take Clippy over an AI whose goal was to spread biological life throughout the universe any day. I expect this may be controversial even here, but see https://longtermrisk.org/the-importance-of-wild-animal-suffering/#Inadvertently_Multiplying_Suffering for why I think this way.
The following spoiler has 2 parts: 1 tells you the work that is referenced, 2 tells you the details.
1. Reminds me of Crystal Society.
2. (Specifically, how ‘the robot doesn’t care about us’ is treated with ‘let’s change its values so it does.’ There are probably other examples from that series though.)
Or there could be...more kinds of sports? Via novel tech? You might not like sports now, but
that’s basically where video games/VR can be pointed. (Realizing a vision may not be easy, though.)
Would you want to go to the moon, and play a game involving human powered gliders at least once?
AI was responsible for an event a while back that got way more people watching chess than normally do.
Edited to:
change The spoiler
Um. This spoiler tag was not very helpful because I didn’t have any hint about what was hiding under it.
Is that better?
Yes. Though it could be improved further by elaborating “the work of fiction that is spoiled”, instead of just “the work.”
Isn’t that the same as the last one?
Just call it a “Status Quo Lock-In” or “Arbitrary Lock-In”
Well, it’s intentionally a riff on that one. I wanted one that illustrated that these “shriek” situations, where some value system takes over and gets locked in forever, don’t necessarily involve “defectors”. I felt that the last scenario was missing something by concentrating entirely on the “sneaky defector takes over” aspect, and I didn’t see any that brought out the “shared human values aren’t necssarily all that” aspect.
Ah, good point! I have a feeling this is a central issue that is hardly discussed here (or anywhere)
Will MacAskill calls this the “actual alignment problem”
Wei Dai has written a lot about related concerns in posts like The Argument from Philosophical Difficulty