Nornagest comments on Decision Theory FAQ

Nornagest 18 Mar 2013 3:34 UTC
4 points
We know value stability is a problem in recursive self-modification scenarios. We don’t know—to put it very mildly—that unstable values will tend towards cozy human-friendly universals, and in fact have excellent reasons to believe they won’t. Especially if they start somewhere as bizarre as paperclippism.

In discussions of a self-improving Clippy, Clippy’s values are usually presumed stable. The alternative is (probably) no less dire, but is a lot harder to visualize.