It is a recurring pattern in history for determined, well-intentioned people to seize power and then do damage. Certainly we’re different because we’re rational, but they were different because they were ${virtueTheyValueMost}. See also The Outside View and The Sorting Hat’s Warning.
A conspiracy of rationalists is even more disturbing because of how closely it resembles an AI. As individuals, we balance more logic based on our admittedly underspecified terminal values against moral intuition. But our intuitions do not match, nor do we communicate them easily. So collectively moral logic dominates. Pure moral logic without really good terminal values… we’ve been over this.
I don’t know, but I’ll throw some ideas up. These aren’t all the possibilities and probably don’t include the best possibility.
Each step must be moral taken in isolation. No it’ll-be-worth-it-in-ten-years reasoning, since that can go especially horribly wrong.
Work honestly within the existing systems. This allows existing safeguards to apply. On the other hand, it assumes it’s possible to get anything done within existing systems by being honest.
Establish some mechanism to keep moral intuition. Secret-ballot mandatory does-this-feel-right votes.
Divide into several conspiracies, which are forbidden to have discuss issues with eachother, preventing groupthink.
Have an oversight conspiracy, with the power to shut us down if they believe we’ve gone evil.
This needs a safety hatch.
It is a recurring pattern in history for determined, well-intentioned people to seize power and then do damage. Certainly we’re different because we’re rational, but they were different because they were ${virtueTheyValueMost}. See also The Outside View and The Sorting Hat’s Warning.
A conspiracy of rationalists is even more disturbing because of how closely it resembles an AI. As individuals, we balance more logic based on our admittedly underspecified terminal values against moral intuition. But our intuitions do not match, nor do we communicate them easily. So collectively moral logic dominates. Pure moral logic without really good terminal values… we’ve been over this.
Don’t worry. This is exactly what the Contrarian Conspiracy was designed to prevent.
Everything is going according to plan.
Huh. An interesting point, and one that I should have considered. So what would you suggest as a safety hatch?
I don’t know, but I’ll throw some ideas up. These aren’t all the possibilities and probably don’t include the best possibility.
Each step must be moral taken in isolation. No it’ll-be-worth-it-in-ten-years reasoning, since that can go especially horribly wrong.
Work honestly within the existing systems. This allows existing safeguards to apply. On the other hand, it assumes it’s possible to get anything done within existing systems by being honest.
Establish some mechanism to keep moral intuition. Secret-ballot mandatory does-this-feel-right votes.
Divide into several conspiracies, which are forbidden to have discuss issues with eachother, preventing groupthink.
Have an oversight conspiracy, with the power to shut us down if they believe we’ve gone evil.