Aleksi Liimatainen comments on Conditional on living in a AI safety/alignment by default universe, what are the implications of this assumption being true?

Aleksi Liimatainen 18 Jul 2023 6:43 UTC
2 points
0
You’re describing an alignment failure scenario, not a success scenario. In this case the AI has been successfully instructed to paperclip-maximize a planned utopia (however you’d do that while still failing at alignment). Successful alignment would entail the AI being able and willing to notice and correct for an unwise wish.

Aleksi Liimatainen comments on Conditional on living in a AI safety/​alignment by default universe, what are the implications of this assumption being true?