Suppose there is a useful formulation of the alignment problem that is mathematically unsolvable. Suppose that as a corollary, modifying your own mind while ensuring any non-trivial property of the resulting mind was also impossible.
Would that prevent a new AI from trying to modify itself?
[Question] If alignment problem was unsolvable, would that avoid doom?
Suppose there is a useful formulation of the alignment problem that is mathematically unsolvable. Suppose that as a corollary, modifying your own mind while ensuring any non-trivial property of the resulting mind was also impossible.
Would that prevent a new AI from trying to modify itself?
Has this direction been explored before?