[Question] If alignment problem was unsolvable, would that avoid doom?

Kinrany7 May 2023 22:13 UTC

3 points

Suppose there is a useful formulation of the alignment problem that is mathematically unsolvable. Suppose that as a corollary, modifying your own mind while ensuring any non-trivial property of the resulting mind was also impossible.

Would that prevent a new AI from trying to modify itself?

Has this direction been explored before?

Kinrany7 May 2023 22:13 UTC

3 points

3 comments1 min readLW link