I generally agree with Stephen Fowler, specifically that “there is no evidence that alignment is a solvable problem.”
But even if a solution can be found which provably works for up to N level AGI, what about N+1 level? A sustainable alignment is just not possible. Our only hope is that there may be some limits on N, for example N=10 requires more resources than the Universe can provide. But it is likely that our ability to prove the alignment will stop well before a significant limit.
I agree that I don’t think we’re going to get any proofs of alignment or guarantees of a N level alignment system working for N + 1. I think we also have reason to believe that N extends pretty far, further than we can hope to align with so little time to research. Thus, I believe our hope lies in using our aligned model to prevent anyone from building an N + 1 model ( including the aligned N level model). If our model is both aligned and powerful, this should be possible.
I generally agree with Stephen Fowler, specifically that “there is no evidence that alignment is a solvable problem.”
But even if a solution can be found which provably works for up to N level AGI, what about N+1 level? A sustainable alignment is just not possible. Our only hope is that there may be some limits on N, for example N=10 requires more resources than the Universe can provide. But it is likely that our ability to prove the alignment will stop well before a significant limit.
I agree that I don’t think we’re going to get any proofs of alignment or guarantees of a N level alignment system working for N + 1. I think we also have reason to believe that N extends pretty far, further than we can hope to align with so little time to research. Thus, I believe our hope lies in using our aligned model to prevent anyone from building an N + 1 model ( including the aligned N level model). If our model is both aligned and powerful, this should be possible.