I don’t think so. That’s one breaking point for alignment, but I’m saying in that post that even if we avoid a sharp left turn and make it to an aligned, superintelligent AGI, that its alignment may drift away from human values as it continues to learn. Learning may necessarily shift the meanings of existing concepts, including values.
I don’t think so. That’s one breaking point for alignment, but I’m saying in that post that even if we avoid a sharp left turn and make it to an aligned, superintelligent AGI, that its alignment may drift away from human values as it continues to learn. Learning may necessarily shift the meanings of existing concepts, including values.