I think you do a nice job of capturing many of the details of why I also think alignment is hard, although to be fair you are driving at a different point. I agree with you that most alignment research, despite the efforts for the researchers, is still not reductive enough in terms of what sort of constructs it expects to be able to operate on in the world, and especially is likely to fall down because it doesn’t recognize that values and beliefs are the same kind of thing but serving different purposes in different contexts and so present different reifications, which regardless are not the real things that exist in humans against which AI needs to be aligned.
I think you do a nice job of capturing many of the details of why I also think alignment is hard, although to be fair you are driving at a different point. I agree with you that most alignment research, despite the efforts for the researchers, is still not reductive enough in terms of what sort of constructs it expects to be able to operate on in the world, and especially is likely to fall down because it doesn’t recognize that values and beliefs are the same kind of thing but serving different purposes in different contexts and so present different reifications, which regardless are not the real things that exist in humans against which AI needs to be aligned.