It sounds to me like the claim you are making here is “the current AI Alignment paradigm might have a major hole, but also this hole might not be real”.
I didn’t write something like that because it is not what I meant. I gave an argument whose strength depends on other beliefs one has, and I just wanted to stress this fact. I also gave two examples (reported below), so I don’t think I mentioned epistemic and moral uncertainty “in a somewhat handwavy way”.
An example: if you think that futures shaped by malevolent actors using AI are many times more likely to happen than futures shaped by uncontrolled AI, the response will strike you as very important; and vice versa if you think the opposite.
Another example: if you think that extinction is way worse than dystopic futures lasting a long time, the response won’t affect you much—assuming that bad human actors are not fans of complete extinction.
Maybe your scepticism is about my beliefs, i.e. you are saying that it is not clear, from the post, what my beliefs on the matter are. I think presenting the argument is more important than presenting my own beliefs: the argument can be used, or at least taken into consideration, by anone who is interested in these topics, while my beliefs alone are useless if they are not backed up by evidence and/or arguments. In case you are curious: I do believe futures shaped by uncontrolled AI are unlikely to happen.
Now to the last part of your comment:
I’m furthermore unsure why the solution to this proposed problem is to try and design AIs to make moral progress; this seems possible but not obvious. One problem with bad actors is that they often don’t base their actions on what the philosophers think is good
I agree that bad actors won’t care. Actually, I think that even if we do manage to build some kind of AI that is considered superethical (better than humans at ethical reasoning) by a decent amount of philosophers, very few people will care, especially at the beginning. But that doesn’t mean it will be useless: at some point in the past, very few people believed slavery was bad, now it is a common belief. How much will such an AI accelerate moral progress, compared to other approaches? Hard to tell, but I wouldn’t throw the idea in the bin.
Sorry for the late reply, I missed your comment.
I didn’t write something like that because it is not what I meant. I gave an argument whose strength depends on other beliefs one has, and I just wanted to stress this fact. I also gave two examples (reported below), so I don’t think I mentioned epistemic and moral uncertainty “in a somewhat handwavy way”.
Maybe your scepticism is about my beliefs, i.e. you are saying that it is not clear, from the post, what my beliefs on the matter are. I think presenting the argument is more important than presenting my own beliefs: the argument can be used, or at least taken into consideration, by anone who is interested in these topics, while my beliefs alone are useless if they are not backed up by evidence and/or arguments. In case you are curious: I do believe futures shaped by uncontrolled AI are unlikely to happen.
Now to the last part of your comment:
I agree that bad actors won’t care. Actually, I think that even if we do manage to build some kind of AI that is considered superethical (better than humans at ethical reasoning) by a decent amount of philosophers, very few people will care, especially at the beginning. But that doesn’t mean it will be useless: at some point in the past, very few people believed slavery was bad, now it is a common belief. How much will such an AI accelerate moral progress, compared to other approaches? Hard to tell, but I wouldn’t throw the idea in the bin.