Well then, isn’t the answer that we care about de re alignment, and whether or not an AI is de dicto aligned is relevant only as far as it predicts de re alignment? We might expect that the two would converge in the limit of superintelligence, and perhaps that aiming for de dicto alignment might be the easier immediate target, but the moral worth would be a factor of what the AI actually did.
That does clear up the seeming confusion behind the OP, though, so thanks!
Well then, isn’t the answer that we care about de re alignment, and whether or not an AI is de dicto aligned is relevant only as far as it predicts de re alignment? We might expect that the two would converge in the limit of superintelligence, and perhaps that aiming for de dicto alignment might be the easier immediate target, but the moral worth would be a factor of what the AI actually did.
That does clear up the seeming confusion behind the OP, though, so thanks!