“Formally Stating the AI Alignment Problem” is probably the nicest introduction, but if you want a preprint of a more formal approach to how I think this matters (with a couple specific cases), you might like this preprint (though note I am working on getting this through to publication, have it halfway through review with a journal, and although I’ve been time constrained to make the reviewers’ suggested changes, I suspect the final version of this paper will be more like what you are looking for).
I think the former is very important, but I’m quite skeptical of the later. What would be the best post of yours for a skeptic to read?
“Formally Stating the AI Alignment Problem” is probably the nicest introduction, but if you want a preprint of a more formal approach to how I think this matters (with a couple specific cases), you might like this preprint (though note I am working on getting this through to publication, have it halfway through review with a journal, and although I’ve been time constrained to make the reviewers’ suggested changes, I suspect the final version of this paper will be more like what you are looking for).