The formal statement of the AI Alignment problem seems to me very much like stating all possible loopholes and plugging them. This endeavor seems to be as difficult or even more so than discovering that ultimate generalized master algorithm.
I still see augmenting ourselves as the only way to maybe keep the alignment of lesser intelligences possible. As we augment, we can simultaneously make sure, our corresponding levels of artificial intelligences remain aligned.
Not to mention it’d be much more easier comparatively to improve upon our existing faculties than to come up with an entire replica of our thinking machines.
AI alignment could be possible, sure if we overcome one of the most difficult problems in research history(as you said formally stating our end goals), but I’m not sure our current intelligences are upto the mark, the same way we’re struggling to discover the unified theory of everything.
Like Turing defined his test actually for general human-level intelligence. He thought if an agent was able to hold a human-like conversation, then it must be AGI. He never expected narrow AIs to be all over the place and beat his test as soon as 2011 with meager chatbots.
Similarly we can never see what kind of unexpected stuff that an AGI might throw at us, that our bleeding edge theories that we came up with a few hours ago start looking like historical outdated Turing tests.
I’d say we start augmenting the human brain until it’s completely replaced by a post-biological counterpart and from there rapid improvements can start taking place, but unless we start early I doubt we’ll be able to catch up with AI. I agree on the part that this need to happen in tandem with AI safety.