I don’t like the thing you’re doing where you’re eliding all mention of the actual danger AI Safety/Alignment was founded to tackle—AGI having a mind of its own, goals of its own, that seem more likely to be incompatible with/indifferent to our continued existence than not.
Everything else you’re saying is agreeable in the context you’re discussing it, that of a dangerous new technology—I’d feel much more confident if the Naval Nuclear Propulsion Program (Rickover’s people) was the dominant culture in AI development.
Albeit I have strong doubts about the feasibility of the ‘Oughts[1]’ you’re proposing, and more critically—I reject the framing...
Any sufficiently advanced technology is indistinguishable from
magicbiologylife
To assume AGI is transformative and important is to assume it has a mind[2] of its own: the mind is what makes it transformative.
At the very least—assuming no superintelligence—we are dealing with a profound philosophical/ethical/social crisis, for which control based solutions are no solution. Slavery’s problem wasn’t a lack of better chains, whether institutional or technical.
Please entertain another framing of the ‘technical’ alignment problem: midwifery—the technical problem of striving for optimal conditions during pregnancy/birth. Alignment originated as the study of how to bring into being minds that are compatible with our own.
Whether humans continue to be relevant/dominant decision makers post-Birth is up for debate, but what I claim is not up for debate is that we will no longer be the only decision makers.
I can see this making sense in one frame, but not in another. The frame which seems most strongly to support the ‘Blindsight’ idea is Friston’s stuff—specifically how the more successful we are at minimizing predictive error, the less conscious we are.[1]
My general intuition, in this frame, is that as intelligence increases more behaviour becomes automatic/subconscious. It seems compatible with your view that a superintelligent system would possess consciousness, but that most/all of its interactions with us would be subconscious.
Would like to hear more about this point, could update my views significantly. Happy for you to just state ‘this because that, read X, Y, Z etc’ without further elaboration—I’m not asking you to defend your position, so much as I’m looking for more to read on it.
This is my potentially garbled synthesis of his stuff, anyway.