In practice I do suspect humans regularly experience internal (within-brain) inner alignment failures, but given that suspicion I feel surprised by how functional humans manage to be. That is, I notice expecting that regular inner alignment failures would cause far more mayhem than I observe, which makes me wonder whether brains are implementing some sort of alignment-relevant tech.
I don’t know why you expect an inner alignment failure to look dysfunctional. Instrumental convergence suggests that it would look functional. What the world looks like if there are inner alignment failures inside the human brain is (in part) that humans pursue a greater diversity of terminal goals than can be accounted for by genetics.
I don’t know why you expect an inner alignment failure to look dysfunctional. Instrumental convergence suggests that it would look functional. What the world looks like if there are inner alignment failures inside the human brain is (in part) that humans pursue a greater diversity of terminal goals than can be accounted for by genetics.