What is all of humanity if not a walking catastrophic inner alignment failure? We were optimized for one thing: inclusive genetic fitness. And only a tiny fraction of humanity could correctly define what that is!
It could both be the case that there exists catastrophic inner alignment failure between humans and evolution, and also that humans don’t regularly experience catastrophic inner alignment failures internally.
In practice I do suspect humans regularly experience internal inner alignment failures, but given that suspicion I feel surprised by how functional humans do manage to be. In other words, I notice expecting that regular inner alignment failures would cause far more mayhem than I observe, which makes me wonder whether brains are implementing some sort of alignment-relevant tech.
In practice I do suspect humans regularly experience internal (within-brain) inner alignment failures, but given that suspicion I feel surprised by how functional humans manage to be. That is, I notice expecting that regular inner alignment failures would cause far more mayhem than I observe, which makes me wonder whether brains are implementing some sort of alignment-relevant tech.
I don’t know why you expect an inner alignment failure to look dysfunctional. Instrumental convergence suggests that it would look functional. What the world looks like if there are inner alignment failures inside the human brain is (in part) that humans pursue a greater diversity of terminal goals than can be accounted for by genetics.
What would inner alignment failures even look like? Overdosing on meth sure makes the dopamine system happy. Perhaps human values reside in the prefrontal complex, and all of humanity is a catastrophic alignment failure of the dopamine system (except a small minority of drug addicts) on top of being a catastrophic alignment failure of natural selection.
Isn’t evolution a better analogy for deep learning anyway? All natural selection does is gradient descent (hill climbing technically), with no capacity for lookahead. And we’ve known this one for 150 years!
All natural selection does is gradient descent (hill climbing technically), with no capacity for lookahead.
I think if you’re interested in the analysis and classification of optimization techniques, there’s enough differences between what natural selection is doing and what deep learning is doing that it isn’t a very natural analogy. (Like, one is a population-based method and the other isn’t, the update rules are different, etc.)
What is all of humanity if not a walking catastrophic inner alignment failure? We were optimized for one thing: inclusive genetic fitness. And only a tiny fraction of humanity could correctly define what that is!
It could both be the case that there exists catastrophic inner alignment failure between humans and evolution, and also that humans don’t regularly experience catastrophic inner alignment failures internally.
In practice I do suspect humans regularly experience internal inner alignment failures, but given that suspicion I feel surprised by how functional humans do manage to be. In other words, I notice expecting that regular inner alignment failures would cause far more mayhem than I observe, which makes me wonder whether brains are implementing some sort of alignment-relevant tech.
I don’t know why you expect an inner alignment failure to look dysfunctional. Instrumental convergence suggests that it would look functional. What the world looks like if there are inner alignment failures inside the human brain is (in part) that humans pursue a greater diversity of terminal goals than can be accounted for by genetics.
What would inner alignment failures even look like? Overdosing on meth sure makes the dopamine system happy. Perhaps human values reside in the prefrontal complex, and all of humanity is a catastrophic alignment failure of the dopamine system (except a small minority of drug addicts) on top of being a catastrophic alignment failure of natural selection.
Isn’t evolution a better analogy for deep learning anyway? All natural selection does is gradient descent (hill climbing technically), with no capacity for lookahead. And we’ve known this one for 150 years!
I think if you’re interested in the analysis and classification of optimization techniques, there’s enough differences between what natural selection is doing and what deep learning is doing that it isn’t a very natural analogy. (Like, one is a population-based method and the other isn’t, the update rules are different, etc.)