Why isn’t there a persuasive write-up of the “current alignment research efforts are doomed” theory?
EY wrote hundreds of thousands of words to show that alignment is a hard and important problem. And it worked! Lots of people listened and started researching this
But that discussion now claims these efforts are no good. And I can’t find good evidence, other than folks talking past each other
I agree with everything in your comment except the value of showing EY’s claim to be wrong:
Believing a problem is harder than it is can stop you from finding creative solutions
False believe in your impending doom leads to all sorts of bad decisions (like misallocating resources, or making innocent researchers’ lives worse)
Belief in your impending doom is terrible for your mental heath (tbh I sensed a bit of this in the EY discussion)
Insulting groups like OpenAI destroys a lot of value, especially if EY is actually wrong
If alignment were solved, then developing AGI would be the best event in human history. It’d be a shame to prevent that
In other words, if EY is right, we really need to know that. And know it in way that’s easy to persuade others. If EY is wrong, we need to know that too, and stop this gloom and doom
Belief in your impending doom is terrible for your mental heath
I think by impending doom you mean AI doom after a few years or decades, so “impending” from a civilizational perspective, not from an individual human perspective. If I misinterpret you, please disregard this post.
I disagree on your mental health point. Main lines of argument: people who lose belief in heaven seem to be fine, cultures that believe in oblivion seem to be fine, old people seem to be fine, etc. Also, we evolved to be mortal, so we should be surprised if evolution has left us mentally ill-prepared for our mortality.
However, I discovered/remembered that depression is a common side-effect of terminal illness. See Living with a Terminal Illness. Perhaps that is where you are coming from? There is also Death row phenomenon, but that seems to be more about extended solitary confinement than impending doom.
Grief is common in people facing the end of their lives as a result of a terminal illness. It’s a feeling that can cause a terminally ill person to experience even more pain than they do from their illness. However, it’s considered a normal reaction to their situation.
But in many terminally ill people, grief evolves into depression. In fact, researchers at Baylor University Medical Center believe it affects up to 77 percent of people with a terminal illness.
Experts say the risks of depression increases as a disease advances and causes more painful or uncomfortable symptoms. The more a person’s body changes, the less control they feel over their lives.
I don’t think this is closely analogous to AI doom. A terminal illness might mean a life expectancy measured in months, whereas we probably have a few years or decades. Also our lives will probably continue to improve in the lead up to AI doom, where terminal illnesses come with a side order of pain and disability. On the other hand, a terminal illness doesn’t include the destruction of everything we value.
Overall, I think that belief in AI doom is a closer match to belief in oblivion than belief in cancer and don’t expect it to cause mental health issues until it is much closer. On a personal note, I’ve placed > 50% probability on AI doom for a few years now, and my mental health has been fine as far as I can tell.
However, belief in your impending doom, when combined with belief that “Belief in your impending doom is terrible for your mental heath”, is probably terrible for your mental health. Also, belief that “Belief in your impending doom is terrible for your mental heath” could cause motivated reasoning that makes it harder to salvage value in the face of impending doom.
Why isn’t there a persuasive write-up of the “current alignment research efforts are doomed” theory?
EY wrote hundreds of thousands of words to show that alignment is a hard and important problem. And it worked! Lots of people listened and started researching this
But that discussion now claims these efforts are no good. And I can’t find good evidence, other than folks talking past each other
I agree with everything in your comment except the value of showing EY’s claim to be wrong:
Believing a problem is harder than it is can stop you from finding creative solutions
False believe in your impending doom leads to all sorts of bad decisions (like misallocating resources, or making innocent researchers’ lives worse)
Belief in your impending doom is terrible for your mental heath (tbh I sensed a bit of this in the EY discussion)
Insulting groups like OpenAI destroys a lot of value, especially if EY is actually wrong
If alignment were solved, then developing AGI would be the best event in human history. It’d be a shame to prevent that
In other words, if EY is right, we really need to know that. And know it in way that’s easy to persuade others. If EY is wrong, we need to know that too, and stop this gloom and doom
I think by impending doom you mean AI doom after a few years or decades, so “impending” from a civilizational perspective, not from an individual human perspective. If I misinterpret you, please disregard this post.
I disagree on your mental health point. Main lines of argument: people who lose belief in heaven seem to be fine, cultures that believe in oblivion seem to be fine, old people seem to be fine, etc. Also, we evolved to be mortal, so we should be surprised if evolution has left us mentally ill-prepared for our mortality.
However, I discovered/remembered that depression is a common side-effect of terminal illness. See Living with a Terminal Illness. Perhaps that is where you are coming from? There is also Death row phenomenon, but that seems to be more about extended solitary confinement than impending doom.
I don’t think this is closely analogous to AI doom. A terminal illness might mean a life expectancy measured in months, whereas we probably have a few years or decades. Also our lives will probably continue to improve in the lead up to AI doom, where terminal illnesses come with a side order of pain and disability. On the other hand, a terminal illness doesn’t include the destruction of everything we value.
Overall, I think that belief in AI doom is a closer match to belief in oblivion than belief in cancer and don’t expect it to cause mental health issues until it is much closer. On a personal note, I’ve placed > 50% probability on AI doom for a few years now, and my mental health has been fine as far as I can tell.
However, belief in your impending doom, when combined with belief that “Belief in your impending doom is terrible for your mental heath”, is probably terrible for your mental health. Also, belief that “Belief in your impending doom is terrible for your mental heath” could cause motivated reasoning that makes it harder to salvage value in the face of impending doom.
Zvi just posted EY’s model