At time of writing, I’m assigning the highest probability to “Will AGI cause an existential catastrophe?” at 85%, with the next-highest predictions at 80% and 76%. Why … why is everyone so optimistic?? Did we learn something new about the problem actually being easier, or our civilization more competent, than previously believed?
Should—should I be trying to do more x-risk-reduction-relevant stuff (somehow), or are you guys saying you’ve basically got it covered? (In 2013, I told myself it was OK for dumb little ol’ me to personally not worry about the Singularity and focus on temporal concerns in order to not have nightmares, and it turned out that I have a lot of temporal concerns which could be indirectly very relevant to the main plot, but that’s not my real reason for focusing on them.)
IMO, we decidedly do not “basically have it covered.”
That said, IMO it is generally not a good idea for a person to try to force themselves on problems that will make them crazy, desperate need or no.
I am often tempted to downplay how much catastrophe-probability I see, basically to decrease the odds that people decide to make themselves crazy in the direct vicinity of alignment research and alignment researchers.
And on the other hand, I am tempted by the HPMOR passage:
“Girls?” whispered Susan. She was slowly pushing herself to her feet, though Hermione could see her limbs swaying and quivering. “Girls, I’m sorry for what I said before. If you’ve got anything clever and heroic to try, you might as well try it.”
(To be clear, I have hope. Also, please just don’t go crazy and don’t do stupid things.)
For me, it’s because there’s disjunctively many ways that AGI could not happen (global totalitarian regime, AI winter, 55% CFR avian flu escapes a BSL4 lab, unexpected difficulty building AGI & the planning fallacy on timelines which we totally won’t fall victim to this time...), or that alignment could be solved, or that I could be mistaken about AGI risk being a big deal, or…
Granted, I assign small probabilities to several of these events. But my credence for P(AGI extinction | no more AI alignment work from community) is 70% - much higher than my 40% unconditional credence. I guess that means yes, I think AGI risk is huge (remember that I’m saying “40% chance we just die to AGI, unconditionally”), and that’s after incorporating the significant contributions which I expect the current community to make. The current community is far from sufficient, but it’s also probably picking a good amount of low-hanging fruit, and so I expect that its presence makes a significant difference.
EDIT: I’m decreasing the 70% to 60% to better match my 40% unconditional, because only thecurrent alignment community stops working on alignment.
I’ve gone from roughly 2⁄3 to 1⁄2 on existential catastrophe (I’ve put 58% here, was feeling pessimistic) based on the big projects having safety teams who I think are doing really good work. That probably falls under our civilization being more competent than previously believed.
At time of writing, I’m assigning the highest probability to “Will AGI cause an existential catastrophe?” at 85%, with the next-highest predictions at 80% and 76%. Why … why is everyone so optimistic?? Did we learn something new about the problem actually being easier, or our civilization more competent, than previously believed?
Should—should I be trying to do more x-risk-reduction-relevant stuff (somehow), or are you guys saying you’ve basically got it covered? (In 2013, I told myself it was OK for dumb little ol’ me to personally not worry about the Singularity and focus on temporal concerns in order to not have nightmares, and it turned out that I have a lot of temporal concerns which could be indirectly very relevant to the main plot, but that’s not my real reason for focusing on them.)
IMO, we decidedly do not “basically have it covered.”
That said, IMO it is generally not a good idea for a person to try to force themselves on problems that will make them crazy, desperate need or no.
I am often tempted to downplay how much catastrophe-probability I see, basically to decrease the odds that people decide to make themselves crazy in the direct vicinity of alignment research and alignment researchers.
And on the other hand, I am tempted by the HPMOR passage:
“Girls?” whispered Susan. She was slowly pushing herself to her feet, though Hermione could see her limbs swaying and quivering. “Girls, I’m sorry for what I said before. If you’ve got anything clever and heroic to try, you might as well try it.”
(To be clear, I have hope. Also, please just don’t go crazy and don’t do stupid things.)
For me, it’s because there’s disjunctively many ways that AGI could not happen (global totalitarian regime, AI winter, 55% CFR avian flu escapes a BSL4 lab, unexpected difficulty building AGI & the planning fallacy on timelines which we totally won’t fall victim to this time...), or that alignment could be solved, or that I could be mistaken about AGI risk being a big deal, or…
Granted, I assign small probabilities to several of these events. But my credence for P(AGI extinction | no more AI alignment work from community) is 70% - much higher than my 40% unconditional credence. I guess that means yes, I think AGI risk is huge (remember that I’m saying “40% chance we just die to AGI, unconditionally”), and that’s after incorporating the significant contributions which I expect the current community to make. The current community is far from sufficient, but it’s also probably picking a good amount of low-hanging fruit, and so I expect that its presence makes a significant difference.
EDIT: I’m decreasing the 70% to 60% to better match my 40% unconditional, because only the current alignment community stops working on alignment.
Some reasons.
I’ve gone from roughly 2⁄3 to 1⁄2 on existential catastrophe (I’ve put 58% here, was feeling pessimistic) based on the big projects having safety teams who I think are doing really good work. That probably falls under our civilization being more competent than previously believed.