I started with the assumption that alignment progress would have diminishing returns. Then the two other factors I took into account were the increasing relevance of alignment research over time[1] and an increasing number of alignment researchers. My model was that the diminishing returns would be canceled out by the increasing number of researchers and increasing relevance.
It seems like you’re emphasizing the importance of diminishing returns. If diminishing returns are more important than the other two factors, progress would slow down over time. I’m not sure which factors are most influential though I may have been underestimating the importance of diminishing returns.
Quote on how AI could reduce AI risk:
“An aligned ASI could reduce or eliminate natural state risks such as the risk from asteroid strikes, supervolcanoes, or stellar explosions by devising protective technologies or by colonizing space so that civilization would continue if Earth were destroyed.”
I think you’re referring to this quote:
“Total existential risk would probably then stop increasing because the ASI could prevent all further existential risks or because an existential catastrophe would have occurred.”
I think I could have explained this point more. I think existential risk levels would fall to very low levels after an aligned ASI is created by definition:
If the AI were aligned, then the AI itself would be a low source of existential risk.
If it’s also superintelligent, it should be powerful enough to strongly reduce all existential risks.
Those are some good points on cognitively enhanced humans. I don’t think I emphasized the downsides enough. Maybe I need to expand that section.
If the aligned superintelligent AGI is known about by all powerful parties (mostly governments), and some of those governments have or believe they have interests non-aligned with the AGI, then there is a large incentive for those governments to go to war against the AGI. If the AGI is only moderately superhuman and we don’t see intelligence-explosion type effects (possibly because we have a prosaic AGI), this would be a very risky situation to be in.
I agree. The world could be at a higher risk of conflict just before or after the first ASI is created. Though even if there is a fast takeoff, the risk is still there before the takeoff if it is obvious that an ASI is about to be created.
This scenario is described in quite a lot of detail in chapter 5 of Superintelligence:
“Given the extreme security implications of superintelligence, governments would likely seek to nationalize any project on their territory that they thought close to achieving a takeoff. A powerful state might also attempt to acquire projects located in other countries through espionage, theft, kidnapping, bribery, threats, military conquest, or any other available means.”
Added explanation for why an aligned ASI would significantly reduce all existential risks:
″ the ASI could prevent all further existential risks. The reason why follows from its definition: an aligned ASI would itself not be a source of existential risk and since it’s superintelligent, it would be powerful enough to eliminate all further risks.”
Updated graph to show exponentially decreasing model in addition to the linear model.
I started with the assumption that alignment progress would have diminishing returns. Then the two other factors I took into account were the increasing relevance of alignment research over time[1] and an increasing number of alignment researchers. My model was that the diminishing returns would be canceled out by the increasing number of researchers and increasing relevance.
It seems like you’re emphasizing the importance of diminishing returns. If diminishing returns are more important than the other two factors, progress would slow down over time. I’m not sure which factors are most influential though I may have been underestimating the importance of diminishing returns.
Quote on how AI could reduce AI risk:
I think you’re referring to this quote:
I think I could have explained this point more. I think existential risk levels would fall to very low levels after an aligned ASI is created by definition:
If the AI were aligned, then the AI itself would be a low source of existential risk.
If it’s also superintelligent, it should be powerful enough to strongly reduce all existential risks.
Those are some good points on cognitively enhanced humans. I don’t think I emphasized the downsides enough. Maybe I need to expand that section.
Toby Ord calls this decreasing nearsightedness.
If the aligned superintelligent AGI is known about by all powerful parties (mostly governments), and some of those governments have or believe they have interests non-aligned with the AGI, then there is a large incentive for those governments to go to war against the AGI. If the AGI is only moderately superhuman and we don’t see intelligence-explosion type effects (possibly because we have a prosaic AGI), this would be a very risky situation to be in.
I agree. The world could be at a higher risk of conflict just before or after the first ASI is created. Though even if there is a fast takeoff, the risk is still there before the takeoff if it is obvious that an ASI is about to be created.
This scenario is described in quite a lot of detail in chapter 5 of Superintelligence:
Changes:
Added explanation for why an aligned ASI would significantly reduce all existential risks:
Updated graph to show exponentially decreasing model in addition to the linear model.