So, I do think the continuity buys you things, even in this case—roughly in the way outlined in the post—it’s choice of the developer to continue training after passing some roughly human level, and with a sensible oversight, they should notice and stop before getting to superintelligence,
You may ask why would they have the oversight in place. My response is, possible because some small-sized AI disaster which happened before, people understand the tech is dangerous.
With the 2-year scenario, again, I think there is a decent chance at stopping ~before or roughly at the human level. One story why that may be the case is, it seems quite possible the level where AI gets good at convincing/manipulating humans, or humans freak out for other reasons, is lower than AGI. If you get enough economic benefits from CAIS at the same time, you can also get strong counter-pressures to competition at developing AGI.
How does the researcher know that it’s about to pass roughly human level, when a misaligned AI may have incentive and ability to fake a plateau at about human level until it has enough training and internal bootstrapping to become strongly superhuman? Even animals have the capability to fake being weaker, injured, or dead when it might benefit them.
I don’t think this is necessarily what will happen, but it is a scenario that needs to be considered.
So, I do think the continuity buys you things, even in this case—roughly in the way outlined in the post—it’s choice of the developer to continue training after passing some roughly human level, and with a sensible oversight, they should notice and stop before getting to superintelligence,
You may ask why would they have the oversight in place. My response is, possible because some small-sized AI disaster which happened before, people understand the tech is dangerous.
With the 2-year scenario, again, I think there is a decent chance at stopping ~before or roughly at the human level. One story why that may be the case is, it seems quite possible the level where AI gets good at convincing/manipulating humans, or humans freak out for other reasons, is lower than AGI. If you get enough economic benefits from CAIS at the same time, you can also get strong counter-pressures to competition at developing AGI.
How does the researcher know that it’s about to pass roughly human level, when a misaligned AI may have incentive and ability to fake a plateau at about human level until it has enough training and internal bootstrapping to become strongly superhuman? Even animals have the capability to fake being weaker, injured, or dead when it might benefit them.
I don’t think this is necessarily what will happen, but it is a scenario that needs to be considered.