Successfully containing & using more capable models (p1) gives you more scary demos for p2
Success in p1 also speeds up p3 a lot, because:
1) You can empirically study AGI directly,
2) Very advanced but “narrow” AI tools accelerate research (“narrow” here still means maybe more general than GPT-4)
3) Maybe you can even have (proto-)AGIs do research for you
You definitely need a lot of success in p2 for anything to work, otherwise people will take all the useful work we can get from proto-AGIs and pour it into capabilities research.
Better alignment research (p3) lets you do more p1 type risky stuff with SOTA models (on the margin)
If p1 is very successful, maybe we can punt most of p3 to the AIs; conversely, if p1 seems very hard then we probably only get ‘narrow’ tools to help with p3 and need to mostly do it ourselves, and hopefully get ML researchers to delay for long enough.
There are positive feedback loops between prongs:
Successfully containing & using more capable models (p1) gives you more scary demos for p2
Success in p1 also speeds up p3 a lot, because:
1) You can empirically study AGI directly,
2) Very advanced but “narrow” AI tools accelerate research (“narrow” here still means maybe more general than GPT-4)
3) Maybe you can even have (proto-)AGIs do research for you
You definitely need a lot of success in p2 for anything to work, otherwise people will take all the useful work we can get from proto-AGIs and pour it into capabilities research.
Better alignment research (p3) lets you do more p1 type risky stuff with SOTA models (on the margin)
If p1 is very successful, maybe we can punt most of p3 to the AIs; conversely, if p1 seems very hard then we probably only get ‘narrow’ tools to help with p3 and need to mostly do it ourselves, and hopefully get ML researchers to delay for long enough.