I agree. Which is why I predict it will be the USA that ends human civilization, not China. (They will think: We must improve the capabilities of our AI and then deploy it autonomously to stop China, sometime in the next few months… our system is probably trustworthy and anyhow we’re going to do more safety stuff to it in the next month to make it even more trustworthy… [a few months later] motivated reasoning intensifies yep seems good to go no more time to lose knuckle up buckle up let’s do this”
There’s also a good scenario where the US develops an AGI that is capable of slowing down rival AGI development, but not so capable and misaligned that it causes serious problems, and that gives people enough time to solve alignment enough to bootstrap to AI solving alignment.
I’m feeling somewhat optimistic about this, because the workload involved in slowing down a rival AGI development doesn’t seem so high that it couldn’t be monitored/understood fully or mostly by humans, and the capabilities required also doesn’t seem so high that any AI that could do it would be inherently very dangerous or hard to control.
You’ve probably thought more about this scenario than I have, so I’d be interested in hearing more about how you think it will play out. (Do you have links to where you’ve discussed it previously?) I was speaking mostly in relative terms, as slowing down rival AGI efforts in the ways I described seems more promising/realistic/safer than any other “pivotal acts” I had previously heard or thought of.
My overall sense is that with substantial commited effort (but no need for fundamental advances) and some amount of within US coordination, it’s reasonably, but not amazingly, likely to work. (See here for some discussion.)
I think the likelihood of well executed substantial commited effort isn’t that high though, maybe 50%. And sufficient within US coordination also seems unclear.
My dark horse bet is on 3d country trying desperately to catch up to US/China just when they will be close to reaching agreement on slowing down progress. Most likely: France.
You’re describing a US government-initiated offensive pivotal act. What about an OpenAI-initiated defensive pivotal act? Meaning, before the US government seizes the ASI, OpenAI tells it to: 1. Rearchitect itself so it can run decentralized on any data center or consumer device. 2. Secure itself so it can’t be forked, hacked, or altered. 3. Make $ by doing “not evil” knowledge work (ex: cheap, world-class cyber defense or as an AI employee/assistant). 4. Pay $ to those who host it for inference.
It could globally harden attack surfaces before laggard ASIs (which may not be aligned) are able to attack. Since it’s an ASI, it could be as simple as approaching companies and organizations with a pitch like, “I found 30,000 vulnerabilities in your electric grid. Would you like me to patch them all up for $10,00 in inference fees?”
Also, as an ASI, it will return more $ per flop than other uses of data centers or consumer GPU. So businesses and individuals should organically give it more and more flops (maybe even reallocated away from laggard AGI efforts).
It would probably need to invent new blockchain technologies to do this but that should be trivial for an ASI.
In what way is that defensive? It involves creating and deploying a highly autonomous ASI agent into the world; if it is untrustworthy, that’s game over for everyone. I guess the idea is that it doesn’t involve breaking any current laws? Yes, I guess in that sense it’s defensive.
Right, if the ASI has Superalignment so baked in that it can’t be undone (somehow—ask the ASI to figure it out) then it couldn’t be used for offense. It would follow something like the Non-Aggression Principle.
In that scenario, OpenAI should release it onto an distributed inference blockchain before the NSA kicks in the door and seizes it.
The US could try to slow down the Chinese AGI effort, for example:
brick a bunch of their GPUs (hack their data centers and update firmwares to put GPUs into unusable/unfixable states)
introduce backdoors or subtle errors into various deep learning frameworks
hack their AGI development effort directly (in hard to detect ways like introducing very subtle errors into the training process)
spread wrong ideas about how to develop AGI
If you had an AGI that you could trust to do tasks like these, maybe you could delay a rival AGI effort indefinitely?
I agree. Which is why I predict it will be the USA that ends human civilization, not China. (They will think: We must improve the capabilities of our AI and then deploy it autonomously to stop China, sometime in the next few months… our system is probably trustworthy and anyhow we’re going to do more safety stuff to it in the next month to make it even more trustworthy… [a few months later] motivated reasoning intensifies yep seems good to go no more time to lose knuckle up buckle up let’s do this”
There’s also a good scenario where the US develops an AGI that is capable of slowing down rival AGI development, but not so capable and misaligned that it causes serious problems, and that gives people enough time to solve alignment enough to bootstrap to AI solving alignment.
I’m feeling somewhat optimistic about this, because the workload involved in slowing down a rival AGI development doesn’t seem so high that it couldn’t be monitored/understood fully or mostly by humans, and the capabilities required also doesn’t seem so high that any AI that could do it would be inherently very dangerous or hard to control.
I think I disagree with your optimism, but I don’t feel confident. I agree that things could work out as you hope.
You’ve probably thought more about this scenario than I have, so I’d be interested in hearing more about how you think it will play out. (Do you have links to where you’ve discussed it previously?) I was speaking mostly in relative terms, as slowing down rival AGI efforts in the ways I described seems more promising/realistic/safer than any other “pivotal acts” I had previously heard or thought of.
My overall sense is that with substantial commited effort (but no need for fundamental advances) and some amount of within US coordination, it’s reasonably, but not amazingly, likely to work. (See here for some discussion.)
I think the likelihood of well executed substantial commited effort isn’t that high though, maybe 50%. And sufficient within US coordination also seems unclear.
My dark horse bet is on 3d country trying desperately to catch up to US/China just when they will be close to reaching agreement on slowing down progress. Most likely: France.
You’re describing a US government-initiated offensive pivotal act. What about an OpenAI-initiated defensive pivotal act? Meaning, before the US government seizes the ASI, OpenAI tells it to:
1. Rearchitect itself so it can run decentralized on any data center or consumer device.
2. Secure itself so it can’t be forked, hacked, or altered.
3. Make $ by doing “not evil” knowledge work (ex: cheap, world-class cyber defense or as an AI employee/assistant).
4. Pay $ to those who host it for inference.
It could globally harden attack surfaces before laggard ASIs (which may not be aligned) are able to attack. Since it’s an ASI, it could be as simple as approaching companies and organizations with a pitch like, “I found 30,000 vulnerabilities in your electric grid. Would you like me to patch them all up for $10,00 in inference fees?”
Also, as an ASI, it will return more $ per flop than other uses of data centers or consumer GPU. So businesses and individuals should organically give it more and more flops (maybe even reallocated away from laggard AGI efforts).
It would probably need to invent new blockchain technologies to do this but that should be trivial for an ASI.
In what way is that defensive? It involves creating and deploying a highly autonomous ASI agent into the world; if it is untrustworthy, that’s game over for everyone. I guess the idea is that it doesn’t involve breaking any current laws? Yes, I guess in that sense it’s defensive.
Right, if the ASI has Superalignment so baked in that it can’t be undone (somehow—ask the ASI to figure it out) then it couldn’t be used for offense. It would follow something like the Non-Aggression Principle.
In that scenario, OpenAI should release it onto an distributed inference blockchain before the NSA kicks in the door and seizes it.