Eliezer did define “pivotal act” so as to be necessary. It’s an act which makes it so that nobody will build an unaligned AI; that’s pretty straightforwardly necessary for preventing existential risk, assuming that unaligned AI poses an existential risk in the first place.
However, the danger in introducing concepts via definitions is that there may be “pivotal acts” which satisfy the definition but do not match the prototypical picture of a “pivotal act”.
Yeah, I guess the answer is yes by definition. Still wondering what kind of pivotal acts people are thinking about—whether they’re closer to a big power-grabs like “burn all the GPUs”, or softer governance methods like “publishing papers with alignment techniques” and “encouraging safe development with industry groups and policy standards”. And whether the need for a pivotal act is the main reason why alignment researchers need to be on the cutting edge of capabilities.
I can’t see how “publishing papers with alignment techniques” or “encouraging safe development with industry groups and policy standards” could be pivotal acts. To prevent anyone from building unaligned AI, building an unaligned AI in your garage needs to be prevented. That requires preventing people who don’t read the alignment papers or policy standards and aren’t members of the industry groups from building unaligned AI.
That, in turn, appears to me to require at least one of 1) limiting access to computation resources from your garage, 2) limiting knowledge by garage hackers of techniques to build unaligned AI, 3) somehow convincing all garage hackers not to build unaligned AI even though they could, or 4) surveillance and intervention to prevent anyone from actually building an unaligned AI even though they have the computation resources and knowledge to do it. Surveillance, under option 4, could (theoretically, I’m not saying all of these possibilities are practical) be by humans, by too-weak-to-be-dangerous AI, or by aligned AI.
“Publishing papers with alignment techniques” and “encouraging safe development with industry groups and policy standards” might well be useful actions. It doesn’t seem to me that anything like that can ever be pivotal. Building an actual aligned AI, of course, would be a pivotal act.
“Building an actual aligned AI, of course, would be a pivotal act.” What would an aligned AI do that would prevent anybody from ever building an unaligned AI?
I mostly agree with what you wrote. Preventing all unaligned AIs forever seems very difficult and cannot be guaranteed by soft influence and governance methods. These would only achieve a lower degree of reliability, perhaps constraining governments and corporations via access to compute and critical algorithms but remaining susceptible to bad actors who find loopholes in the system. I guess what I’m poking at is, does everyone here believe that the only way to prevent AI catastrophe is through power-grab pivotal acts that are way outside the Overton Window, e.g. burning all GPUs?
“Building an actual aligned AI, of course, would be a pivotal act.” What would an aligned AI do that would prevent anybody from ever building an unaligned AI?
My guess is that it would implement universal surveillance and intervene, when necessary, to directly stop people from doing just that. Sorry, I should’ve been clearer that I was talking about an aligned superintelligent AI. Since unaligned AI killing everyone seems pretty obviously extremely bad according to the vast majority of humans’ preferences, preventing that would be a very high priority for any sufficiently powerful aligned AI.
Thanks, that really clarifies things. Frankly I’m not on board with any plan to “save the world” that calls for developing AGI in order to implement universal surveillance or otherwise take over the world. Global totalitarianism dictated by a small group of all-powerful individuals is just so terrible in expectation that I’d want to take my chances on other paths to AI safety.
I’m surprised that these kinds of pivotal acts are not more openly debated as a source of s-risk and x-risk. Publish your plans, open yourselves to critique, and perhaps you’ll revise your goals. If not, you’ll still be in a position to follow your original plan. Better yet, you might convince the eventual decision makers of it.
Specifically, do you agree with Eliezer that preventing existential risks requires a “pivotal act” as described here (#6 and #7)?
Eliezer did define “pivotal act” so as to be necessary. It’s an act which makes it so that nobody will build an unaligned AI; that’s pretty straightforwardly necessary for preventing existential risk, assuming that unaligned AI poses an existential risk in the first place.
However, the danger in introducing concepts via definitions is that there may be “pivotal acts” which satisfy the definition but do not match the prototypical picture of a “pivotal act”.
Yeah, I guess the answer is yes by definition. Still wondering what kind of pivotal acts people are thinking about—whether they’re closer to a big power-grabs like “burn all the GPUs”, or softer governance methods like “publishing papers with alignment techniques” and “encouraging safe development with industry groups and policy standards”. And whether the need for a pivotal act is the main reason why alignment researchers need to be on the cutting edge of capabilities.
I can’t see how “publishing papers with alignment techniques” or “encouraging safe development with industry groups and policy standards” could be pivotal acts. To prevent anyone from building unaligned AI, building an unaligned AI in your garage needs to be prevented. That requires preventing people who don’t read the alignment papers or policy standards and aren’t members of the industry groups from building unaligned AI.
That, in turn, appears to me to require at least one of 1) limiting access to computation resources from your garage, 2) limiting knowledge by garage hackers of techniques to build unaligned AI, 3) somehow convincing all garage hackers not to build unaligned AI even though they could, or 4) surveillance and intervention to prevent anyone from actually building an unaligned AI even though they have the computation resources and knowledge to do it. Surveillance, under option 4, could (theoretically, I’m not saying all of these possibilities are practical) be by humans, by too-weak-to-be-dangerous AI, or by aligned AI.
“Publishing papers with alignment techniques” and “encouraging safe development with industry groups and policy standards” might well be useful actions. It doesn’t seem to me that anything like that can ever be pivotal. Building an actual aligned AI, of course, would be a pivotal act.
“Building an actual aligned AI, of course, would be a pivotal act.” What would an aligned AI do that would prevent anybody from ever building an unaligned AI?
I mostly agree with what you wrote. Preventing all unaligned AIs forever seems very difficult and cannot be guaranteed by soft influence and governance methods. These would only achieve a lower degree of reliability, perhaps constraining governments and corporations via access to compute and critical algorithms but remaining susceptible to bad actors who find loopholes in the system. I guess what I’m poking at is, does everyone here believe that the only way to prevent AI catastrophe is through power-grab pivotal acts that are way outside the Overton Window, e.g. burning all GPUs?
My guess is that it would implement universal surveillance and intervene, when necessary, to directly stop people from doing just that. Sorry, I should’ve been clearer that I was talking about an aligned superintelligent AI. Since unaligned AI killing everyone seems pretty obviously extremely bad according to the vast majority of humans’ preferences, preventing that would be a very high priority for any sufficiently powerful aligned AI.
Thanks, that really clarifies things. Frankly I’m not on board with any plan to “save the world” that calls for developing AGI in order to implement universal surveillance or otherwise take over the world. Global totalitarianism dictated by a small group of all-powerful individuals is just so terrible in expectation that I’d want to take my chances on other paths to AI safety.
I’m surprised that these kinds of pivotal acts are not more openly debated as a source of s-risk and x-risk. Publish your plans, open yourselves to critique, and perhaps you’ll revise your goals. If not, you’ll still be in a position to follow your original plan. Better yet, you might convince the eventual decision makers of it.