AI foom is said to happen if at some point in the future while humans are still mostly in charge [...]
Humans being in charge doesn’t seem central to foom. Like, physically these are wholly unrelated things.
mechanisms like recursive self-improvement can only cause foom if they come earlier than widespread automation from pre-superintelligent systems
Only on the humans-not-in-charge technicality introduced in this definition of foom. Something else being in charge doesn’t change what physically happens as a result of recursive self-improvement.
essentially removing humans from the picture before humans ever need to solve the problem of controlling an AI foom
This doesn’t make the problem of controlling an AI foom go away. The non-foomy systems in charge of the world would still need to solve it.
This doesn’t make the problem of controlling an AI foom go away. The non-foomy systems in charge of the world would still need to solve it.
You’re right, of course, but I don’t think it should be a priority to solve problems that our AI descendants will face, rather than us. It is better to focus on making sure our non-foomy AI descendants have the tools to solve those problems themselves, and that they are properly aligned with our interests.
As non-foomy systems grow more capable, they become the most likely source of foom, so building them causes foom by proxy. At that point, their alignment wouldn’t matter in the same way as current humanity’s alignment wouldn’t matter.
My point is that no system will foom until humans have already left the picture. Actually I doubt that any system will foom even after humans have left the picture, but predicting the very long-run is hard. If no system will foom until humans are already out of the picture, I fail to see why we should make it a priority to try to control a foom now.
I doubt that any system will foom even after humans have left the picture
This seems more like a crux.
Assuming eventual foom, non-foomy things that don’t set up anti-foom security in time only make the foom problem worse, so this abdication of direct responsibility frame doesn’t help. Assuming no foom, there is no need to bother with abdication of direct responsibility. So I don’t see the relevance of the argument you gave in this thread, built around humanity’s direct vs. by-proxy influence over foom.
Assuming eventual foom, non-foomy things that don’t set up anti-foom security in time only make the foom problem worse, so this abdication of direct responsibility frame doesn’t help.
If foom is inevitable, but it won’t happen when humans are still running anything, then what anti-foom security measures can we actually put in place that would help our future descendants handle foom? And does it look any different than ordinary prosaic alignment research?
It looks like building a minimal system that’s non-foomy by design, for the specific purpose of setting up anti-foom security and nothing else. In contrast to starting with more general hopefully-non-foomy hopefully-aligned systems that quickly increase the risk of foom.
Maybe they manage to set up anti-foom security in time. But if we didn’t do it at all, why would they do any better?
It looks like building a minimal system that’s non-foomy by design, for the specific purpose of setting up anti-foom security and nothing else.
Your link for anti-foom security is to the Arbitral article on pivotal acts. I think pivotal acts, almost by definition, assume that foom is achievable in the way that I defined it. That’s because if foom is false, there’s no way you can prevent other people from building AGI after you’ve completed any apparent pivotal act. At most you can delay timelines, by for example imposing ordinary regulations. But you can’t actually have a global indefinite moratorium, enforced by e.g. nanotech that will melt anyone’s GPU who circumvents the ban, in the way implied by the pivotal act framework.
In other words, if you think we can achieve pivotal acts while humans are still running the show, then it sounds like you just disagree with my original argument.
I agree that pivotal act AI is not achievable in anything like our current world before AGI takeover, though I think it remains plausible that with ~20 more years of no-AGI status quo this can change. Even deep learning might do, with enough decision theory to explain what a system is optimizing, interpretability to ensure it’s optimizing the intended thing and nothing else, synthetic datasets to direct its efforts at purely technical problems, and enough compute to get there directly without a need for design-changing self-improvement.
Pivotal act AI is an answer to the question of what AI-shaped intervention would improve on the default trajectory of losing control to non-foomy general AIs (even if we assume/expect their alignment) with respect to an eventual foom. This doesn’t make the intervention feasible without more things changing significantly, like an ordinary decades-long compute moratorium somehow getting its way.
I guess pivotal AI as non-foom again runs afoul of your definition of foom, but it’s noncentral as an example of the concerning concept. It’s not a general intelligence given the features of the design that tell it not to dwell on the real world and ideas outside its task, maybe remaining unaware of the real world altogether. It’s almost certainly easy to modify its design (and datasets) to turn it into a general intelligence, but as designed it’s not. This reduction does make your argument point to it being infeasible right now. But it’s much easier to see that directly, in how much currently unavailable deconfusion and engineering a pivotal act AI design would require.
Humans being in charge doesn’t seem central to foom. Like, physically these are wholly unrelated things.
Only on the humans-not-in-charge technicality introduced in this definition of foom. Something else being in charge doesn’t change what physically happens as a result of recursive self-improvement.
This doesn’t make the problem of controlling an AI foom go away. The non-foomy systems in charge of the world would still need to solve it.
You’re right, of course, but I don’t think it should be a priority to solve problems that our AI descendants will face, rather than us. It is better to focus on making sure our non-foomy AI descendants have the tools to solve those problems themselves, and that they are properly aligned with our interests.
As non-foomy systems grow more capable, they become the most likely source of foom, so building them causes foom by proxy. At that point, their alignment wouldn’t matter in the same way as current humanity’s alignment wouldn’t matter.
My point is that no system will foom until humans have already left the picture. Actually I doubt that any system will foom even after humans have left the picture, but predicting the very long-run is hard. If no system will foom until humans are already out of the picture, I fail to see why we should make it a priority to try to control a foom now.
This seems more like a crux.
Assuming eventual foom, non-foomy things that don’t set up anti-foom security in time only make the foom problem worse, so this abdication of direct responsibility frame doesn’t help. Assuming no foom, there is no need to bother with abdication of direct responsibility. So I don’t see the relevance of the argument you gave in this thread, built around humanity’s direct vs. by-proxy influence over foom.
If foom is inevitable, but it won’t happen when humans are still running anything, then what anti-foom security measures can we actually put in place that would help our future descendants handle foom? And does it look any different than ordinary prosaic alignment research?
It looks like building a minimal system that’s non-foomy by design, for the specific purpose of setting up anti-foom security and nothing else. In contrast to starting with more general hopefully-non-foomy hopefully-aligned systems that quickly increase the risk of foom.
Maybe they manage to set up anti-foom security in time. But if we didn’t do it at all, why would they do any better?
Your link for anti-foom security is to the Arbitral article on pivotal acts. I think pivotal acts, almost by definition, assume that foom is achievable in the way that I defined it. That’s because if foom is false, there’s no way you can prevent other people from building AGI after you’ve completed any apparent pivotal act. At most you can delay timelines, by for example imposing ordinary regulations. But you can’t actually have a global indefinite moratorium, enforced by e.g. nanotech that will melt anyone’s GPU who circumvents the ban, in the way implied by the pivotal act framework.
In other words, if you think we can achieve pivotal acts while humans are still running the show, then it sounds like you just disagree with my original argument.
I agree that pivotal act AI is not achievable in anything like our current world before AGI takeover, though I think it remains plausible that with ~20 more years of no-AGI status quo this can change. Even deep learning might do, with enough decision theory to explain what a system is optimizing, interpretability to ensure it’s optimizing the intended thing and nothing else, synthetic datasets to direct its efforts at purely technical problems, and enough compute to get there directly without a need for design-changing self-improvement.
Pivotal act AI is an answer to the question of what AI-shaped intervention would improve on the default trajectory of losing control to non-foomy general AIs (even if we assume/expect their alignment) with respect to an eventual foom. This doesn’t make the intervention feasible without more things changing significantly, like an ordinary decades-long compute moratorium somehow getting its way.
I guess pivotal AI as non-foom again runs afoul of your definition of foom, but it’s noncentral as an example of the concerning concept. It’s not a general intelligence given the features of the design that tell it not to dwell on the real world and ideas outside its task, maybe remaining unaware of the real world altogether. It’s almost certainly easy to modify its design (and datasets) to turn it into a general intelligence, but as designed it’s not. This reduction does make your argument point to it being infeasible right now. But it’s much easier to see that directly, in how much currently unavailable deconfusion and engineering a pivotal act AI design would require.