1) It’s not even clear people are going to try to react in the first place.
I think this just depends a lot on how large-scale they are. If they are using millions of dollars of compute, and are effectively large-scale criminal organizations, then there are many different avenues by which they might get detected and suppressed.
If we don’t solve alignment and we implement a pause on AI development in labs, the ARA AI may still continue to develop.
A world which can pause AI development is one which can also easily throttle ARA AIs.
The central point is:
At some point, ARA is unshutdownable unless you try hard with a pivotal cleaning act. We may be stuck with a ChaosGPT forever, which is not existential, but pretty annoying. People are going to die.
the ARA evolves over time. Maybe this evolution is very slow, maybe fast. Maybe it plateaus, maybe it does not plateau. I don’t know
This may take an indefinite number of years, but this can be a problem
This seems like a weak central point. “Pretty annoying” and some people dying is just incredibly small compared with the benefits of AI. And “it might be a problem in an indefinite number of years” doesn’t justify the strength of the claims you’re making in this post, like “we are approaching a point of no return” and “without a treaty, we are screwed”.
An extended analogy: suppose the US and China both think it might be possible to invent a new weapon far more destructive than nuclear weapons, and they’re both worried that the other side will invent it first. Worrying about ARAs feels like worrying about North Korea’s weapons program. It could be a problem in some possible worlds, but it is always going to be much smaller, it will increasingly be left behind as the others progress, and if there’s enough political will to solve the main problem (US and China racing) then you can also easily solve the side problem (e.g. by China putting pressure on North Korea to stop).
you can find some comments I’ve made about this by searching my twitter
Link here, and there are other comments in the same thread. Was on my laptop, which has twitter blocked, so couldn’t link it myself before.
doesn’t justify the strength of the claims you’re making in this post, like “we are approaching a point of no return” and “without a treaty, we are screwed”.
I agree that’s a bit too much, but it seems to me that we’re not at all on the way to stopping open source development, and that we need to stop it at some point; maybe you think ARA is a bit early, but I think we need a red line before AI becomes human-level, and ARA is one of the last arbitrary red lines before everything accelerates.
But I still think no return to loss of control because it might be very hard to stop ARA agent still seems pretty fair to me.
Link here, and there are other comments in the same thread. Was on my laptop, which has twitter blocked, so couldn’t link it myself before.
I agree with your comment on twitter that evolutionary forces are very slow compared to deliberate design, but that is not way I wanted to convey (that’s my fault). I think an ARA agent would not only depend on evolutionary forces, but also on the whole open source community finding new ways to quantify, prune, distill, and run the model in a distributed way in a practical way. I think the main driver this “evolution” would be the open source community & libraries who will want to create good “ARA”, and huge economic incentive will make agent AIs more and more common and easy in the future.
I think this just depends a lot on how large-scale they are. If they are using millions of dollars of compute, and are effectively large-scale criminal organizations, then there are many different avenues by which they might get detected and suppressed.
A world which can pause AI development is one which can also easily throttle ARA AIs.
This seems like a weak central point. “Pretty annoying” and some people dying is just incredibly small compared with the benefits of AI. And “it might be a problem in an indefinite number of years” doesn’t justify the strength of the claims you’re making in this post, like “we are approaching a point of no return” and “without a treaty, we are screwed”.
An extended analogy: suppose the US and China both think it might be possible to invent a new weapon far more destructive than nuclear weapons, and they’re both worried that the other side will invent it first. Worrying about ARAs feels like worrying about North Korea’s weapons program. It could be a problem in some possible worlds, but it is always going to be much smaller, it will increasingly be left behind as the others progress, and if there’s enough political will to solve the main problem (US and China racing) then you can also easily solve the side problem (e.g. by China putting pressure on North Korea to stop).
Link here, and there are other comments in the same thread. Was on my laptop, which has twitter blocked, so couldn’t link it myself before.
I push back on this somewhat in a discussion thread here. (As a pointer to people reading through.)
Overall, I think this is likely to be true (maybe 60% likely), but not enough that we should feel totally fine about the situation.
I agree that’s a bit too much, but it seems to me that we’re not at all on the way to stopping open source development, and that we need to stop it at some point; maybe you think ARA is a bit early, but I think we need a red line before AI becomes human-level, and ARA is one of the last arbitrary red lines before everything accelerates.
But I still think no return to loss of control because it might be very hard to stop ARA agent still seems pretty fair to me.
I agree with your comment on twitter that evolutionary forces are very slow compared to deliberate design, but that is not way I wanted to convey (that’s my fault). I think an ARA agent would not only depend on evolutionary forces, but also on the whole open source community finding new ways to quantify, prune, distill, and run the model in a distributed way in a practical way. I think the main driver this “evolution” would be the open source community & libraries who will want to create good “ARA”, and huge economic incentive will make agent AIs more and more common and easy in the future.