some types of AI can be made non-catastrophic with a modest effort:
AI trained only to prove math theorems
AI trained only to produce predictive causal models of the world, by observation alone (an observer and learner, not an active agent)
AIs trained only to optimize w.r.t a clearly specified objective and a formal world model (not actually acting in the world and getting feedback—only being rewarded on solving formal optimization problems well)
the last two kinds (world-learners and formal-model-optimizers) should be kept separate, trained to different objectives.
these are immensely useful things to have, but they don’t lead, if proper care is taken, to a conniving, power-seeking, humanity-destroying AI.
I hypothesize that if we manage to not mix RL with manipulative leverage (the ability to manipulate humans or hack computers), then we can be safe.
non-safe AI should be globally outlawed, seen as no less risky than nuclear weapon development, and we should be working now on building the political support for this.
even if it means that access to large quantities of computation and/or energy would be strictly regulated
every world leader must be made to personally understand how dangerous AI is
all of the above would be extremely hard politically but we don’t have a better plan yet, nor do we have time to waste.
In summary: By accepting harsh limitations on what AI is acceptable to develop, we can have very powerful AI without ending the world. The problem then becomes political (still very hard but solvable in principle).
What are these AIs going to do that is immensely useful but not at all dangerous? A lot of useful capabilities that people want are adjacent to danger. Tool AIs Want to be Agent AIs.
If two of your AIs would be dangerous when combined, clearly you can’t make them publicly available, or someone would combine them. If your publicly-available AI is dangerous if someone wraps it with a shell script, someone will create that shell script (see AutoGPT). If no one but a select few can use your AI, that limits its usefulness.
An AI ban that stops dangerous AI might be possible. An AI ban that allows development of extremely powerful systems but has exactly the right safeguard requirements to render those systems non-dangerous seems impossible.
Thanks for the pointer. I’ll hopefully read the linked article in a couple of days.
I start from a point of “no AI for anyone” and then ask “what can we safely allow”. I made a couple of suggestions, where “safely” is understood to mean “safe when treated with great care”. You are correct that this definition of “safe” is incompatible with unfettered AI development. But what approach to powerful AI isn’t incompatible with unfettered AI development? Every AI capability we build can be combined with other capabilities, making the whole more powerful and therefore more dangerous.
To keep things safe while still having AI, the answer may be: “an international agency holds most of the world’s compute power so that all AI work is done by submitting experiment requests to the agency which vets them for safety”. Indeed, I don’t see how we can allow people to do AI development without oversight, at all. This centralization is bad but I don’t see how it can be avoided.
Military establishments would probably refuse to subject themselves to this restriction even if we get states to restrict the civilians. I hope I’m wrong on this and that international agreement can be reached and enforced to restrict AI development by national security organizations. Still, it’s better to restrict the civilians (and try to convince the militaries to self-regulate) than to restrict nobody.
Is it possible to reach and enforce a global political consensus of “no AI for anyone ever at all”?. We may need thermonuclear war for that, and I’m not on board. I think “strictly-regulated AI development” is a relatively easier sell (though still terribly hard).
I agree that such a restriction is a large economic handicap, but what else can we do? It seems that the alternative is praying that someone comes up with an effectively costless and safe approach so that nobody gives up anything. Are we getting there in your opinion?
Is this a plausible take?
some types of AI can be made non-catastrophic with a modest effort:
AI trained only to prove math theorems
AI trained only to produce predictive causal models of the world, by observation alone (an observer and learner, not an active agent)
AIs trained only to optimize w.r.t a clearly specified objective and a formal world model (not actually acting in the world and getting feedback—only being rewarded on solving formal optimization problems well)
the last two kinds (world-learners and formal-model-optimizers) should be kept separate, trained to different objectives.
these are immensely useful things to have, but they don’t lead, if proper care is taken, to a conniving, power-seeking, humanity-destroying AI.
I hypothesize that if we manage to not mix RL with manipulative leverage (the ability to manipulate humans or hack computers), then we can be safe.
non-safe AI should be globally outlawed, seen as no less risky than nuclear weapon development, and we should be working now on building the political support for this.
even if it means that access to large quantities of computation and/or energy would be strictly regulated
every world leader must be made to personally understand how dangerous AI is
all of the above would be extremely hard politically but we don’t have a better plan yet, nor do we have time to waste.
In summary: By accepting harsh limitations on what AI is acceptable to develop, we can have very powerful AI without ending the world. The problem then becomes political (still very hard but solvable in principle).
What are these AIs going to do that is immensely useful but not at all dangerous? A lot of useful capabilities that people want are adjacent to danger. Tool AIs Want to be Agent AIs.
If two of your AIs would be dangerous when combined, clearly you can’t make them publicly available, or someone would combine them. If your publicly-available AI is dangerous if someone wraps it with a shell script, someone will create that shell script (see AutoGPT). If no one but a select few can use your AI, that limits its usefulness.
An AI ban that stops dangerous AI might be possible. An AI ban that allows development of extremely powerful systems but has exactly the right safeguard requirements to render those systems non-dangerous seems impossible.
Thanks for the pointer. I’ll hopefully read the linked article in a couple of days.
I start from a point of “no AI for anyone” and then ask “what can we safely allow”. I made a couple of suggestions, where “safely” is understood to mean “safe when treated with great care”. You are correct that this definition of “safe” is incompatible with unfettered AI development. But what approach to powerful AI isn’t incompatible with unfettered AI development? Every AI capability we build can be combined with other capabilities, making the whole more powerful and therefore more dangerous.
To keep things safe while still having AI, the answer may be: “an international agency holds most of the world’s compute power so that all AI work is done by submitting experiment requests to the agency which vets them for safety”. Indeed, I don’t see how we can allow people to do AI development without oversight, at all. This centralization is bad but I don’t see how it can be avoided.
Military establishments would probably refuse to subject themselves to this restriction even if we get states to restrict the civilians. I hope I’m wrong on this and that international agreement can be reached and enforced to restrict AI development by national security organizations. Still, it’s better to restrict the civilians (and try to convince the militaries to self-regulate) than to restrict nobody.
Is it possible to reach and enforce a global political consensus of “no AI for anyone ever at all”?. We may need thermonuclear war for that, and I’m not on board. I think “strictly-regulated AI development” is a relatively easier sell (though still terribly hard).
I agree that such a restriction is a large economic handicap, but what else can we do? It seems that the alternative is praying that someone comes up with an effectively costless and safe approach so that nobody gives up anything. Are we getting there in your opinion?
immensely useful things these AI can do:
drive basic science and technology forward at an accelerated pace
devise excellent macroeconomic, geopolitical and public health policy
these things are indeed risk-adjacent, I grant.