The selected counterarguments seem consistently bad to me, like you haven’t really been talking to good critics. Here are some selected counterarguments I’ve noticed that I’ve found to be especially resilient, in some cases seeming to be stronger than the arguments in favor (but it’s hard for me to tell whether they’re stronger when I’m embedded in a community that’s devoted to promoting the arguments in favor).
2. Humans won’t figure out how to make systems with goals that are compatible with human welfare and realizing human values
This argument was a lot more compelling when natural language understanding seemed hard, but today, pre-AGI AI models have a pretty sophisticated understanding of human motivations, since they’re trained to understand the world and how to move in it through human writing, and then productized by working as assistants for humans. This is not very likely to change.
catastrophic tools argument
New technologies that are sufficiently catastrophic to pose an extinction risk may not be feasible soon, even with relatively advanced AI
I don’t have a counterargument here but even if that’s all you could find, I don’t think it’s a good practice to give energy to bad criticism that no one would really stand by. We already have protein predictors. The idea that AI wont have applications in bioweapons is very very easy to dismiss. Autonomous weapons and logistics are another obvious application.
Powerful black boxes
Contemporary AI is guileless, it doesn’t have access to its own weights and so cannot actively hide anything that’s going on in there. And interpretability research is going fairly well.
Self-improving AI will inevitably develop its own way of making its mechanisms legible to itself, which we’re likely to be able to use make its mechanisms legible to us, similar to the mechanisms that exist in biology: A system of modular, reusable parts has to be comprehensible in some frame, just to function, a system of parts where parts have roles that are incomprehensible to humans is not going to have the kind of interoperability that makes possible evolution or intentional design. (I received this argument some time ago from a lesswrong post written by someone with a background in biology that I’m having difficulty finding. It might not have been argued explicitly.)
Which is to say
It’s pretty likely that this is solvable (not an argument against the existence of risk, but something that should be mentioned here)
“black box” is kind of a mischaracterization.
There’s an argument of similar strength to the given argument, that nothing capable of really dangerously emergent capabilities can be a black box.
multi-agent dynamics
Influence in AI is concentrated to a few very similar companies in one state in one country, with many other choke-points in the supply chain. It’s very unlikely that competition in AI is going to be particularly intense or ongoing. Strategizing AI is equivalent to scaleable/mass-producible organizational competence, so is not vulnerable to the scaling limits that keep human organizations fragmented, so power concentration is likely to be an enduring theme.
AI is an information technology. All information technologies make coordination problems dramatically easier to solve. If I can have an AI reliably detect and trucks in satellite images, then I can easily and more reliably detect violations of nuclear treaties.
Relatedly, if I can build an artificial agent with an open design that proves to all parties that it is a neutral party that will only report treaty violations, then we can now have treaties that place these agents as observers inside all weapons research programs. Previously, neutral observation without the risk of leaking military secrets was arguably an unsolvable problem. We’re maybe a year away from that changing.
This trend towards transparency/coordination capacity as computers become more intelligent and more widely deployed is strong.
It’s unclear to what extent natural selection/competition generally matters in AI technologies. Competition within advanced ecologies/economies tends to lead towards symbiosis. Fruitful interaction in knowledge economies tends to require free cooperation. Breeding for competition so far hasn’t seemed very useful in AI training processes. (Unless you count adversarial training? But should you?)
The selected counterarguments seem consistently bad to me, like you haven’t really been talking to good critics. Here are some selected counterarguments I’ve noticed that I’ve found to be especially resilient, in some cases seeming to be stronger than the arguments in favor (but it’s hard for me to tell whether they’re stronger when I’m embedded in a community that’s devoted to promoting the arguments in favor).
This argument was a lot more compelling when natural language understanding seemed hard, but today, pre-AGI AI models have a pretty sophisticated understanding of human motivations, since they’re trained to understand the world and how to move in it through human writing, and then productized by working as assistants for humans. This is not very likely to change.
I don’t have a counterargument here but even if that’s all you could find, I don’t think it’s a good practice to give energy to bad criticism that no one would really stand by. We already have protein predictors. The idea that AI wont have applications in bioweapons is very very easy to dismiss. Autonomous weapons and logistics are another obvious application.
Contemporary AI is guileless, it doesn’t have access to its own weights and so cannot actively hide anything that’s going on in there. And interpretability research is going fairly well.
Self-improving AI will inevitably develop its own way of making its mechanisms legible to itself, which we’re likely to be able to use make its mechanisms legible to us, similar to the mechanisms that exist in biology: A system of modular, reusable parts has to be comprehensible in some frame, just to function, a system of parts where parts have roles that are incomprehensible to humans is not going to have the kind of interoperability that makes possible evolution or intentional design. (I received this argument some time ago from a lesswrong post written by someone with a background in biology that I’m having difficulty finding. It might not have been argued explicitly.)
Which is to say
It’s pretty likely that this is solvable (not an argument against the existence of risk, but something that should be mentioned here)
“black box” is kind of a mischaracterization.
There’s an argument of similar strength to the given argument, that nothing capable of really dangerously emergent capabilities can be a black box.
Influence in AI is concentrated to a few very similar companies in one state in one country, with many other choke-points in the supply chain. It’s very unlikely that competition in AI is going to be particularly intense or ongoing. Strategizing AI is equivalent to scaleable/mass-producible organizational competence, so is not vulnerable to the scaling limits that keep human organizations fragmented, so power concentration is likely to be an enduring theme.
AI is an information technology. All information technologies make coordination problems dramatically easier to solve. If I can have an AI reliably detect and trucks in satellite images, then I can easily and more reliably detect violations of nuclear treaties.
Relatedly, if I can build an artificial agent with an open design that proves to all parties that it is a neutral party that will only report treaty violations, then we can now have treaties that place these agents as observers inside all weapons research programs. Previously, neutral observation without the risk of leaking military secrets was arguably an unsolvable problem. We’re maybe a year away from that changing.
This trend towards transparency/coordination capacity as computers become more intelligent and more widely deployed is strong.
It’s unclear to what extent natural selection/competition generally matters in AI technologies. Competition within advanced ecologies/economies tends to lead towards symbiosis. Fruitful interaction in knowledge economies tends to require free cooperation. Breeding for competition so far hasn’t seemed very useful in AI training processes. (Unless you count adversarial training? But should you?)