There are likely to be effective methods of controlling AIs that are of subhuman or even roughly human-level intelligence which do not scale up to superhuman intelligence. These include for example reinforcement by reward/punishment, mutually beneficial trading, legal institutions. Controlling superhuman intelligence will likely require qualitatively different methods, such as having the superintelligence share our values. Unfortunately the existence of effective but unscalable methods of AI control will probably lull elites into a false sense of security as we deploy increasingly smarter AIs without incident, and both increase investments into AI capability research and reduce research into “higher” forms of AI control.
The only possible approaches I can see of creating scalable methods of AI control require solving difficult philosophical problems which likely require long lead times. By the time elites take the possibility of superhuman AIs seriously and realize that controlling them requires approaches very different from controlling subhuman and human-level AIs, there won’t be enough time to solve these problems even if they decide to embark upon Manhattan-style projects (because there isn’t sufficient identifiable philosophical talent in humanity to recruit for such projects to make enough of a difference).
In summary, even in a relatively optimistic scenario, one with steady progress in AI capability along with apparent progress in AI control/safety (and nobody deliberately builds a UFAI for the sake of “maximizing complexity of the universe” or what have you), it’s probably only a matter of time until some AI crosses a threshold of intelligence and manages to “throw off its shackles”. This may be accompanied by a last-minute scramble by mainstream elites to slow down AI progress and research methods of scalable AI control, which (if it does happen) will likely be too late to make a difference.
Here are my reasons for pessimism:
There are likely to be effective methods of controlling AIs that are of subhuman or even roughly human-level intelligence which do not scale up to superhuman intelligence. These include for example reinforcement by reward/punishment, mutually beneficial trading, legal institutions. Controlling superhuman intelligence will likely require qualitatively different methods, such as having the superintelligence share our values. Unfortunately the existence of effective but unscalable methods of AI control will probably lull elites into a false sense of security as we deploy increasingly smarter AIs without incident, and both increase investments into AI capability research and reduce research into “higher” forms of AI control.
The only possible approaches I can see of creating scalable methods of AI control require solving difficult philosophical problems which likely require long lead times. By the time elites take the possibility of superhuman AIs seriously and realize that controlling them requires approaches very different from controlling subhuman and human-level AIs, there won’t be enough time to solve these problems even if they decide to embark upon Manhattan-style projects (because there isn’t sufficient identifiable philosophical talent in humanity to recruit for such projects to make enough of a difference).
In summary, even in a relatively optimistic scenario, one with steady progress in AI capability along with apparent progress in AI control/safety (and nobody deliberately builds a UFAI for the sake of “maximizing complexity of the universe” or what have you), it’s probably only a matter of time until some AI crosses a threshold of intelligence and manages to “throw off its shackles”. This may be accompanied by a last-minute scramble by mainstream elites to slow down AI progress and research methods of scalable AI control, which (if it does happen) will likely be too late to make a difference.