We explored similar idea in “Military AI as a Convergent Goal of Self-Improving AI”. In that article we suggested that any advance AI will have a convergent goal to take over the world and because of this, it will have convergent subgoal of developing weapons in the broad sense of the word “weapon”: not only tanks or drones, but any instruments to enforce its own will over others or destroy them or their goals.
We wrote in the abstract: “We show that one of the convergent drives of AI is a militarization drive, arising from AI’s need to wage a war against its potential rivals by either physical or software means, or to increase its bargaining power. This militarization trend increases global catastrophic risk or even existential risk during AI takeoff, which includes the use of nuclear weapons against rival AIs, blackmail by the threat of creating a global catastrophe, and the consequences of a war between two AIs. As a result, even benevolent AI may evolve into potentially dangerous military AI. The type and intensity of militarization drive depend on the relative speed of the AI takeoff and the number of potential rivals.”
That paper seems quite different from this post in important ways.
In particular, the gist of the OP seems to be something like “showing that pre-formal intuitions about instrumental convergence persist under a certain natural class of formalisations”. In particular, it does so using formalism closer to standard machine learning research.
The paper you linked seems to me to instead assume that this holds true, and then apply that insight in the context of military strategy. Without speculating about the merits of that, it seems like a different thing which will appeal to different readers, and if it is important, it will be important for somewhat different reasons.
We explored similar idea in “Military AI as a Convergent Goal of Self-Improving AI”. In that article we suggested that any advance AI will have a convergent goal to take over the world and because of this, it will have convergent subgoal of developing weapons in the broad sense of the word “weapon”: not only tanks or drones, but any instruments to enforce its own will over others or destroy them or their goals.
We wrote in the abstract: “We show that one of the convergent drives of AI is a militarization drive, arising from AI’s need to wage a war against its potential rivals by either physical or software means, or to increase its bargaining power. This militarization trend increases global catastrophic risk or even existential risk during AI takeoff, which includes the use of nuclear weapons against rival AIs, blackmail by the threat of creating a global catastrophe, and the consequences of a war between two AIs. As a result, even benevolent AI may evolve into potentially dangerous military AI. The type and intensity of militarization drive depend on the relative speed of the AI takeoff and the number of potential rivals.”
That paper seems quite different from this post in important ways.
In particular, the gist of the OP seems to be something like “showing that pre-formal intuitions about instrumental convergence persist under a certain natural class of formalisations”. In particular, it does so using formalism closer to standard machine learning research.
The paper you linked seems to me to instead assume that this holds true, and then apply that insight in the context of military strategy. Without speculating about the merits of that, it seems like a different thing which will appeal to different readers, and if it is important, it will be important for somewhat different reasons.