Sort of! This paper (of which I’m a coauthor) discusses this “unraveling” argument, and the technical conditions under which it does and doesn’t go through. Briefly:
It’s not clear how easy it is to demonstrate military strength in the context of an advanced AI civilization, in a way that can be verified / can’t be bluffed. If I see that you’ve demonstrated high strength in some small war game, but my prior on you being that strong is sufficiently low, I’ll probably think you’re bluffing and wouldn’t be that strong in the real large-scale conflict.
Supposing strength can be verified, it might be intractable to do so without also disclosing vulnerable info (irrelevant to the potential conflict). As TLW’s comment notes, the disclosure process itself might be really computationally expensive.
But if we can verifiably disclose, and I can either selectively disclose only the war-relevant info or I don’t have such a vulnerability, then yes you’re right, war can be avoided. (At least in this toy model where there’s a scalar “strength” variable; things can get more complicated in multiple dimensions, or where there isn’t an “ordering” to the war-relevant info.)
Another option (which the paper presents) is conditional disclosure—even if you could exploit me by knowing the vulnerable info, I commit to share my code if and only if you commit to share yours, play the cooperative equilibrium, and not exploit me.
As TLW’s comment notes, the disclosure process itself might be really computationally expensive.
I was actually thinking of the cost of physical demonstrations, and/or the cost of convincing others that simulations are accurate[1], not so much direct simulation costs.
That being said, this is still a valid point, just not one that I should be credited for.
Imagine trying to convince someone of atomic weapons purely with simulations, without anyone ever having detonated one[2], for instance. It may be doable; it’d be nowhere near cheap.
Now imagine trying to do so without allowing the other side to figure out how to make atomic bombs in the process...
To be clear: as in alt-history-style ‘Trinity / etc never happened’. Not just as in someone today convincing another that their particular atomic weapon works.
Sort of! This paper (of which I’m a coauthor) discusses this “unraveling” argument, and the technical conditions under which it does and doesn’t go through. Briefly:
It’s not clear how easy it is to demonstrate military strength in the context of an advanced AI civilization, in a way that can be verified / can’t be bluffed. If I see that you’ve demonstrated high strength in some small war game, but my prior on you being that strong is sufficiently low, I’ll probably think you’re bluffing and wouldn’t be that strong in the real large-scale conflict.
Supposing strength can be verified, it might be intractable to do so without also disclosing vulnerable info (irrelevant to the potential conflict). As TLW’s comment notes, the disclosure process itself might be really computationally expensive.
But if we can verifiably disclose, and I can either selectively disclose only the war-relevant info or I don’t have such a vulnerability, then yes you’re right, war can be avoided. (At least in this toy model where there’s a scalar “strength” variable; things can get more complicated in multiple dimensions, or where there isn’t an “ordering” to the war-relevant info.)
Another option (which the paper presents) is conditional disclosure—even if you could exploit me by knowing the vulnerable info, I commit to share my code if and only if you commit to share yours, play the cooperative equilibrium, and not exploit me.
I was actually thinking of the cost of physical demonstrations, and/or the cost of convincing others that simulations are accurate[1], not so much direct simulation costs.
That being said, this is still a valid point, just not one that I should be credited for.
Imagine trying to convince someone of atomic weapons purely with simulations, without anyone ever having detonated one[2], for instance. It may be doable; it’d be nowhere near cheap.
Now imagine trying to do so without allowing the other side to figure out how to make atomic bombs in the process...
To be clear: as in alt-history-style ‘Trinity / etc never happened’. Not just as in someone today convincing another that their particular atomic weapon works.