While there are often good reasons to keep some specific technical details of dangerous technology secret, keeping strategy secret is unwise.
In this comment, by “public” I mean “the specific intellectual public who would be interested in your ideas if you shared them”, not “the general public”. (I’m arguing for transparency, not mass-marketing)
Either you think the public should, in general, have better beliefs about AI strategy, or you think the public should, in general, have worse beliefs about AI strategy, or you think the public should have exactly the level of epistemics about AI strategy that it does.
If you think the public should, in general, have better beliefs about AI strategy: great, have public discussions. Maybe some specific discussions will be net-negative, but others will be net-positive, and the good will outweigh the bad.
If you think the public should, in general, have worse beliefs about AI strategy: unless you have a good argument for this, the public has reason to think you’re not acting in the public interest at this point, and are also likely acting against it.
There are strong prior reasons to think that it’s better for the public to have better beliefs about AI strategy. To the extent that “people doing stupid things” is a risk, that risk comes from people having bad strategic beliefs. Also, to the extent that “people not knowing what each other is going to do and getting scared” is a risk, the risk comes from people not sharing their strategies with each other. It’s common for multiple nations to spy on each other to reduce the kind of information asymmetries that can lead to unnecessary arms races, preemptive strikes, etc.
This doesn’t rule out that there may come a time when there are good public arguments that some strategic topics should stop being discussed publicly. But that time isn’t now.
There are strong prior reasons to think that it’s better for the public to have better beliefs about AI strategy.
That may be, but note that the word “prior” is doing basically all of the work in this sentence. (To see this, just replace “AI strategy” with practically any other subject, and notice how the modified statement sounds just as sensible as the original.) This is important because priors can easily be overwhelmed by additional evidence—and insofar as AI researcher Alice thinks a specific discussion topic in AI strategy has the potential to be dangerous, it’s worth realizing Alice probably has some specific inside view reasons to believe that’s the case. And, if those inside view arguments happen to require an understanding of the topic that Alice believes to be dangerous, then Alice’s hands are now tied: she’s both unable to share information about something, and unable to explain why she can’t share that information.
Naturally, this doesn’t just make Alice’s life more difficult: if you’re someone on the outside looking in, then you have no way of confirming if anything Alice says is true, and you’re forced to resort to just trusting Alice. If you don’t have a whole lot of trust in Alice to begin with, you might assume the worst of her: Alice is either rationalizing or lying (or possibly both) in order to gain status for herself and the field she works in.
I think, however, that these are dangerous assumptions to make. Firstly, if Alice is being honest and rational, then this policy effectively punishes her for being “in the know”—she must either divulge information she (correctly) believes to be dangerous, or else suffer an undeserved reputational hit. I’m particularly wary of imposing incentive structures of this kind around AI safety research, especially considering the relatively small number of people working on AI safety to begin with.
Secondly, however: in addition to being unfair to Alice, there are more subtle effects that such a policy may have. In particular, if Alice feels pressured to disclose the reasons she can’t disclose things, that may end up influencing the rate and/or quality of the research she does in the first place (Ctrl+F “walls”). This could have serious consequences down the line for AI safety research, above and beyond the object-level hazards of revealing potentially dangerous ideas to the public.
Given all of this, I don’t think it’s obvious that the best move at this point involves making all of the strategic arguments around AI safety public. (And note that I say this as a member of said public: I am not affiliated with MIRI or any other AI safety institution, nor am I personally acquainted with anyone who is so affiliated. This therefore makes me a direct counter-example to your claim about the public in general having reason to think secret-keeping organizations must be doing so for self-interested reasons.)
To be clear: I think there is a possible world in which your arguments make sense. I also think there is a possible world in which your arguments not only do not make sense, but would lead to a clearly worse outcome if taken seriously. It’s not clear to me which of these worlds we actually live in, and I don’t think you’ve done a sufficient job of arguing that we live in the former world instead of the latter.
If someone’s claiming “topic X is dangerous to talk about, and I’m not even going to try to convince you of the abstract decision theory implying this, because this decision theory is dangerous to talk about”, I’m not going to believe them, because that’s frankly absurd.
It’s possible to make abstract arguments that don’t reveal particular technical details, such as by referring to historical cases, or talking about hypothetical situations.
It’s also possible for Alice to convince Bob that some info is dangerous by giving the info to Carol, who is trusted by both Alice and Bob, after which Carol tells Bob how dangerous the info is.
If Alice isn’t willing to do any of these things, fine, there’s a possible but highly unlikely world where she’s right, and she takes a reputation hit due to the “unlikely” part of that sentence.
(Note, the alternative hypothesis isn’t just direct selfishness; what’s more likely is cliquish inner ring dynamics)
I haven’t had time to write my thoughts on when strategy research should and shouldn’t be public, but I note that this recent post by Spiracular touches on many of the points that I would touch on in talking about the pros and cons of secrecy around infohazards.
The main claim that I would make about extending this to strategy is that strategy implies details. If I have a strategy that emphasizes that we need to be careful around biosecurity, that implies technical facts about the relative risks of biology and other sciences.
For example, the US developed the Space Shuttle with a justification that didn’t add up (ostensibly it would save money, but it was obvious that it wouldn’t). The Soviets, trusting in the rationality of the US government, inferred that there must be some secret application for which the Space Shuttle was useful, and so developed a clone (so that when the secret application was unveiled, they would be able to deploy it immediately instead of having to build their own shuttle from scratch then). If in fact an application like that had existed, it seems likely that the Soviets could have found it by reasoning through “what do they know that I don’t?” when they might not have found it by reasoning from scratch.
While there are often good reasons to keep some specific technical details of dangerous technology secret, keeping strategy secret is unwise.
In this comment, by “public” I mean “the specific intellectual public who would be interested in your ideas if you shared them”, not “the general public”. (I’m arguing for transparency, not mass-marketing)
Either you think the public should, in general, have better beliefs about AI strategy, or you think the public should, in general, have worse beliefs about AI strategy, or you think the public should have exactly the level of epistemics about AI strategy that it does.
If you think the public should, in general, have better beliefs about AI strategy: great, have public discussions. Maybe some specific discussions will be net-negative, but others will be net-positive, and the good will outweigh the bad.
If you think the public should, in general, have worse beliefs about AI strategy: unless you have a good argument for this, the public has reason to think you’re not acting in the public interest at this point, and are also likely acting against it.
There are strong prior reasons to think that it’s better for the public to have better beliefs about AI strategy. To the extent that “people doing stupid things” is a risk, that risk comes from people having bad strategic beliefs. Also, to the extent that “people not knowing what each other is going to do and getting scared” is a risk, the risk comes from people not sharing their strategies with each other. It’s common for multiple nations to spy on each other to reduce the kind of information asymmetries that can lead to unnecessary arms races, preemptive strikes, etc.
This doesn’t rule out that there may come a time when there are good public arguments that some strategic topics should stop being discussed publicly. But that time isn’t now.
That may be, but note that the word “prior” is doing basically all of the work in this sentence. (To see this, just replace “AI strategy” with practically any other subject, and notice how the modified statement sounds just as sensible as the original.) This is important because priors can easily be overwhelmed by additional evidence—and insofar as AI researcher Alice thinks a specific discussion topic in AI strategy has the potential to be dangerous, it’s worth realizing Alice probably has some specific inside view reasons to believe that’s the case. And, if those inside view arguments happen to require an understanding of the topic that Alice believes to be dangerous, then Alice’s hands are now tied: she’s both unable to share information about something, and unable to explain why she can’t share that information.
Naturally, this doesn’t just make Alice’s life more difficult: if you’re someone on the outside looking in, then you have no way of confirming if anything Alice says is true, and you’re forced to resort to just trusting Alice. If you don’t have a whole lot of trust in Alice to begin with, you might assume the worst of her: Alice is either rationalizing or lying (or possibly both) in order to gain status for herself and the field she works in.
I think, however, that these are dangerous assumptions to make. Firstly, if Alice is being honest and rational, then this policy effectively punishes her for being “in the know”—she must either divulge information she (correctly) believes to be dangerous, or else suffer an undeserved reputational hit. I’m particularly wary of imposing incentive structures of this kind around AI safety research, especially considering the relatively small number of people working on AI safety to begin with.
Secondly, however: in addition to being unfair to Alice, there are more subtle effects that such a policy may have. In particular, if Alice feels pressured to disclose the reasons she can’t disclose things, that may end up influencing the rate and/or quality of the research she does in the first place (Ctrl+F “walls”). This could have serious consequences down the line for AI safety research, above and beyond the object-level hazards of revealing potentially dangerous ideas to the public.
Given all of this, I don’t think it’s obvious that the best move at this point involves making all of the strategic arguments around AI safety public. (And note that I say this as a member of said public: I am not affiliated with MIRI or any other AI safety institution, nor am I personally acquainted with anyone who is so affiliated. This therefore makes me a direct counter-example to your claim about the public in general having reason to think secret-keeping organizations must be doing so for self-interested reasons.)
To be clear: I think there is a possible world in which your arguments make sense. I also think there is a possible world in which your arguments not only do not make sense, but would lead to a clearly worse outcome if taken seriously. It’s not clear to me which of these worlds we actually live in, and I don’t think you’ve done a sufficient job of arguing that we live in the former world instead of the latter.
If someone’s claiming “topic X is dangerous to talk about, and I’m not even going to try to convince you of the abstract decision theory implying this, because this decision theory is dangerous to talk about”, I’m not going to believe them, because that’s frankly absurd.
It’s possible to make abstract arguments that don’t reveal particular technical details, such as by referring to historical cases, or talking about hypothetical situations.
It’s also possible for Alice to convince Bob that some info is dangerous by giving the info to Carol, who is trusted by both Alice and Bob, after which Carol tells Bob how dangerous the info is.
If Alice isn’t willing to do any of these things, fine, there’s a possible but highly unlikely world where she’s right, and she takes a reputation hit due to the “unlikely” part of that sentence.
(Note, the alternative hypothesis isn’t just direct selfishness; what’s more likely is cliquish inner ring dynamics)
I haven’t had time to write my thoughts on when strategy research should and shouldn’t be public, but I note that this recent post by Spiracular touches on many of the points that I would touch on in talking about the pros and cons of secrecy around infohazards.
The main claim that I would make about extending this to strategy is that strategy implies details. If I have a strategy that emphasizes that we need to be careful around biosecurity, that implies technical facts about the relative risks of biology and other sciences.
For example, the US developed the Space Shuttle with a justification that didn’t add up (ostensibly it would save money, but it was obvious that it wouldn’t). The Soviets, trusting in the rationality of the US government, inferred that there must be some secret application for which the Space Shuttle was useful, and so developed a clone (so that when the secret application was unveiled, they would be able to deploy it immediately instead of having to build their own shuttle from scratch then). If in fact an application like that had existed, it seems likely that the Soviets could have found it by reasoning through “what do they know that I don’t?” when they might not have found it by reasoning from scratch.