I guess this is sort of an agreement with the post… but I don’t think the post goes far enough.
Whoever “you guys” are, all you’ll do by adopting a lot of secrecy is slow yourselves down radically, while making sure that people who are better than you are at secrecy, who are better than you are at penetrating secrecy, who have more resources than you do, and who are better at coordinated action than you are, will know nearly everything you do, and will also know many things that you don’t know.
They will “scoop” you at every important point. And you have approximately zero chance of ever catching up with them on any of their advantages.
The best case long term outcome of an emphasis on keeping dangerous ideas secret would be that particular elements within the Chinese government (or maybe the US government, not that the corresponding elements would necessarily be much better) would get it right when they consolidated their current worldview’s permanent, unchallengeable control over all human affairs. That control could very well include making it impossible for anyone to even want to change the values being enforced. The sorts of people most likely to be ahead throughout any race, and most likely to win if there’s a hard “end”, would be completely comfortable with re-educating you to cure your disharmonious counter-revolutionary attitudes. If they couldn’t do that, they’d definitely arrange things so that you couldn’t ever communicate those attitudes or coordinate around them.
The worst case outcome is that somebody outright destroys the world in a way you might have been able to talk them out of.
Secrecy destroys your influence over people who might otherwise take warnings from you. Nobody is going to change any actions without a clear and detailed explanation of the reasons. And you can’t necessarily know who needs to be given such an explanation. In fact, people you might consider members of “your community” could end up making nasty mistakes because they don’t know something you do.
I’ve spent a lot of my career on the sorts of things where people try to keep secrets, and my overall impression of the AI risk and X-risk communities (including Nick Bostrom) is that they have a profoundly unrealistic, sometimes outright romanticized, view of what secrecy is and what it can do for them (and an unduly rosy view of their prospects for unanimous action in general).
all you’ll do by adopting a lot of secrecy is slow yourselves down radically, while making sure that people who are better than you are at secrecy, who are better than you are at penetrating secrecy, who have more resources than you do, and who are better at coordinated action than you are, will know nearly everything you do, and will also know many things that you don’t know.
So, there’s a few types of secrecy. Here’s three.
The sort of secrecy you have with friends when you gossip, which most of the time works fine.
The sort of secrecy where nobody really knows what is being worked on within companies like Apple and Facebook, whereas there’s way more openness about e.g. Google.
The sort of secrecy where you’re trying to protect yourself from foreign governments, which is way harder.
I’m pretty sure secrecy has been key for Apple’s ability to control its brand, and it’s not just slowed itself down, and I think that it’s plausible to achieve similar levels of secrecy, and that this has many uses. But what you’re talking about is secrecy from governmental groups actively trying to hack you.
I largely agree that when a major government wants your info, they can get it, though I’m not sure it’s not possible to keep secrets from them with a massive amount of work (I have not thought about it too much). I do question your assumption that governments will end up taking over the world, I think with deeply revolutionary tech like nanotech, AI, and others, different groups can end up taking over the world. So I don’t view things as clearly falling toward the outcome of Chinese/US/etc hegemony.
I don’t think Apple is a useful model here at all.
I’m pretty sure secrecy has been key for Apple’s ability to control its brand,
Well, Apple thinks so anyway. They may or may not be right, and “control of the brand” may or may not be important anyway. But anyway it’s true that Apple can keep secrets to some degree.
and it’s not just slowed itself down,
Apple is a unitary organization, though. It has a boundary. It’s small enough that you can find the person whose job it is to care about any given issue, and you are unlikely to miss anybody who needs to know. It has well-defined procedures and effective enforcement. Its secrets have a relatively short lifetime of maybe as much as 2 or 3 years.
Anybody who is spying on Apple is likely to be either a lot smaller, or heavily constrained in how they can safely use any secret they get. If I’m at Google and I steal something from Apple, I can’t publicize it internally, and in fact I run a very large risk of getting fired or turned in to law enforcement if I tell it to the wrong person internally.
Apple has no adversary with a disproportionate internal communication advantage, at least not with respect to any secrets that come from Apple.
The color of the next iPhone is never going to be as interesting to any adversary as an X-risk-level AI secret. And if, say, MIRI actually has a secret that is X-risk-level, then anybody who steals it, and who’s in a position to actually use it, is not likely to feel the least bit constrained by fear of MIRI’s retaliation in using it to do whatever X-risky thing they may be doing.
There’s also the sort of secrecy you have when you signed an NDA because you consult with a company. I would expect a person like Nick Bostrom to have access to information about what happens inside DeepMind that’s protected by promises of secrecy.
These seem like important considerations, but aren’t really engaging with what Chatham House rules are trying to do, which is not to keep secrets, just to keep people’s identities obfuscated enough that people feel comfortable speaking freely
I’ve spent a lot of my career on the sorts of things where people try to keep secrets, and my overall impression of the AI risk and X-risk communities (including Nick Bostrom) is that they have a profoundly unrealistic, sometimes outright romanticized, view of what secrecy is and what it can do for them (and an unduly rosy view of their prospects for unanimous action in general).
I guess I’d be interested in discussing specifics with you, but I can’t think of any good public examples of secrecy in ai-risk and x-risk related communities.
Oh actually, yes I can, MIRI’s written about going non-disclosed by default. I expect you to think this is fine and probably good and not too relevant, because it’s not (as far as the writeup suggests) an attempt to keep secrets from the US government, and you expect they’d fail at that. Is that right?
And OpenAI is attempting to push more careful release practises into the overton window of discussion in the ML communities (my summary is here). I think that while I agree that if a foreign government wants that tech, they can probably get it, I still think not releasing things has major effects. For example, there are lots of great researchers in the world that aren’t paid by governments, and those people cannot get the ideas, which means that overall progress on potentially dangerous tech slows down considerably. I’m not sure what I expect you to think of this one. Am curious if you think this seems unrealistics/romanticised.
MIRI’s written about going non-disclosed by default. I expect you to think this is fine and probably good and not too relevant, because it’s not (as far as the writeup suggests) an attempt to keep secrets from the US government, and you expect they’d fail at that. Is that right?
No, I think it’s probably very counterproductive, depending on what it really means in practice. I wasn’t quite sure what the balance was between “We are going to actively try to keep this secret” and “It’s taking too much of our time to write all of this up”.
On the secrecy side of that, the problem isn’t whether or not MIRI’s secrecy works (although it probably won’t)[1]. The problem is with the cost and impact on their own community from their trying to do it. I’m going to go into that further down this tome.
And OpenAI is attempting to push more careful release practises into the overton window of discussion in the ML communities (my summary is here). [...]
For example, there are lots of great researchers in the world that aren’t paid by governments, and those people cannot get the ideas [...]
That whole GPT thing was just strange.
OpenAI didn’t conceal any of the ideas at all. They held back the full version of the actual trained network, but as I recall they published all of the methods they used to create it. Although a big data blob like the network is relatively easy to keep secret, if your goal is to slow down other research, controlling the network isn’t going to be effective at all.
… and I don’t think that slowing down follow-on research was their goal. If I remember right, they seemed to be worried that people would abuse the actual network they’d trained. That was indeed unrealistic. I’ve seen the text from the full network, and played with giving it prompts and seeing what comes out. Frankly, the thing is useless for fooling anybody and wouldn’t be worth anybody’s time. You could do better by driving a manually created grammar with random numbers, and people already do that.
Treating it like a Big Deal just made OpenAI look grossly out of touch. I wonder how long it took them to get the cherry-picked examples they published when they made their announcement...
So, yes, I thought OpenAI was being unrealistic, although it’s not the kind of “romanticization” I had in mind. I just can’t figure out what they could have stood to gain by that particular move.
All that said, I don’t think I object to “more careful release practices”, in the sense of giving a little thought to what you hand out. My objections are more to things like--
Secrecy-by-default, or treating it as cost-free to make something secret. It’s impractical to have too many secrets, and tends to dilute your protection for any secrets you actually do truly need. In the specific case of AI risk, I think it also changes the balance of speed between you and your adversaries… for the worse. I’ll explain more about that below when I talk about MIRI.
The idea that you can just “not release things”, without very strict formal controls and institutional boundaries, and have that actually work in any meaningful way. There seems to be a lot of “illusion of control” thinking going on. Real secrecy is hard, and it gets harder fast if it has to last a long time.
To set the frame for the rest, I’m going to bloviate a bit about how I’ve seen secrecy to work in general.
One of the “secrets of secrecy” is that, at any scale beyond two or three people, it’s more about controlling diffusion rates than about creating absolute barriers. Information interesting enough to care about will leak eventually.
You have some amount of control over the diffusion rate within some specific domains, and at their boundaries. Once information breaks out into a domain you do not control, it will spread according to the conditions in that new domain regardless of what you do. When information hits a new community, there’s a step change in how fast it propagates.
Which brings up next not-very-secret secret: I’m wrong to talk about a “diffusion rate”. The numbers aren’t big enough to smooth out random fluctuations the way they are for molecules. Information tends to move in jumps for lots of reasons. Something may stay “secret” for a really long time just because nobody notices it… and then become big news when it gets to somebody who actively propagates it, or to somebody who sees an implication others didn’t. A big part of propagation is the framing and setting; if you pair some information with an explanation of why it matters, and release it into a community with a lot of members who care, it will move much, much faster than if you don’t.[2]
So, now, MIRI’s approach...
The problem with what MIRI seems to be doing is that it disproportionately slows the movement of information within their own community and among their allies. In most cases, they will probably hurt themselves more than they hurt their “adversaries”.
Ideas will still spread among the “good guys”, but unreliably, slowly, through an unpredictable rumor mill, with much negotiation and everybody worrying at every turn about what to tell everybody else [3]. That keeps problems from getting solved. It can’t be fixed by telling the people who “need to know”, because MIRI (or whoever) won’t know who those people are, especially-but-not-only if they’re also being secretive.
Meanwhile, MIRI can’t rely on keeping absolute secrets from anybody for any meaningful amount of time. And they’ll probably have a relatively small effect on institutions that could actually do dangerous development. Assuming it’s actually interesting, once one of MIRI’s secrets gets to somebody who happens to be part of some “adversary” institution, it will be propagated throughout that institution, possibly very quickly. It may even get formally announced in the internal newsletter. It even has a chance of moving on from there into that first institution’s own institutional adversaries, because they spy on each other.
But the “adversaries” are still relatively good at secrecy, especially from non-peers, so any follow-on ideas they produce will be slower to propagate back out into the public where MIRI et al can benefit from them.
The advantage the AI risk and X-risk communities have is, if you will, flexibility: they can get their heads around new ideas relatively quickly, adapt, act on implications, build one idea on another, and change their course relatively rapidly. The corresponding, closely related disadvantage is weakness in coordinating work on a large scale toward specific, agreed-upon goals (like say big scary AI development projects).
Worrying too much about secrecy throws away the advantage, but doesn’t cure the disadvantage. Curing the disadvantage requires a culture and a set of material resources that I don’t believe MIRI and friends can ever develop… and that would probably torpedo their effectiveness if they did develop them.
By their nature, they are going to be the people who are arguing against some development program that everybody else is for. Maybe against programs that have already got a lot of investment behind them before some problem becomes clear. That makes them intrinsically less acceptable as “team players”. And they can’t easily focus on doing a single project; they have to worry about any possible way of doing it wrong. The structures that are good at building dangerous projects aren’t necessarily the same as the structures that are good at stoppping them.
If the AI safety community loses its agility advantage, it’s not gonna have much left.
MIRI will probably also lose some donors and collaborators, and have more trouble recruiting new ones as time goes on. People will forget they exist because they’re not talking, and there’s a certain reluctance to give people money or attention in exchange for “pigs in pokes”… or even to spend the effort to engage and find out what’s in the poke.
A couple of other notes:
Sometimes people talk about spreading defensive ideas without spreading the corresponding offensive ideas. In AI, that comes out as wanting to talk about safety measures without saying anything about how to increase capability.
In computer security, it comes out as cryptic announcments to “protect this port from this type of traffic until you apply this patch”… and it almost never works for long. The mere fact that you’re talking about some specific subject is enough to get people interested and make them figure out the offensive side. It can work for a couple of weeks for a security bug announcement, but beyond that it will almost always just backfire by drawing attention. And it’s very rare to be able to improve a defense without understanding the actual threat.
Edited the next day in an attempt to fix the footnotes… paragraphs after the first in each footnote were being left in the main flow.
As for keeping secrets from any major government…
.
First, I still prefer to talk about the Chinese government. The US government seems less likely to be a player here. Probably the most important reason is that most parts of the US government apparatus see things like AI development as a job for “industry”, which they tend to believe should be a very clearly separate sphere from “government”. That’s kind of different from the Chinese attitude, and it matters. Another reason is that the US government tends to have certain legal constraints and certain scruples that limit their effectiveness in penetrating secrecy.
.
I threw the US in as a reminder that China is far from the only issue, and I chose them because they used to be more interesting back during the cold war, and perhaps could be again if they got worried enough about “national security”.
.
But if any government, including the US, decides that MIRI has a lot of important “national security” information, and decides to look hard at them, then, yes, MIRI will largely fail to keep secrets. They may not fail completely. They may be able to keep some things off the radar, for a while. But that’s less likely for the most important things, and it will get harder the more people they convince that they may have information that’s worth looking at. Which they need to do.
.
They’ll probably even have information leaking into institutions that aren’t actively spying on them, and aren’t governments, either.
.
But that all that just leaves them where they started anyway. If there were no cost to it, it wouldn’t be a problem.
You can also get independent discoveries creating new, unpredictable starting points for diffusion. Often independent discoveries get easier as time goes on and the general “background” information improves. If you thought of something, even something really new, that can be a signal that conditions are making it easier for the next person to think of the same thing. I’ve seen security bugs with many independent discoveries.
.
Not to mention pathologies like one community thinking something is a big secret, and then seeing it break out from some other, sometimes much larger community that has treated it as common knowledge for ages.
If you ever get to the point where mostly-unaffiliated individuals are having to make complicated decisions about what should be shared, or having to think hard about what they have and have not committed themselves not to share, you are 95 percent of the way to fully hosed.
.
That sort of thing kind of works for industrial NDAs, but the reason it works is that, regardless of what people have convinced themselves to believe, most industrial “secret sauce” is pretty boring, and the rest tends to be either so specific and detailed that it obviously covered by any NDA. AND you usually only care about relatively few competitors, most of whose employees don’t get paid enough to get sued. That’s very different from some really inobvious world-shaking insight that makes the difference between low-power “safe” AI and high-power “unsafe” AI.
I guess this is sort of an agreement with the post… but I don’t think the post goes far enough.
Whoever “you guys” are, all you’ll do by adopting a lot of secrecy is slow yourselves down radically, while making sure that people who are better than you are at secrecy, who are better than you are at penetrating secrecy, who have more resources than you do, and who are better at coordinated action than you are, will know nearly everything you do, and will also know many things that you don’t know.
They will “scoop” you at every important point. And you have approximately zero chance of ever catching up with them on any of their advantages.
The best case long term outcome of an emphasis on keeping dangerous ideas secret would be that particular elements within the Chinese government (or maybe the US government, not that the corresponding elements would necessarily be much better) would get it right when they consolidated their current worldview’s permanent, unchallengeable control over all human affairs. That control could very well include making it impossible for anyone to even want to change the values being enforced. The sorts of people most likely to be ahead throughout any race, and most likely to win if there’s a hard “end”, would be completely comfortable with re-educating you to cure your disharmonious counter-revolutionary attitudes. If they couldn’t do that, they’d definitely arrange things so that you couldn’t ever communicate those attitudes or coordinate around them.
The worst case outcome is that somebody outright destroys the world in a way you might have been able to talk them out of.
Secrecy destroys your influence over people who might otherwise take warnings from you. Nobody is going to change any actions without a clear and detailed explanation of the reasons. And you can’t necessarily know who needs to be given such an explanation. In fact, people you might consider members of “your community” could end up making nasty mistakes because they don’t know something you do.
I’ve spent a lot of my career on the sorts of things where people try to keep secrets, and my overall impression of the AI risk and X-risk communities (including Nick Bostrom) is that they have a profoundly unrealistic, sometimes outright romanticized, view of what secrecy is and what it can do for them (and an unduly rosy view of their prospects for unanimous action in general).
So, there’s a few types of secrecy. Here’s three.
The sort of secrecy you have with friends when you gossip, which most of the time works fine.
The sort of secrecy where nobody really knows what is being worked on within companies like Apple and Facebook, whereas there’s way more openness about e.g. Google.
The sort of secrecy where you’re trying to protect yourself from foreign governments, which is way harder.
I’m pretty sure secrecy has been key for Apple’s ability to control its brand, and it’s not just slowed itself down, and I think that it’s plausible to achieve similar levels of secrecy, and that this has many uses. But what you’re talking about is secrecy from governmental groups actively trying to hack you.
I largely agree that when a major government wants your info, they can get it, though I’m not sure it’s not possible to keep secrets from them with a massive amount of work (I have not thought about it too much). I do question your assumption that governments will end up taking over the world, I think with deeply revolutionary tech like nanotech, AI, and others, different groups can end up taking over the world. So I don’t view things as clearly falling toward the outcome of Chinese/US/etc hegemony.
I don’t think Apple is a useful model here at all.
Well, Apple thinks so anyway. They may or may not be right, and “control of the brand” may or may not be important anyway. But anyway it’s true that Apple can keep secrets to some degree.
Apple is a unitary organization, though. It has a boundary. It’s small enough that you can find the person whose job it is to care about any given issue, and you are unlikely to miss anybody who needs to know. It has well-defined procedures and effective enforcement. Its secrets have a relatively short lifetime of maybe as much as 2 or 3 years.
Anybody who is spying on Apple is likely to be either a lot smaller, or heavily constrained in how they can safely use any secret they get. If I’m at Google and I steal something from Apple, I can’t publicize it internally, and in fact I run a very large risk of getting fired or turned in to law enforcement if I tell it to the wrong person internally.
Apple has no adversary with a disproportionate internal communication advantage, at least not with respect to any secrets that come from Apple.
The color of the next iPhone is never going to be as interesting to any adversary as an X-risk-level AI secret. And if, say, MIRI actually has a secret that is X-risk-level, then anybody who steals it, and who’s in a position to actually use it, is not likely to feel the least bit constrained by fear of MIRI’s retaliation in using it to do whatever X-risky thing they may be doing.
There’s also the sort of secrecy you have when you signed an NDA because you consult with a company. I would expect a person like Nick Bostrom to have access to information about what happens inside DeepMind that’s protected by promises of secrecy.
I can tell you that if you just want to walk into DeepMind (i.e. past the security gate), you have to sign an NDA.
These seem like important considerations, but aren’t really engaging with what Chatham House rules are trying to do, which is not to keep secrets, just to keep people’s identities obfuscated enough that people feel comfortable speaking freely
I guess I’d be interested in discussing specifics with you, but I can’t think of any good public examples of secrecy in ai-risk and x-risk related communities.
Oh actually, yes I can, MIRI’s written about going non-disclosed by default. I expect you to think this is fine and probably good and not too relevant, because it’s not (as far as the writeup suggests) an attempt to keep secrets from the US government, and you expect they’d fail at that. Is that right?
And OpenAI is attempting to push more careful release practises into the overton window of discussion in the ML communities (my summary is here). I think that while I agree that if a foreign government wants that tech, they can probably get it, I still think not releasing things has major effects. For example, there are lots of great researchers in the world that aren’t paid by governments, and those people cannot get the ideas, which means that overall progress on potentially dangerous tech slows down considerably. I’m not sure what I expect you to think of this one. Am curious if you think this seems unrealistics/romanticised.
No, I think it’s probably very counterproductive, depending on what it really means in practice. I wasn’t quite sure what the balance was between “We are going to actively try to keep this secret” and “It’s taking too much of our time to write all of this up”.
On the secrecy side of that, the problem isn’t whether or not MIRI’s secrecy works (although it probably won’t)[1]. The problem is with the cost and impact on their own community from their trying to do it. I’m going to go into that further down this tome.
That whole GPT thing was just strange.
OpenAI didn’t conceal any of the ideas at all. They held back the full version of the actual trained network, but as I recall they published all of the methods they used to create it. Although a big data blob like the network is relatively easy to keep secret, if your goal is to slow down other research, controlling the network isn’t going to be effective at all.
… and I don’t think that slowing down follow-on research was their goal. If I remember right, they seemed to be worried that people would abuse the actual network they’d trained. That was indeed unrealistic. I’ve seen the text from the full network, and played with giving it prompts and seeing what comes out. Frankly, the thing is useless for fooling anybody and wouldn’t be worth anybody’s time. You could do better by driving a manually created grammar with random numbers, and people already do that.
Treating it like a Big Deal just made OpenAI look grossly out of touch. I wonder how long it took them to get the cherry-picked examples they published when they made their announcement...
So, yes, I thought OpenAI was being unrealistic, although it’s not the kind of “romanticization” I had in mind. I just can’t figure out what they could have stood to gain by that particular move.
All that said, I don’t think I object to “more careful release practices”, in the sense of giving a little thought to what you hand out. My objections are more to things like--
Secrecy-by-default, or treating it as cost-free to make something secret. It’s impractical to have too many secrets, and tends to dilute your protection for any secrets you actually do truly need. In the specific case of AI risk, I think it also changes the balance of speed between you and your adversaries… for the worse. I’ll explain more about that below when I talk about MIRI.
The idea that you can just “not release things”, without very strict formal controls and institutional boundaries, and have that actually work in any meaningful way. There seems to be a lot of “illusion of control” thinking going on. Real secrecy is hard, and it gets harder fast if it has to last a long time.
To set the frame for the rest, I’m going to bloviate a bit about how I’ve seen secrecy to work in general.
One of the “secrets of secrecy” is that, at any scale beyond two or three people, it’s more about controlling diffusion rates than about creating absolute barriers. Information interesting enough to care about will leak eventually.
You have some amount of control over the diffusion rate within some specific domains, and at their boundaries. Once information breaks out into a domain you do not control, it will spread according to the conditions in that new domain regardless of what you do. When information hits a new community, there’s a step change in how fast it propagates.
Which brings up next not-very-secret secret: I’m wrong to talk about a “diffusion rate”. The numbers aren’t big enough to smooth out random fluctuations the way they are for molecules. Information tends to move in jumps for lots of reasons. Something may stay “secret” for a really long time just because nobody notices it… and then become big news when it gets to somebody who actively propagates it, or to somebody who sees an implication others didn’t. A big part of propagation is the framing and setting; if you pair some information with an explanation of why it matters, and release it into a community with a lot of members who care, it will move much, much faster than if you don’t.[2]
So, now, MIRI’s approach...
The problem with what MIRI seems to be doing is that it disproportionately slows the movement of information within their own community and among their allies. In most cases, they will probably hurt themselves more than they hurt their “adversaries”.
Ideas will still spread among the “good guys”, but unreliably, slowly, through an unpredictable rumor mill, with much negotiation and everybody worrying at every turn about what to tell everybody else [3]. That keeps problems from getting solved. It can’t be fixed by telling the people who “need to know”, because MIRI (or whoever) won’t know who those people are, especially-but-not-only if they’re also being secretive.
Meanwhile, MIRI can’t rely on keeping absolute secrets from anybody for any meaningful amount of time. And they’ll probably have a relatively small effect on institutions that could actually do dangerous development. Assuming it’s actually interesting, once one of MIRI’s secrets gets to somebody who happens to be part of some “adversary” institution, it will be propagated throughout that institution, possibly very quickly. It may even get formally announced in the internal newsletter. It even has a chance of moving on from there into that first institution’s own institutional adversaries, because they spy on each other.
But the “adversaries” are still relatively good at secrecy, especially from non-peers, so any follow-on ideas they produce will be slower to propagate back out into the public where MIRI et al can benefit from them.
The advantage the AI risk and X-risk communities have is, if you will, flexibility: they can get their heads around new ideas relatively quickly, adapt, act on implications, build one idea on another, and change their course relatively rapidly. The corresponding, closely related disadvantage is weakness in coordinating work on a large scale toward specific, agreed-upon goals (like say big scary AI development projects).
Worrying too much about secrecy throws away the advantage, but doesn’t cure the disadvantage. Curing the disadvantage requires a culture and a set of material resources that I don’t believe MIRI and friends can ever develop… and that would probably torpedo their effectiveness if they did develop them.
By their nature, they are going to be the people who are arguing against some development program that everybody else is for. Maybe against programs that have already got a lot of investment behind them before some problem becomes clear. That makes them intrinsically less acceptable as “team players”. And they can’t easily focus on doing a single project; they have to worry about any possible way of doing it wrong. The structures that are good at building dangerous projects aren’t necessarily the same as the structures that are good at stoppping them.
If the AI safety community loses its agility advantage, it’s not gonna have much left.
MIRI will probably also lose some donors and collaborators, and have more trouble recruiting new ones as time goes on. People will forget they exist because they’re not talking, and there’s a certain reluctance to give people money or attention in exchange for “pigs in pokes”… or even to spend the effort to engage and find out what’s in the poke.
A couple of other notes:
Sometimes people talk about spreading defensive ideas without spreading the corresponding offensive ideas. In AI, that comes out as wanting to talk about safety measures without saying anything about how to increase capability.
In computer security, it comes out as cryptic announcments to “protect this port from this type of traffic until you apply this patch”… and it almost never works for long. The mere fact that you’re talking about some specific subject is enough to get people interested and make them figure out the offensive side. It can work for a couple of weeks for a security bug announcement, but beyond that it will almost always just backfire by drawing attention. And it’s very rare to be able to improve a defense without understanding the actual threat.
Edited the next day in an attempt to fix the footnotes… paragraphs after the first in each footnote were being left in the main flow.
As for keeping secrets from any major government…
.
First, I still prefer to talk about the Chinese government. The US government seems less likely to be a player here. Probably the most important reason is that most parts of the US government apparatus see things like AI development as a job for “industry”, which they tend to believe should be a very clearly separate sphere from “government”. That’s kind of different from the Chinese attitude, and it matters. Another reason is that the US government tends to have certain legal constraints and certain scruples that limit their effectiveness in penetrating secrecy.
.
I threw the US in as a reminder that China is far from the only issue, and I chose them because they used to be more interesting back during the cold war, and perhaps could be again if they got worried enough about “national security”.
.
But if any government, including the US, decides that MIRI has a lot of important “national security” information, and decides to look hard at them, then, yes, MIRI will largely fail to keep secrets. They may not fail completely. They may be able to keep some things off the radar, for a while. But that’s less likely for the most important things, and it will get harder the more people they convince that they may have information that’s worth looking at. Which they need to do.
.
They’ll probably even have information leaking into institutions that aren’t actively spying on them, and aren’t governments, either.
.
But that all that just leaves them where they started anyway. If there were no cost to it, it wouldn’t be a problem.
You can also get independent discoveries creating new, unpredictable starting points for diffusion. Often independent discoveries get easier as time goes on and the general “background” information improves. If you thought of something, even something really new, that can be a signal that conditions are making it easier for the next person to think of the same thing. I’ve seen security bugs with many independent discoveries.
.
Not to mention pathologies like one community thinking something is a big secret, and then seeing it break out from some other, sometimes much larger community that has treated it as common knowledge for ages.
If you ever get to the point where mostly-unaffiliated individuals are having to make complicated decisions about what should be shared, or having to think hard about what they have and have not committed themselves not to share, you are 95 percent of the way to fully hosed.
.
That sort of thing kind of works for industrial NDAs, but the reason it works is that, regardless of what people have convinced themselves to believe, most industrial “secret sauce” is pretty boring, and the rest tends to be either so specific and detailed that it obviously covered by any NDA. AND you usually only care about relatively few competitors, most of whose employees don’t get paid enough to get sued. That’s very different from some really inobvious world-shaking insight that makes the difference between low-power “safe” AI and high-power “unsafe” AI.