I see that many people are commenting how it’s crazy to try to keep things secret between coworkers, or to not allow people to even mention certain projects, or that this kind of secrecy is psychologically damaging, or the like.
Now, I imagine this is heavily dependent on exactly how it’s implemented, and I have no idea how it’s implemented at MIRI. But just as a relevant data point—this kind of secrecy is totally par for the course for anybody who works for certain government and especially military-related organizations or contractors. You need extensive background checks to get a security clearance, and even then you can’t mention anything classified to someone else unless they have a valid need to know, you’re in a secure classified area that meets a lot of very detailed guidelines, etc. Even within small groups, there are certain projects that you simply are not allowed to discuss with other group members, since they do not necessarily have a valid need to know. If you’re not sure whether something is classified, you should be talking to someone higher up who does know. There are projects that you cannot even admit that they exist, and there are even words that you cannot mention in connection to each other even though each word on its own is totally normal and unclassified. In some places like the CIA or the NSA, you’re usually not even supposed to admit that you work there.
Again, this is probably all very dependent on exactly how the security guidelines are implemented. I am also not commenting at all on whether or not the information that MIRI tries to keep secret should in fact be kept secret. I am just pointing out that if some organization thinks that certain pieces of information really do need to be kept secret, and if they implement secrecy guidelines in the proper way, then as far as I could tell everything that’s been described as MIRI policies seems pretty reasonable to me.
Some secrecy between coworkers could be reasonable. Including secrecy about what secret projects exist (e.g. “we’re combining AI techniques X and Y and applying them to application Z first as a test”).
What seemed off is that the only information concealed by the policy in question (that researchers shouldn’t ask each other what they’re working on) is who is and isn’t recently working on a secret project. That isn’t remotely enough information to derive AI insights to any significant degree. Doing detective work on “who started saying they had secrets at the same time” to derive AI insights is a worse use of time than just reading more AI papers.
The policy in question is strictly dominated by an alternative policy, of revealing that you are working on a secret project but not which one. When I see a policy that is this clearly suboptimal for the stated goal, I have to infer alternative motives, such as maintaining domination of people by isolating them from each other. (Such a motive could be memetic/collective, partially constituted by people copying each other, rather than serving anyone’s individual interest, although personal motives are relevant too)
Mainstream organizations being secretive at the level MIRI was isn’t a particularly strong argument. As we learned with COVID, many mainstream organizations are opposing their stated mission. Zack Davis points out that controlling people into acting against their interests is a common function of mainstream policies (this is especially obvious in the military). Such control is especially counterproductive for FAI research, where a large part of the problem is to make AI act on human values rather than false approximations of them. Revealing actual human value requires freedom to act according to revealed preferences, not just pre-specified models of goals. (In other words: if everything in an organization is organized around pursuing a legible goal that is only an instrumental goal of human value, that org is either a UFAI or is not a general intelligence)
If mainstream policies were sufficient, there wouldn’t be any need for MIRI, since other AI orgs already use mainstream policies.
There are a few parts in here that seem fishy enough to me to try to red flag them.
Mainstream organizations being secretive at the level MIRI was isn’t a particularly strong argument. As we learned with COVID, many mainstream organizations are opposing their stated mission.
This is fair as a detraction to the sorta appeal to authority it is in reply to, but is also not a very good proof that secrecy is a bad idea. To boil it down smaller, the argument went “Secrecy works well for many existing organizations” and you replied “Many existing organizations did a bad job during Covid”. Strictly speaking, doing a bad job during Covid means that not everything is going well, but this is still a pretty weird and weak argument.
This whole paragraph:
Zack Davis points out that controlling people into acting against their interests is a common function of mainstream policies (this is especially obvious in the military). Such control is especially counterproductive for FAI research, where a large part of the problem is to make AI act on human values rather than false approximations of them. Revealing actual human value requires freedom to act according to revealed preferences, not just pre-specified models of goals. (In other words: if everything in an organization is organized around pursuing a legible goal that is only an instrumental goal of human value, that org is either a UFAI or is not a general intelligence)
also makes next to no sense to me. Please correct me if I’m wrong (which I kinda think I might be), but I read this as
Mainstream organizations make people act against their own values
We want AI to act on human values
3, Only agents acting on human values can develop an AI that acts on human values
By 1,3, Mainstream organizations act against human values
By 3,4, Mainstream organizations cannot develop FAI
There’s a big difference between “optimizing poorly” and “pessimizing”, i.e. making the peoblem worse in ways that require some amount of cleverness. Mainstream institutions handling COVID was a case of pessimizing not just optimizing poorly, e.g. banning tests, telling people masks don’t work, and seizing mask shipments.
I don’t think you’re mis-stating the argument here, it really is a thing I’m arguing that institutions that make people act against their values can’t build FAI. As an example you could imagine an institution that optimized for some utility function U that was designed by committee. That U wouldn’t be the human utility function (unless the design-by-committee process is a reliable value loader), so forcing everyone to optimize U means you aren’t optimizing the human utility function, it has the same issues as a paperclip maximizer.
What if you try setting U = “get FAI”? Too bad, “FAI” is a lisp token, for it to have semantics it has to connect with human value somehow, i.e. someone actually wanting a thing and being assisted in getting it.
Maybe you can have a research org where some people are slaves and some aren’t, but for this to work you’d need a legible distinction between the two classes, so you don’t get confused into thinking you’re optimizing the slave’s utility function by enslaving them.
You have by far more information than me about what it’s like on the ground as a MIRI researcher.
But one thing missing so far is that my sense was that a lot of researchers preferred the described level of secretiveness as a simplifying move?
e.g. “It seems like I could say more without violating any norms, but I have a hard time tracking where the norms are and it’s easier for me to just be quiet as a general principle. I’m going to just be quiet as a general principle rather than being the-maximum-cooperative-amount-of-open, which would be a burden on me to track with the level of conscientiousness I would want to apply.”
The policy described was mandated, it wasn’t just on a voluntary basis. Anyway, I don’t really trust something optimizing this badly to have a non-negligible shot at FAI, so the point is kind of moot.
I see that many people are commenting how it’s crazy to try to keep things secret between coworkers, or to not allow people to even mention certain projects, or that this kind of secrecy is psychologically damaging, or the like.
Now, I imagine this is heavily dependent on exactly how it’s implemented, and I have no idea how it’s implemented at MIRI. But just as a relevant data point—this kind of secrecy is totally par for the course for anybody who works for certain government and especially military-related organizations or contractors. You need extensive background checks to get a security clearance, and even then you can’t mention anything classified to someone else unless they have a valid need to know, you’re in a secure classified area that meets a lot of very detailed guidelines, etc. Even within small groups, there are certain projects that you simply are not allowed to discuss with other group members, since they do not necessarily have a valid need to know. If you’re not sure whether something is classified, you should be talking to someone higher up who does know. There are projects that you cannot even admit that they exist, and there are even words that you cannot mention in connection to each other even though each word on its own is totally normal and unclassified. In some places like the CIA or the NSA, you’re usually not even supposed to admit that you work there.
Again, this is probably all very dependent on exactly how the security guidelines are implemented. I am also not commenting at all on whether or not the information that MIRI tries to keep secret should in fact be kept secret. I am just pointing out that if some organization thinks that certain pieces of information really do need to be kept secret, and if they implement secrecy guidelines in the proper way, then as far as I could tell everything that’s been described as MIRI policies seems pretty reasonable to me.
Some secrecy between coworkers could be reasonable. Including secrecy about what secret projects exist (e.g. “we’re combining AI techniques X and Y and applying them to application Z first as a test”).
What seemed off is that the only information concealed by the policy in question (that researchers shouldn’t ask each other what they’re working on) is who is and isn’t recently working on a secret project. That isn’t remotely enough information to derive AI insights to any significant degree. Doing detective work on “who started saying they had secrets at the same time” to derive AI insights is a worse use of time than just reading more AI papers.
The policy in question is strictly dominated by an alternative policy, of revealing that you are working on a secret project but not which one. When I see a policy that is this clearly suboptimal for the stated goal, I have to infer alternative motives, such as maintaining domination of people by isolating them from each other. (Such a motive could be memetic/collective, partially constituted by people copying each other, rather than serving anyone’s individual interest, although personal motives are relevant too)
Mainstream organizations being secretive at the level MIRI was isn’t a particularly strong argument. As we learned with COVID, many mainstream organizations are opposing their stated mission. Zack Davis points out that controlling people into acting against their interests is a common function of mainstream policies (this is especially obvious in the military). Such control is especially counterproductive for FAI research, where a large part of the problem is to make AI act on human values rather than false approximations of them. Revealing actual human value requires freedom to act according to revealed preferences, not just pre-specified models of goals. (In other words: if everything in an organization is organized around pursuing a legible goal that is only an instrumental goal of human value, that org is either a UFAI or is not a general intelligence)
If mainstream policies were sufficient, there wouldn’t be any need for MIRI, since other AI orgs already use mainstream policies.
There are a few parts in here that seem fishy enough to me to try to red flag them.
This is fair as a detraction to the sorta appeal to authority it is in reply to, but is also not a very good proof that secrecy is a bad idea. To boil it down smaller, the argument went “Secrecy works well for many existing organizations” and you replied “Many existing organizations did a bad job during Covid”. Strictly speaking, doing a bad job during Covid means that not everything is going well, but this is still a pretty weird and weak argument.
This whole paragraph:
also makes next to no sense to me. Please correct me if I’m wrong (which I kinda think I might be), but I read this as
Mainstream organizations make people act against their own values
We want AI to act on human values 3, Only agents acting on human values can develop an AI that acts on human values
By 1,3, Mainstream organizations act against human values
By 3,4, Mainstream organizations cannot develop FAI
Which seems to not follow in any way to me.
There’s a big difference between “optimizing poorly” and “pessimizing”, i.e. making the peoblem worse in ways that require some amount of cleverness. Mainstream institutions handling COVID was a case of pessimizing not just optimizing poorly, e.g. banning tests, telling people masks don’t work, and seizing mask shipments.
I don’t think you’re mis-stating the argument here, it really is a thing I’m arguing that institutions that make people act against their values can’t build FAI. As an example you could imagine an institution that optimized for some utility function U that was designed by committee. That U wouldn’t be the human utility function (unless the design-by-committee process is a reliable value loader), so forcing everyone to optimize U means you aren’t optimizing the human utility function, it has the same issues as a paperclip maximizer.
What if you try setting U = “get FAI”? Too bad, “FAI” is a lisp token, for it to have semantics it has to connect with human value somehow, i.e. someone actually wanting a thing and being assisted in getting it.
Maybe you can have a research org where some people are slaves and some aren’t, but for this to work you’d need a legible distinction between the two classes, so you don’t get confused into thinking you’re optimizing the slave’s utility function by enslaving them.
With a bit more meat, I can see what you’re referring to better.
I still don’t agree I think, but I can see why you would build that belief much better than I could before. I appreciate the clarification, thank you.
You have by far more information than me about what it’s like on the ground as a MIRI researcher.
But one thing missing so far is that my sense was that a lot of researchers preferred the described level of secretiveness as a simplifying move?
e.g. “It seems like I could say more without violating any norms, but I have a hard time tracking where the norms are and it’s easier for me to just be quiet as a general principle. I’m going to just be quiet as a general principle rather than being the-maximum-cooperative-amount-of-open, which would be a burden on me to track with the level of conscientiousness I would want to apply.”
The policy described was mandated, it wasn’t just on a voluntary basis. Anyway, I don’t really trust something optimizing this badly to have a non-negligible shot at FAI, so the point is kind of moot.