There are a few parts in here that seem fishy enough to me to try to red flag them.
Mainstream organizations being secretive at the level MIRI was isn’t a particularly strong argument. As we learned with COVID, many mainstream organizations are opposing their stated mission.
This is fair as a detraction to the sorta appeal to authority it is in reply to, but is also not a very good proof that secrecy is a bad idea. To boil it down smaller, the argument went “Secrecy works well for many existing organizations” and you replied “Many existing organizations did a bad job during Covid”. Strictly speaking, doing a bad job during Covid means that not everything is going well, but this is still a pretty weird and weak argument.
This whole paragraph:
Zack Davis points out that controlling people into acting against their interests is a common function of mainstream policies (this is especially obvious in the military). Such control is especially counterproductive for FAI research, where a large part of the problem is to make AI act on human values rather than false approximations of them. Revealing actual human value requires freedom to act according to revealed preferences, not just pre-specified models of goals. (In other words: if everything in an organization is organized around pursuing a legible goal that is only an instrumental goal of human value, that org is either a UFAI or is not a general intelligence)
also makes next to no sense to me. Please correct me if I’m wrong (which I kinda think I might be), but I read this as
Mainstream organizations make people act against their own values
We want AI to act on human values
3, Only agents acting on human values can develop an AI that acts on human values
By 1,3, Mainstream organizations act against human values
By 3,4, Mainstream organizations cannot develop FAI
There’s a big difference between “optimizing poorly” and “pessimizing”, i.e. making the peoblem worse in ways that require some amount of cleverness. Mainstream institutions handling COVID was a case of pessimizing not just optimizing poorly, e.g. banning tests, telling people masks don’t work, and seizing mask shipments.
I don’t think you’re mis-stating the argument here, it really is a thing I’m arguing that institutions that make people act against their values can’t build FAI. As an example you could imagine an institution that optimized for some utility function U that was designed by committee. That U wouldn’t be the human utility function (unless the design-by-committee process is a reliable value loader), so forcing everyone to optimize U means you aren’t optimizing the human utility function, it has the same issues as a paperclip maximizer.
What if you try setting U = “get FAI”? Too bad, “FAI” is a lisp token, for it to have semantics it has to connect with human value somehow, i.e. someone actually wanting a thing and being assisted in getting it.
Maybe you can have a research org where some people are slaves and some aren’t, but for this to work you’d need a legible distinction between the two classes, so you don’t get confused into thinking you’re optimizing the slave’s utility function by enslaving them.
There are a few parts in here that seem fishy enough to me to try to red flag them.
This is fair as a detraction to the sorta appeal to authority it is in reply to, but is also not a very good proof that secrecy is a bad idea. To boil it down smaller, the argument went “Secrecy works well for many existing organizations” and you replied “Many existing organizations did a bad job during Covid”. Strictly speaking, doing a bad job during Covid means that not everything is going well, but this is still a pretty weird and weak argument.
This whole paragraph:
also makes next to no sense to me. Please correct me if I’m wrong (which I kinda think I might be), but I read this as
Mainstream organizations make people act against their own values
We want AI to act on human values 3, Only agents acting on human values can develop an AI that acts on human values
By 1,3, Mainstream organizations act against human values
By 3,4, Mainstream organizations cannot develop FAI
Which seems to not follow in any way to me.
There’s a big difference between “optimizing poorly” and “pessimizing”, i.e. making the peoblem worse in ways that require some amount of cleverness. Mainstream institutions handling COVID was a case of pessimizing not just optimizing poorly, e.g. banning tests, telling people masks don’t work, and seizing mask shipments.
I don’t think you’re mis-stating the argument here, it really is a thing I’m arguing that institutions that make people act against their values can’t build FAI. As an example you could imagine an institution that optimized for some utility function U that was designed by committee. That U wouldn’t be the human utility function (unless the design-by-committee process is a reliable value loader), so forcing everyone to optimize U means you aren’t optimizing the human utility function, it has the same issues as a paperclip maximizer.
What if you try setting U = “get FAI”? Too bad, “FAI” is a lisp token, for it to have semantics it has to connect with human value somehow, i.e. someone actually wanting a thing and being assisted in getting it.
Maybe you can have a research org where some people are slaves and some aren’t, but for this to work you’d need a legible distinction between the two classes, so you don’t get confused into thinking you’re optimizing the slave’s utility function by enslaving them.
With a bit more meat, I can see what you’re referring to better.
I still don’t agree I think, but I can see why you would build that belief much better than I could before. I appreciate the clarification, thank you.