First and foremost: Jessica, I’m sad you had a bad late/post-MIRI experience. I found your contributions to MIRI valuable (Quantilizers and Reflective Solomonoff Induction spring to mind as some cool stuff), and I personally wish you well.
A bit of meta before I say anything else: I’m leery of busting in here with critical commentary, and thereby causing people to think they can’t air dirty laundry without their former employer busting in with critical commentary. I’m going to say a thing or two anyway, in the name of honest communication. I’m open to suggestions for alternative ways to handle this tradeoff.
Now, some quick notes: I think Jessica is truthfully reporting her experiences as she recalls them. I endorse orthonormal’s comment as more-or-less matching my own recollections. That said, in a few of Jessica’s specific claims, I believe I recognize the conversations she’s referring to, and I feel misunderstood and/or misconstrued. I don’t want to go through old conversations blow-by-blow, but for a sense of the flavor, I note that in this comment Jessica seems to me to misconstrue some of Eliezer’s tweets in a way that feels similar to me. Also, as one example from the text, looking at the part of the text that names me specifically:
Nate Soares frequently [...] [said] that we must create a human emulation using nanotechnology that is designed by a “genie” AI [...]
I wouldn’t personally use a phrase like “we [at MIRI] must create a human emulation using nanotech designed by a genie AI”. I’d phrase that claim more like “my current best concrete idea is to solve narrow alignment sufficient for a limited science/engineering AGI to safely design nanotech capable of, eg, uploading a human”. This difference is important to me. In contrast with the connotations I read into Jessica’s account, I didn’t/don’t have particularly high confidence in that specific plan (and I wrote contemporaneously about how plans don’t have truth values, and that the point of having a concrete plan isn’t to think it will work). Also, my views in this vicinity are not particularly MIRI-centric (I have been a regular advocate of all AGI research teams thinking concretely and specifically about pivotal acts and how their tech could be used to end the acute risk period). Jessica was perhaps confused by my use of we-as-in-humanity instead of we-as-in-MIRI. I recall attempting to clarify that during our conversation, but perhaps it didn’t stick.
My experience conversing with Jessica, in the time period before she departed MIRI, was one of regular miscommunications, similar in flavor to the above two examples.
(NB: I’m not currently planning to engage in much back-and-forth.)
Thanks, I appreciate you saying that you’re sorry my experience was bad towards the end (I notice it actually makes me feel better about the situation), that you’re aware of how criticizing people the wrong way can discourage speech and are correcting for that, and that you’re still concerned enough about misconstruals to correct them where you see fit. I’ve edited the relevant section of the OP to link to this comment. I’m glad I had a chance to work with you even if things got really confusing towards the end.
I don’t think OP asserted that this specific plan was fixed, it was an example of a back-chaining plan, but I see how “a world-saving plan” could imply that it was this specific plan, which it wasn’t.
I didn’t specify which small group was taking over the world, I didn’t mean to imply that it had to be MIRI specifically, maybe the comparison with Leverage led to that seeming like it was implied?
I still don’t understand how I’m misconstruing Eliezer’s tweets, it seems very clear to me that he’s saying something about how neural nets work would be very upsetting if learned about and I don’t see what else he could be saying.
Regarding Eliezer’s tweets, I think the issue is that he is joking about the “never stop screaming”. He is using humor to point at a true fact, that it’s really unfortunate how unreliable neural nets are, but he’s not actually saying that if you study neural nets until you understand them then you will have a psychotic break and never stop screaming.
I’m not sure I agree with Jessica’s interpretation of Eliezer’s tweets, but I do think they illustrate an important point about MIRI: MIRI can’t seem to decide if it’s an advocacy org or a research org.
“if you actually knew how deep neural networks were solving your important mission-critical problems, you’d never stop screaming” is frankly evidence-free hyperbole, of the same sort activist groups use (e.g. “taxation is theft”). People like Chris Olah have studied how neural nets solve problems a lot, and I’ve never heard of them screaming about what they discovered.
Suppose there was a libertarian advocacy group with a bombastic leader who liked to go tweeting things like “if you realized how bad taxation is for the economy, you’d never stop screaming”. After a few years of advocacy, the group decides they want to switch to being a think tank. Suppose they hire some unusually honest economists, who study taxation and notice things in the data that kinda suggest taxation might actually be good for the economy sometimes. Imagine you’re one of those economists and you’re gonna ask your boss about looking into this more. You might have second thoughts like: Will my boss scream at me? Will they fire me? The organizational incentives don’t seem to favor truthseeking.
Another issue with advocacy is you can get so caught up in convincing people that the problem needs to be solved that you forget to solve it, or even take actions that are counterproductive for solving it. For AI safety advocacy, you want to convince everyone that the problem is super difficult and requires more attention and resources. But for AI safety research, you want to make the problem easy, and solve it with the attention and resources you have.
In The Algorithm Design Manual, Steven Skiena writes:
In any group brainstorming session, the most useful person in the room is the one who keeps asking “Why can’t we do it this way?”; not the nitpicker who keeps telling them why. Because he or she will eventually stumble on an approach that can’t be shot down… The correct answer to “Can I do it this way?” is never “no,” but “no, because. . . .” By clearly articulating your reasoning as to why something doesn’t work, you can check whether you have glossed over a possibility that you didn’t think hard enough about. It is amazing how often the reason you can’t find a convincing explanation for something is because your conclusion is wrong.
Being an advocacy org means you’re less likely to hire people who continually ask “Why can’t we do it this way?”, and those who are hired will be discouraged from this behavior if it’s implied that a leader might scream if they dislike the proposed solution. The activist mindset tends to favor evidence-free hyperbole over carefully checking if you glossed over a possibility, or wondering if an inability to convince others means your conclusion is wrong.
I dunno if there’s an easy solution to this—I would like to see both advocacy work and research work regarding AI safety. But having them in the same org seems potentially suboptimal.
MIRI can’t seem to decide if it’s an advocacy org or a research org.
MIRI is a research org. It is not an advocacy org. It is not even close. You can tell by the fact that it basically hasn’t said anything for the last 4 years. Eliezer’s personal twitter account does not make MIRI an advocacy org.
(I recognize this isn’t addressing your actual point. I just found the frame frustrating.)
as a tiny, mostly-uninformed data point, i read “if you realized how bad taxation is for the economy, you’d never stop screaming” to have a very diff vibe from Eliezer’s tweet, cause he didn’t use the word bad. I know it’s a small diff but it hits diff. Something in his tweet was amusing because it felt like it was pointing to a presumably neutral thing and making it scary? whereas saying the same thing about a clearly moralistic point seems like it’s doing a different thing.
Again—a very minor point here, just wanted to throw it in.
First and foremost: Jessica, I’m sad you had a bad late/post-MIRI experience. I found your contributions to MIRI valuable (Quantilizers and Reflective Solomonoff Induction spring to mind as some cool stuff), and I personally wish you well.
A bit of meta before I say anything else: I’m leery of busting in here with critical commentary, and thereby causing people to think they can’t air dirty laundry without their former employer busting in with critical commentary. I’m going to say a thing or two anyway, in the name of honest communication. I’m open to suggestions for alternative ways to handle this tradeoff.
Now, some quick notes: I think Jessica is truthfully reporting her experiences as she recalls them. I endorse orthonormal’s comment as more-or-less matching my own recollections. That said, in a few of Jessica’s specific claims, I believe I recognize the conversations she’s referring to, and I feel misunderstood and/or misconstrued. I don’t want to go through old conversations blow-by-blow, but for a sense of the flavor, I note that in this comment Jessica seems to me to misconstrue some of Eliezer’s tweets in a way that feels similar to me. Also, as one example from the text, looking at the part of the text that names me specifically:
I wouldn’t personally use a phrase like “we [at MIRI] must create a human emulation using nanotech designed by a genie AI”. I’d phrase that claim more like “my current best concrete idea is to solve narrow alignment sufficient for a limited science/engineering AGI to safely design nanotech capable of, eg, uploading a human”. This difference is important to me. In contrast with the connotations I read into Jessica’s account, I didn’t/don’t have particularly high confidence in that specific plan (and I wrote contemporaneously about how plans don’t have truth values, and that the point of having a concrete plan isn’t to think it will work). Also, my views in this vicinity are not particularly MIRI-centric (I have been a regular advocate of all AGI research teams thinking concretely and specifically about pivotal acts and how their tech could be used to end the acute risk period). Jessica was perhaps confused by my use of we-as-in-humanity instead of we-as-in-MIRI. I recall attempting to clarify that during our conversation, but perhaps it didn’t stick.
My experience conversing with Jessica, in the time period before she departed MIRI, was one of regular miscommunications, similar in flavor to the above two examples.
(NB: I’m not currently planning to engage in much back-and-forth.)
Thanks, I appreciate you saying that you’re sorry my experience was bad towards the end (I notice it actually makes me feel better about the situation), that you’re aware of how criticizing people the wrong way can discourage speech and are correcting for that, and that you’re still concerned enough about misconstruals to correct them where you see fit. I’ve edited the relevant section of the OP to link to this comment. I’m glad I had a chance to work with you even if things got really confusing towards the end.
With regard to the specific misconstruals:
I don’t think OP asserted that this specific plan was fixed, it was an example of a back-chaining plan, but I see how “a world-saving plan” could imply that it was this specific plan, which it wasn’t.
I didn’t specify which small group was taking over the world, I didn’t mean to imply that it had to be MIRI specifically, maybe the comparison with Leverage led to that seeming like it was implied?
I still don’t understand how I’m misconstruing Eliezer’s tweets, it seems very clear to me that he’s saying something about how neural nets work would be very upsetting if learned about and I don’t see what else he could be saying.
Regarding Eliezer’s tweets, I think the issue is that he is joking about the “never stop screaming”. He is using humor to point at a true fact, that it’s really unfortunate how unreliable neural nets are, but he’s not actually saying that if you study neural nets until you understand them then you will have a psychotic break and never stop screaming.
I’m not sure I agree with Jessica’s interpretation of Eliezer’s tweets, but I do think they illustrate an important point about MIRI: MIRI can’t seem to decide if it’s an advocacy org or a research org.
“if you actually knew how deep neural networks were solving your important mission-critical problems, you’d never stop screaming” is frankly evidence-free hyperbole, of the same sort activist groups use (e.g. “taxation is theft”). People like Chris Olah have studied how neural nets solve problems a lot, and I’ve never heard of them screaming about what they discovered.
Suppose there was a libertarian advocacy group with a bombastic leader who liked to go tweeting things like “if you realized how bad taxation is for the economy, you’d never stop screaming”. After a few years of advocacy, the group decides they want to switch to being a think tank. Suppose they hire some unusually honest economists, who study taxation and notice things in the data that kinda suggest taxation might actually be good for the economy sometimes. Imagine you’re one of those economists and you’re gonna ask your boss about looking into this more. You might have second thoughts like: Will my boss scream at me? Will they fire me? The organizational incentives don’t seem to favor truthseeking.
Another issue with advocacy is you can get so caught up in convincing people that the problem needs to be solved that you forget to solve it, or even take actions that are counterproductive for solving it. For AI safety advocacy, you want to convince everyone that the problem is super difficult and requires more attention and resources. But for AI safety research, you want to make the problem easy, and solve it with the attention and resources you have.
In The Algorithm Design Manual, Steven Skiena writes:
Being an advocacy org means you’re less likely to hire people who continually ask “Why can’t we do it this way?”, and those who are hired will be discouraged from this behavior if it’s implied that a leader might scream if they dislike the proposed solution. The activist mindset tends to favor evidence-free hyperbole over carefully checking if you glossed over a possibility, or wondering if an inability to convince others means your conclusion is wrong.
I dunno if there’s an easy solution to this—I would like to see both advocacy work and research work regarding AI safety. But having them in the same org seems potentially suboptimal.
MIRI is a research org. It is not an advocacy org. It is not even close. You can tell by the fact that it basically hasn’t said anything for the last 4 years. Eliezer’s personal twitter account does not make MIRI an advocacy org.
(I recognize this isn’t addressing your actual point. I just found the frame frustrating.)
as a tiny, mostly-uninformed data point, i read “if you realized how bad taxation is for the economy, you’d never stop screaming” to have a very diff vibe from Eliezer’s tweet, cause he didn’t use the word bad. I know it’s a small diff but it hits diff. Something in his tweet was amusing because it felt like it was pointing to a presumably neutral thing and making it scary? whereas saying the same thing about a clearly moralistic point seems like it’s doing a different thing.
Again—a very minor point here, just wanted to throw it in.