johnswentworth comments on Challenges with Breaking into MIRI-Style Research

johnswentworth 17 Jan 2022 17:12 UTC
LW: 31 AF: 14
AF
The object-level claims here seem straightforwardly true, but I think “challenges with breaking into MIRI-style research” is a misleading way to characterize it. The post makes it sound like these are problems with the pipeline for new researchers, but really these problems are all driven by challenges of the kind of research involved.
The central feature of MIRI-style research which drives all this is that MIRI-style research is preparadigmatic. The whole point of preparadigmatic research is that:
- We don’t know the right frames to apply (and if we just picked some, they’d probably be wrong)
- We don’t know the right skills or knowledge to train (and if we just picked some, they’d probably be wrong)
- We don’t have shared foundations for communicating work (and if we just picked some, they’d probably be wrong)
- We don’t have shared standards for evaluating work (and if we just picked some, they’d probable be wrong)
Here’s how the challenges of preparadigmicity apply the points in the post.
- MIRI doesn’t seem to be running internships^[3] or running their AI safety for computer scientists workshops
MIRI does not know how to efficiently produce new theoretical researchers. They’ve done internships, they’ve done workshops, and the yields just weren’t that great, at least for producing new theorists.
- You can park in a standard industry job for a while in order to earn career capital for ML-style safety. Not so for MIRI-style research.
- There are well-crafted materials for learning a lot of the prerequisites for ML-style safety.
- There seems to be a natural pathway of studying a masters then pursuing a PhD to break into ML-style safety. There are a large number of scholarships available and many countries offer loans or income support
- General AI safety programs and support—ie. AI Safety Fundamentals Course, AI Safety Support, AI Safety Camp, Alignment Newsletter, ect. are naturally going to strongly focus on ML-style research and might not even have the capability to vet MIRI-style research.
There is no standardized field of knowledge with the tools we need. We can’t just go look up study materials to learn the right skills or knowledge, because we don’t know what skills or knowledge those are. There’s no standard set of alignment skills or knowledge which an employer could recognize as probably useful for their own problems, so there’s no standardized industry jobs. Similarly, there’s no PhD for alignment; we don’t know what would go into it.
- There’s no equivalent to submitting a paper^[4]. If a paper passes review, then it gains a certain level of credibility. There are upvotes, but this signaling mechanism is more distorted by popularity or accessibility. Further, unlike writing an academic paper, writing alignment forum posts won’t provide credibility outside of the field.
We don’t have clear shared standards for evaluating work. Most people doing MIRI-style research think most other people doing MIRI-style research are going about it all wrong. Whatever perception of credibility might be generated by something paper-like would likely be fake.
- It is much harder to find people with similar interests to collaborate with or mentor you. Compare to how easy it is to meet a bunch of people interested in ML-style research by attending EA meetups or EAGx.
We don’t have standard frames shared by everyone doing MIRI-style research, and if we just picked some frames they would probably be wrong, and the result would probably be worse than having a wide mix of frames and knowing that we don’t know which ones are right.
Main takeaway of all that: most of the post’s challenges of breaking into MIRI-style research accurately reflect the challenges involved in doing MIRI-style research. Figuring out new paths, new frames, applying new skills and knowledge, explaining your own ways of evaluating outputs… these are all central pieces of doing this kind of research. If the pipeline did not force people to figure this sort of stuff out, then it would not select for researchers well-suited to this kind of work.
Now, I do still think the pipeline could be better, in principle. But the challenge is to train people to build their own paradigms, and that’s a major problem in its own right. I don’t know of anyone ever having done it before at scale; there’s no template to copy for this. I have been working on it, though.
What links here?
- Implications of Civilizational Inadequacy (reviewing mazes/simulacra/etc) by Raemon (21 Jan 2022 1:45 UTC; 93 points)
- Chris_Leong 17 Jan 2022 18:18 UTC
  LW: 14 AF: 5
  AF Parent
  The object-level claims here seem straightforwardly true, but I think “challenges with breaking into MIRI-style research” is a misleading way to characterize it. The post makes it sound like these are problems with the pipeline for new researchers, but really these problems are all driven by challenges of the kind of research involved.
  
  There’s definitely some truth to this, but I guess I’m skeptical that there isn’t anything that we can do about some of these challenges. Actually, rereading I can see that you’ve conceded this towards the end of your post. I agree that there might be a limit to how much progress we can make on these issues, but I think we shouldn’t rule out making progress too quickly.
  Figuring out new paths, new frames, applying new skills and knowledge, explaining your own ways of evaluating outputs… these are all central pieces of doing this kind of research. If the pipeline did not force people to figure this sort of stuff out, then it would not select for researchers well-suited to this kind of work.
  
  Some of these aspects don’t really select for people with the ability to figure this kind of stuff out, but rather strongly select for people who have either saved up money to fund themselves or who happen to be located in the Bay Area, ect.
  We don’t know the right frames to apply (and if we just picked some, they’d probably be wrong)
  Philosophy often has this problem and they address this by covering a wide range of perspectives with the hope that you’re inspired by the readings even if none of them are correct.
  We don’t have clear shared standards for evaluating work. Most people doing MIRI-style research think most other people doing MIRI-style research are going about it all wrong. Whatever perception of credibility might be generated by something paper-like would likely be fake.
  This is a hugely difficult problem, but maybe it’s better to try rather than not try at all?
  - johnswentworth 18 Jan 2022 1:51 UTC
    LW: 6 AF: 4
    AF Parent
    There’s definitely some truth to this, but I guess I’m skeptical that there isn’t anything that we can do about some of these challenges. Actually, rereading I can see that you’ve conceded this towards the end of your post. I agree that there might be a limit to how much progress we can make on these issues, but I think we shouldn’t rule out making progress too quickly.
    To be clear, I don’t intend to argue that the problem is too hard or not worthwhile or whatever. Rather, my main point is that solutions need to grapple with the problems of teaching people to create new paradigms, and working with people who don’t share standard frames. I expect that attempts to mimic the traditional pipelines of paradigmatic fields will not solve those problems. That’s not an argument against working on it, it’s just an argument that we need fundamentally different strategies than the standard education and career paths in other fields.
- Koen.Holtman 19 Jan 2022 17:13 UTC
  LW: 5 AF: 3
  AF Parent
  I like your summary of the situation:
  
  Most people doing MIRI-style research think most other people doing MIRI-style research are going about it all wrong.
  
  This has also been my experience, at least on this forum. Much less so in academic-style papers about alignment. This has certain consequences for the problem of breaking into preparadigmatic alignment research.
  
  Here are two ways to do preparadigmatic research:
  1. Find something that is all wrong with somebody else’s paradigm, then write about it.
  2. Find a new useful paradigm and write about it.
  MIRI-style preparadigmatic research, to the extent that it is published, read, and discussed on this forum, is almost all about the first of the above. Even on a forum as generally polite and thoughtful as this one, social media dynamics promote and reward the first activity much more than the second.
  
  In science and engineering, people will usually try very hard to make progress by standing on the shoulders of others. The discourse on this forum, on the other hand, more often resembles that of a bunch of crabs in a bucket.
  
  My conclusion is of course that if you want to break into preparadigmatic research, then you are going about it all wrong if your approach is to try to engage more with MIRI, or to maximise engagement scores on this forum.
  - Chris_Leong 19 Jan 2022 19:07 UTC
    LW: 2 AF: 1
    AF Parent
    In science and engineering, people will usually try very hard to make progress by standing on the shoulders of others. The discourse on this forum, on the other hand, more often resembles that of a bunch of crabs in a bucket.
    
    Hmm… Yeah, I certainly don’t think that there’s enough collaboration or appreciation of the insights that other approaches may provide.
    
    Any thoughts on how to encourage a healthier dynamic.
    - Koen.Holtman 20 Jan 2022 12:19 UTC
      LW: 3 AF: 2
      AF Parent
      
      Any thoughts on how to encourage a healthier dynamic.
      
      I have no easy solution to offer, except for the obvious comment that the world is bigger than this forum.
      
      My own stance is to treat the over-production of posts of type 1 above as just one of these inevitable things that will happen in the modern media landscape. There is some value to these posts, but after you have read about 20 of them, you can be pretty sure about how the next one will go.
      
      So I try to focus my energy, as a reader and writer, on work of type 2 instead. I treat arXiv as my main publication venue, but I do spend some energy cross-posting my work of type 2 here. I hope that it will inspire others, or at least counter-balance some of the type 1 work.