David Scott Krueger (formerly: capybaralet) comments on (My understanding of) What Everyone in Technical Alignment is Doing and Why

David Scott Krueger (formerly: capybaralet) 30 Aug 2022 20:16 UTC
LW: 31 AF: 18
20
AF
The main thing missing here are academic groups (like mine at Cambridge https://www.davidscottkrueger.com/). This is a pretty glaring oversight, although I’m not that surprised since it’s LW.

Some other noteworthy groups in academia lead by people who are somewhat connected to this community:
- Jacob Steinhardt (Berkeley)
- Dylan Hadfield-Menell (MIT)
- Sam Bowman (NYU)
- Roger Grosse (UofT)
More at https://futureoflife.org/team/ai-existential-safety-community/ (although I think the level of focus on x-safety and engagement with this community varies substantially among these people).

BTW, FLI is itself worth a mention, as is FHI, maybe in particular https://www.fhi.ox.ac.uk/causal-incentives-working-group/ if you want to focus on technical stuff.

Some other noteworthy groups in academia lead by people who are perhaps less connected to this community:
- Aleksander Madry (MIT)
- Percy Liang (Stanford)
- Scott Neikum (UMass Amhearst)

These are just examples.
- elifland 30 Aug 2022 21:26 UTC
  12 points
  0
  Parent
  (speaking for just myself, not Thomas but I think it’s likely he’d endorse most of this)
  I agree it would be great to include many of these academic groups; the exclusion wasn’t out of any sort of malice. Personally I don’t know very much about what most of these groups are doing or their motivations; if any of them want to submit brief write ups I‘d be happy to add them! :)
  edit: lol, Thomas responded with a similar tone while I was typing
- David Reber 31 Aug 2022 18:41 UTC
  10 points
  6
  Parent
  The causal incentives working group should get mentioned, it’s directly on AI safety: though it’s a bit older I gained a lot of clarity about AI safety concepts via “Modeling AGI Safety Frameworks with Causal Influence Diagrams”, which is quite accessible even if you don’t have a ton of training in causality.
- Thomas Larsen 30 Aug 2022 21:28 UTC
  10 points
  0
  Parent
  Sorry about that, and thank you for pointing this out.
  For now I’ve added a disclaimer (footnote 2 right now, might make this more visible/clear but not sure what the best way of doing that is). I will try to add a summary of some of these groups in when I have read some of their papers, currently I have not read a lot of their research.
  Edit: agree with Eli’s comment.
- Gunnar_Zarncke 31 Aug 2022 14:21 UTC
  0 points
  0
  AF Parent
  Some other noteworthy groups in academia lead by people who are somewhat connected to this community:
  - Jacob Steinhardt (Berkeley)
  - Dylan Hadfield-Menell (MIT)
  - Sam Bowman (NYU)
  - Roger Grosse (UofT)
  Some other noteworthy groups in academia lead by people who are perhaps less connected to this community:
  - Aleksander Madry (MIT)
  - Percy Liang (Stanford)
  - Scott Neikum (UMass Amhearst)
  Can you provide some links to these groups?
  - aogara 31 Aug 2022 16:05 UTC
    LW: 18 AF: 8
    8
    AF Parent
    These professors all have a lot of published papers in academic conferences. It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already. I would start by looking at their Google Scholar pages, followed by personal websites and maybe Twitter. One caveat would be that papers probably don’t have full explanations of the x-risk motivation or applications of the work, but that’s reading between the lines that AI safety people should be able to do themselves.
    - Thomas Larsen 31 Aug 2022 18:12 UTC
      LW: 10 AF: 6
      8
      AF Parent
      Agree with both aogara and Eli’s comment.
      One caveat would be that papers probably don’t have full explanations of the x-risk motivation or applications of the work, but that’s reading between the lines that AI safety people should be able to do themselves.
      For me this reading between the lines is hard: I spent ~2 hours reading academic papers/websites yesterday and while I could quite quickly summarize the work itself, it was quite hard to me to figure out the motivations.
      - David Scott Krueger (formerly: capybaralet) 2 Sep 2022 19:32 UTC
        LW: 11 AF: 3
        12
        AF Parent
        There’s a lot of work that could be relevant for x-risk but is not motivated by it. Some of it is more relevant than work that is motivated by it. An important challenge for this community (to facilitate scaling of research funding, etc.) is to move away from evaluating work based on motivations, and towards evaluating work based on technical content.
        elifland 3 Sep 2022 2:26 UTC
        6 points
        2
        Parent
        See The academic contribution to AI safety seems large and comments for some existing discussion related to this point
      - joshc 5 Sep 2022 5:17 UTC
        LW: 7 AF: 2
        2
        AF Parent
        PAIS #5 might be helpful here. It explains how a variety of empirical directions are related to X-Risk and probably includes many of the ones that academics are working on.
      - aogara 31 Aug 2022 18:49 UTC
        7 points
        5
        Parent
        Agreed it’s really difficult for a lot of the work. You’ve probably seen it already but Dan Hendrycks has done a lot of work explaining academic research areas in terms of x-risk (e.g. this and this paper). Jacob Steinhardt’s blog and field overview and Sam Bowman’s Twitter are also good for context.
        David Reber 31 Aug 2022 18:57 UTC
        12 points
        5
        Parent
        I second this, that it’s difficult to summarize AI-safety-relevant academic work for LW audiences. I want to highlight the symmetric difficulty of trying to summarize the mountain of blog-post-style work on the AF for academics.
        In short, both groups have steep reading/learning curves that are under-appreciated when you’re already familiar with it all.
    - elifland 31 Aug 2022 16:12 UTC
      7 points
      0
      Parent
      It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already
      Fair, I see why this would be frustrating and apologize for any frustration caused. In an ideal world we would have read many of these papers and summarized them ourselves, but that would have taken a lot of time and I think the post was valuable to get out ASAP.
      ETA: Probably it would have been better to include more of a disclaimer on the “everyone” point from the get-go, I think not doing this was a mistake.
      - aogara 31 Aug 2022 16:43 UTC
        5 points
        3
        Parent
        (Also, this is an incredibly helpful writeup and it’s only to be expected that some stuff would be missing. Thank you for sharing it!)
    - JohnMalin 31 Aug 2022 22:25 UTC
      LW: 6 AF: 5
      5
      AF Parent
      I don’t think the onus should be on the reader to infer x-risk motivations. In academic ML, it’s the author’s job to explain why the reader should care about the paper. I don’t see why this should be different in safety. If it’s hard to do that in the paper itself, you can always e.g. write a blog post explaining safety relevance (as mentioned by aogara, people are already doing this, which is great!).
      
      There are often many different ways in which a paper might be intended to be useful for x-risks (and ways in which it might not be). Often the motivation for a paper (even in the groups mentioned above) may be some combination of it being an interesting ML problem, interests of the particular student, and various possible thoughts around AI safety. It’s hard to try to disentangle this from the outside by reading between the lines.
      - Morpheus 14 Sep 2022 14:28 UTC
        2 points
        0
        Parent
        On the other hand there are a lot of reasons to belief the authors to be delusional about promises of their research and it’s theory for impact. I think the most I get personally out of posts like this is having this 3rd party perspective that I can compare with my own.
    - johnswentworth 31 Aug 2022 17:48 UTC
      0 points
      1
      Parent
      It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already.
      On the one hand, yeah, probably frustrating. On the other hand, that’s the norm in academia: people publish work and then nobody reads it.
      - David Reber 31 Aug 2022 18:50 UTC
        27 points
        16
        Parent
        Anecdotally, I’ve found the same said of Less Wrong / Alignment Forum posts among AI safety / EA academics: that it amounts to an echo chamber that no one else reads.
        I suspect both communities are taking their collective lack of familiarity with the other as evidence that the other community isn’t doing their part to disseminate their ideas properly. Of course, neither community seems particularly interested in taking the time to read up on the other, and seems to think that the other community should simply mimic their example (LWers want more LW synopses of academic papers, academics want AF work to be published in journals).
        Personally I think this is symptomatic of a larger camp-ish divide between the two, which is worth trying to bridge.
      - aogara 31 Aug 2022 18:36 UTC
        26 points
        17
        Parent
        All of these academics are widely read and cited. Looking at their Google Scholar profiles, everyone one of them has more than 1000, and half have more than 10,000 citations. Outside of LessWrong, lots of people in academia and industry labs already read and understand their work. We shouldn’t disparage people who are successfully bringing AI safety into the mainstream ML community.