The main thing missing here are academic groups (like mine at Cambridge https://www.davidscottkrueger.com/). This is a pretty glaring oversight, although I’m not that surprised since it’s LW.
Some other noteworthy groups in academia lead by people who are somewhat connected to this community: - Jacob Steinhardt (Berkeley) - Dylan Hadfield-Menell (MIT) - Sam Bowman (NYU) - Roger Grosse (UofT)
Some other noteworthy groups in academia lead by people who are perhaps less connected to this community: - Aleksander Madry (MIT) - Percy Liang (Stanford) - Scott Neikum (UMass Amhearst)
(speaking for just myself, not Thomas but I think it’s likely he’d endorse most of this)
I agree it would be great to include many of these academic groups; the exclusion wasn’t out of any sort of malice. Personally I don’t know very much about what most of these groups are doing or their motivations; if any of them want to submit brief write ups I‘d be happy to add them! :)
edit: lol, Thomas responded with a similar tone while I was typing
The causal incentives working group should get mentioned, it’s directly on AI safety: though it’s a bit older I gained a lot of clarity about AI safety concepts via “Modeling AGI Safety Frameworks with Causal Influence Diagrams”, which is quite accessible even if you don’t have a ton of training in causality.
Sorry about that, and thank you for pointing this out.
For now I’ve added a disclaimer (footnote 2 right now, might make this more visible/clear but not sure what the best way of doing that is). I will try to add a summary of some of these groups in when I have read some of their papers, currently I have not read a lot of their research.
Some other noteworthy groups in academia lead by people who are somewhat connected to this community: - Jacob Steinhardt (Berkeley) - Dylan Hadfield-Menell (MIT) - Sam Bowman (NYU) - Roger Grosse (UofT)
Some other noteworthy groups in academia lead by people who are perhaps less connected to this community: - Aleksander Madry (MIT) - Percy Liang (Stanford) - Scott Neikum (UMass Amhearst)
These professors all have a lot of published papers in academic conferences. It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already. I would start by looking at their Google Scholar pages, followed by personal websites and maybe Twitter. One caveat would be that papers probably don’t have full explanations of the x-risk motivation or applications of the work, but that’s reading between the lines that AI safety people should be able to do themselves.
One caveat would be that papers probably don’t have full explanations of the x-risk motivation or applications of the work, but that’s reading between the lines that AI safety people should be able to do themselves.
For me this reading between the lines is hard: I spent ~2 hours reading academic papers/websites yesterday and while I could quite quickly summarize the work itself, it was quite hard to me to figure out the motivations.
There’s a lot of work that could be relevant for x-risk but is not motivated by it. Some of it is more relevant than work that is motivated by it. An important challenge for this community (to facilitate scaling of research funding, etc.) is to move away from evaluating work based on motivations, and towards evaluating work based on technical content.
PAIS #5 might be helpful here. It explains how a variety of empirical directions are related to X-Risk and probably includes many of the ones that academics are working on.
Agreed it’s really difficult for a lot of the work. You’ve probably seen it already but Dan Hendrycks has done a lot of work explaining academic research areas in terms of x-risk (e.g. this and this paper). Jacob Steinhardt’s blog and field overview and Sam Bowman’s Twitter are also good for context.
I second this, that it’s difficult to summarize AI-safety-relevant academic work for LW audiences. I want to highlight the symmetric difficulty of trying to summarize the mountain of blog-post-style work on the AF for academics.
In short, both groups have steep reading/learning curves that are under-appreciated when you’re already familiar with it all.
It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already
Fair, I see why this would be frustrating and apologize for any frustration caused. In an ideal world we would have read many of these papers and summarized them ourselves, but that would have taken a lot of time and I think the post was valuable to get out ASAP.
ETA: Probably it would have been better to include more of a disclaimer on the “everyone” point from the get-go, I think not doing this was a mistake.
I don’t think the onus should be on the reader to infer x-risk motivations. In academic ML, it’s the author’s job to explain why the reader should care about the paper. I don’t see why this should be different in safety. If it’s hard to do that in the paper itself, you can always e.g. write a blog post explaining safety relevance (as mentioned by aogara, people are already doing this, which is great!).
There are often many different ways in which a paper might be intended to be useful for x-risks (and ways in which it might not be). Often the motivation for a paper (even in the groups mentioned above) may be some combination of it being an interesting ML problem, interests of the particular student, and various possible thoughts around AI safety. It’s hard to try to disentangle this from the outside by reading between the lines.
On the other hand there are a lot of reasons to belief the authors to be delusional about promises of their research and it’s theory for impact. I think the most I get personally out of posts like this is having this 3rd party perspective that I can compare with my own.
It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already.
On the one hand, yeah, probably frustrating. On the other hand, that’s the norm in academia: people publish work and then nobody reads it.
Anecdotally, I’ve found the same said of Less Wrong / Alignment Forum posts among AI safety / EA academics: that it amounts to an echo chamber that no one else reads.
I suspect both communities are taking their collective lack of familiarity with the other as evidence that the other community isn’t doing their part to disseminate their ideas properly. Of course, neither community seems particularly interested in taking the time to read up on the other, and seems to think that the other community should simply mimic their example (LWers want more LW synopses of academic papers, academics want AF work to be published in journals).
Personally I think this is symptomatic of a larger camp-ish divide between the two, which is worth trying to bridge.
All of these academics are widely read and cited. Looking at their Google Scholar profiles, everyone one of them has more than 1000, and half have more than 10,000 citations. Outside of LessWrong, lots of people in academia and industry labs already read and understand their work. We shouldn’t disparage people who are successfully bringing AI safety into the mainstream ML community.
The main thing missing here are academic groups (like mine at Cambridge https://www.davidscottkrueger.com/). This is a pretty glaring oversight, although I’m not that surprised since it’s LW.
Some other noteworthy groups in academia lead by people who are somewhat connected to this community:
- Jacob Steinhardt (Berkeley)
- Dylan Hadfield-Menell (MIT)
- Sam Bowman (NYU)
- Roger Grosse (UofT)
More at https://futureoflife.org/team/ai-existential-safety-community/ (although I think the level of focus on x-safety and engagement with this community varies substantially among these people).
BTW, FLI is itself worth a mention, as is FHI, maybe in particular https://www.fhi.ox.ac.uk/causal-incentives-working-group/ if you want to focus on technical stuff.
Some other noteworthy groups in academia lead by people who are perhaps less connected to this community:
- Aleksander Madry (MIT)
- Percy Liang (Stanford)
- Scott Neikum (UMass Amhearst)
These are just examples.
(speaking for just myself, not Thomas but I think it’s likely he’d endorse most of this)
I agree it would be great to include many of these academic groups; the exclusion wasn’t out of any sort of malice. Personally I don’t know very much about what most of these groups are doing or their motivations; if any of them want to submit brief write ups I‘d be happy to add them! :)
edit: lol, Thomas responded with a similar tone while I was typing
The causal incentives working group should get mentioned, it’s directly on AI safety: though it’s a bit older I gained a lot of clarity about AI safety concepts via “Modeling AGI Safety Frameworks with Causal Influence Diagrams”, which is quite accessible even if you don’t have a ton of training in causality.
Sorry about that, and thank you for pointing this out.
For now I’ve added a disclaimer (footnote 2 right now, might make this more visible/clear but not sure what the best way of doing that is). I will try to add a summary of some of these groups in when I have read some of their papers, currently I have not read a lot of their research.
Edit: agree with Eli’s comment.
Can you provide some links to these groups?
These professors all have a lot of published papers in academic conferences. It’s probably a bit frustrating to not have their work summarized, and then be asked to explain their own work, when all of their work is published already. I would start by looking at their Google Scholar pages, followed by personal websites and maybe Twitter. One caveat would be that papers probably don’t have full explanations of the x-risk motivation or applications of the work, but that’s reading between the lines that AI safety people should be able to do themselves.
Agree with both aogara and Eli’s comment.
For me this reading between the lines is hard: I spent ~2 hours reading academic papers/websites yesterday and while I could quite quickly summarize the work itself, it was quite hard to me to figure out the motivations.
There’s a lot of work that could be relevant for x-risk but is not motivated by it. Some of it is more relevant than work that is motivated by it. An important challenge for this community (to facilitate scaling of research funding, etc.) is to move away from evaluating work based on motivations, and towards evaluating work based on technical content.
See The academic contribution to AI safety seems large and comments for some existing discussion related to this point
PAIS #5 might be helpful here. It explains how a variety of empirical directions are related to X-Risk and probably includes many of the ones that academics are working on.
Agreed it’s really difficult for a lot of the work. You’ve probably seen it already but Dan Hendrycks has done a lot of work explaining academic research areas in terms of x-risk (e.g. this and this paper). Jacob Steinhardt’s blog and field overview and Sam Bowman’s Twitter are also good for context.
I second this, that it’s difficult to summarize AI-safety-relevant academic work for LW audiences. I want to highlight the symmetric difficulty of trying to summarize the mountain of blog-post-style work on the AF for academics.
In short, both groups have steep reading/learning curves that are under-appreciated when you’re already familiar with it all.
Fair, I see why this would be frustrating and apologize for any frustration caused. In an ideal world we would have read many of these papers and summarized them ourselves, but that would have taken a lot of time and I think the post was valuable to get out ASAP.
ETA: Probably it would have been better to include more of a disclaimer on the “everyone” point from the get-go, I think not doing this was a mistake.
(Also, this is an incredibly helpful writeup and it’s only to be expected that some stuff would be missing. Thank you for sharing it!)
I don’t think the onus should be on the reader to infer x-risk motivations. In academic ML, it’s the author’s job to explain why the reader should care about the paper. I don’t see why this should be different in safety. If it’s hard to do that in the paper itself, you can always e.g. write a blog post explaining safety relevance (as mentioned by aogara, people are already doing this, which is great!).
There are often many different ways in which a paper might be intended to be useful for x-risks (and ways in which it might not be). Often the motivation for a paper (even in the groups mentioned above) may be some combination of it being an interesting ML problem, interests of the particular student, and various possible thoughts around AI safety. It’s hard to try to disentangle this from the outside by reading between the lines.
On the other hand there are a lot of reasons to belief the authors to be delusional about promises of their research and it’s theory for impact. I think the most I get personally out of posts like this is having this 3rd party perspective that I can compare with my own.
On the one hand, yeah, probably frustrating. On the other hand, that’s the norm in academia: people publish work and then nobody reads it.
Anecdotally, I’ve found the same said of Less Wrong / Alignment Forum posts among AI safety / EA academics: that it amounts to an echo chamber that no one else reads.
I suspect both communities are taking their collective lack of familiarity with the other as evidence that the other community isn’t doing their part to disseminate their ideas properly. Of course, neither community seems particularly interested in taking the time to read up on the other, and seems to think that the other community should simply mimic their example (LWers want more LW synopses of academic papers, academics want AF work to be published in journals).
Personally I think this is symptomatic of a larger camp-ish divide between the two, which is worth trying to bridge.
All of these academics are widely read and cited. Looking at their Google Scholar profiles, everyone one of them has more than 1000, and half have more than 10,000 citations. Outside of LessWrong, lots of people in academia and industry labs already read and understand their work. We shouldn’t disparage people who are successfully bringing AI safety into the mainstream ML community.