Resources I send to AI researchers about AI safety

Vael Gates14 Jun 2022 2:24 UTC

69 points

AI Alignment Fieldbuilding AI Existential risk

If you’re interested in seeing my up-to-date recommendations, please see the Arkose Resource Center!

For the purpose of reducing the number of pages where I keep updated recommendations, I’m now retiring this post. However, you’re welcome to look at the 2023 web archive version.

What links here?

List of AI safety newsletters and other resources by Lizka (EA Forum; 1 May 2023 17:24 UTC; 49 points)
Papers to start getting into NLP-focused alignment research by Feraidoon (24 Sep 2022 23:53 UTC; 6 points)
Vael Gates's comment on Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding by Vael Gates (EA Forum; 6 Dec 2022 22:04 UTC; 1 point)

Vael Gates14 Jun 2022 2:24 UTC

69 points

12 comments1 min readLW link

AI Alignment Fieldbuilding AI Existential risk

Crossposted to EA Forum (43 points, 0 comments)

Adam Zerner 14 Jun 2022 8:28 UTC
15 points
0
I notice that Eliezer and MIRI are missing. Why is this? Low prestige amongst the academic community? Harsh writing style?

I don’t mean to open a can of worms or anything. It just seems worth engaging with reality and not shying away from it.
- Vael Gates 14 Jun 2022 23:32 UTC
  5 points
  0
  Parent
  A great point, thanks! I’ve just edited the “There’s also a growing community working on AI alignment” section to include MIRI, and also edited some of the academics’ names and links.
  
  I don’t think it makes sense for me to list Eliezer’s name in the part of that section where I’m listing names, since I’m only listing some subset of academics who (vaguely gesturing at a cluster) are sort of actively publishing in academia, mostly tenure track and actively recruiting students, and interested in academic field-building. I’m not currently listing names of researchers in industry or non-profits (e.g. I don’t list Paul Christiano, or Chris Olah), though that might be a thing to do.
  
  Note that I didn’t choose this list of names very carefully, so I’m happy to take suggestions! This doc came about because I had an email draft that I was haphazardly adding things to as I talked to researchers and needed to promptly send them resources, getting gradually refined when I spotted issues. I thus consider it a work-in-progress and appreciate suggestions.
  - Vael Gates 14 Jun 2022 23:45 UTC
    10 points
    0
    Parent
    With respect to the fact that I don’t immediately point people at LessWrong or the Alignment Forum (I actually only very rarely include the “Rationalist” section in the email—not unless I’ve decided to bring it up in person, and they’ve reacted positively), there’s different philosophies on AI alignment field-building. One of the active disagreements right now is how much we want new people coming into AI alignment to be the type of person who enjoy LessWrong, or whether it’s good to be targeting a broader audience.
    
    I’m personally currently of the opinion that we should be targeting a broader audience, where there’s a place for people who want to work in academia or industry separate from the main Rationalist sphere, and the people who are drawn towards the Rationalists will find their way there either on their own (I find people tend to do this pretty easily when they start Googling), or with my nudging if they seem to be that kind of person.
    
    I don’t think this is much “shying away from reality”—it feels more like engaging with it, trying to figure out if and how we want AI alignment research to grow, and how to best make that happen given the different types of people with different motivations involved.
    - Adam Zerner 15 Jun 2022 0:42 UTC
      3 points
      0
      Parent
      
      I’m personally currently of the opinion that we should be targeting a broader audience
      
      Is the implication that, in order to target a broader audience, you think it would be wise to avoid mentions of LessWrong? Is that because you fear such mentions would turn them off?
      
      If so, that seems like an important thing to take note of. Such a perception seems like a bad thing that we should try to fix. On the other hand, it is also possible that it is a net positive because it keeps the community from being “diluted”.
      
      I don’t think this is much “shying away from reality”
      
      I didn’t mean to imply that you personally were. What I meant when I used that phrase is that this feels like a touchy subject that I myself wanted to flinch away from, but I don’t actually think I should flinch away from.
- TAG 14 Jun 2022 10:42 UTC
  1 point
  0
  Parent
  There’s a mention of the rationalist community.
  - Adam Zerner 14 Jun 2022 17:32 UTC
    2 points
    0
    Parent
    True, but despite that fact, it still feels like Eliezer and MIRI are purposefully left out.
    - TAG 15 Jun 2022 11:26 UTC
      1 point
      0
      Parent
      How it feels depends on how prominence you them to have.
lc 14 Jun 2022 6:23 UTC
6 points
0
Don’t sleep on this stuff Vael Gates keeps putting out. They’re doing the lord’s work.
KatWoods 20 Jun 2022 10:01 UTC
5 points
0
Love this! Added it to our list of AI safety curricula, reading lists, and courses.
Thanks for sharing this.
- Vael Gates 21 Jun 2022 2:26 UTC
  2 points
  0
  Parent
  Thanks for doing that Kat!
plex 14 Jun 2022 11:58 UTC
4 points
0
Amazing! Would you be happy for some of the content here to be used as a basis for Stampy answers?
- Vael Gates 14 Jun 2022 23:50 UTC
  3 points
  0
  Parent
  Sure! This isn’t novel content; the vast majority of it is drawn from existing lists, so it’s not even particularly mine. I think just make sure the things within are referenced correctly, and you should be good to go!