Documenting Journey Into AI Safety
Summary
Recently, I have noticed that many individuals have become discouraged due to a seeming lack of progress in their attempts to break into a career in AI safety. It seems to me that a key bottleneck in the beginning a career in AI safety pipeline is access to mentors.
One method that may help address this need for mentorship would be for an individual going through this process to document their path, with a focus on peer and mentor feedback.[1]Although this content would not be tailored to each specific listener, my intuition is that a lot of the feedback provided to one individual will still be valuable to others who have not yet been able to find mentors themselves. A series like this could then be a resource for newcomers, and could significantly increase the number of people who can get involved in the field.
If you have the time, I would greatly appreciate any feedback that you have regarding my thought process and this endeavor. Check out the Feedback Request section for some more information.
Feedback Request
I have prioritized getting this out quickly over this being my final plan, as I believe we can reach a much better result more efficiently than I could on my own. So, if something seems incredibly obvious and I didn’t mention it or take it into account, please let me know! Here are some questions to think about as you read through this post for a few different demographics:
If you are new to AI safety, would a series like this be helpful? Is there anything you would like to make sure is covered early on (if possible)?
Field builders, am I making a field-building-faux-pas, or does something like this make sense to you? Is there any way I can improve it?
AI safety veterans, am I missing an obvious opportunity or angle? Is there anything that you think would be highly valuable to talk about in a series like this?
Brief Intro
My name is Jacob Haimes, and I live just outside of Boulder, Colorado, USA. I have an MS in computational modeling. In my free time I love playing all kinds of games, especially TTRPGs.
Examining the Situation
When I began thinking about making a career change into AI safety, I saw two options: up-skill and apply for positions, or apply for a PhD (with the intent to focus on AI safety, and enter the field once finished with the program). After discussing with a number of individuals, I decided on the former for the following reasons:
Consensus seemed to be that a PhD is most effective when the individual knows precisely what niche/problem they plan on studying—I did not.
The field of AI safety is currently relatively small, but is growing, meaning that work which is done now could have an increased impact (as it is likely to influence future work and/or practices).
Timelines for TAI/AGI may be short enough that we need as many individuals as possible working on AI safety as soon as possible.
In these same conversations, I would ask what resources and methods others had found to be the most helpful. The most common piece of advice went something like this: “Courses are helpful to some extent, but I found actually contributing towards research and/or working on projects to be much more effective.” When asking about where to go to contribute to these kinds of efforts I was provided some Discord servers, opportunities boards, research groups, grants, organizations, and even some start-ups and companies that would be more for career capital than for the actual value add to AI safety that I would be generating.
Unfortunately, a majority of the job postings aren’t looking for newcomers, and instead they include statements such as “must have 5+ years of work in a relevant field” or “Master’s required, PhD preferred.” To limit the pool even more, a very significant portion of the beginner-accepting positions require relocation to Washington D.C., the San Francisco Bay Area, or London, and all are extremely competitive. It seems that the frequently parroted idea of “we need more people interested in AI safety,” isn’t wholly accurate; a more authentic version would be “we need more sufficiently experienced and credentialed people interested in AI safety.”[2]
The Inconsistency
As far as I can tell, my story up to this point is a relatively common one, meaning that we have found a routinely occurring inconsistency: as people begin thinking about entering into the AI safety field, they are discouraged from pursuing a PhD, but once they know enough to theoretically be able to contribute, there are very few opportunities for individuals with their level of experience.
The value of a PhD, in addition to having a relatively stable salary for the 3-5 years, is getting experience doing research (in your field of study) under the guidance and oversight of someone who has significantly more knowledge and practice. Theoretically, at least, this kind of relationship is not monopolized by higher education; instead, mentorship is constrained by the availability of mentors. Since the field is rapidly growing (nice work field-builders), there is no way that our current AI safety experts have the capacity to mentor all newcomers, continue pushing forward their own work, and have a healthy work-life balance.
Something That May Help
To resolve this, we need a way to amplify the mentorship that is getting done. I think that I might be able to assist in this endeavor by recording/documenting my transition to a career in AI safety, including conversations that I have with peers/mentors, my progress through courses I am able to get into (with the facilitator’s permission), and how my perspectives change as I learn more. This won’t be as good as talking directly to one’s own mentor, but should provide an additional resource for individuals that are finding it difficult to know what to do next, and how to do it.
Regarding Conflicts of Interest
As I was writing this out, I realized that this post could be perceived as primarily self-serving. This perspective would consider me as an individual who hadn’t yet found mentors, and was trying to secure the best ones under the guise of helping others. Perhaps you weren’t thinking this, but if I noticed it myself, I think it is probably worth addressing.
I am committed to work in the AI safety space, and I currently have a stable part-time job. Because of this, I am confident that I will, at some point, find fantastic mentors. I do not want to poach mentors from others, so I have come to a two-part solution:
I commit to mentoring others once I have the skills and experience to do so.
The proposed series should not be taken into account when weighing whether or not you would reach out to me and/or offer mentorship.
If you have the availability to be a mentor, please do so.
Series Structure
After receiving the resources that I mentioned earlier, I put myself out there on Discord, messaging individuals and channels about opportunities I could contribute to, and I applied for all of the other opportunities that were accessible to me. Even so, I found no traction, so I began trying to make my own opportunities.
Through networking with others in my AI Governance cohort, I was eventually able to connect with another researcher who was willing to collaborate on mechanistic interpretability research, and I began working with him. I’ve also been keeping a close eye on the forums and job boards for new opportunities, leading me to Linda Linsefors’s post about applying to be a research lead for the upcoming AI Safety Camp. Although at first I was not sure if I would meet the requirements for such a position, I reached out, and verified with her that I could be a good fit. Since then I have gone through multiple iterations of my formal research proposal (effectively the application for AISC), and just in doing that I have learned a lot.
With this in mind, the series would most likely be a podcast (primarily), with the intent to produce additional content (videos, documents) when a situation was particularly conducive to it. The first episode of the podcast would be a summary of my path thus far, including the resources and experiences that have been most valuable. After this (with the possibility of revisiting anything that others are particularly interested in), the podcast would turn into an audio journal about the two projects I am currently working on (the AISC research plan and the mechanistic interpretability research), as well as any that I pick up over time. I have been having multiple meetings a week with connections I have made throughout the courses, and I would (with permission) record the audio of those sessions as well. Afterwards, I will edit these down to focus on the most important parts, which should help hold listener attention. In addition, I would be happy to create posts with my thoughts on why and how specific resources were helpful to me.
Moving forward, I will then continue to apply to programs/courses (e.g., SERI-MATS), with the intent of sharing as much content as is safely possible (I could see there being some edge cases where research requires more safety measures, although I doubt I will be in a position to contribute towards those kinds of projects for some time).
Acknowledgements
Special thanks to Linda Linsefors, Peter Gebauer, and Chris Lonsberry for feedback on this post and our discussions. I also would not have made this post without the help of AI Safety Quest’s navigation call, 80,000 Hours 1-1 advice, or BlueDot Impact’s AI courses.
I would also like to thank those of you who took the time to read through this post and provide feedback.
- INTERVIEW: StakeOut.AI w/ Dr. Peter Park by 5 Mar 2024 18:04 UTC; 21 points) (EA Forum;
- Interview: Applications w/ Alice Rigg by 19 Dec 2023 19:03 UTC; 12 points) (
- Hackathon and Staying Up-to-Date in AI by 8 Jan 2024 17:10 UTC; 11 points) (
- Into AI Safety: Episode 3 by 11 Dec 2023 16:30 UTC; 6 points) (
- INTERVIEW: StakeOut.AI w/ Dr. Peter Park by 4 Mar 2024 16:35 UTC; 6 points) (
- Into AI Safety—Episode 0 by 22 Oct 2023 3:30 UTC; 5 points) (
- Into AI Safety Episodes 1 & 2 by 9 Nov 2023 4:36 UTC; 2 points) (
Interesting idea. Looking forward to seeing how this goes!
The timing of this post is quite serendipitous for me. Much of what you wrote resonates heavily. First comment on LW, by the way!
I’m deeply interested in the technical problems of alignment and have recently read through the AI Safety Fundamentals Course. I’m looking for any opportunity to discuss these ideas with others. I’ve been adjacent to the rationalist community for a few years (a few friends, EA, ACX, rationalism, etc.), but the need to sanity check my own thoughts on alignment has made engaging with the LW community seem invaluable.
I have found the barrier to entry seems high from a career perspective. Despite how I’d like to spend my time, my day job limits the amount of focused hours I can commit to upskilling in this domain, and a community of people in similar positions would be invaluable. I’m more than willing to self-study and do independent research, but I’m eager for some guidance so I can appropriately goal set.
If and only if the members have publicly provided the link, would you mind sharing any resources you find where groups may be working on particular problems?
Glad to hear that my post is resonating with some people!
I definitely understand the difficulty regarding time allocation when also working a full time job. As I gather resources and connections I will definitely make sure to spread awareness of them.
One thing to note, though, is that I found the more passive approach of waiting until I find an opportunity to be much less effective than forging opportunities myself (even though I was spending a significant amount of time looking for those opportunities).
A specific and more detailed recommendation for how to do this is going to be highly dependent on your level of experience with ML and time availability. My more general recommendation would be to apply to be in a cohort of BlueDot Impact’s AI Governance or AI Safety Fundamentals courses (I believe that the application for the early 2024 session of the AI Safety Fundamentals course is currently open). Taking a course like this provides opportunities to gain connections, which can be leveraged into independent projects/efforts. I found that the AI Governance session was very doable with a full time position (when I started it, I was still full time at my current job). Although I cannot definitely say the same for the AI Safety Fundamentals course, as I did not complete it through a formal session (and instead just did the readings independently), it seems to be a similar time commitment. I think that taking the course with a cohort would definitely be valuable, even for those that have completed the readings independently.
Thanks so much for the thoughtful response. I’ll certainly reach out and try to participate in BlueDot Impact’s course now that I’m more familiar with the content, and will stay on the lookout for anything you document as you go through your own journey! Even just a few of the names and resources so far have been incredibly valuable pointers to the right corners of the internet.
I don’t have karma yet, but if I did, I’d gladly open my wallet :)