notes (from a very jr researcher) on alignment training pipeline
Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.
A background assumption I’m working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers ought to publication-maximize for a period of time even if it’s at the expense of their training. This does not mean that training as a longtermist is always or even often orthogonal to legible-to-academia training, it can be highly synchronized, but it depends on the occasion.
It’s common to query what relative ratio should be assigned to competence building (textbooks, exercises) vs. understanding the literature (reading papers and alignment forum), but perhaps there is a third category- honing your threat model and theory of change.
I spoke with a sr researcher recently who roughly said that a threat model with a theory of change is almost sufficient for an inside view / gears-level model. I’m working from the theory that honed threat models and your theory of change are important to calculate interventions. See Alice and Bob in Rohin’s faq.
I’ve been trying by doing exercises with a group of peers weekly to hone my inside view / gears-level model of the actual problem. But the sr researcher i spoke to said mentorship trees of 1:1 time, not exercises that jrs can just do independently or in groups, is the only way it can happen. This is troublesome to me, as the bottleneck becomes mentors’ time. I’m not so much worried about the hopefully merit-based process of mentors figuring out who’s worth their time, as I am about the overall throughput. It gets worse though- what if the process is credentialist?
Take a look at the Critch quote from the top of Rohin’s faq:
I get a lot of emails from folks with strong math backgrounds (mostly, PhD students in math at top schools) who are looking to transition to working on AI alignment / AI x-risk.
Is he implicitly saying that he offloads some of the filtering work to admissions people at top schools? Presumably people from non-top schools are also emailing him, but he doesn’t mention them.
I’d like to see a claim that admissions people at top schools are trustworthy. No one has argued this to my knowledge. I think sometimes the movement falls back on status games, unless there is some intrinsic benefit to “top schools” (besides building social power/capital) that everyone is aware of. (Indeed if someone’s argument is that they identified a lever that requires a lot of social power/capital, then they can maybe put that top school on their resume to use, but if the lever is strictly high quality useful research (instead of say steering a federal government) this doesn’t seem to apply).
Is he implicitly saying that he offloads some of the filtering work to admissions people at top schools?
I don’t think Critch’s saying that the best way to get his attention is through cold emails backed up by credentials. The whole post is about him not using that as a filter to decide who’s worth his time but that people should create good technical writing to get attention.
Critch’s written somewhere that if you can get into UC Berkeley, he’ll automatically allow you to become his student, because getting into UC Berkeley is a good enough filter.
Where did he say that? Given that he’s working at UC Berkeley I would expect him to treat UC Berkeley students preferentially for reasons that aren’t just about UC Berkeley being able to filter.
It’s natural that you can sign up for one of the classes he teaches at UC Berkeley by being a student of UC Berkeley.
Being enrolled into MIT might be just as hard as being enrolled into UC Berkeley but it doesn’t give you the same access to courses taught at UC Berkeley by it’s faculty.
If you get into one of the following programs at Berkeley:
a PhD program in computer science, mathematics, logic, or statistics, or
a postdoc specializing in cognitive science, cybersecurity, economics, evolutionary biology, mechanism design, neuroscience, or moral philosophy,
… then I will personally help you find an advisor who is supportive of you researching AI alignment, and introduce you to other researchers in Berkeley with related interests.
and also
While my time is fairly limited, I care a lot about this field, and you getting into Berkeley is a reasonable filter for taking time away from my own research to help you kickstart yours.
notes (from a very jr researcher) on alignment training pipeline
Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.
A background assumption I’m working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers ought to publication-maximize for a period of time even if it’s at the expense of their training. This does not mean that training as a longtermist is always or even often orthogonal to legible-to-academia training, it can be highly synchronized, but it depends on the occasion.
It’s common to query what relative ratio should be assigned to competence building (textbooks, exercises) vs. understanding the literature (reading papers and alignment forum), but perhaps there is a third category- honing your threat model and theory of change.
I spoke with a sr researcher recently who roughly said that a threat model with a theory of change is almost sufficient for an inside view / gears-level model. I’m working from the theory that honed threat models and your theory of change are important to calculate interventions. See Alice and Bob in Rohin’s faq.
I’ve been trying by doing exercises with a group of peers weekly to hone my inside view / gears-level model of the actual problem. But the sr researcher i spoke to said mentorship trees of 1:1 time, not exercises that jrs can just do independently or in groups, is the only way it can happen. This is troublesome to me, as the bottleneck becomes mentors’ time. I’m not so much worried about the hopefully merit-based process of mentors figuring out who’s worth their time, as I am about the overall throughput. It gets worse though- what if the process is credentialist?
Take a look at the Critch quote from the top of Rohin’s faq:
Is he implicitly saying that he offloads some of the filtering work to admissions people at top schools? Presumably people from non-top schools are also emailing him, but he doesn’t mention them.
I’d like to see a claim that admissions people at top schools are trustworthy. No one has argued this to my knowledge. I think sometimes the movement falls back on status games, unless there is some intrinsic benefit to “top schools” (besides building social power/capital) that everyone is aware of. (Indeed if someone’s argument is that they identified a lever that requires a lot of social power/capital, then they can maybe put that top school on their resume to use, but if the lever is strictly high quality useful research (instead of say steering a federal government) this doesn’t seem to apply).
I don’t think Critch’s saying that the best way to get his attention is through cold emails backed up by credentials. The whole post is about him not using that as a filter to decide who’s worth his time but that people should create good technical writing to get attention.
Critch’s written somewhere that if you can get into UC Berkeley, he’ll automatically allow you to become his student, because getting into UC Berkeley is a good enough filter.
Where did he say that? Given that he’s working at UC Berkeley I would expect him to treat UC Berkeley students preferentially for reasons that aren’t just about UC Berkeley being able to filter.
It’s natural that you can sign up for one of the classes he teaches at UC Berkeley by being a student of UC Berkeley.
Being enrolled into MIT might be just as hard as being enrolled into UC Berkeley but it doesn’t give you the same access to courses taught at UC Berkeley by it’s faculty.
http://acritch.com/ai-berkeley/
and also
Okay, he does speak about using Berkeley as a filter but he doesn’t speak about taking people as his student.
It seems about helping people in UC Berkeley to connect with other people in UC Berkeley.