There’s a sense in which there are specific assumptions made that influence the selection of advisors listed for Astra Fellowship. I may be wrong, but it seems to me that the majority of researchers listed seem to work on interpretability and evals-and-demonstrations, or have models of the alignment problem (or research taste and agenda) that are strongly Paul-Christiano-like.
I assume Chipmonk was gesturing at the nonexistence of advisors who aren’t downstream of Paul Christiano’s work and models and research agenda and mentors. Agent foundations (John Wentworth, Scott Garrabrant, Abram Demski) and formal world-models (Davidad) are two examples that come to mind.
Note I don’t entirely share this belief (I notice that there are advisors who seem to be interested in s-risk focused research), but I get the sentiment. Also as far as I can tell, there are very few researchers like the ones I listed, and they may not be in a position to be an advisor for this program.
There’s a sense in which there are specific assumptions made that influence the selection of advisors listed for Astra Fellowship. I may be wrong, but it seems to me that the majority of researchers listed seem to work on interpretability and evals-and-demonstrations, or have models of the alignment problem (or research taste and agenda) that are strongly Paul-Christiano-like.
I assume Chipmonk was gesturing at the nonexistence of advisors who aren’t downstream of Paul Christiano’s work and models and research agenda and mentors. Agent foundations (John Wentworth, Scott Garrabrant, Abram Demski) and formal world-models (Davidad) are two examples that come to mind.
Note I don’t entirely share this belief (I notice that there are advisors who seem to be interested in s-risk focused research), but I get the sentiment. Also as far as I can tell, there are very few researchers like the ones I listed, and they may not be in a position to be an advisor for this program.
Yes this. And more agent foundations, especially. Thanks mesa