I don’t see this distribution of research projects as “Goodharting” or “overfocusing” on projects with clear feedback loops. As MATS is principally a program for prosaic AI alignment at the moment, most research conducted within the program should be within this paradigm. We believe projects that frequently “touch reality” often offer the highest expected value in terms of reducing AI catastrophic risk, and principally support non-prosaic, “speculative,” and emerging research agendas for their “exploration value,” which might aid potential paradigm shifts, as well as to round out our portfolio (i.e., “hedge our bets”).
However, even with the focus on prosaic AI alignment research agendas, our Summer 2023 Program supported many emerging or neglected research agendas, including projects in agent foundations, simulator theory, cooperative/multipolar AI (including s-risks), the nascent “activation engineering” approach our program helped pioneer, and the emerging “cyborgism” research agenda.
Additionally, our mentor portfolio is somewhat conditioned on the preferences of our funders. While we largely endorse our funders’ priorities, we are seeking additional funding diversification so that we can support further speculative “research bets”. If you are aware of large funders willing to support our program, please let me know!
I don’t see this distribution of research projects as “Goodharting” or “overfocusing” on projects with clear feedback loops. As MATS is principally a program for prosaic AI alignment at the moment, most research conducted within the program should be within this paradigm. We believe projects that frequently “touch reality” often offer the highest expected value in terms of reducing AI catastrophic risk, and principally support non-prosaic, “speculative,” and emerging research agendas for their “exploration value,” which might aid potential paradigm shifts, as well as to round out our portfolio (i.e., “hedge our bets”).
However, even with the focus on prosaic AI alignment research agendas, our Summer 2023 Program supported many emerging or neglected research agendas, including projects in agent foundations, simulator theory, cooperative/multipolar AI (including s-risks), the nascent “activation engineering” approach our program helped pioneer, and the emerging “cyborgism” research agenda.
Additionally, our mentor portfolio is somewhat conditioned on the preferences of our funders. While we largely endorse our funders’ priorities, we are seeking additional funding diversification so that we can support further speculative “research bets”. If you are aware of large funders willing to support our program, please let me know!