Another problem is whether the “winning” approach might come from deeper searching along the existing paths, rather than broader searching in weirder areas. In that case, it could maybe still make sense to proliferate sub-approaches under the existing paths. The rest of the points (especially (4) below) would still apply, and this still relies on the existing paths being… broken enough to call “doom”, but not broken enough to try anything too different. This is possible.
This seems pretty plausible to me, and makes me think that people probably shouldn’t be too worried about trying to do “diverse” approaches just for the sake of trying them.
You have to play down through a few layers of the game tree before you start to realize what the main bottlenecks to solving the problem are.
[...]
And I do think the longer people are in the field, the more they tend to converge to a similar view. So for instance, example of that, right now, myself, Scott Garrabrant, and Paul Christiano are all working on basically the same problem. We’re all basically working on, what is abstraction and where does the human ontology come from? That sort of thing. And that was very much a case of convergent evolution. We all came from extremely different directions.
If a bunch of researchers have really ended up on somewhat similar research agendas because they each find these approaches the most promising, I think I feel better about them all sticking with their similar approaches than I would about them trying to go for more diverse approaches simply to “change things up” or “diversify our bets.”
On 4.
What are your thoughts on programs like AGISF and SERI MATS that allow people to learn about alignment research and try out their fit for it in a more structured environment? Do you think people should generally be scaling programs like this up further, or trying something pretty different?
Also, you say:
By the time you get good enough to get a grant, you have to have spent a lot of time studying this stuff.
My impression was that many funders may be somewhat willing to give grants (especially relatively small ones) to people who haven’t spent a ton of time learning about alignment already and who have relatively little in the way of existing “accomplishments,” to try their hand at alignment work. Have you personally gotten to apply for funding to work on alignment full-time yet?
On 3.
You say:
This seems pretty plausible to me, and makes me think that people probably shouldn’t be too worried about trying to do “diverse” approaches just for the sake of trying them.
In an interview with John Wentworth from AXRP, he suggests that convergence in research directions may generally be viewed as a positive sign.
If a bunch of researchers have really ended up on somewhat similar research agendas because they each find these approaches the most promising, I think I feel better about them all sticking with their similar approaches than I would about them trying to go for more diverse approaches simply to “change things up” or “diversify our bets.”
On 4.
What are your thoughts on programs like AGISF and SERI MATS that allow people to learn about alignment research and try out their fit for it in a more structured environment? Do you think people should generally be scaling programs like this up further, or trying something pretty different?
Also, you say:
My impression was that many funders may be somewhat willing to give grants (especially relatively small ones) to people who haven’t spent a ton of time learning about alignment already and who have relatively little in the way of existing “accomplishments,” to try their hand at alignment work. Have you personally gotten to apply for funding to work on alignment full-time yet?
I heard of (and worked through some of) the AGISF, but haven’t heard of SERI MATS. Scaling these up would likely work well.