Occasionally people say “hey, alignment research has lots of money behind it now, why not fund basically everyone who wants to try it?”.
But then the rest of the post did not directly address this question. You point out that most of “everyone who wants to try it” will start out with some (probably) flawed ideas, but that doesn’t seem like a good argument for not funding them. After all, you yourself experienced the growth you want others to go through while being funded yourself. I’d expect someone who is able to spend more of their time doing research (due to being able to focus on it full-time) will likely reach intellectual maturity on the topic faster than someone who has to focus on making a living in another area as well.
Mostly what I’m arguing for here is a whole different model, where newcomers are funded with a goal of getting them through the Path (probably with resources designed for that purpose), rather than relying on Alignment Maturity coming about accidentally as a side-effect of research.
(Also, minor point, I think I was most of the way through the Path by the time I got my first grant, so I actually did go through that growth before I had funding. But I don’t think that’s particularly relevant here.)
At the moment there’s a plan to create The Berlin Hub as a coliving space for new AI safety researchers. What lessons do you think should be drawn from the thesis you laid out for that project? Do you believe that the peer review that happens through that environment will push people on the [ath forward or would you fear that a lot of people at the Hub would do work that doesn’t matter?
This is extremely difficult. Some good literature on cooperative living worth reading because there are countless common pitfalls. Also being a research org at the same time is quite ambitious. Good luck!
Many practicalities with admitting good members, dealing with problematic members, keeping the kitchen sink clean, keeping the floors clean, keeping track of rent, doing repairs, etc. Some of this is alleviated if you have a big budget. Culture is extremely tricky. It is extremely rewarding when it works.
Visiting a coop for even a week reveals quite a bit about how it works — if you haven’t done that already
The main immediate advice I’d give is to look at people switching projects/problems/ideas as a key metric. Obviously that’s not a super-robust proxy and will break down if people start optimizing for it directly. But insofar as changes in which projects people work on are driven by updates to their underlying models, it’s a pretty good metric of progress down the Path.
At this point, I still have a lot of uncertainty about things which will work well or not work well for accelerating people down the Path; it looks tractable, but that doesn’t mean that it’s clear yet what the best methods are. Trying things and seeing what causes people to update a lot seems like a generally good approach.
To clarify, basically anyone who actually wants to try to work on alignment full time who is at all promising and willing to learn is already getting funded to upskill and investigate for at least a few months. The question here is “why not fund them to do X, if they suggest it,” and my answer is that if they only thing they are interested in is X, and X is one of the things John listed above, they aren’t going to get funded unless they have a really great argument. And most don’t, and either they take feedback and come up with another idea. I suggest they upskill and learn more, or they decide to do something else.
I noticed that you began this post by saying
But then the rest of the post did not directly address this question. You point out that most of “everyone who wants to try it” will start out with some (probably) flawed ideas, but that doesn’t seem like a good argument for not funding them. After all, you yourself experienced the growth you want others to go through while being funded yourself. I’d expect someone who is able to spend more of their time doing research (due to being able to focus on it full-time) will likely reach intellectual maturity on the topic faster than someone who has to focus on making a living in another area as well.
Mostly what I’m arguing for here is a whole different model, where newcomers are funded with a goal of getting them through the Path (probably with resources designed for that purpose), rather than relying on Alignment Maturity coming about accidentally as a side-effect of research.
(Also, minor point, I think I was most of the way through the Path by the time I got my first grant, so I actually did go through that growth before I had funding. But I don’t think that’s particularly relevant here.)
At the moment there’s a plan to create The Berlin Hub as a coliving space for new AI safety researchers. What lessons do you think should be drawn from the thesis you laid out for that project? Do you believe that the peer review that happens through that environment will push people on the [ath forward or would you fear that a lot of people at the Hub would do work that doesn’t matter?
This is extremely difficult. Some good literature on cooperative living worth reading because there are countless common pitfalls. Also being a research org at the same time is quite ambitious. Good luck!
do you happen to have additional references besides those words to find literature on cooperative living?
Had some books at previous coop — might have been these.
https://www.ic.org/community-bookstore/product/wisdom-of-communities-complete-set/
Many practicalities with admitting good members, dealing with problematic members, keeping the kitchen sink clean, keeping the floors clean, keeping track of rent, doing repairs, etc. Some of this is alleviated if you have a big budget. Culture is extremely tricky. It is extremely rewarding when it works.
Visiting a coop for even a week reveals quite a bit about how it works — if you haven’t done that already
The main immediate advice I’d give is to look at people switching projects/problems/ideas as a key metric. Obviously that’s not a super-robust proxy and will break down if people start optimizing for it directly. But insofar as changes in which projects people work on are driven by updates to their underlying models, it’s a pretty good metric of progress down the Path.
At this point, I still have a lot of uncertainty about things which will work well or not work well for accelerating people down the Path; it looks tractable, but that doesn’t mean that it’s clear yet what the best methods are. Trying things and seeing what causes people to update a lot seems like a generally good approach.
To clarify, basically anyone who actually wants to try to work on alignment full time who is at all promising and willing to learn is already getting funded to upskill and investigate for at least a few months. The question here is “why not fund them to do X, if they suggest it,” and my answer is that if they only thing they are interested in is X, and X is one of the things John listed above, they aren’t going to get funded unless they have a really great argument. And most don’t, and either they take feedback and come up with another idea. I suggest they upskill and learn more, or they decide to do something else.
How do you define whether or not someone is “at all promising”?