The EA AI safety strategy has had a large focus on placing EA-aligned people in A(G)I labs. The thinking was that having enough aligned insiders would make a difference on crucial deployment decisions & longer-term alignment strategy. We could say that the strategy is an attempt to corrupt the goal of pure capability advance & making money towards the goal of alignment. This fits into a larger theme that EA needs to get close to power to have real influence.
[See also the large donations EA has made to OpenAI & Anthropic. ]
Whether this strategy paid off… too early to tell.
What has become apparent is that the large AI labs & being close to power have had a strong corrupting influence on EA epistemics and culture.
Many people in EA now think nothing of being paid Bay Area programmer salaries for research or nonprofit jobs.
There has been a huge influx of MBA blabber being thrown around. Bizarrely EA funds are often giving huge grants to for profit organizations for which it is very unclear whether they’re really EA-aligned in the long-term or just paying lip service. Highly questionable that EA should be trying to do venture capitalism in the first place.
There is a questionable trend to equate ML skills prestige within capabilities work with the ability to do alignment work. EDIT: haven’t looked at it deeply yet but superfiically impressed by CAIS recent work. seems like an eminently reasonable approach. Hendryx’s deep expertise in capabilities work / scientific track record seem to have been key. in general, EA-adjacent AI safety work has suffered from youth, inexpertise & amateurism so makes sense to have more world-class expertise EDITEDIT: i should be careful in promoting work I haven’t looked at. I have been told from a source I trust that almost nothing is new in this paper and Hendryx engages in a lot of very questionable self-promotion tactics.
For various political reasons there has been an attempt to put x-risk AI safety on a continuum with more mundance AI concerns like it saying bad words. This means there is lots of ‘alignment research’ that is at best irrelevant, at worst a form of rnsidiuous safetywashing.
The influx of money and professionalization has not been entirely bad. Early EA suffered much more from virtue signalling spirals, analysis paralysis. Current EA is much more professional, largely for the better.
As a supervisor of numerous MSc and PhD students in mathematics, when someone finishes a math degree and considers a job, the tradeoffs are usually between meaning, income, freedom, evil, etc., with some of the obvious choices being high/low along (relatively?) obvious axes. It’s extremely striking to see young talented people with math or physics (or CS) backgrounds going into technical AI alignment roles in big labs, apparently maximising along many (or all) of these axes!
Especially in light of recent events I suspect that this phenomenon, which appears too good to be true, actually is.
I’m not too concerned about this. ML skills are not sufficient to do good alignment work, but they seem to be very important for like 80% of alignment work and make a big difference in the impact of research (although I’d guess still smaller than whether the application to alignment is good)
The explosion of research in the last ~year is partially due to an increase in the number of people in the community who work with ML. Maybe you would argue that lots of current research is useless, but it seems a lot better than only having MIRI around
The field of machine learning at large is in many cases solving easier versions of problems we have in alignment, and therefore it makes a ton of sense to have ML research experience in those areas. E.g. safe RL is how to get safe policies when you can optimize over policies and know which states/actions are safe; alignment can be stated as a harder version of this where we also need to deal with value specification, self-modification, instrumental convergence etc.
I should have said ‘prestige within capabilities research’ rather than ML skills which seems straightforwardly useful.
The former is seems highly corruptive.
There is a questionable trend to equate ML skills with the ability to do alignment work.
I’d arguably say this is good, primarily because I think EA was already in danger of it’s AI safety wing becoming unmoored from reality by ignoring key constraints, similar to how early Lesswrong before the deep learning era around 2012-2018 turned out to be mostly useless due to how much everything was stated in a mathematical way, and not realizing how many constraints and conjectured constraints applied to stuff like formal provability, for example..
Corrupting influences
The EA AI safety strategy has had a large focus on placing EA-aligned people in A(G)I labs. The thinking was that having enough aligned insiders would make a difference on crucial deployment decisions & longer-term alignment strategy. We could say that the strategy is an attempt to corrupt the goal of pure capability advance & making money towards the goal of alignment. This fits into a larger theme that EA needs to get close to power to have real influence.
[See also the large donations EA has made to OpenAI & Anthropic. ]
Whether this strategy paid off… too early to tell.
What has become apparent is that the large AI labs & being close to power have had a strong corrupting influence on EA epistemics and culture.
Many people in EA now think nothing of being paid Bay Area programmer salaries for research or nonprofit jobs.
There has been a huge influx of MBA blabber being thrown around. Bizarrely EA funds are often giving huge grants to for profit organizations for which it is very unclear whether they’re really EA-aligned in the long-term or just paying lip service. Highly questionable that EA should be trying to do venture capitalism in the first place.
There is a questionable trend to
equate ML skillsprestige within capabilities work with the ability to do alignment work. EDIT: haven’t looked at it deeply yet but superfiically impressed by CAIS recent work. seems like an eminently reasonable approach. Hendryx’s deep expertise in capabilities work / scientific track record seem to have been key. in general, EA-adjacent AI safety work has suffered from youth, inexpertise & amateurism so makes sense to have more world-class expertise EDITEDIT: i should be careful in promoting work I haven’t looked at. I have been told from a source I trust that almost nothing is new in this paper and Hendryx engages in a lot of very questionable self-promotion tactics.For various political reasons there has been an attempt to put x-risk AI safety on a continuum with more mundance AI concerns like it saying bad words. This means there is lots of ‘alignment research’ that is at best irrelevant, at worst a form of rnsidiuous safetywashing.
The influx of money and professionalization has not been entirely bad. Early EA suffered much more from virtue signalling spirals, analysis paralysis. Current EA is much more professional, largely for the better.
As a supervisor of numerous MSc and PhD students in mathematics, when someone finishes a math degree and considers a job, the tradeoffs are usually between meaning, income, freedom, evil, etc., with some of the obvious choices being high/low along (relatively?) obvious axes. It’s extremely striking to see young talented people with math or physics (or CS) backgrounds going into technical AI alignment roles in big labs, apparently maximising along many (or all) of these axes!
Especially in light of recent events I suspect that this phenomenon, which appears too good to be true, actually is.
Yes!
I’m not too concerned about this. ML skills are not sufficient to do good alignment work, but they seem to be very important for like 80% of alignment work and make a big difference in the impact of research (although I’d guess still smaller than whether the application to alignment is good)
Primary criticisms of Redwood involve their lack of experience in ML
The explosion of research in the last ~year is partially due to an increase in the number of people in the community who work with ML. Maybe you would argue that lots of current research is useless, but it seems a lot better than only having MIRI around
The field of machine learning at large is in many cases solving easier versions of problems we have in alignment, and therefore it makes a ton of sense to have ML research experience in those areas. E.g. safe RL is how to get safe policies when you can optimize over policies and know which states/actions are safe; alignment can be stated as a harder version of this where we also need to deal with value specification, self-modification, instrumental convergence etc.
I mostly agree with this.
I should have said ‘prestige within capabilities research’ rather than ML skills which seems straightforwardly useful. The former is seems highly corruptive.
I’d arguably say this is good, primarily because I think EA was already in danger of it’s AI safety wing becoming unmoored from reality by ignoring key constraints, similar to how early Lesswrong before the deep learning era around 2012-2018 turned out to be mostly useless due to how much everything was stated in a mathematical way, and not realizing how many constraints and conjectured constraints applied to stuff like formal provability, for example..