Do you still think that people interested in alignment research should apply to work at OpenAI?
I think alignment is a lot better if there are strong teams trying to apply best practices to align state of the art models, who have been learning about what it actually takes to do that in practice and building social capital. Basically that seems good because (i) I think there’s a reasonable chance that we fail not because alignment is super-hard but because we just don’t do a very good job during crunch time, and I think such teams are the best intervention for doing a better job, (ii) even if alignment is very hard and we need big new ideas, I think that such teams will be important for empirically characterizing and ultimately adopting those big new ideas. It’s also an unusually unambiguous good thing.
I spent a lot of time at OpenAI largely because I wanted to help get that kind of alignment effort going. For some color see this post; that team still exists (under Jan Leike) and there are now some other similar efforts at the organization.
I’m not as in the loop as I was a few months ago and so you might want to defer to folks at OpenAI, but from the outside I still tentatively feel pretty enthusiastic about the work of this kind that’s happening at OpenAI. If you’re excited about this kind of work then OpenAI still seems like a good place to go to me. (It also seems reasonable to think about DeepMind and Google, and of course I’m a fan of ARC for people who are a good fit, and I suspect that there will be more groups doing good applied alignment work in the future.)
I think alignment is a lot better if there are strong teams trying to apply best practices to align state of the art models, who have been learning about what it actually takes to do that in practice and building social capital. Basically that seems good because (i) I think there’s a reasonable chance that we fail not because alignment is super-hard but because we just don’t do a very good job during crunch time, and I think such teams are the best intervention for doing a better job, (ii) even if alignment is very hard and we need big new ideas, I think that such teams will be important for empirically characterizing and ultimately adopting those big new ideas. It’s also an unusually unambiguous good thing.
I spent a lot of time at OpenAI largely because I wanted to help get that kind of alignment effort going. For some color see this post; that team still exists (under Jan Leike) and there are now some other similar efforts at the organization.
I’m not as in the loop as I was a few months ago and so you might want to defer to folks at OpenAI, but from the outside I still tentatively feel pretty enthusiastic about the work of this kind that’s happening at OpenAI. If you’re excited about this kind of work then OpenAI still seems like a good place to go to me. (It also seems reasonable to think about DeepMind and Google, and of course I’m a fan of ARC for people who are a good fit, and I suspect that there will be more groups doing good applied alignment work in the future.)