Given he is going to be doing this at literal OpenAI, how confident are we that this is on net a good idea? I’m especially interested in Christiano’s opinion here, since he was Aaronson’s student and he also was at but left OpenAI.
Here’s a 1-year-old answer from Christiano to the question “Do you still think that people interested in alignment research should apply to work at OpenAI?”. Generally pretty positive about people going there to “apply best practices to align state of the art models”. That’s not exactly what Aaronson will be doing, but it seems like alignment theory should have even less probability of differentially accelerating capabilities.
He says he will be doing alignment work, the worst thing I can think of that can realistically happen is that he gives OpenAI unwarranted confidence in how aligned their AIs are. Working at OpenAI isn’t intrinsically bad, publishing capabilities research is.
The field of alignment has historically been pretty divorced (IMO) from how the technology of machine learning works, so it would benefit the field to be closer to the ground reality. Also, any possible solution to alignment is going to need to be integrated with capabilities when it comes time. (Again, IMO.)
However good an idea it is, it’s not as good an idea as Aaronson just taking a year off and doing it on his own time, collaborating and sharing whatever he deems appropriate with the greater community. Might be financially inconvenient but is definitely something he could swing.
Given he is going to be doing this at literal OpenAI, how confident are we that this is on net a good idea? I’m especially interested in Christiano’s opinion here, since he was Aaronson’s student and he also was at but left OpenAI.
Here’s a 1-year-old answer from Christiano to the question “Do you still think that people interested in alignment research should apply to work at OpenAI?”. Generally pretty positive about people going there to “apply best practices to align state of the art models”. That’s not exactly what Aaronson will be doing, but it seems like alignment theory should have even less probability of differentially accelerating capabilities.
He says he will be doing alignment work, the worst thing I can think of that can realistically happen is that he gives OpenAI unwarranted confidence in how aligned their AIs are. Working at OpenAI isn’t intrinsically bad, publishing capabilities research is.
The field of alignment has historically been pretty divorced (IMO) from how the technology of machine learning works, so it would benefit the field to be closer to the ground reality. Also, any possible solution to alignment is going to need to be integrated with capabilities when it comes time. (Again, IMO.)
However good an idea it is, it’s not as good an idea as Aaronson just taking a year off and doing it on his own time, collaborating and sharing whatever he deems appropriate with the greater community. Might be financially inconvenient but is definitely something he could swing.