jacquesthibs comments on Preparing for AI-assisted alignment research: we need data!

jacquesthibs 17 Jan 2023 16:56 UTC
10 points
3
Heads up, we are starting to work on stuff like this in a discord server (DM for link) and I’ll be working on this stuff full-time from February to end of April (if not longer). We’ve talked about data collection a bit over the past year, but have yet to take the time to do anything serious (besides the alignment text dataset). In order to make this work, we’ll have to make it insanely easy on the part of the people generating the data. It’s just not going to happen by default. Some people might take the time to set this up for themselves, but very few do.
Glad to see others take interest in this idea! I think this kind of stuff has a very low barrier to entry for software engineers who want to contribute to alignment, but might want to focus on using their software engineering skills rather than trying to become a full-on researcher. It opens up the door for engineering work that is useful for independent researchers, not just the orgs.
And as I said in the survey results post:
We are looking to build tools now rather than later because it allows us to learn what’s useful before we have access to even more powerful models. Once GPT-(N-1) arrives, we want to be able to use it to generate extremely high-quality alignment work right out of the gate. This work involves both augmenting alignment researchers and using AI to generate alignment research. Both of these approaches fall under the “accelerating alignment” umbrella.
Ideally, we want these kinds of tools to be used disproportionately for alignment work in the first six months of GPT-(N-1)’s release. We hope that the tools are useful before that time but, at the very least, we hope to have pre-existing code for interfaces, a data pipeline, and engineers already set to hit the ground running.
What links here?
- jacquesthibs's comment on jacquesthibs’s Shortform by jacquesthibs (17 Jan 2023 21:01 UTC; 2 points)