paulfchristiano comments on How will OpenAI + GitHub’s Copilot affect programming?

paulfchristiano 30 Jun 2021 0:27 UTC
31 points
(I can’t speak to any details of Copilot or Codex and don’t know much about computer security; this is me speaking as an outside alignment researcher.)
A first pass of improving the situation would be to fine-tune the model for quality, with particular attention to security (both using higher-quality demonstrations and eventually RL fine-tuning). This is an interesting domain from an alignment perspective because (i) I think it’s reasonable to aim at narrowly superhuman performance even with existing models, (ii) security is a domain where you care a lot about rare failures and could aim for error rates close to 0 for many kinds of severe failures.