Adam Jermyn answers Has private AGI research made independent safety research ineffective already? What should we do about this?

Adam Jermyn 24 Jan 2023 9:54 UTC
3 points
0
I think there’s tons of low-hanging fruit in toy model interpretability, and I expect at least some lessons from at least some such projects to generalize. A lot of the questions I’m excited about in interpretability are fundamentally accessible in toy models, like “how do models trade off interference and representational capacity?”, “what priors do MLP’s have over different hypotheses about the data distribution?”, etc.