Ansh Radhakrishnan comments on Evaluations (of new AI Safety researchers) can be noisy

Ansh Radhakrishnan 5 Feb 2023 16:30 UTC
LW: 8 AF: 5
3
AF
Thanks for this post Lawrence! I agree with it substantially, perhaps entirely.

One other thing that I thing interacts with the difficulty of evaluation in some ways is the fact that many AI safety researchers think that most of the work done by some other researchers is approximately useless, or even net-negative in terms of reducing existential risk. I think it’s pretty easy to wrap an evaluation of a research direction or agenda and an evaluation of a particular researcher together. I think this is actually pretty justified for more senior researchers, since presumably an important skill is “research taste”, but I think it’s also important to acknowledge that this is pretty subjective and that there’s substantial disagreement about the utility of different research directions among senior safety researchers. It seems probably good to try and disentangle this when evaluating junior researchers, as much as is possible, and instead try to focus on “core competencies” that are likely to be valuable across a wide range of safety research directions, though even then the evaluation of this can be difficult and noisy, as the OP argues.