I specifically avoided claiming that adversarial robustness is the best altruistic option for a particular person. Instead, I’d like to establish that progress on adversarial robustness would have significant benefits, and therefore should be included in the set of research directions that “count” as useful AI safety research.
Over the next few years, I expect AI safety funding and research will (and should) dramatically expand. Research directions that would not make the cut at a small organization with a dozen researchers should still be part of the field of 10,000 people working on AI safety later this decade. Currently I’m concerned that the field focuses on a small handful of research directions (mainly mechinterp and scalable oversight) which will not be able to absorb such a large influx of interest. If we can lay the groundwork for many valuable research directions, we can multiply the impact of this large population of future researchers.
I don’t think adversarial robustness should be more than 5% or 10% of the research produced by AI safety-focused researchers today. But some research (e.g. 1, 2) from safety-minded folks seems very valuable for raising the number of people working on this problem and refocusing them on more useful subproblems. I think robustness should also be included in curriculums that educate people about safety, and research agendas for the field.
I agree with basically all of this and apologies for writing a comment which doesn’t directly respond to your post (though it is a relevant part of my views on the topic).
I specifically avoided claiming that adversarial robustness is the best altruistic option for a particular person. Instead, I’d like to establish that progress on adversarial robustness would have significant benefits, and therefore should be included in the set of research directions that “count” as useful AI safety research.
Over the next few years, I expect AI safety funding and research will (and should) dramatically expand. Research directions that would not make the cut at a small organization with a dozen researchers should still be part of the field of 10,000 people working on AI safety later this decade. Currently I’m concerned that the field focuses on a small handful of research directions (mainly mechinterp and scalable oversight) which will not be able to absorb such a large influx of interest. If we can lay the groundwork for many valuable research directions, we can multiply the impact of this large population of future researchers.
I don’t think adversarial robustness should be more than 5% or 10% of the research produced by AI safety-focused researchers today. But some research (e.g. 1, 2) from safety-minded folks seems very valuable for raising the number of people working on this problem and refocusing them on more useful subproblems. I think robustness should also be included in curriculums that educate people about safety, and research agendas for the field.
I agree with basically all of this and apologies for writing a comment which doesn’t directly respond to your post (though it is a relevant part of my views on the topic).
That’s cool, appreciate the prompt to discuss what is a relevant question.