This could be facilitated by employing AI systems in red-team roles, employing systems that can plan malign behaviors as hypothetical challenges to detection and defense strategies. In this way, worst-case misaligned plans can contribute to achieving aligned outcomes.
I doubt we should even try this. Our experience thus far with gain-of-function research shows that it’s on the net bad rather than good.
I doubt we should even try this. Our experience thus far with gain-of-function research shows that it’s on the net bad rather than good.