That’s a great question. I am not quite sure but WMDP-cyber does look relevant. If you are interested in working on a new benchmark for unlearning and secret collusion, do reach out to us!
That’s a great question. I am not quite sure but WMDP-cyber does look relevant. If you are interested in working on a new benchmark for unlearning and secret collusion, do reach out to us!