Bogdan Ionut Cirstea comments on Bogdan Ionut Cirstea’s Shortform

Bogdan Ionut Cirstea 26 Sep 2024 14:08 UTC
4 points
2
It might be interesting to develop/put out RFPs for some benchmarks/datasets for unlearning of ML/AI knowledge (and maybe also ARA-relevant knowledge), analogously to WMDP for CBRN. This might be somewhat useful e.g. in a case where we might want to use powerful (e.g. human-level) AIs for cybersecurity, but we don’t fully trust them.