But yes, the event organizers will be writing a paper about it and publishing the data (after it’s been anonymized).
I imagine this would primarily be a report from the competition? What I was thinking about was more about how this sort of assessment should be done in general, what are the similarities and differences between cybersecurity, and how to squeeze more utility out of it.
For example, a (naive version of) one low-hanging fruit is to withhold 10% of the obtained data (from the AI companies, then test those jailbreak strategies later). This would give us some insight into whether the current “alignment” methods generalise, or whether we are closer to playing whack-a-mole. Similarly to how we use test data in ML.
There are many more considerations, and many more things you can do. And I don’t claim to have all the answers, nor to be the optimal person to be writing about them. Just that it would be good if somebody was doing that (and wondering whether that is happening :-) ).
I imagine this would primarily be a report from the competition? What I was thinking about was more about how this sort of assessment should be done in general, what are the similarities and differences between cybersecurity, and how to squeeze more utility out of it. For example, a (naive version of) one low-hanging fruit is to withhold 10% of the obtained data (from the AI companies, then test those jailbreak strategies later). This would give us some insight into whether the current “alignment” methods generalise, or whether we are closer to playing whack-a-mole. Similarly to how we use test data in ML.
There are many more considerations, and many more things you can do. And I don’t claim to have all the answers, nor to be the optimal person to be writing about them. Just that it would be good if somebody was doing that (and wondering whether that is happening :-) ).