StellaAthena comments on [missing post]

StellaAthena 9 May 2023 6:30 UTC
2 points
0
Red teaming has always been a legitimate academic thing? I don’t know what background you’re coming from but… you’re very far off.

But yes, the event organizers will be writing a paper about it and publishing the data (after it’s been anonymized).
- VojtaKovarik 9 May 2023 18:33 UTC
  1 point
  0
  Parent
  
  But yes, the event organizers will be writing a paper about it and publishing the data (after it’s been anonymized).
  
  I imagine this would primarily be a report from the competition? What I was thinking about was more about how this sort of assessment should be done in general, what are the similarities and differences between cybersecurity, and how to squeeze more utility out of it. For example, a (naive version of) one low-hanging fruit is to withhold 10% of the obtained data (from the AI companies, then test those jailbreak strategies later). This would give us some insight into whether the current “alignment” methods generalise, or whether we are closer to playing whack-a-mole. Similarly to how we use test data in ML.
  
  There are many more considerations, and many more things you can do. And I don’t claim to have all the answers, nor to be the optimal person to be writing about them. Just that it would be good if somebody was doing that (and wondering whether that is happening :-) ).
- VojtaKovarik 9 May 2023 18:15 UTC
  1 point
  0
  Parent
  
  Red teaming has always been a legitimate academic thing? I don’t know what background you’re coming from but… you’re very far off.
  
  Theoretical CS/AI/game theory, rather than cybersecurity. Given the lack of cybersec background, I acknowledge I might be very far off.
  
  To me, it seems that the perception in cybersecurity might be different from the perception outside of it. Also, red teaming in the context of AI models might have important differences from cybersecurity context. Also, red teaming by public seems, to me, different from internal red-teaming or bounties. (Though this might be one of the things where I am far off.)