Zach Stein-Perlman comments on AI companies aren’t really using external evaluators

Zach Stein-Perlman 26 May 2024 17:31 UTC
9 points
3
external testing groups . . . . had the ability to turn down or turn off safety filters.
Yay DeepMind! I apologize for missing this. I will edit the post.
Longer quote for reference:
9.6. External Safety Testing
As outlined in the Gemini 1.0 Technical Report (Gemini-Team et al., 2023), we began working with a small set of independent external groups to help identify areas for improvement in our model safety work by undertaking structured evaluations, qualitative probing, and unstructured red teaming.
For Gemini 1.5 Pro, our external testing groups were given black-box testing access to a February 2024 Gemini 1.5 Pro API model checkpoint for a number of weeks. They had access to a chat interface and a programmatic API, and had the ability to turn down or turn off safety filters. Groups selected for participation regularly checked in with internal teams to present their work and receive feedback on future directions for evaluations.
These groups were selected based on their expertise across a range of domain areas, such as societal, cyber, and chemical, biological, radiological and nuclear risks, and included academia, civil society, and commercial organizations. The groups testing the February 2024 Gemini 1.5 Pro API model checkpoint were compensated for their time.
External groups designed their own methodology to test topics within a particular domain area. The time dedicated to testing also varied per group, with some groups working full-time on executing testing processes, while others dedicated one to three days per week. Some groups pursued manual red-teaming and reported on qualitative findings from their exploration of model behavior, while others developed bespoke automatic testing strategies and produced quantitative reports of their results.
[The report goes on to discuss some results from external testing.]