Terminology question—does adversarial filtering mean the same thing as decontamination?
In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong.
Terminology question—does adversarial filtering mean the same thing as decontamination?
In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong.