Terminology question—does adversarial filtering mean the same thing as decontamination?
In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong.
Current theme: default
Less Wrong (text)
Less Wrong (link)
Terminology question—does adversarial filtering mean the same thing as decontamination?
In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong.