Jan Betley comments on Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs

Jan Betley 18 Jul 2024 12:00 UTC
3 points
0
This is interesting! Although I think it’s pretty hard to use that in a benchmark (because you need a set of problems assigned to clearly defined types and I’m not aware of any such dataset).

There are some papers on “do models know what they know”, e.g. https://arxiv.org/abs/2401.13275 or https://arxiv.org/pdf/2401.17882.
- Martín Soto 18 Jul 2024 19:24 UTC
  2 points
  0
  Parent
  you need a set of problems assigned to clearly defined types and I’m not aware of any such dataset
  Hm, I was thinking something as easy to categorize as “multiplying numbers of n digits”, or “the different levels of MMLU” (although again, they already know about MMLU), or “independently do X online (for example create an account somewhere)”, or even some of the tasks from your paper.
  I guess I was thinking less about “what facts they know”, which is pure memorization (although this is also interesting), and more about “cognitively hard tasks”, that require some computational steps.
  - Owain_Evans 19 Jul 2024 6:20 UTC
    2 points
    0
    Parent
    You want to make it clear to the LLM what the task is (multiplying n digit numbers is clear but “doing hard math questions” is vague) and also have some variety of difficulty levels (within LLMs and between LLMs) and a high ceiling. I think this would take some iteration at least.