I mean, it’s unrealistic—the cells are “limited to English-language sources, were prohibited from accessing the dark web, and could not leverage print materials (!!)” which rules out textbooks. If LLMs are trained on textbooks—which, let’s be honest, they are, even though everyone hides their datasources—this means teams who have access to an LLM have a nice proxy to a textbook through an LLM, and other teams don’t.
It’s more of a gesture at the kind of thing you’d want to do, I guess but I don’t think it’s the kind of thing that it would make sense to trust. The blinding was also really unclear to me.
Jason Matheny, by the way, the president of Rand, the organization running that study, is on Anthropic’s “Long Term Benefit Trust.” I don’t know how much that should matter for your evaluation, but my bet is a non-zero amount. If you think there’s an EA blob that funded all of the above—well, he’s part of it. OpenPhil funded Rand with 15 mil also.
You may think it’s totally unfair to mention that; you may think it’s super important to mention that; but there’s the information, do what you will with it.
They do mention a justification for the restrictions – “to maintain consistency across cells”. One needn’t agree with the approach, but it seems at least to be within the realm of reasonable tradeoffs.
Nowadays of course textbooks are generally available online as well. They don’t indicate whether paid materials are within scope, but of course that would be a question for paper textbooks as well.
What I like about this study is that the teams are investing a relatively large amount of effort (“Each team was given a limit of seven calendar weeks and no more than 80 hours of red-teaming effort per member”), which seems much more realistic than brief attempts to get an LLM to answer a specific question. And of course they’re comparing against a baseline of folks who still have Internet access.
I mean, it’s unrealistic—the cells are “limited to English-language sources, were prohibited from accessing the dark web, and could not leverage print materials (!!)” which rules out textbooks. If LLMs are trained on textbooks—which, let’s be honest, they are, even though everyone hides their datasources—this means teams who have access to an LLM have a nice proxy to a textbook through an LLM, and other teams don’t.
It’s more of a gesture at the kind of thing you’d want to do, I guess but I don’t think it’s the kind of thing that it would make sense to trust. The blinding was also really unclear to me.
Jason Matheny, by the way, the president of Rand, the organization running that study, is on Anthropic’s “Long Term Benefit Trust.” I don’t know how much that should matter for your evaluation, but my bet is a non-zero amount. If you think there’s an EA blob that funded all of the above—well, he’s part of it. OpenPhil funded Rand with 15 mil also.
You may think it’s totally unfair to mention that; you may think it’s super important to mention that; but there’s the information, do what you will with it.
They do mention a justification for the restrictions – “to maintain consistency across cells”. One needn’t agree with the approach, but it seems at least to be within the realm of reasonable tradeoffs.
Nowadays of course textbooks are generally available online as well. They don’t indicate whether paid materials are within scope, but of course that would be a question for paper textbooks as well.
What I like about this study is that the teams are investing a relatively large amount of effort (“Each team was given a limit of seven calendar weeks and no more than 80 hours of red-teaming effort per member”), which seems much more realistic than brief attempts to get an LLM to answer a specific question. And of course they’re comparing against a baseline of folks who still have Internet access.