You note that the RSP says we will do a comprehensive assessment at least every 6 months—and then you say it would be better to do a comprehensive assessment at least every 6 months.
I thought the whole point of this update was to specify when you start your comprehensive evals, rather than when you complete your comprehensive evals. The old RSP implied that evals must complete at most 3 months after the last evals were completed, which is awkward if you don’t know how long comprehensive evals will take, and is presumably what led to the 3 day violation in the most recent round of evals.
(I think this is very reasonable, but I do think it means you can’t quite say “we will do a comprehensive assessment at least every 6 months”.)
There’s also the point that Zach makes below that “routinely” isn’t specified and implies that the comprehensive evals may not even start by the 6 month mark, but I assumed that was just an unfortunate side effect of how the section was written, and the intention was that evals will start at the 6 month mark.
(I agree that the intention is surely no more than 6 months; I’m mostly annoyed for legibility—things like this make it harder for me to say “Anthropic has clearly committed to X” for lab-comparison purposes—and LeCun-test reasons)
I thought the whole point of this update was to specify when you start your comprehensive evals, rather than when you complete your comprehensive evals. The old RSP implied that evals must complete at most 3 months after the last evals were completed, which is awkward if you don’t know how long comprehensive evals will take, and is presumably what led to the 3 day violation in the most recent round of evals.
(I think this is very reasonable, but I do think it means you can’t quite say “we will do a comprehensive assessment at least every 6 months”.)
There’s also the point that Zach makes below that “routinely” isn’t specified and implies that the comprehensive evals may not even start by the 6 month mark, but I assumed that was just an unfortunate side effect of how the section was written, and the intention was that evals will start at the 6 month mark.
(I agree that the intention is surely no more than 6 months; I’m mostly annoyed for legibility—things like this make it harder for me to say “Anthropic has clearly committed to X” for lab-comparison purposes—and LeCun-test reasons)