I don’t think the evaluations we’re describing here are about measuring capabilites. More like measuring whether our oversight (and other aspects) suffice for avoiding misalignment failures.
Measuring capabilities should be easy.
Yeah, I don’t know where my reading comprehension skills were that evening, but they weren’t with me :P
Oh well, I’ll just leave it as is as a monument to bad comments.
I don’t think the evaluations we’re describing here are about measuring capabilites. More like measuring whether our oversight (and other aspects) suffice for avoiding misalignment failures.
Measuring capabilities should be easy.
Yeah, I don’t know where my reading comprehension skills were that evening, but they weren’t with me :P
Oh well, I’ll just leave it as is as a monument to bad comments.