Good question. I guess I’m at 30%, so 2x higher? Low confidence haven’t thought about it much, there’s a lot of uncertainty about what METR/ARC will classify as success, and I also haven’t reread ARC/METR’s ARA eval to remind myself of how hard it is.
Good question. I guess I’m at 30%, so 2x higher? Low confidence haven’t thought about it much, there’s a lot of uncertainty about what METR/ARC will classify as success, and I also haven’t reread ARC/METR’s ARA eval to remind myself of how hard it is.