Well, that went quite well. Um, I think two main differences I’d like to see are, first, a shift in attention from ‘AGI when’ to more specific benchmarks/capabilities. Like, ability to replace 90% of the work of an AI researcher (can you say SWE-bench saturated? Maybe in conversation with Arvind only) when?
And then the second is to try to explicitly connect those benchmarks/capabilities directly to danger—like, make the ol’ King Midas analogy maybe? Or maybe just that high capabilities → instability and risk inherently?
Well, that went quite well. Um, I think two main differences I’d like to see are, first, a shift in attention from ‘AGI when’ to more specific benchmarks/capabilities. Like, ability to replace 90% of the work of an AI researcher (can you say SWE-bench saturated? Maybe in conversation with Arvind only) when?
And then the second is to try to explicitly connect those benchmarks/capabilities directly to danger—like, make the ol’ King Midas analogy maybe? Or maybe just that high capabilities → instability and risk inherently?