No idea. Be really worried, I guess—I tend a bit towards doomer. There’s something to be said for not leaving capabilities overhangs lying around, though. Maybe contact Anthropic?
The thing is, the confidence the top labs have in short-term AGI makes me think there’s a reasonable chance they have the solution to this problem already. I made the mistake of thinking they didn’t once before—I was pretty skeptical that “more test-time compute” would really unhobble LLMs in a meaningful fashion when Situational Awareness came out and didn’t elaborate at all on how that would work. But it turned out that at least OpenAI, and probably Anthropic too, already had the answer at the time.
No idea. Be really worried, I guess—I tend a bit towards doomer. There’s something to be said for not leaving capabilities overhangs lying around, though. Maybe contact Anthropic?
The thing is, the confidence the top labs have in short-term AGI makes me think there’s a reasonable chance they have the solution to this problem already. I made the mistake of thinking they didn’t once before—I was pretty skeptical that “more test-time compute” would really unhobble LLMs in a meaningful fashion when Situational Awareness came out and didn’t elaborate at all on how that would work. But it turned out that at least OpenAI, and probably Anthropic too, already had the answer at the time.