I think you may have missed, or at least not taken literally, at least one of these things in the post:
The expansion of “superhuman strategic agent” is not “agent that’s better than humans at strategic reasoning”, it’s “agent that is better than the best groups of humans at taking (situated) strategic action”
Strategic action is explicitly context-dependent, e.g. an AI system that’s inside a mathematically perfect simulated world that can have no effect on the rest of the physical world and vice versa, has zero strategic power in this sense. Also e.g. in the FAQ, “Capabilities and controls are relevant to existential risks from agentic AI insofar as they provide or limit situated strategic power.” So, yes, an agent that lives on your laptop is only strategically superhuman if it has the resources to actually take strategic action rivaling the most strategically capable groups of humans.
“increasingly accurately” is meant to point out that we don’t need to understand or limit the capabilities of things that are obviously much strategically worse than us.
I was confused here, had Claude try to explain this to me:
Let me break down Ben’s response carefully.
He says you may have missed three key points from his original post:
His definition of “superhuman strategic agent” isn’t just about being better at strategic thinking/reasoning—it’s about being better than the best human teams at actually taking real-world strategic actions. This is a higher bar that includes implementation, not just planning.
Strategic power is context-dependent. He gives two examples to illustrate this:
An AI in a perfect simulation with no connection to the real world has zero strategic power, because it can’t affect anything real
An AI on your laptop (referring to your example) only counts as strategically superhuman if it has enough resources to actually compete with top human strategic actors
The phrase “increasingly accurately” in his original post was meant to convey that we only need to carefully control and understand AI systems that are getting close to human-level strategic capability. We don’t need to worry as much about obviously weaker systems.
His response is essentially saying: “Your example of a small agent on a laptop doesn’t contradict my point, because I’m specifically talking about agents that have both the capability AND the resources to outcompete top human strategic actors in the real world.”
This addresses your concern about limited money/power by saying that an agent without sufficient resources wouldn’t qualify as “strategically superhuman” under his definition, even if it had excellent strategic reasoning capabilities.
I think I’m still confused. My guess is that the “most strategically capable groups of humans” are still not all that powerful, especially without that many resources. If you do give it a lot of resources, then sure, I agree that an LLM system with human-outperforming strategy and say $10B could do a fair bit of damage.
Not sure if it’s worth much more, just wanted to flag that.
I think you may have missed, or at least not taken literally, at least one of these things in the post:
The expansion of “superhuman strategic agent” is not “agent that’s better than humans at strategic reasoning”, it’s “agent that is better than the best groups of humans at taking (situated) strategic action”
Strategic action is explicitly context-dependent, e.g. an AI system that’s inside a mathematically perfect simulated world that can have no effect on the rest of the physical world and vice versa, has zero strategic power in this sense. Also e.g. in the FAQ, “Capabilities and controls are relevant to existential risks from agentic AI insofar as they provide or limit situated strategic power.” So, yes, an agent that lives on your laptop is only strategically superhuman if it has the resources to actually take strategic action rivaling the most strategically capable groups of humans.
“increasingly accurately” is meant to point out that we don’t need to understand or limit the capabilities of things that are obviously much strategically worse than us.
I was confused here, had Claude try to explain this to me:
I think I’m still confused. My guess is that the “most strategically capable groups of humans” are still not all that powerful, especially without that many resources. If you do give it a lot of resources, then sure, I agree that an LLM system with human-outperforming strategy and say $10B could do a fair bit of damage.
Not sure if it’s worth much more, just wanted to flag that.