I’ve built very basic agents where (if I’m understanding correctly) my laptop is the Scaffold Server and there is no separate Execution Server; the agent executes Python code and bash commands locally. You mention that it seems bizarre to not set up a separate Execution Server (at least for more sophisticated agents) because the agent can break things on the Scaffold Server. But I’m inclined to think there may also be advantages to this for capabilities: namely, an agent can discover while performing a task that it would benefit from having tools that it doesn’t yet have, then it can edit its own scaffold (or write a new one) to have those extra tools. (Similarly for other edits to the scaffold.) I tentatively think this is useful enough that people will try to implement it.
Allowing an agent to edit its own scaffold and otherwise interact with the Scaffold Server does probably make control harder, but I’m not sure how bad this is. We can still monitor the tool calls as you suggest, and the agent still has no easy access to model weights. Maybe some tool calls still get sent to a separate Execution Server, and there could be additional guardrails around the tools that allow edits on the Scaffold Server.
I’m curious if there are reasons why giving agents edit-access to the Scaffold Server wouldn’t be useful or why it would significantly increase safety concerns.
I’ve built very basic agents where (if I’m understanding correctly) my laptop is the Scaffold Server and there is no separate Execution Server; the agent executes Python code and bash commands locally. You mention that it seems bizarre to not set up a separate Execution Server (at least for more sophisticated agents) because the agent can break things on the Scaffold Server. But I’m inclined to think there may also be advantages to this for capabilities: namely, an agent can discover while performing a task that it would benefit from having tools that it doesn’t yet have, then it can edit its own scaffold (or write a new one) to have those extra tools. (Similarly for other edits to the scaffold.) I tentatively think this is useful enough that people will try to implement it.
Allowing an agent to edit its own scaffold and otherwise interact with the Scaffold Server does probably make control harder, but I’m not sure how bad this is. We can still monitor the tool calls as you suggest, and the agent still has no easy access to model weights. Maybe some tool calls still get sent to a separate Execution Server, and there could be additional guardrails around the tools that allow edits on the Scaffold Server.
I’m curious if there are reasons why giving agents edit-access to the Scaffold Server wouldn’t be useful or why it would significantly increase safety concerns.