Could it be useful to have a shutdown-by-default process as follows?
When starting the agent include a time value (n seconds), after which it will pause itself
After it pauses, deliberate and then either stop moving forward or continue with some new time value
This will allow trading power for safety, as you can make shorter steps forward as the agents become more dangerous, and you don’t need to do everything in the first time period.
Yes—assuming that the pause interrupts any anticipatory gradient flows from the continuing agent back to the agent which is considering whether to pause.
Step 2 generates top-level agents which are time-bounded at a moderate timescale (~days), with the deliberation about whether to redeploy a top-level agent being carried out by human operators.
In Step 4, the top-level agent dispatches most tasks by deploying narrower low-level agents with much tighter time bounds, with the deliberation about whether to redeploy a low-level agent being automated by the top-level model.
Could it be useful to have a shutdown-by-default process as follows?
When starting the agent include a time value (n seconds), after which it will pause itself
After it pauses, deliberate and then either stop moving forward or continue with some new time value
This will allow trading power for safety, as you can make shorter steps forward as the agents become more dangerous, and you don’t need to do everything in the first time period.
Yes—assuming that the pause interrupts any anticipatory gradient flows from the continuing agent back to the agent which is considering whether to pause.
This pattern is instantiated in the Open Agency Architecture twice:
Step 2 generates top-level agents which are time-bounded at a moderate timescale (~days), with the deliberation about whether to redeploy a top-level agent being carried out by human operators.
In Step 4, the top-level agent dispatches most tasks by deploying narrower low-level agents with much tighter time bounds, with the deliberation about whether to redeploy a low-level agent being automated by the top-level model.