Another way of putting this is that a “tool” has an underlying instruction set that conceptually looks like: “(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc.” An “agent,” by contrast, has an underlying instruction set that conceptually looks like: “(1) Calculate which action, A, would maximize parameter P, based on existing data set D. (2) Execute Action A.” In any AI where (1) is separable (by the programmers) as a distinct step, (2) can be set to the “tool” version rather than the “agent” version, and this separability is in fact present with most/all modern software. Note that in the “tool” version, neither step (1) nor step (2) (nor the combination) constitutes an instruction to maximize a parameter—to describe a program of this kind as “wanting” something is a category error, and there is no reason to expect its step (2) to be deceptive.
If I understand correctly, his “agent” is your Consequentialist AI, and his “tool” is your Decoupled AI 1.
FWIW, this reminds me of Holden Karnofsky’s formulation of Tool AI (from his 2012 post, Thoughts on the Singularity Institute):
If I understand correctly, his “agent” is your Consequentialist AI, and his “tool” is your Decoupled AI 1.