Strong agree with long-horizon sequential decision-making success being very tied to wantingness.
I kinda want to point at things like the Good and Gooder Regulator theorems here as theoretical reasons to expect this, besides the analogies you give. But I don’t find them entirely satisfactory. I have recently wondered if there’s something like a Good Regulator theorem for planner-simulators: a Planner Simulator conjecture something like, ‘every (simplest) simulator of a planner contains (something homomorphic to) a planner’. Potential stepping-stone for the agent-like structure problem. I also have some more specific thoughts about long-horizon and the closed-loop of deliberation for R&D-like tasks. But I’ve struggled to articulate these, in part because I flinch when it seems too capabilities-laden.
Strong agree with long-horizon sequential decision-making success being very tied to wantingness.
I kinda want to point at things like the Good and Gooder Regulator theorems here as theoretical reasons to expect this, besides the analogies you give. But I don’t find them entirely satisfactory. I have recently wondered if there’s something like a Good Regulator theorem for planner-simulators: a Planner Simulator conjecture something like, ‘every (simplest) simulator of a planner contains (something homomorphic to) a planner’. Potential stepping-stone for the agent-like structure problem. I also have some more specific thoughts about long-horizon and the closed-loop of deliberation for R&D-like tasks. But I’ve struggled to articulate these, in part because I flinch when it seems too capabilities-laden.
Any tips?