The outcome is that the agent performs the “best” action (according to the utility function) - and then the rest of the world responds to it according to physical law. The agent can only control its actions. Outcomes are determined from them by physics and the rest of the world.
This is backwards. Agents control their perceptions, not their actions. They vary their actions in such a manner as to produce the perceptions they desire. There is a causal path from action to perception outside the agent, and another from perception (and desired perception) to action inside the agent.
It is only by mistakenly looking at those paths separately and ignoring their connection that one can maintain the stimulus-response model of an organism (whether of the behaviourist or cognitive type), whereby perceptions control actions. But the two are bound together in a loop, whose properties are completely different: actions control perceptions. The loop as a whole operates in such a way that the perception takes on whatever value the agent intends it to. The action varies all over the place, while the perception hardly changes. The agent controls its perceptions by means of its actions; the environment does not control the agent’s actions by means of the perceptions it supplies.
Agents control their perceptions, not their actions.
“Control” is being used in two different senses in the above two quotes. In control theory parlance, timtyler is saying that actions are the manipulated variable, and you’re saying that perceptions are the process variable.
I am well aware of the perception-action feedback—but what does it have to do with this discussion?
It renders wrong the passage that I quoted above. You have described agents as choosing an outcome (from utility calculations, which I’d dispute, but that’s not the point at issue here) deciding on an action which will produce that outcome, and emitting that action, whereupon the world then produces the chosen outcome. Agents, that is, in the grip of the planning fallacy.
Planning plays a fairly limited role in human activity. An artificial agent designed to plan everything will do nothing useful. “No plan of battle survives contact with the enemy.” “What you do changes who you are.” “Life is what happens when you’re making other plans.” Etc.
I don’t know what you are thinking—but it seems fairly probable that you are still misinterpreting me—since your first paragraph contains:
You have described agents as choosing an outcome [...] deciding on an action which will produce that outcome, and emitting that action
...which appears to me to have rather little to do with what I originally wrote.
Rather, agents pick an action to execute, enumerate their possible actions, have a utility (1 or 0) assigned to each action by the I/O wrapper I described, select the highest utility action and then pass that on to the associated actuators.
Notice the lack of mention of outcomes here—in contrast to your description.
I stand by the passage that you quoted above, which you claim is wrong.
In that case, I disagree even more. The perceived outcome is what matters to an agent. The actions it takes to get there have no utility attached to them; if utility is involved, it attaches to the perceived outcomes.
I continue to be perplexed that you take seriously the epiphenomal utility function you described in these words:
Simply wrap the I/O of the non-utility model, and then assign the (possibly compound) action the agent will actually take in each timestep utility 1 and assign all other actions a utility 0 - and then take the highest utility action in each timestep.
and previously here. These functions require you to know what action the agent will take in order to assign it a utility. The agent is not using the utility to choose its action. The utility function plays no role in the agent’s decision process.
This is backwards. Agents control their perceptions, not their actions. They vary their actions in such a manner as to produce the perceptions they desire. There is a causal path from action to perception outside the agent, and another from perception (and desired perception) to action inside the agent.
It is only by mistakenly looking at those paths separately and ignoring their connection that one can maintain the stimulus-response model of an organism (whether of the behaviourist or cognitive type), whereby perceptions control actions. But the two are bound together in a loop, whose properties are completely different: actions control perceptions. The loop as a whole operates in such a way that the perception takes on whatever value the agent intends it to. The action varies all over the place, while the perception hardly changes. The agent controls its perceptions by means of its actions; the environment does not control the agent’s actions by means of the perceptions it supplies.
“Control” is being used in two different senses in the above two quotes. In control theory parlance, timtyler is saying that actions are the manipulated variable, and you’re saying that perceptions are the process variable.
Um. Agents do control their actions.
I am well aware of the perception-action feedback—but what does it have to do with this discussion?
It renders wrong the passage that I quoted above. You have described agents as choosing an outcome (from utility calculations, which I’d dispute, but that’s not the point at issue here) deciding on an action which will produce that outcome, and emitting that action, whereupon the world then produces the chosen outcome. Agents, that is, in the grip of the planning fallacy.
Planning plays a fairly limited role in human activity. An artificial agent designed to plan everything will do nothing useful. “No plan of battle survives contact with the enemy.” “What you do changes who you are.” “Life is what happens when you’re making other plans.” Etc.
I don’t know what you are thinking—but it seems fairly probable that you are still misinterpreting me—since your first paragraph contains:
...which appears to me to have rather little to do with what I originally wrote.
Rather, agents pick an action to execute, enumerate their possible actions, have a utility (1 or 0) assigned to each action by the I/O wrapper I described, select the highest utility action and then pass that on to the associated actuators.
Notice the lack of mention of outcomes here—in contrast to your description.
I stand by the passage that you quoted above, which you claim is wrong.
In that case, I disagree even more. The perceived outcome is what matters to an agent. The actions it takes to get there have no utility attached to them; if utility is involved, it attaches to the perceived outcomes.
I continue to be perplexed that you take seriously the epiphenomal utility function you described in these words:
and previously here. These functions require you to know what action the agent will take in order to assign it a utility. The agent is not using the utility to choose its action. The utility function plays no role in the agent’s decision process.
The utility function determines what the agent does. It is the agent’s utility function.
Utilities are numbers. They are associated with actions—that association is what allows utility-based agents to choose between their possible actions.
The actions produces outcomes—so, the utilities are also associated with the relevant outcomes.
The utility function determines what the agent does. It is the agent’s utility function.