Informally: a system has (immutable) terminal goals.
Semiformally: a system’s decision making is well described as (an approximation) of argmax over actions (or higher level mappings thereof) to maximise (the expected value of) a simple unitary utility function.
Are the (parenthesized) words part of your operationalization or not? If so, I would recommend removing the parentheses, to make it clear that they are not optional.
Also, what do you mean by “a simple unitary utility function”? I suspect other people will also be confused/thrown off by that description.
The “or higher mappings thereof” is to accommodate agents that choose state —> action policies directly, and agent that choose policies over … over policies, so I’ll keep it.
I don’t actually know if my critique applies well to systems that have non immutable terminal goals.
I guess if you have sufficiently malleable terminal goals, you get values near exactly.
Are the (parenthesized) words part of your operationalization or not, or not? If so, I would recommend removing the parentheses, to make it clear that they are not optional.
Will do.
Also, what do you mean by “a simple unitary utility function”? I suspect other people will also be confused/thrown off by that description.
If you define your utility function in a sufficiently convoluted manner, then everything is a utility maximiser.
Less contrived, I was thinking of stuff like Wentworth’s subagents that identifies decision making with pareto optimality over a set of utility functions.
I think subagents comes very close to being an ideal model of agency and could probably be adapted to be a complete model.
I don’t want to include subagents in my critique at this point.
If you define your utility function in a sufficiently convoluted manner, then everything is a utility maximiser.
Less contrived, I was thinking of stuff like Wentworth’s subagents that identifies decision making with pareto optimality over a set of utility functions.
I think subagents comes very close to being an ideal model of agency and could probably be adapted to be a complete model.
I don’t want to include subagents in my critique at this point.
I think what you want might be “a single fixed utility function over states” or something similar. That captures that you’re excluding from critique:
Agents with multiple internal “utility functions” (subagents)
Agents whose “utility function” is malleably defined
Agents that have trivial utility functions, like over universe-histories
Are the (parenthesized) words part of your operationalization or not? If so, I would recommend removing the parentheses, to make it clear that they are not optional.
Also, what do you mean by “a simple unitary utility function”? I suspect other people will also be confused/thrown off by that description.
The “or higher mappings thereof” is to accommodate agents that choose state —> action policies directly, and agent that choose policies over … over policies, so I’ll keep it.
I don’t actually know if my critique applies well to systems that have non immutable terminal goals.
I guess if you have sufficiently malleable terminal goals, you get values near exactly.
Will do.
If you define your utility function in a sufficiently convoluted manner, then everything is a utility maximiser.
Less contrived, I was thinking of stuff like Wentworth’s subagents that identifies decision making with pareto optimality over a set of utility functions.
I think subagents comes very close to being an ideal model of agency and could probably be adapted to be a complete model.
I don’t want to include subagents in my critique at this point.
I think what you want might be “a single fixed utility function over states” or something similar. That captures that you’re excluding from critique:
Agents with multiple internal “utility functions” (subagents)
Agents whose “utility function” is malleably defined
Agents that have trivial utility functions, like over universe-histories