The “or higher mappings thereof” is to accommodate agents that choose state —> action policies directly, and agent that choose policies over … over policies, so I’ll keep it.
I don’t actually know if my critique applies well to systems that have non immutable terminal goals.
I guess if you have sufficiently malleable terminal goals, you get values near exactly.
The “or higher mappings thereof” is to accommodate agents that choose state —> action policies directly, and agent that choose policies over … over policies, so I’ll keep it.
I don’t actually know if my critique applies well to systems that have non immutable terminal goals.
I guess if you have sufficiently malleable terminal goals, you get values near exactly.