The core of most of my disagreements with this article find their most concentrated expression in:
“Happiness” is an idiom of policy reinforcement learning, not expected utility maximization.
Under Omohundro’s model of intelligent systems, these two approaches converge. As they do so, the reward signal of reinforcement learning and the concept of expected utility also converge. In other words, it is rather inappropriate to emphasize the differences between these two systems as though it was a fundamental one.
There are differences—but they are rather superficial. For example, there is often a happiness “set point”, for example—whereas that concept is typically more elusive for an expected utility maximizer. However, the analogies between the concepts are deep and fundamental: an agent maximising its happiness is doing something deeply and fundamentally similar to an agent maximising its expected utility. That becomes obvious if you substitute “happiness” for “expected utility”.
In the case of real organisms, that substitution is doubly appropriate—because of evolution. The “happiness” function is not an arbitrarily chosen one—it is created in such a way that it converges closely on a function that favours behaviour resulting in increased expected ancestral representation. So, happiness gets an “expectation” of future events built into it automatically by the evolutionary process.
The core of most of my disagreements with this article find their most concentrated expression in:
Under Omohundro’s model of intelligent systems, these two approaches converge. As they do so, the reward signal of reinforcement learning and the concept of expected utility also converge. In other words, it is rather inappropriate to emphasize the differences between these two systems as though it was a fundamental one.
There are differences—but they are rather superficial. For example, there is often a happiness “set point”, for example—whereas that concept is typically more elusive for an expected utility maximizer. However, the analogies between the concepts are deep and fundamental: an agent maximising its happiness is doing something deeply and fundamentally similar to an agent maximising its expected utility. That becomes obvious if you substitute “happiness” for “expected utility”.
In the case of real organisms, that substitution is doubly appropriate—because of evolution. The “happiness” function is not an arbitrarily chosen one—it is created in such a way that it converges closely on a function that favours behaviour resulting in increased expected ancestral representation. So, happiness gets an “expectation” of future events built into it automatically by the evolutionary process.