So, we already have the underlying mathematical structure of the first half of cognition (determining the state of the world). What about the second half- influencing the world in a predictable way to achieve goals?
= creating mutual information between your utility function and the world. This is left as an exercise to the reader.
“Correlation” is a big old fuzzy mess, usually just defined in terms of what’s not correlated. As a result it boils down to E[x]E[y] =/= E[xy], or sometimes p(x|y) =/= p(x). It can only really be made quantitative (i.e. correlation coefficients) with linear variables, rather than categories. Mutual information really captures in a quantitative way how much you can predict one from the other.
That said, they’re both bad terms because a utility function is not a probability distribution.
But you can have a probability distribution of utility functions. Now that is true only in certain circumstances, but there is a simple model in which you can make a very nice probabilistic statement.
If the state of the world consists of a vector of N real variables, and a utility function is another vector, with the utility being the dot product (meaning all utility functions are linear in those variables), and the expected value of each coefficient is 0
then expected utility can be expressed as the covariance of this vector, and rational behavior maximizes that covariance. So that’s something.
So, we already have the underlying mathematical structure of the first half of cognition (determining the state of the world). What about the second half- influencing the world in a predictable way to achieve goals?
= creating mutual information between your utility function and the world. This is left as an exercise to the reader.
Isn’t correlation the proper term here, not mutual information?
“Correlation” is a big old fuzzy mess, usually just defined in terms of what’s not correlated. As a result it boils down to E[x]E[y] =/= E[xy], or sometimes p(x|y) =/= p(x). It can only really be made quantitative (i.e. correlation coefficients) with linear variables, rather than categories. Mutual information really captures in a quantitative way how much you can predict one from the other.
That said, they’re both bad terms because a utility function is not a probability distribution.
But you can have a probability distribution of utility functions. Now that is true only in certain circumstances, but there is a simple model in which you can make a very nice probabilistic statement.
If the state of the world consists of a vector of N real variables, and a utility function is another vector, with the utility being the dot product (meaning all utility functions are linear in those variables), and the expected value of each coefficient is 0
then expected utility can be expressed as the covariance of this vector, and rational behavior maximizes that covariance. So that’s something.