One argument is that U() should be computable because the agent has to be able to use it in computations. This perspective is especially appealing if you think of U() as a black-box function which you can only optimize through search. If you can’t evaluate U(), how are you supposed to use it? If U() exists as an actual module somewhere in the brain, how is it supposed to be implemented?
This seems like a weak argument. If I think about a human trying to achieve some goal in practice, “think of U() as a black-box function which you can only optimize through search” doesn’t really describe how we typically reason. I would say that we optimize for things we can’t evaluate all the time—it’s our default mode of thought. We don’t need to evaluate U() in order to decide which of two options yields higher U().
Example: suppose I’m a general trying to maximize my side’s chance of winning a war. Can I evaluate the probability that we win, given all of the information available to me? No—fully accounting for every little piece of info I have is way beyond my computational capabilities. Even reasoning through an entire end-to-end plan for winning takes far more effort than I usually make for day-to-day decisions. Yet I can say that some actions are likely to increase our chances of victory, and I can prioritize actions which are more likely to increase our chances of victory by a larger amount.
Suppose I’m running a company, trying to maximize profits. I don’t make decisions by looking at the available options, and then estimating how profitable I expect the company to be under each choice. Rather, I reason locally: at a cost of X I can gain Y, I’ve cached an intuitive valuation of X and Y based on their first-order effects, and I make the choice based on that without reasoning through all the second-, third-, and higher-order effects of the choice. I don’t calculate all the way through to an expected utility or anything comparable to it.
If I see a $100 bill on the ground, I don’t need to reason through exactly what I’ll spend it on in order to decide to pick it up.
In general, I think humans usually make decisions directionally and locally: we try to decide which of two actions is more likely to better achieve our goals, based on local considerations, without actually simulating all the way to the possible outcomes.
Taking a more theoretical perspective… how would a human or other agent work with an uncomputable U()? Well, we’d consider specific choices available to us, and then try to guess which of those is more likely to give higher U(). We might look for proofs that one specific choice or the another is better; we might leverage logical induction; we might do something else entirely. None of that necessarily requires evaluating U().
Yeah, a didactic problem with this post is that when I write everything out, the “reductive utility” position does not sound that tempting.
I still think it’s a really easy trap to fall into, though, because before thinking too much the assumption of a computable utility function sounds extremely reasonable.
Suppose I’m running a company, trying to maximize profits. I don’t make decisions by looking at the available options, and then estimating how profitable I expect the company to be under each choice. Rather, I reason locally: at a cost of X I can gain Y, I’ve cached an intuitive valuation of X and Y based on their first-order effects, and I make the choice based on that without reasoning through all the second-, third-, and higher-order effects of the choice. I don’t calculate all the way through to an expected utility or anything comparable to it.
With dynamic-programming inspired algorithms such as AlphaGo, “cached an intuitive valuation of X and Y” is modeled as a kind of approximate evaluation which is learned based on feedback—but feedback requires the ability to compute U() at some point. (So you don’t start out knowing how to evaluate uncertain situations, but you do start out knowing how to evaluate utility on completely specified worlds.)
So one might still reasonably assume you need to be able to compute U() despite this.
suppose I’m a general trying to maximize my side’s chance of winning a war. Can I evaluate the probability that we win, given all of the information available to me? No—fully accounting for every little piece of info I have is way beyond my computational capabilities. Even reasoning through an entire end-to-end plan for winning takes far more effort than I usually make for day-to-day decisions. Yet I can say that some actions are likely to increase our chances of victory, and I can prioritize actions which are more likely to increase our chances of victory by a larger amount.
So, when and why are we able to get away with doing that?
AFAICT, the formalisms of agents that I’m aware of (Bayesian inference, AIXI etc.) set things up by supposing logical omniscience and that the true world generating our hypotheses is in the set of hypotheses and from there you can show that the agent will maximise expected utilty, or not get dutch booked or whatever. But humans, and ML algorithms for that matter, don’t do that, we’re able to get “good enough” results even when we know our models are wrong and don’t capture a good deal of the underlying process generating our observations. Furthermore, it seems that empirically, the more expressive the model class we use, and the more compute thrown at the problem, the better these bounded inference algorithms work. I haven’t found a good explanation of why this is the case beyond hand wavy “we approach logical omniscience as compute goes to infinity and our hypothesis space grows to encompass all computable hypotheses, so eventually our approximation should work like the ideal Bayesian one”.
I think in part we can get away with it because it’s possible to optimize for things that are only usually decidable.
Take winning the war for example. There may be no computer program that could look at any state of the world and tell you who won the war—there are lots of weird edge cases that could cause a Turing machine to not return a decision. But if we expect to be able to tell who won the war with very high probability (or have a model that we think matches who wins the war with high probability), then we can just sort of ignore the weird edge cases and model failures when calculating an expected utility.
This seems like a weak argument. If I think about a human trying to achieve some goal in practice, “think of U() as a black-box function which you can only optimize through search” doesn’t really describe how we typically reason. I would say that we optimize for things we can’t evaluate all the time—it’s our default mode of thought. We don’t need to evaluate U() in order to decide which of two options yields higher U().
Example: suppose I’m a general trying to maximize my side’s chance of winning a war. Can I evaluate the probability that we win, given all of the information available to me? No—fully accounting for every little piece of info I have is way beyond my computational capabilities. Even reasoning through an entire end-to-end plan for winning takes far more effort than I usually make for day-to-day decisions. Yet I can say that some actions are likely to increase our chances of victory, and I can prioritize actions which are more likely to increase our chances of victory by a larger amount.
Suppose I’m running a company, trying to maximize profits. I don’t make decisions by looking at the available options, and then estimating how profitable I expect the company to be under each choice. Rather, I reason locally: at a cost of X I can gain Y, I’ve cached an intuitive valuation of X and Y based on their first-order effects, and I make the choice based on that without reasoning through all the second-, third-, and higher-order effects of the choice. I don’t calculate all the way through to an expected utility or anything comparable to it.
If I see a $100 bill on the ground, I don’t need to reason through exactly what I’ll spend it on in order to decide to pick it up.
In general, I think humans usually make decisions directionally and locally: we try to decide which of two actions is more likely to better achieve our goals, based on local considerations, without actually simulating all the way to the possible outcomes.
Taking a more theoretical perspective… how would a human or other agent work with an uncomputable U()? Well, we’d consider specific choices available to us, and then try to guess which of those is more likely to give higher U(). We might look for proofs that one specific choice or the another is better; we might leverage logical induction; we might do something else entirely. None of that necessarily requires evaluating U().
Yeah, a didactic problem with this post is that when I write everything out, the “reductive utility” position does not sound that tempting.
I still think it’s a really easy trap to fall into, though, because before thinking too much the assumption of a computable utility function sounds extremely reasonable.
With dynamic-programming inspired algorithms such as AlphaGo, “cached an intuitive valuation of X and Y” is modeled as a kind of approximate evaluation which is learned based on feedback—but feedback requires the ability to compute U() at some point. (So you don’t start out knowing how to evaluate uncertain situations, but you do start out knowing how to evaluate utility on completely specified worlds.)
So one might still reasonably assume you need to be able to compute U() despite this.
I actually found the position very tempting until I got to the subjective utility section.
Specifically, discontinuous utility functions have always seemed basically irrational to me, for reasons related to incomputability.
So, when and why are we able to get away with doing that?
AFAICT, the formalisms of agents that I’m aware of (Bayesian inference, AIXI etc.) set things up by supposing logical omniscience and that the true world generating our hypotheses is in the set of hypotheses and from there you can show that the agent will maximise expected utilty, or not get dutch booked or whatever. But humans, and ML algorithms for that matter, don’t do that, we’re able to get “good enough” results even when we know our models are wrong and don’t capture a good deal of the underlying process generating our observations. Furthermore, it seems that empirically, the more expressive the model class we use, and the more compute thrown at the problem, the better these bounded inference algorithms work. I haven’t found a good explanation of why this is the case beyond hand wavy “we approach logical omniscience as compute goes to infinity and our hypothesis space grows to encompass all computable hypotheses, so eventually our approximation should work like the ideal Bayesian one”.
I think in part we can get away with it because it’s possible to optimize for things that are only usually decidable.
Take winning the war for example. There may be no computer program that could look at any state of the world and tell you who won the war—there are lots of weird edge cases that could cause a Turing machine to not return a decision. But if we expect to be able to tell who won the war with very high probability (or have a model that we think matches who wins the war with high probability), then we can just sort of ignore the weird edge cases and model failures when calculating an expected utility.
Perhaps...
As the approximation gets closer to the ideal, the results do as well. (The Less Wrong quote seems relevant.)