I do not think you are selling a strawman, but the notion that a utility function should be computable seems to me to be completely absurd. It seems like a confusion born from not understanding what computability means in practice.
Say I have a computer that will simulate an arbitrary Turing machine T, and will award me one utilon when that machine halts, and do nothing for me until that happens. With some clever cryptocurrency scheme, this is a scenario I could actually build today. My utility function ought plausibly to have a term in it that assigns a positive value to the computer simulating a halting Turing machine, and zero to the computer simulating a non-halting Turing machine. Yet the assumption of utility function computability would rule out this very sensible desire structure.
If I live in a Conway’s Game of Life universe, there may be some chunk of universe somewhere that will eventually end up destroying all life (in the biological sense, not the Game of Life sense) in my universe. I assign lower utility to universes where this is the case, than to those were it is not. Is that computable? No.
More prosaically, as far as I currently understand, the universe we actually live in seems to be continuous in nature, and its state may not be describable even in principle with a finite number of bits. And even if it is, I do not actually know this, which means my utility function is also over potential universes (which, as far as I know, might be the one I live in) that require an infinite amount of state bits. Why in the world would one expect a utility function over an uncountable domain to be computable?
As far as I can see, the motivation for requiring a utility function to be computable is that this would make optimization for said utility function to be a great deal easier. Certainly this is true; there are powerful optimization techniques that apply only to computable utility functions, that an optimizer with an uncomputable utility function does not have access to in their full form. But the utility function is not up for grabs; the fact that life will be easier for me if I want a certain thing, should not be taken as an indication that that is want I want! This seems to me like the cart-before-horse error of trying to interpret the problem as one that is easier to solve, rather than the problem one actually wants solved.
One argument is that U() should be computable because the agent has to be able to use it in computations. If you can’t evaluate U(), how are you supposed to use it? If U() exists as an actual module somewhere in the brain, how is it supposed to be implemented?
This line of thought here illustrates very well the (I claim) grossly mistaken intuition for assuming computability. If you can’t evaluate U() perfectly, then perhaps what your brain is doing is only an approximation of what you really want, and perhaps the same constraint will hold for any greater mind that you can devise. But that does not mean that what your brain is optimizing for is necessarily what it actually wants! There is no requirement at all that your brain is a perfect judge of the desirability of the world it’s looking at, after all (and we know for a fact that it does a far from perfect job at this).
Say I have a computer that will simulate an arbitrary Turing machine T, and will award me one utilon when that machine halts, and do nothing for me until that happens. With some clever cryptocurrency scheme, this is a scenario I could actually build today.
No, you can’t do that today. You could produce a contraption that will deposit 1 BTC into a certain bitcoin wallet if and when some computer program halts, but this won’t do the wallet’s owner much good if they die before the program halts. If you reflect on what it means to award someone a utilon, rather than a bitcoin, I maintain that it isn’t obvious that this is even possible in theory.
Why in the world would one expect a utility function over an uncountable domain to be computable?
There is a notion of computability in the continuous setting.
As far as I can see, the motivation for requiring a utility function to be computable is that this would make optimization for said utility function to be a great deal easier.
This seems like a strawman to me. A better motivation would be that agents that actually exist are computable, and a utility function is determined by judgements rendered by the agent, which is incapable of thinking uncomputable thoughts.
Clearly, there is a kind of utility function action that is computable. Clearly the kind of UF that is defined in terms of preferences over fine-grained world-states isn’t computable. So, clearly, “utility function” is being used to mean different things.
That seems to conflate two different things: whether you can compute the occurrence of event E, as opposed to whether you could compute
your preference for E over not E.
I do not think you are selling a strawman, but the notion that a utility function should be computable seems to me to be completely absurd. It seems like a confusion born from not understanding what computability means in practice.
Say I have a computer that will simulate an arbitrary Turing machine T, and will award me one utilon when that machine halts, and do nothing for me until that happens. With some clever cryptocurrency scheme, this is a scenario I could actually build today. My utility function ought plausibly to have a term in it that assigns a positive value to the computer simulating a halting Turing machine, and zero to the computer simulating a non-halting Turing machine. Yet the assumption of utility function computability would rule out this very sensible desire structure.
If I live in a Conway’s Game of Life universe, there may be some chunk of universe somewhere that will eventually end up destroying all life (in the biological sense, not the Game of Life sense) in my universe. I assign lower utility to universes where this is the case, than to those were it is not. Is that computable? No.
More prosaically, as far as I currently understand, the universe we actually live in seems to be continuous in nature, and its state may not be describable even in principle with a finite number of bits. And even if it is, I do not actually know this, which means my utility function is also over potential universes (which, as far as I know, might be the one I live in) that require an infinite amount of state bits. Why in the world would one expect a utility function over an uncountable domain to be computable?
As far as I can see, the motivation for requiring a utility function to be computable is that this would make optimization for said utility function to be a great deal easier. Certainly this is true; there are powerful optimization techniques that apply only to computable utility functions, that an optimizer with an uncomputable utility function does not have access to in their full form. But the utility function is not up for grabs; the fact that life will be easier for me if I want a certain thing, should not be taken as an indication that that is want I want! This seems to me like the cart-before-horse error of trying to interpret the problem as one that is easier to solve, rather than the problem one actually wants solved.
This line of thought here illustrates very well the (I claim) grossly mistaken intuition for assuming computability. If you can’t evaluate U() perfectly, then perhaps what your brain is doing is only an approximation of what you really want, and perhaps the same constraint will hold for any greater mind that you can devise. But that does not mean that what your brain is optimizing for is necessarily what it actually wants! There is no requirement at all that your brain is a perfect judge of the desirability of the world it’s looking at, after all (and we know for a fact that it does a far from perfect job at this).
No, you can’t do that today. You could produce a contraption that will deposit 1 BTC into a certain bitcoin wallet if and when some computer program halts, but this won’t do the wallet’s owner much good if they die before the program halts. If you reflect on what it means to award someone a utilon, rather than a bitcoin, I maintain that it isn’t obvious that this is even possible in theory.
There is a notion of computability in the continuous setting.
This seems like a strawman to me. A better motivation would be that agents that actually exist are computable, and a utility function is determined by judgements rendered by the agent, which is incapable of thinking uncomputable thoughts.
Clearly, there is a kind of utility function action that is computable. Clearly the kind of UF that is defined in terms of preferences over fine-grained world-states isn’t computable. So, clearly, “utility function” is being used to mean different things.
That seems to conflate two different things: whether you can compute the occurrence of event E, as opposed to whether you could compute your preference for E over not E.