timtyler comments on The Human’s Hidden Utility Function (Maybe)

timtyler 25 Jan 2012 12:49 UTC
0 points
That mind would have some associated behaviour and that behaviour could be expressed by a utility function (assuming computability—which follows from the Church–Turing–Deutsch principle).

Navel gazing, rushing around in circles, burning money, whatever—all have corresponding utility functions.

Dewey explains why in more detail—if you are prepared to follow the previously-provided link from here.
- JoachimSchipper 25 Jan 2012 13:53 UTC
  4 points
  Parent
  I’ve taken a look at the paper. If “outcomes” are things like “chose A”, “chose B” or “chose C”, the above mind is simply not an O-maximizer: consider a world with observations “I can choose between A and B/B and C/C and A” (equally likely, independent of any past actions or observations) and actions “take the first offered option” or “take the second offered option” (played for one round, for simplicity, but the argument works fine with multiple rounds); there is no definition of U that yields the described behaviour. (I’m aware that the paper asserts that “any agents [sic] can be written in O-maximizer form”, but note that the paper may simply be wrong. It’s clearly an unfinished draft, and no argument or proof is given.)
  
  If outcomes are things like “chose A given a choice between A and B”, which is not clear to me from the paper, then my mind is indeed an O-maximizer (that is, there is a definition of U such that an O-maximizer produces the same outputs as my mind). However, as I understand it, you have also encoded any cognitive errors in the utility function: if a mind can be Dutch-booked into a undesirable state, the associated O-maximizer will have to act on a U function that values this undesirable state highly if it comes about as a result of being Dutch-booked. (Remember, the O-maximizer maximizes U and behaves like the original mind.) As an additional consideration, most decision/choice theory seems to assume a ranking of outcomes, not (path, outcome) pairs.
  - timtyler 25 Jan 2012 15:30 UTC
    1 point
    Parent
    
    I’ve taken a look at the paper. If “outcomes” are things like “chose A”, “chose B” or “chose C”, the above mind is simply not an O-maximizer: consider a world with observations “I can choose between A and B/B and C/C and A” (equally likely, independent of any past actions or observations) and actions “take the first offered option” or “take the second offered option” (played for one round, for simplicity, but the argument works fine with multiple rounds); there is no definition of U that yields the described behaviour.
    
    What?!? You haven’t clearly specified the behaviour of the machine. If you are invoking an uncomputable random number generator to produce an “equally likely” result then you have an uncomputable agent. However, there’s no such thing as an uncomputable random number generator in the real world. So: how is this decision actually being made?
    
    I’m aware that the paper asserts that “any agents [sic] can be written in O-maximizer form”, but note that the paper may simply be wrong. It’s clearly an unfinished draft, and no argument or proof is given.
    
    It applies to any computable agent. That is any agent—assuming that the Church–Turing–Deutsch principle is true.
    
    The argument given is pretty trivial. If you doubt the result, check it—and you should be able to see if it is correct or not fairly easily.
    - JoachimSchipper 25 Jan 2012 16:57 UTC
      0 points
      Parent
      The world is as follows: each observation x_i is one of “the mind can choose between A and B”, “the mind can choose between B and C” or “the mind can choose between C and A” (conveniently encoded as 1, 2 and 3). Independently of any past observations (x_1 and the like) and actions (x_1 and the like), each of these three options is equally likely. This fully specifies a possible world, no?
      
      The mind, then, is as follows: if the last observation is 1 (“A and B”), output “A”; if the last observation is 2 (“B and C”), output “B”; if the last observation is 3 (“C and A”), output “C”. This fully specifies a possible (deterministic, computable) decision procedure, no? (1)
      
      I argue that there is no assignment to U(“A”), U(“B”) and U(“C”) that causes an O-maximizer to produce the same output as the algorithm above. Conversely, there are assignments to U(“1A”), U(“1B”), …, U(“3C”) that cause the O-maximizer to output the same decisions as the above algorithm, but then we have encoded our decision algorithm into the U function used by the O-maximizer (which has its own issues, see my previous post.)
      
      (1) Actually, the definition requires the mind to output something before receiving input. That is a technical detail that can be safely ignored; alternatively, just always output “A” before receiving input.
      - timtyler 25 Jan 2012 18:13 UTC
        3 points
        Parent
        
        I argue that there is no assignment to U(“A”), U(“B”) and U(“C”) that causes an O-maximizer to produce the same output as the algorithm above.
        
        ...but the domain of a utility function surely includes sensory inputs and remembered past experiences (the state of the agent). You are trying to assign utilities to outputs.
        
        If you try and do that you can’t even encode absolutely elementary preferences with a utility function—such as: I’ve just eaten a peanut butter sandwich, so I would prefer a jam one next.
        
        If that is the only type of utility function you are considering, it is no surprise that you can’t get the theory to work.