Anja comments on Universal agents and utility functions

Anja 17 Nov 2012 22:03 UTC
0 points

First, replace the action-perception sequence with an action-perception-utility sequence u1,y1,x1,u2,y2,x2,etc.

This seems unnecessary. The information u_i is already contained in x_i.

modeled_action(n, k) = argmax(y_k) uk(yx\<k, yx_k:n)*M(uyx_<k, uyx_k:n)

This completely breaks the expectimax principle. I assume you actually mean something like
=\textrm{arg}\max_{y_k}\sum_{x_k}u_k(\.{y}\.{x}_{%3Ck}y\underline{x}_{k:n})M(\.{y}\.{x}_{%3Ck}y\underline{x}_{k:n}))

which is just Agent 2 in disguise.
- AlexMennen 18 Nov 2012 2:10 UTC
  0 points
  Parent
  Oops. Yes, that’s what I meant. But it is not the same as Agent 2, because this (Agent 4?) uses its current utility function to evaluate the desirability of future observations and actions, even though it knows that it will use a different utility function to choose between them later. For example, Agent 4 will not take the Simpleton’s Gambit because it cares about its current utility function getting satisfied in the future, not about its future utility function getting satisfied in the future.
  
  Agent 4 can be seen as a set of agents, one for each possible utility function, that are using game theory with each other.
  - Anja 18 Nov 2012 3:06 UTC
    0 points
    Parent
    I second the general sentiment that it would be good for an agent to have these traits, but if I follow your equations I end up with Agent 2.
    - AlexMennen 18 Nov 2012 20:11 UTC
      3 points
      Parent
      No, you don’t. If you tried to represent Agent 2 in that notation, you would get
      
      modeled_action(n, k) = argmax(y_k) sum(x_k) [u_k(yx_k.
      
      You were using u_k to represent the utility of the last step of its input, so that total utility is the sum of the utilities of its prefixes, while I was using u_k to represent the utility of the whole sequence. If I adapt Agent 4 to your use of u_k, I get
      
      modeled_action(n, k) = argmax(y_k) sum(x_k) [u_k(yx_k.
      - Anja 19 Nov 2012 4:26 UTC
        4 points
        Parent
        I am starting to see what you mean. Let’s stick with utility functions over histories of length m_k (whole sequences) like you proposed and denote them with a capital U to distinguish them from the prefix utilities. I think your Agent 4 runs into the following problem: modeled_action(n,m) actually depends on the actions and observations yx_{k:m-1} and needs to be calculated for each combination, so y_m is actually
        
        $y\_m\(\\\.\{y\}\\\.\{x\}\_\{<k\}y\\underline\{x\}\_\{k:m\-1\}\$
        )
        
        which clutters up the notation so much that I don’t want to write it down anymore.
        
        We also get into trouble with taking the expectation, the observations x_{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx_<k,yx_k:n) even supposed to mean, where do the x’s come from?
        
        So let’s torture some indices:
        
        =\textrm{arg}\max_{y_n}\sum_{x_{n:m_k}}U_n(yx_{1:n}\hat{y}_{n+1,k}(yx_{1:n})x_{n+1}\dots)
        
        x_{m_k})M(\.{y}\.{x}_{%3Ck}yx_{k:n-1}\hat{y}\underline{x}_{n:m_k}))
        
        where n>=k and
        $\\\.\{y\}\_k=\\hat\{y\}\_\{k,k\}\.$
        
        This is not really AIXI anymore and I am not sure what to do with it, but I like it.
        AlexMennen 19 Nov 2012 5:03 UTC
        2 points
        Parent
        
        so y_m is actually [...] which clutters up the notation so much that I don’t want to write it down anymore.
        
        Yes.
        
        We also get into trouble with taking the expectation, the observations x{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx<k,yx_k:n) even supposed to mean, where do the x’s come from?
        
        Oops, you are right. The sum should have been over x_{k:n}, not just over x_k.
        
        So let’s torture some indices: [...]
        
        Yes, that is a cleaner and actually correct version what I was trying to describe. Thanks.