Anja comments on Universal agents and utility functions

Anja 18 Nov 2012 3:06 UTC
0 points
I second the general sentiment that it would be good for an agent to have these traits, but if I follow your equations I end up with Agent 2.
- AlexMennen 18 Nov 2012 20:11 UTC
  3 points
  Parent
  No, you don’t. If you tried to represent Agent 2 in that notation, you would get
  
  modeled_action(n, k) = argmax(y_k) sum(x_k) [u_k(yx_k.
  
  You were using u_k to represent the utility of the last step of its input, so that total utility is the sum of the utilities of its prefixes, while I was using u_k to represent the utility of the whole sequence. If I adapt Agent 4 to your use of u_k, I get
  
  modeled_action(n, k) = argmax(y_k) sum(x_k) [u_k(yx_k.
  - Anja 19 Nov 2012 4:26 UTC
    4 points
    Parent
    I am starting to see what you mean. Let’s stick with utility functions over histories of length m_k (whole sequences) like you proposed and denote them with a capital U to distinguish them from the prefix utilities. I think your Agent 4 runs into the following problem: modeled_action(n,m) actually depends on the actions and observations yx_{k:m-1} and needs to be calculated for each combination, so y_m is actually
    
    $y\_m\(\\\.\{y\}\\\.\{x\}\_\{<k\}y\\underline\{x\}\_\{k:m\-1\}\$
    )
    
    which clutters up the notation so much that I don’t want to write it down anymore.
    
    We also get into trouble with taking the expectation, the observations x_{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx_<k,yx_k:n) even supposed to mean, where do the x’s come from?
    
    So let’s torture some indices:
    
    =\textrm{arg}\max_{y_n}\sum_{x_{n:m_k}}U_n(yx_{1:n}\hat{y}_{n+1,k}(yx_{1:n})x_{n+1}\dots)
    
    x_{m_k})M(\.{y}\.{x}_{%3Ck}yx_{k:n-1}\hat{y}\underline{x}_{n:m_k}))
    
    where n>=k and
    $\\\.\{y\}\_k=\\hat\{y\}\_\{k,k\}\.$
    
    This is not really AIXI anymore and I am not sure what to do with it, but I like it.
    - AlexMennen 19 Nov 2012 5:03 UTC
      2 points
      Parent
      
      so y_m is actually [...] which clutters up the notation so much that I don’t want to write it down anymore.
      
      Yes.
      
      We also get into trouble with taking the expectation, the observations x{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx<k,yx_k:n) even supposed to mean, where do the x’s come from?
      
      Oops, you are right. The sum should have been over x_{k:n}, not just over x_k.
      
      So let’s torture some indices: [...]
      
      Yes, that is a cleaner and actually correct version what I was trying to describe. Thanks.