I am starting to see what you mean. Let’s stick with utility functions over histories of length m_k (whole sequences) like you proposed and denote them with a capital U to distinguish them from the prefix utilities.
I think your Agent 4 runs into the following problem:
modeled_action(n,m) actually depends on the actions and observations yx_{k:m-1} and needs to be calculated for each combination, so y_m is actually
)
which clutters up the notation so much that I don’t want to write it down anymore.
We also get into trouble with taking the expectation, the observations x_{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx_<k,yx_k:n) even supposed to mean, where do the x’s come from?
so y_m is actually [...] which clutters up the notation so much that I don’t want to write it down anymore.
Yes.
We also get into trouble with taking the expectation, the observations x{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx<k,yx_k:n) even supposed to mean, where do the x’s come from?
Oops, you are right. The sum should have been over x_{k:n}, not just over x_k.
So let’s torture some indices: [...]
Yes, that is a cleaner and actually correct version what I was trying to describe. Thanks.
I am starting to see what you mean. Let’s stick with utility functions over histories of length m_k (whole sequences) like you proposed and denote them with a capital U to distinguish them from the prefix utilities. I think your Agent 4 runs into the following problem: modeled_action(n,m) actually depends on the actions and observations yx_{k:m-1} and needs to be calculated for each combination, so y_m is actually
)which clutters up the notation so much that I don’t want to write it down anymore.
We also get into trouble with taking the expectation, the observations x_{k+1:n} are only considered in modeling the actions of the future agents, but not now. What is M(yx_<k,yx_k:n) even supposed to mean, where do the x’s come from?
So let’s torture some indices:
=\textrm{arg}\max_{y_n}\sum_{x_{n:m_k}}U_n(yx_{1:n}\hat{y}_{n+1,k}(yx_{1:n})x_{n+1}\dots) x_{m_k})M(\.{y}\.{x}_{%3Ck}yx_{k:n-1}\hat{y}\underline{x}_{n:m_k}))where n>=k and
This is not really AIXI anymore and I am not sure what to do with it, but I like it.
Yes.
Oops, you are right. The sum should have been over x_{k:n}, not just over x_k.
Yes, that is a cleaner and actually correct version what I was trying to describe. Thanks.