I expect (but again, gut feeling) that what works is that as the complexity of the environment increases, the set of non-dominated strategies concentrates around utility maximization.
The complexity of the environment is how restricted is the relationship between actions and outcomes, and between actions and reward. The number of possible actions and outcomes could matter in principle, but it is always possible to write them in binary and move back to restrictions on the relationship.
Concentration around utility maximization I think would be quantified as regret if you are willing to fix a global utility, or else define some scoring based on rankings in the games.
I see. If we were to make this formal it would depend on the notion of “complexity” we use.
Notably it seems intuitive that there be counterexample games that pump complexity out of thin air by adding rules and restriction that do not really matter. So “just” adding a simple complexity threshold would certainly not work, for most notions of complexity.
Maybe it is true that “the higher the complexity the larger the portion of nondominated strategies that are utility maximisation”.
But
The set of strategies is often infinite, so the notion of “portion” depends on a measure function.
That kind of result is already much weaker than the “coherence result” I have come to expect by reading various sources.
Interesting idea anyway, seems to require quite a bit more work.
Notably it seems intuitive that there be counterexample games that pump complexity out of thin air by adding rules and restriction that do not really matter. So “just” adding a simple complexity threshold would certainly not work, for most notions of complexity.
I have the suspicion that you read “more complexity” as meaning “more restrictions”, while I meant the contrary (I do realize I didn’t express myself clearly). Is that the case?
My intuition is that you can find strategies that work in some restricted, simple games, which are not utility maximization; but that if you can’t already assume what’s going on, if anything could happen, then the utility maximizer will find some way to squint out more points; or if your strategy is about not being adversarially pumped, the adversary will find a way to bypass your simple anti-pumping scheme.
In EJT’s post, I remember that the main concrete point was that being stubborn, in this sense:
if I previously turned down some option X, I will not choose any option that I strictly disprefer to X
protects you from being pumped. There is real life evidence about this being a strategy that has merit—humans do it, it seems to me—but it only takes you so far, right? Someone smart enough to predict how you think may find a way to employ your stubbornness against you, or you may miss on opportunities.
Concrete example: you turn down the first suitor, all the subsequent ones you get look worse, you refuse them because you read EJT’s post, you end up a spinster. Now, if due to incompleteness, you don’t compare spinsterdom with possible marriages, then you are not being dominated. So I realize I missed an important point in my idea: you can’t just increase the environmental complexity in the sense of lifting restrictions on actions and outcomes, you also need to put restrictions on preference complexity, otherwise you can always add branches as needed to say you are not comparing the outcomes that could make you look worse off.
So my pre-formal statement becomes: a non-dominated strategy for a preference tree compact enough compared to the world it applies to will be approximately a utility maximizer.
In EJT’s post, I remember that the main concrete point was that being stubborn, in this sense:
if I previously turned down some option X, I will not choose any option that I strictly disprefer to X
To my understanding that was a good counter to the idea that anything that is not a utility maximisation is vulnerable to money pumps in a specific kind of games.
But that is restricted to “decision tree” games in which in every turn but the first you have an “active outcome” which you know you can keep until the end if you wish. Every turn you can decide to change that active outcome or to keep it. These games are interesting to discuss dutch book vulnerability but they are still quite specific. Most games are not like that.
On a related note:
a non-dominated strategy for a preference tree compact enough compared to the world it applies to will be approximately a utility maximizer
I think I didn’t understand what you mean by “preference tree” here. Is it just a partial order relation (preference) on outcomes?
If you mean “for a case in which the complexity of the preference ordering is small compared to that of the rest of the game” , then I disagree.
The counterexample could certainly scale to high complexity of the rules without any change to the (very simple) preference ordering.
The closest I could come to your statement in my vocabulary above is:
For some value k, if the ratio “complexity of the outcome preference” / “complexity of the total game” is inferior to k then any nondominated strategy is (approximately) a utility maximisation.
I have the suspicion that you read “more complexity” as meaning “more restrictions”, while I meant the contrary (I do realize I didn’t express myself clearly). Is that the case?
My intuition for the idea of complexity is something like “the minimal number of character it takes to implement this game in python”. The flaw is that this assume computable games, which is not in line with the general formulation of the conjecture I used. So that definition does not worK. But that’s roughly what I think of when I read “complexity”. Is that compatible with your meaning?
Note that this is for the notion of complexity of a given game. If you mean the complexity of a class of games then I am less certain how to define it. However if we are to change the category of games we are talking about then the only clear ways to do so I see involve weakening the conjecture by restricting it to speak of strictly fewer games.
My intuition for the idea of complexity is something like “the minimal number of character it takes to implement this game in python”. The flaw is that this assume computable games, which is not in line with the general formulation of the conjecture I used. So that definition does not worK. But that’s roughly what I think of when I read “complexity”. Is that compatible with your meaning?
I’m not sure, but I think not.
Maybe phrases that are more representative of what I have in mind are “size of the game world” or “complexity of the possible strategies”. I’m trying to point at something like the real world: compact fundamental laws, but even pursuing a simple goal can entail complicated strategies. “Minimal number of character it takes to implement this game in python” would be small because the “game code” part is the laws and the reward.
Example: think about a variation of AIXI with non-complete preferences, where reward at each time step is additional energy accumulated in some region of the universe, and the non-completeness is due to the rewards at different times not being comparable. I expect a standard AIXI with some complete version of these preferences, such as utility = area under the energy-time curve, would reach very high incoming energy on each step, and it would be difficult to write a non-dominated strategy which tries to be non-dominated just by not accruing that much energy overall and instead moving a lot of energy at some time step. Yet it’s probably possible. But this kind of trick becomes more difficult if I restrict the number of branches you can make in the preferences tree; it’s possible to be non-dominated “just” because you have many non-comparable branches and it suffices to do OK on just one branch. As I restrict the number of branches, you’ll have to do better overall.
Weird. I didn’t expect this to be wrong and I did not expect the other one to be right. Glad I asked.
“Minimal number of character it takes to implement this game in python” would be small because the “game code” part is the laws and the reward.
Not so sure about that. The game has to describe and model “everything” about the situation. So if you want to describe interaction with details of a “real world” then you also need a complete simulation of said real world.
While everything is “contained” in the reward function, it is not like the reward function can be computed independently of “what happens” in the game.
It is however true that you only need to compute the most minimal version of the game relevant to the outcome. So if your game contains a lot of “pointless” rules that do nothing then they can be safely ignored when computing the complexity of the game. I think that’s normal.
In the case of the real world, even restricting it to a bounded precision, the program would need to be very long. It is not just a description of the sentence “you win if there are a lot of diamonds in the world” (or whatever the goal is). It is also a complete “simulation” of the world.
and it would be difficult to write a non-dominated strategy which tries to be non-dominated just by not accruing that much energy overall and instead moving a lot of energy at some time step. Yet it’s probably possible [...]
Depending on the exact parameter, an intuitive strategy that is not dominated but not optimal long terms either could be “never invest anything”, in which you value the present so much that you never move because that would cost energy.
Remark that this strategy is still “a” utility maximisation (just value each step more than the next to a high amount). But it is very bad for the utility you described.
But this kind of trick becomes more difficult if I restrict the number of branches you can make in the preferences tree; it’s possible to be non-dominated “just” because you have many non-comparable branches and it suffices to do OK on just one branch. As I restrict the number of branches, you’ll have to do better overall.
I still get the feeling that your notion of preference tree is not equivalent to my own concept of a partial order on the set of outcomes. Could you clarify?
I think simulating the real world requires a lot of memory and computations, not a large program. (Not that I know the program.) Komogorov complexity does not put restrictions on the computations. Think about Conway’s game of life.
Depending on the exact parameter, an intuitive strategy that is not dominated but not optimal long terms either could be “never invest anything”
It’s not obvious to me that “do nothing” is not dominated. Even if an active agent didn’t start pumping energy right away in the designated region to work on some master plan instead, the energy in the region would, if the active agent is somewhere else, be the same as if the agent was the null agent, until the active agent started doing something. Then, if the active agent never passes through some small phase were energy is pumped out of the region for some reason, it dominates the null agent.
You seem to interpret that in my example the energy “in the battery of the agent” counts, such that not moving can’t be dominated. I said “energy accumulated in some region of the universe” to avoid this kind of thing. Anyway, the point of the example is not showing a completely general property, but to point at things which have the property, so I expect you to fix yourself counter-specifications that make the example fail, unless of course you thought the example was very broken.
I still get the feeling that your notion of preference tree is not equivalent to my own concept of a partial order on the set of outcomes. Could you clarify?
Sorry, my choice of expression is confusing. I was thinking about a directed acyclic graph representing the order in my mind, and called that “tree”, but indeed the standard definition of tree is acyclic without orientation, so the skeleton of a DAG does not qualify in general. A minimal representation total order would be a chain, while a partial order has to contain “parallel branches”.
You seem to interpret that in my example the energy “in the battery of the agent” counts, such that not moving can’t be dominated. I said “energy accumulated in some region of the universe” to avoid this kind of thing. Anyway, the point of the example is not showing a completely general property, but to point at things which have the property, so I expect you to fix yourself counter-specifications that make the example fail, unless of course you thought the example was very broken.
Sure. I agree counterexamples that rely on a small specification flaw are not relevant to your point.
I don’t know if that class of examples works.
My intuition is somewhat that there will be nondominated strategies that are not utility maximization “by default” on that sort of games. At least if we only look at utilities that are weighted sums of the energy at various points in time.
On the whole and in general, it is still not intuitive to me whether utility maximization become ubiquitous when the “complexity” ratio you defines goes down.
Sorry for only answering now, I was quite busy in the last few days.
I think simulating the real world requires a lot of memory and computations, not a large program. (Not that I know the program.) Komogorov complexity does not put restrictions on the computations. Think about Conway’s game of life.
You also need a specification of the initial state, which dramatically increases the size of the program!
Because the formalism only requires turing machines (or any equivalent computation formalism), there is no distinction between the representation of the “rules” and the rest of the necessary data.
So even if the rules of physics themselves are very simple (like in the game of life), the program that simulates the world is very big. It probably requires something like “position of every atom at step t”.
Sorry, my choice of expression is confusing. I was thinking about a directed acyclic graph representing the order in my mind, and called that “tree”, but indeed the standard definition of tree is acyclic without orientation, so the skeleton of a DAG does not qualify in general. A minimal representation total order would be a chain, while a partial order has to contain “parallel branches”.
Ok thank you. I will keep reading “order relation” for those.
You also need a specification of the initial state, which dramatically increases the size of the program! Because the formalism only requires turing machines (or any equivalent computation formalism), there is no distinction between the representation of the “rules” and the rest of the necessary data. So even if the rules of physics themselves are very simple (like in the game of life), the program that simulates the world is very big. It probably requires something like “position of every atom at step t
The initial conditions can be simple. Do you expect that the real universe requires very specific boundary conditions?
In the game of life it is necessary to set up specific initial configurations for something interesting to happen, but the configurations can be computed with a program much smaller than the number of cells to set (e.g., repetitive patterns). It is not necessary to explicitly lay out the values of all the cells.
Let’s put aside the distiction between initial conditions and rules, I think it is just a distraction at this point.
In general I would even expect a complete simulation of the univers to be non-computable. Ie I expect that the univers contains an infinite amount of information.
If we bound the problem to some finite part of time and space then I expect, just as an intuition, that a complete simulation would require a lot of information. Ie, the minimal turing machine / python code that consistently outputs the same result as the simulation for each input is very long.
I do not have a good number to give that translates this intuition of “very long”.
Let’s say that simulating the earth during the last 10 days would take multiple millions of terrabits of data?
Of course the problem is underspecified. We also need to specify what the legal inputs are.
I expect (but again, gut feeling) that what works is that as the complexity of the environment increases, the set of non-dominated strategies concentrates around utility maximization.
The complexity of the environment is how restricted is the relationship between actions and outcomes, and between actions and reward. The number of possible actions and outcomes could matter in principle, but it is always possible to write them in binary and move back to restrictions on the relationship.
Concentration around utility maximization I think would be quantified as regret if you are willing to fix a global utility, or else define some scoring based on rankings in the games.
I see. If we were to make this formal it would depend on the notion of “complexity” we use.
Notably it seems intuitive that there be counterexample games that pump complexity out of thin air by adding rules and restriction that do not really matter. So “just” adding a simple complexity threshold would certainly not work, for most notions of complexity.
Maybe it is true that “the higher the complexity the larger the portion of nondominated strategies that are utility maximisation”. But
The set of strategies is often infinite, so the notion of “portion” depends on a measure function.
That kind of result is already much weaker than the “coherence result” I have come to expect by reading various sources.
Interesting idea anyway, seems to require quite a bit more work.
I have the suspicion that you read “more complexity” as meaning “more restrictions”, while I meant the contrary (I do realize I didn’t express myself clearly). Is that the case?
My intuition is that you can find strategies that work in some restricted, simple games, which are not utility maximization; but that if you can’t already assume what’s going on, if anything could happen, then the utility maximizer will find some way to squint out more points; or if your strategy is about not being adversarially pumped, the adversary will find a way to bypass your simple anti-pumping scheme.
In EJT’s post, I remember that the main concrete point was that being stubborn, in this sense:
protects you from being pumped. There is real life evidence about this being a strategy that has merit—humans do it, it seems to me—but it only takes you so far, right? Someone smart enough to predict how you think may find a way to employ your stubbornness against you, or you may miss on opportunities.
Concrete example: you turn down the first suitor, all the subsequent ones you get look worse, you refuse them because you read EJT’s post, you end up a spinster. Now, if due to incompleteness, you don’t compare spinsterdom with possible marriages, then you are not being dominated. So I realize I missed an important point in my idea: you can’t just increase the environmental complexity in the sense of lifting restrictions on actions and outcomes, you also need to put restrictions on preference complexity, otherwise you can always add branches as needed to say you are not comparing the outcomes that could make you look worse off.
So my pre-formal statement becomes: a non-dominated strategy for a preference tree compact enough compared to the world it applies to will be approximately a utility maximizer.
To my understanding that was a good counter to the idea that anything that is not a utility maximisation is vulnerable to money pumps in a specific kind of games. But that is restricted to “decision tree” games in which in every turn but the first you have an “active outcome” which you know you can keep until the end if you wish. Every turn you can decide to change that active outcome or to keep it. These games are interesting to discuss dutch book vulnerability but they are still quite specific. Most games are not like that.
On a related note:
I think I didn’t understand what you mean by “preference tree” here. Is it just a partial order relation (preference) on outcomes? If you mean “for a case in which the complexity of the preference ordering is small compared to that of the rest of the game” , then I disagree. The counterexample could certainly scale to high complexity of the rules without any change to the (very simple) preference ordering.
The closest I could come to your statement in my vocabulary above is:
Is this faithful enough?
Yes, I mean that.
Yup.
My intuition for the idea of complexity is something like “the minimal number of character it takes to implement this game in python”. The flaw is that this assume computable games, which is not in line with the general formulation of the conjecture I used. So that definition does not worK. But that’s roughly what I think of when I read “complexity”. Is that compatible with your meaning?
Note that this is for the notion of complexity of a given game. If you mean the complexity of a class of games then I am less certain how to define it. However if we are to change the category of games we are talking about then the only clear ways to do so I see involve weakening the conjecture by restricting it to speak of strictly fewer games.
I’m not sure, but I think not.
Maybe phrases that are more representative of what I have in mind are “size of the game world” or “complexity of the possible strategies”. I’m trying to point at something like the real world: compact fundamental laws, but even pursuing a simple goal can entail complicated strategies. “Minimal number of character it takes to implement this game in python” would be small because the “game code” part is the laws and the reward.
Example: think about a variation of AIXI with non-complete preferences, where reward at each time step is additional energy accumulated in some region of the universe, and the non-completeness is due to the rewards at different times not being comparable. I expect a standard AIXI with some complete version of these preferences, such as utility = area under the energy-time curve, would reach very high incoming energy on each step, and it would be difficult to write a non-dominated strategy which tries to be non-dominated just by not accruing that much energy overall and instead moving a lot of energy at some time step. Yet it’s probably possible. But this kind of trick becomes more difficult if I restrict the number of branches you can make in the preferences tree; it’s possible to be non-dominated “just” because you have many non-comparable branches and it suffices to do OK on just one branch. As I restrict the number of branches, you’ll have to do better overall.
Weird. I didn’t expect this to be wrong and I did not expect the other one to be right. Glad I asked.
Not so sure about that. The game has to describe and model “everything” about the situation. So if you want to describe interaction with details of a “real world” then you also need a complete simulation of said real world. While everything is “contained” in the reward function, it is not like the reward function can be computed independently of “what happens” in the game. It is however true that you only need to compute the most minimal version of the game relevant to the outcome. So if your game contains a lot of “pointless” rules that do nothing then they can be safely ignored when computing the complexity of the game. I think that’s normal.
In the case of the real world, even restricting it to a bounded precision, the program would need to be very long. It is not just a description of the sentence “you win if there are a lot of diamonds in the world” (or whatever the goal is). It is also a complete “simulation” of the world.
Btw, the notion I was alluding to is Kolmogorov complexity.
Depending on the exact parameter, an intuitive strategy that is not dominated but not optimal long terms either could be “never invest anything”, in which you value the present so much that you never move because that would cost energy. Remark that this strategy is still “a” utility maximisation (just value each step more than the next to a high amount). But it is very bad for the utility you described.
I still get the feeling that your notion of preference tree is not equivalent to my own concept of a partial order on the set of outcomes. Could you clarify?
I think simulating the real world requires a lot of memory and computations, not a large program. (Not that I know the program.) Komogorov complexity does not put restrictions on the computations. Think about Conway’s game of life.
It’s not obvious to me that “do nothing” is not dominated. Even if an active agent didn’t start pumping energy right away in the designated region to work on some master plan instead, the energy in the region would, if the active agent is somewhere else, be the same as if the agent was the null agent, until the active agent started doing something. Then, if the active agent never passes through some small phase were energy is pumped out of the region for some reason, it dominates the null agent.
You seem to interpret that in my example the energy “in the battery of the agent” counts, such that not moving can’t be dominated. I said “energy accumulated in some region of the universe” to avoid this kind of thing. Anyway, the point of the example is not showing a completely general property, but to point at things which have the property, so I expect you to fix yourself counter-specifications that make the example fail, unless of course you thought the example was very broken.
Sorry, my choice of expression is confusing. I was thinking about a directed acyclic graph representing the order in my mind, and called that “tree”, but indeed the standard definition of tree is acyclic without orientation, so the skeleton of a DAG does not qualify in general. A minimal representation total order would be a chain, while a partial order has to contain “parallel branches”.
Sure. I agree counterexamples that rely on a small specification flaw are not relevant to your point.
I don’t know if that class of examples works. My intuition is somewhat that there will be nondominated strategies that are not utility maximization “by default” on that sort of games. At least if we only look at utilities that are weighted sums of the energy at various points in time.
On the whole and in general, it is still not intuitive to me whether utility maximization become ubiquitous when the “complexity” ratio you defines goes down.
Sorry for only answering now, I was quite busy in the last few days.
You also need a specification of the initial state, which dramatically increases the size of the program! Because the formalism only requires turing machines (or any equivalent computation formalism), there is no distinction between the representation of the “rules” and the rest of the necessary data. So even if the rules of physics themselves are very simple (like in the game of life), the program that simulates the world is very big. It probably requires something like “position of every atom at step t”.
Ok thank you. I will keep reading “order relation” for those.
The initial conditions can be simple. Do you expect that the real universe requires very specific boundary conditions?
In the game of life it is necessary to set up specific initial configurations for something interesting to happen, but the configurations can be computed with a program much smaller than the number of cells to set (e.g., repetitive patterns). It is not necessary to explicitly lay out the values of all the cells.
Let’s put aside the distiction between initial conditions and rules, I think it is just a distraction at this point.
In general I would even expect a complete simulation of the univers to be non-computable. Ie I expect that the univers contains an infinite amount of information. If we bound the problem to some finite part of time and space then I expect, just as an intuition, that a complete simulation would require a lot of information. Ie, the minimal turing machine / python code that consistently outputs the same result as the simulation for each input is very long.
I do not have a good number to give that translates this intuition of “very long”. Let’s say that simulating the earth during the last 10 days would take multiple millions of terrabits of data? Of course the problem is underspecified. We also need to specify what the legal inputs are.
Anyway, do you agree with this intuition?
I think that simulating a specific slice of spacetime is more kolmogorov-complex that simulating the entire universe.