I can describe a class of such games and its solution, but I’d hope that if a good decision theoretical agent exists that solves this class, it might also solve some wider class of problems in an intuitively correct way. -- That said, the class is symmetric games, and the solution is the strategy profile that yields the highest utility, among those strategy profiles in which all players choose the same strategy. [More precisely, a good algorithm should choose this algorithm when playing against itself or a similar algorithm, should maximize its utility when playing against an opponent that unconditionally plays some given strategy, and should “do something sensible” in other cases.]
The quining-cooperator algorithm in the Prisoner’s Dilemma forms a “Nash equilibrium in algorithms” against itself. This is a very desirable property to have, which roughly corresponds to “reflective consistency” in single-player decision theory. However, the algorithm you’re trying to develop (which includes maximizing utility against an opponent that plays a strategy unconditionally) will not form a “Nash equilibrium in algorithms” against itself, even in symmetric games.
To see this, consider a simple bargaining game, a variant of “dividing the dollar”. Each player sees the other’s source code and outputs an integer. If the sum of the two integers is less than or equal to 10, both players get the amount in dollars that they asked for. If the sum is greater than 10, both get nothing. Your algorithm will output 5 when playing against itself. But this means it’s not a best reply to itself, because if your opponent is replaced with the simple program “return 9″, you will maximize utility and get only 1.
The quining-cooperator algorithm in the Prisoner’s Dilemma forms a “Nash equilibrium in algorithms” against itself. This is a very desirable property to have, which roughly corresponds to “reflective consistency” in single-player decision theory. However, the algorithm you’re trying to develop (which includes maximizing utility against an opponent that plays a strategy unconditionally) will not form a “Nash equilibrium in algorithms” against itself, even in symmetric games.
On reflection: I agree that “Nash equilibrium in algorithms” would be desirable, but you seem to find it a more knock-out argument than I do. If we’re discussing what kinds of algorithms to enter into something like Axelrod’s tournament, then clearly, Nash equilibrium in algorithms would be a very compelling property. But if world() really is (a utility function over) the whole universe, then it’s less clear to me that “take 1″ is not the right thing to do when encountering a “take 9” rock.
Intuitively, the reason you wouldn’t want to “take 1” in this case is that you would not want someone else—Clippy, say—a motive to leave “take 9″ rocks around for you to find. But there’s the counter-intuition that:
(a) If Clippy does this as a result of reasoning about how you behave, then that’s in fact a different situation (the assumption being that you’re reasoning about the source of the whole world, not just about the source of the “take 9” rock in front of you). What you do in this situation influences whether Clippy will have left behind a “take 9″ rock, so your decision algorithm should not be able to conclude that you maximize utility by “taking 1.” [I’m assuming an UDT-like agent which one can think of as taking the source of the world and returning the actions to perform in different contingencies, so it does not make sense to say that we cannot influence what Clippy does because our sensory input tells us that Clippy has “already” left behind a “take 9” rock.]
(b) If Clippy does not do this as a result of actually reasoning about how you behave, then if you “take 5” (say), Clippy will still have left a “take 9″ rock behind—by assumption, there’s no logical causality from what you do to what Clippy does.
I can’t formalize this argument, and I’m not sure that (b) especially is the right way of thinking about this problem, but this counter-intuition seems at least as compelling to me as the original intuition. Am I missing something?
-- All this said, I no longer think that the idea behind my proposed algorithm is useful. The problem is as far as I can see, if agent 1 runs “do X unless I can prove it’s better to do Y” and agent 2 runs “do A unless I can prove that it’s better to do B,” then the agents won’t play (X,B) or (Y,A) even if this would be the most desirable outcome for both of them—or at least I can’t prove that these outcomes are possible.
“Play minimax unless I can prove the other strategy is better” doesn’t exactly run into this problem in symmetric games as long as the two available strategies have different minimax values, but what I’m really after is a decision theory that would act correctly if it finds such a game embedded in the world, and this does seem to run into the problem described above.
User Perplexed pointed me to Nash’s 1953 paper Two-person cooperative games that seems to give a unique “fair” solution for all two-person competitive games in our setting. I’ve been thinking how to spin the algorithm in the paper into a flavor of utility maximization, but failing so far.
The key is to understand that when two players form a coalition they create a new entity distinct from its members which has its own revealed preferences and its own utility to maximize. That is, you need to step up a level and imagine each member of a coalition as making a deal with the coalition itself, rather than simply dealing with the other members. In joining a coalition, a player gives up some or all of his decision-making power in exchange for a promise of security.
Define coalescence utility for player A as the amount of extra A utility A gets because the pair plays the cooperation game rather than the threat game. If you wish, think of this as a compensation paid to A by the collective in exchange for his cooperation in the objectives of the collective. Define the utility of cooperation to the ‘collective’ (i.e. to the union of A and B considered as an entity at a different level than A and B themselves) as the sum of the logarithms of the coalescence utilities of A and B. Then, the collective maximizes its own utility by maximizing the product of the coalescence utilities (which means the same as maximizing the sum of the logs).
This approach of taking the logarithm of individual utilities to get components of corporate or collective utility extends nicely to coalitions with more than 2 members. For some analogies in other (biological) kinds of optimization processes where taking the logarithm is the trick that makes it all work, see my favorite paper ever or this classic paper from Dick Lewontin:
R. C. Lewontin and D. Cohen.
On population growth in a randomly varying environment.
Proceedings of the National Academy of the Sciences, 62:1056–1060, 1969.
Can you describe this class of games precisely? And can you define the solution precisely, without referring to our algorithms?
I can describe a class of such games and its solution, but I’d hope that if a good decision theoretical agent exists that solves this class, it might also solve some wider class of problems in an intuitively correct way. -- That said, the class is symmetric games, and the solution is the strategy profile that yields the highest utility, among those strategy profiles in which all players choose the same strategy. [More precisely, a good algorithm should choose this algorithm when playing against itself or a similar algorithm, should maximize its utility when playing against an opponent that unconditionally plays some given strategy, and should “do something sensible” in other cases.]
The quining-cooperator algorithm in the Prisoner’s Dilemma forms a “Nash equilibrium in algorithms” against itself. This is a very desirable property to have, which roughly corresponds to “reflective consistency” in single-player decision theory. However, the algorithm you’re trying to develop (which includes maximizing utility against an opponent that plays a strategy unconditionally) will not form a “Nash equilibrium in algorithms” against itself, even in symmetric games.
To see this, consider a simple bargaining game, a variant of “dividing the dollar”. Each player sees the other’s source code and outputs an integer. If the sum of the two integers is less than or equal to 10, both players get the amount in dollars that they asked for. If the sum is greater than 10, both get nothing. Your algorithm will output 5 when playing against itself. But this means it’s not a best reply to itself, because if your opponent is replaced with the simple program “return 9″, you will maximize utility and get only 1.
On reflection: I agree that “Nash equilibrium in algorithms” would be desirable, but you seem to find it a more knock-out argument than I do. If we’re discussing what kinds of algorithms to enter into something like Axelrod’s tournament, then clearly, Nash equilibrium in algorithms would be a very compelling property. But if world() really is (a utility function over) the whole universe, then it’s less clear to me that “take 1″ is not the right thing to do when encountering a “take 9” rock.
Intuitively, the reason you wouldn’t want to “take 1” in this case is that you would not want someone else—Clippy, say—a motive to leave “take 9″ rocks around for you to find. But there’s the counter-intuition that:
(a) If Clippy does this as a result of reasoning about how you behave, then that’s in fact a different situation (the assumption being that you’re reasoning about the source of the whole world, not just about the source of the “take 9” rock in front of you). What you do in this situation influences whether Clippy will have left behind a “take 9″ rock, so your decision algorithm should not be able to conclude that you maximize utility by “taking 1.” [I’m assuming an UDT-like agent which one can think of as taking the source of the world and returning the actions to perform in different contingencies, so it does not make sense to say that we cannot influence what Clippy does because our sensory input tells us that Clippy has “already” left behind a “take 9” rock.]
(b) If Clippy does not do this as a result of actually reasoning about how you behave, then if you “take 5” (say), Clippy will still have left a “take 9″ rock behind—by assumption, there’s no logical causality from what you do to what Clippy does.
I can’t formalize this argument, and I’m not sure that (b) especially is the right way of thinking about this problem, but this counter-intuition seems at least as compelling to me as the original intuition. Am I missing something?
-- All this said, I no longer think that the idea behind my proposed algorithm is useful. The problem is as far as I can see, if agent 1 runs “do X unless I can prove it’s better to do Y” and agent 2 runs “do A unless I can prove that it’s better to do B,” then the agents won’t play (X,B) or (Y,A) even if this would be the most desirable outcome for both of them—or at least I can’t prove that these outcomes are possible.
“Play minimax unless I can prove the other strategy is better” doesn’t exactly run into this problem in symmetric games as long as the two available strategies have different minimax values, but what I’m really after is a decision theory that would act correctly if it finds such a game embedded in the world, and this does seem to run into the problem described above.
Hello again.
User Perplexed pointed me to Nash’s 1953 paper Two-person cooperative games that seems to give a unique “fair” solution for all two-person competitive games in our setting. I’ve been thinking how to spin the algorithm in the paper into a flavor of utility maximization, but failing so far.
The key is to understand that when two players form a coalition they create a new entity distinct from its members which has its own revealed preferences and its own utility to maximize. That is, you need to step up a level and imagine each member of a coalition as making a deal with the coalition itself, rather than simply dealing with the other members. In joining a coalition, a player gives up some or all of his decision-making power in exchange for a promise of security.
Define coalescence utility for player A as the amount of extra A utility A gets because the pair plays the cooperation game rather than the threat game. If you wish, think of this as a compensation paid to A by the collective in exchange for his cooperation in the objectives of the collective. Define the utility of cooperation to the ‘collective’ (i.e. to the union of A and B considered as an entity at a different level than A and B themselves) as the sum of the logarithms of the coalescence utilities of A and B. Then, the collective maximizes its own utility by maximizing the product of the coalescence utilities (which means the same as maximizing the sum of the logs).
This approach of taking the logarithm of individual utilities to get components of corporate or collective utility extends nicely to coalitions with more than 2 members. For some analogies in other (biological) kinds of optimization processes where taking the logarithm is the trick that makes it all work, see my favorite paper ever or this classic paper from Dick Lewontin:
R. C. Lewontin and D. Cohen. On population growth in a randomly varying environment. Proceedings of the National Academy of the Sciences, 62:1056–1060, 1969.
True; thanks for pointing that out.