The question, abstracted and rearranged a little, is:
“For what games is there a strategy which dominates other strategies in the same computability class?”
That is potentially a profound question of pure mathematics, and maybe even a question which has been answered for important cases, but it all depends on how you define your terms. For a sufficiently liberal set of definitions, any finite game with a winning strategy would count, but I doubt that this is what you’re looking for. At the other extreme, you could stick to a definition of game, dominance, and computability class tailored specifically to your earlier post. If that’s really what you want, the best thing you can do is supply the exact definitions, and maybe someone can prove a theorem for you.
However, if the concepts aren’t quite settled in your head, and you’re interested in fruitful analogies—that is, more expansive definitions that will lead to novel results—there’s a body of existing theory you could push towards that’s potentially very interesting. Mathematically, it’s a set of correspondences between large cardinals, axiomatic proof systems, two-player games, and fast-growing functions.
In set theory, there are numerous “large cardinal axioms” (all of the form “there exists a transfinite number with a certain property”) which can be ranked hierarchically by the deductive power which they supply. The biggest large cardinal axioms are the logically inconsistent ones—their “power” is so great that they can prove X and not(X) at the same time—but they’re not very useful. However, there are many levels to the consistent part of the large cardinal hierarchy—with fun names like uncountable cardinals, inaccessible cardinals, and extendible cardinals, and there are many others—and not only do they offer an exact hierarchy of proof strength, but the theorems that become accessible at higher levels of the hierarchy can include propositions about quite practical areas of mathematics, like combinatorics and game theory. This tells you that we’re not just talking about platonically defined distinctions, but that it must be possible to define these levels in a finitary and operationally meaningful way. Of course, from a formal system perspective, the presence or absence of an axiom can make an enormous difference to the set of strings which can be validly derived from your axiom set, so in principle one can see why the addition of a semantically mysterious or dubious axiom might have big consequences for the power of a theorem prover, but one would prefer to have an exact conceptual understanding of the nature of these hierarchies.
It turns out that another manifestation of this hierarchy pertains to fast-growing functions. Apparently, one can also rank theorem provers according to the asymptotic behavior of functions that they can prove to be computable. Here we see the first tenuous hint of a connection to your question: an ordering of functions (by asymptotic growth rate), a hierarchy of notions of computability… I believe that this connection actually goes through ordinals rather than cardinals. You probably already know that the abstract theory of computability has deep connections to the theory of ordinals. I think in this case that classes of fast-growing functions can be paired up with ordinals somehow, or that the ordering of ordinals resembles the ordering of these functions. How this exactly works, how the ordinal hierarchy relates to the cardinal hierarchy relates to the function hierarchy relates to the logical strength hierarchy—I can’t tell you the details, but I know that a lot of details have been worked out, and are out there in the literature.
As for games, the connection comes partly through the ordinals. You can have a set of ordinals—defined by its upper bound (another ordinal which may not be in the set, if it’s an infinite set) - and then you can define a game using that set. For example, you might take turns trying to name the highest number from the set. For some sets, there will be a winning strategy. This is called determinacy, there are a number of different axioms of determinacy, and their proof strength can be ranked according to the large cardinal hierarchy—see Wikipedia.
I know this stuff only impressionistically, but perhaps you can see why I would think it’s potentially relevant. We can go even further and try to establish links between axiom systems and computational complexity classes, especially circuit complexity, which seems very apt for a crossover with games expressed in terms of Turing machines, because it’s an explicit and constructive model of computation; there are links with logic and the evaluation of nested quantifiers; and heading in a different direction again, you’ll end up in category theory and highly abstract algebraic notions like motives.
Part of what I’m saying is that if you choose your definitions carefully—make the right choices when you have to specify exactly what sort of analogies matter for your question—you may be able to tap into some of this huge body of concepts and results. I’m happy to expand on any aspect of this if you want; I’d be delighted to see a crossover between this mathematics and the mathematics of Solomonoff, Hutter, and their colleagues. On the other hand, if you can sense that it’s all irrelevant to your purpose and can pose your question in a rigorous way that just doesn’t go there… I might still have a go at answering it, but probably one of LW’s professional decision-theory hackers will do a better job.
Mitchell, thanks. Actually I already know pretty much all this stuff, and have long been interested in some questions about large cardinals and axiom systems. Nice writeup though!
But right now I’m just interested in “the mathematics of Hutter”, that is, how far we can extend Levin’s original theorem. If you’re asking for a rigorous formulation, here’s a pretty concise one I just thought up, not sure how much sense it makes. Let’s define a “supermartingale with bounded increments” (SWBI) as a function M on finite bit sequences S that satisfies the following conditions:
1) M(empty sequence) = 0
2) for any S, (M(S#0) + M(S#1))/2 ⇐ M(S)
3) for any S, M(S#0) - M(S) ⇐ 1 and M(S#1) - M(S) ⇐ 1
The question: does the set of all lower-semicomputable SWBIs contain an element X that dominates any other element Y up to an additive constant (which may depend on Y but doesn’t depend on S)?
Let F be the SWBI that just says F(S) = number of 0s in S—number of 1s in S. Let M be an arbitrary semicomputable SWBI. Then I don’t think M can dominate both F and -F up to an additive constant.
I have a vague, handwavy argument:
Suppose the ‘data’ is an algorithmically random sequence, which basically gives us a discrete random walk. Now a random walk must return infinitely many times to the ‘center’ (where F = 0). And in order to dominate both F and -F, when the random walk is far away from the center, M is going to have to be increasing at the rate of 1 per step away from the center. But M can’t know exactly when the random walk is going to turn around and head back home, so when it gets home, M is going to have lost about 1. (i.e. M will be about 1 less than it was last time it was at the center.) And this is going to happen infinitely many times.
But it’s still possible that some M could dominate all semicomputable SWBIs in the weaker sense of losing less than epsilon per step, for all epsilon > 0.
But it’s still possible that some M could dominate all semicomputable SWBIs in the weaker sense of losing less than epsilon per step, for all epsilon > 0.
You may already know that, but for the benefit of onlookers: multiplicative weights over F and -F is computable and kinda sorta dominates both F and -F in the weaker sense that you specified. You can make epsilon arbitrarily close to 0 by choosing a multiplier close to 1, at the cost of also incurring an additive constant loss that goes to infinity as epsilon->0. Multiplicative weights over all lower-semicomputable SWBIs dominates them all in the same sense, but it doesn’t seem to be lower-semicomputable itself, only approximable (or maybe I just failed to find a proof).
Actually, I’m Very Suspicious of that argument now because of the fact that it takes such a long time for a random walk to ‘return home’. Perhaps it’s just wrong.
Thanks! I agree it’s hard to imagine a lower-semicomputable M that would dominate both F and -F when S is algorithmically random, but a more solid proof would be nice because about half the steps in your argument aren’t intuitively obvious to me.
The question, abstracted and rearranged a little, is:
“For what games is there a strategy which dominates other strategies in the same computability class?”
That is potentially a profound question of pure mathematics, and maybe even a question which has been answered for important cases, but it all depends on how you define your terms. For a sufficiently liberal set of definitions, any finite game with a winning strategy would count, but I doubt that this is what you’re looking for. At the other extreme, you could stick to a definition of game, dominance, and computability class tailored specifically to your earlier post. If that’s really what you want, the best thing you can do is supply the exact definitions, and maybe someone can prove a theorem for you.
However, if the concepts aren’t quite settled in your head, and you’re interested in fruitful analogies—that is, more expansive definitions that will lead to novel results—there’s a body of existing theory you could push towards that’s potentially very interesting. Mathematically, it’s a set of correspondences between large cardinals, axiomatic proof systems, two-player games, and fast-growing functions.
In set theory, there are numerous “large cardinal axioms” (all of the form “there exists a transfinite number with a certain property”) which can be ranked hierarchically by the deductive power which they supply. The biggest large cardinal axioms are the logically inconsistent ones—their “power” is so great that they can prove X and not(X) at the same time—but they’re not very useful. However, there are many levels to the consistent part of the large cardinal hierarchy—with fun names like uncountable cardinals, inaccessible cardinals, and extendible cardinals, and there are many others—and not only do they offer an exact hierarchy of proof strength, but the theorems that become accessible at higher levels of the hierarchy can include propositions about quite practical areas of mathematics, like combinatorics and game theory. This tells you that we’re not just talking about platonically defined distinctions, but that it must be possible to define these levels in a finitary and operationally meaningful way. Of course, from a formal system perspective, the presence or absence of an axiom can make an enormous difference to the set of strings which can be validly derived from your axiom set, so in principle one can see why the addition of a semantically mysterious or dubious axiom might have big consequences for the power of a theorem prover, but one would prefer to have an exact conceptual understanding of the nature of these hierarchies.
It turns out that another manifestation of this hierarchy pertains to fast-growing functions. Apparently, one can also rank theorem provers according to the asymptotic behavior of functions that they can prove to be computable. Here we see the first tenuous hint of a connection to your question: an ordering of functions (by asymptotic growth rate), a hierarchy of notions of computability… I believe that this connection actually goes through ordinals rather than cardinals. You probably already know that the abstract theory of computability has deep connections to the theory of ordinals. I think in this case that classes of fast-growing functions can be paired up with ordinals somehow, or that the ordering of ordinals resembles the ordering of these functions. How this exactly works, how the ordinal hierarchy relates to the cardinal hierarchy relates to the function hierarchy relates to the logical strength hierarchy—I can’t tell you the details, but I know that a lot of details have been worked out, and are out there in the literature.
As for games, the connection comes partly through the ordinals. You can have a set of ordinals—defined by its upper bound (another ordinal which may not be in the set, if it’s an infinite set) - and then you can define a game using that set. For example, you might take turns trying to name the highest number from the set. For some sets, there will be a winning strategy. This is called determinacy, there are a number of different axioms of determinacy, and their proof strength can be ranked according to the large cardinal hierarchy—see Wikipedia.
I know this stuff only impressionistically, but perhaps you can see why I would think it’s potentially relevant. We can go even further and try to establish links between axiom systems and computational complexity classes, especially circuit complexity, which seems very apt for a crossover with games expressed in terms of Turing machines, because it’s an explicit and constructive model of computation; there are links with logic and the evaluation of nested quantifiers; and heading in a different direction again, you’ll end up in category theory and highly abstract algebraic notions like motives.
Part of what I’m saying is that if you choose your definitions carefully—make the right choices when you have to specify exactly what sort of analogies matter for your question—you may be able to tap into some of this huge body of concepts and results. I’m happy to expand on any aspect of this if you want; I’d be delighted to see a crossover between this mathematics and the mathematics of Solomonoff, Hutter, and their colleagues. On the other hand, if you can sense that it’s all irrelevant to your purpose and can pose your question in a rigorous way that just doesn’t go there… I might still have a go at answering it, but probably one of LW’s professional decision-theory hackers will do a better job.
Mitchell, thanks. Actually I already know pretty much all this stuff, and have long been interested in some questions about large cardinals and axiom systems. Nice writeup though!
But right now I’m just interested in “the mathematics of Hutter”, that is, how far we can extend Levin’s original theorem. If you’re asking for a rigorous formulation, here’s a pretty concise one I just thought up, not sure how much sense it makes. Let’s define a “supermartingale with bounded increments” (SWBI) as a function M on finite bit sequences S that satisfies the following conditions:
1) M(empty sequence) = 0
2) for any S, (M(S#0) + M(S#1))/2 ⇐ M(S)
3) for any S, M(S#0) - M(S) ⇐ 1 and M(S#1) - M(S) ⇐ 1
The question: does the set of all lower-semicomputable SWBIs contain an element X that dominates any other element Y up to an additive constant (which may depend on Y but doesn’t depend on S)?
Let F be the SWBI that just says F(S) = number of 0s in S—number of 1s in S. Let M be an arbitrary semicomputable SWBI. Then I don’t think M can dominate both F and -F up to an additive constant.
I have a vague, handwavy argument:
Suppose the ‘data’ is an algorithmically random sequence, which basically gives us a discrete random walk. Now a random walk must return infinitely many times to the ‘center’ (where F = 0). And in order to dominate both F and -F, when the random walk is far away from the center, M is going to have to be increasing at the rate of 1 per step away from the center. But M can’t know exactly when the random walk is going to turn around and head back home, so when it gets home, M is going to have lost about 1. (i.e. M will be about 1 less than it was last time it was at the center.) And this is going to happen infinitely many times.
But it’s still possible that some M could dominate all semicomputable SWBIs in the weaker sense of losing less than epsilon per step, for all epsilon > 0.
You may already know that, but for the benefit of onlookers: multiplicative weights over F and -F is computable and kinda sorta dominates both F and -F in the weaker sense that you specified. You can make epsilon arbitrarily close to 0 by choosing a multiplier close to 1, at the cost of also incurring an additive constant loss that goes to infinity as epsilon->0. Multiplicative weights over all lower-semicomputable SWBIs dominates them all in the same sense, but it doesn’t seem to be lower-semicomputable itself, only approximable (or maybe I just failed to find a proof).
Thanks! Your counterexample looks like it might work, but I’m having trouble working out a more rigorous proof.
Me too. :-/
Actually, I’m Very Suspicious of that argument now because of the fact that it takes such a long time for a random walk to ‘return home’. Perhaps it’s just wrong.
MathOverflow ROCKS! When your first experience with an online community looks like this, you know you’re gonna stay there.
Thanks! I agree it’s hard to imagine a lower-semicomputable M that would dominate both F and -F when S is algorithmically random, but a more solid proof would be nice because about half the steps in your argument aren’t intuitively obvious to me.