If we assume the strategies have a measure biased towards short strategies or use a prefix-free encoding or a subset of the strategies are implicitly something like that
Can you discuss more your assumptions on the quantilizer’s base distribution? I take it that it’s uniform?
Really all I need is that a strategy that takes n bits to specify will be performed by 1 in ∼2n of all random strategies. Maybe a random strategy consists of a bunch of random motions that cancel each other out, and in 1 in ∼2n of strategies in between these random motions are directed actions that add up to performing this n-bit strategy. Maybe 1 in ∼2n strategies start off by typing this strategy to another computer and end with shutting yourself off, so that in the remaining bits of the strategy will be ignored. A prefix-free encoding is basically like the latter situation except ignoring the bits after a certain point is built into the encoding rather than being an outcome of the agent’s interaction with the environment.
You mention:
Can you discuss more your assumptions on the quantilizer’s base distribution? I take it that it’s uniform?
Really all I need is that a strategy that takes n bits to specify will be performed by 1 in ∼2n of all random strategies. Maybe a random strategy consists of a bunch of random motions that cancel each other out, and in 1 in ∼2n of strategies in between these random motions are directed actions that add up to performing this n-bit strategy. Maybe 1 in ∼2n strategies start off by typing this strategy to another computer and end with shutting yourself off, so that in the remaining bits of the strategy will be ignored. A prefix-free encoding is basically like the latter situation except ignoring the bits after a certain point is built into the encoding rather than being an outcome of the agent’s interaction with the environment.