Five minutes’ thought allowed me to prove the following stupid “theorem”:
Consider any game (haha). The only restriction is that the game must never make total wins go below zero, as in my game 2. Then there’s a general-purpose winning agent: choose one strategy at the outset, sampled from the space of all computable strategies according to some distribution, and then follow it for all eternity. Obviously, this agent’s expected accumulated utilities at all times cannot be worse than any individual strategy by more than a multiplicative constant, which is equal to that strategy’s weight in the initial distribution.
Perhaps this result is easy in retrospect. Now I’d like to know what happens if utility can become negative (taking the exponent doesn’t seem to work), and also how to improve the agent because it looks kinda stupid (even though it solves game 2 about as well as Solomonoff does). Sorry if this all sounds obvious, I’ve only been studying the topic for several days.
Five minutes’ thought allowed me to prove the following stupid “theorem”:
Consider any game (haha). The only restriction is that the game must never make total wins go below zero, as in my game 2. Then there’s a general-purpose winning agent: choose one strategy at the outset, sampled from the space of all computable strategies according to some distribution, and then follow it for all eternity. Obviously, this agent’s expected accumulated utilities at all times cannot be worse than any individual strategy by more than a multiplicative constant, which is equal to that strategy’s weight in the initial distribution.
Perhaps this result is easy in retrospect. Now I’d like to know what happens if utility can become negative (taking the exponent doesn’t seem to work), and also how to improve the agent because it looks kinda stupid (even though it solves game 2 about as well as Solomonoff does). Sorry if this all sounds obvious, I’ve only been studying the topic for several days.