Deterministic Strategies Can Be Suboptimal
Yesterday I posted the following problem
You are playing a game against ROB, the robber. In this game, ROB will choose two arbitrary (real/computable) numbers, and . ROB will send both numbers to TABI the arbiter, who will secretly flip a fair coin and then hand you either or depending on the outcome of the flip. Your task is to come up with an algorithm which does better than 50% accuracy to determine if you have the larger number.
I will now insert line breaks, then some hints, then more line breaks, then the solution. Edit: This sentence is meant to remind you that you can only discover the answer once, and it is sweet victory to solve puzzles.
.
.
.
.
.
.
.
.
.
Note that ROB is obviously not uniformly choosing any two arbitrary real numbers—that is incoherent! Much worse, ROB is stealing your algorithm and choosing his numbers from a probability distribution which minimizes your changes of winning.
.
.
.
.
.
.
.
.
.
You can probably solve this if you restrict ROB to only picking two of with different probability distributions, and then extrapolate.
.
.
.
.
.
.
.
.
.
The solution is to choose any strictly increasing function whose range is (a subset of) , and guess that your number is the larger number with probability . For any given pair selected by ROB, you will guess “larger” correctly with probability , and you will guess “larger” incorrectly with probability ; since is increasing, it is clear you will guess “larger” correctly more often than you guess “larger” incorrectly. I hope the remaining details are clear.
Shout-out to jacob.wood who posted a solution first, and dxu who posted the entire solution first.
This is naively surprising to those who know that strategies should often not be probabilistic, even when the games you are playing are. Yudkowsky’s newest novel demonstrates other areas where probabilistic strategies dominate. I will cover an example in my next post, as well as illustrate real world use cases where I have immediately applied the lesson.
The link to Yudkowsky’s newest novel is here: https://www.glowfic.com/posts/4582
My understanding is this generalizes to using a deterministic strategy when your “opponent” is non-strategic. Probabilistic approaches are mostly reserved for adversarial conditions, to prevent exploitation if your opponent can model your behavior.
I want to register my conviction that ROB can mitigate effectiveness by keeping you arbitrialy close to 50% success rate by selecting numbers arbitrarily close to one another.
Yes, or just arbitrarily big numbers.
Note that Eliezer believes the opposite: https://www.lesswrong.com/posts/GYuKqAL95eaWTDje5/worse-than-random.
From near the end of the post you linked:
In the case of Robert’s problem, since ROB is specified to have access to your algorithm, it effectively “moves second”, which does indeed place it in a position to “use its intelligence on you”. So it should be unsurprising that a mixed strategy can beat out a pure strategy. Indeed, this is why mixed strategies are sometimes optimal in game-theoretic situations: when adversaries are involved, being unpredictable has its advantages.
Of course, this does presume that you have access to a source of randomness that even ROB cannot predict. If you do not have access to such a source, then you are in fact screwed. But then the problem becomes inherently unfair, and thus of less interest.
(From a certain perspective, even this is compatible with Eliezer’s main message: for an adversary to be looking over your shoulder, “moving second” relative to you, and anti-optimizing your objective function, is indeed a special case of things being “worse than random”. The problem is that producing a “superior derandomized algorithm” in such a case requires inverting the “move order” of yourself and your adversary, which is not possible in many scenarios.)
The original problem doesn’t say that ROB has access to your algorithm, or that ROB wants you to lose.
In such problems, it is usually assumed that your solution have to work (in this case work = better than 50% accuracy) always, even in the worst case, when all unknowns are against you.
https://www.lesswrong.com/posts/AAqTP6Q5aeWnoAYr4/?commentId=WJ5hegYjp98C4hcRt
The problem statement says “arbitrary real numbers”, so the domain of your function P is -infinity to +infinity. P represents a probability distribution, so the area under the curve is equal to 1. P is strictly increasing, so… I’m having trouble visualizing a function that meets all these conditions.
You say “any” such function… perhaps you could give just one example.
In this case P is the cumulative distribution function, so it has to approach 1 at infinity, rather than the area under the curve being 1. An example would be 1/(1+exp(-x)).
Actually, for any given P which works, P’(x)=P(x)/10 is also a valid algorithm.