Let’s figure it out! Suppose the payoffs are as used here.
Now, the correct answer is to go straight with p=2/3. How is that figured out? Because it maximizes the expected value given by p · (1-p) · 4 + p · p.
Since the probabilities are dependent on your strategy, we will leave them as equations—if you go straight with probability p, then P(first crossing) is 1/(1+p), and P(second crossing) is p/(1+p). So the question is, what is the correct mixed strategy to take, in terms of these probabilities and the utilities?
Well, there’s a rather dumb way: you just extract p from the probabilities and plug it into the expected value.
That is, you look at the ratio P(second crossing)/P(first crossing), and then choose a strategy such that that ratio maximizes the expected utility equation. That’s the function.
I see. For any math problem where I can figure out the right answer, I can write a function that receives the phase of the moon as argument and returns the right answer. Okay, I guess you’re right, I do want the function to look like expected utility maximization based on the supplied probabilities, rather than some random formula.
Hardly random—it makes perfect sense that the ratio of probabilities is equal to the parameter of the strategy, and so any maximization of the parameter can be rewritten as a maximization of the ratio of probabilities.
Let’s figure it out! Suppose the payoffs are as used here.
Now, the correct answer is to go straight with p=2/3. How is that figured out? Because it maximizes the expected value given by p · (1-p) · 4 + p · p.
Since the probabilities are dependent on your strategy, we will leave them as equations—if you go straight with probability p, then P(first crossing) is 1/(1+p), and P(second crossing) is p/(1+p). So the question is, what is the correct mixed strategy to take, in terms of these probabilities and the utilities?
Well, there’s a rather dumb way: you just extract p from the probabilities and plug it into the expected value.
That is, you look at the ratio P(second crossing)/P(first crossing), and then choose a strategy such that that ratio maximizes the expected utility equation. That’s the function.
I see. For any math problem where I can figure out the right answer, I can write a function that receives the phase of the moon as argument and returns the right answer. Okay, I guess you’re right, I do want the function to look like expected utility maximization based on the supplied probabilities, rather than some random formula.
Hardly random—it makes perfect sense that the ratio of probabilities is equal to the parameter of the strategy, and so any maximization of the parameter can be rewritten as a maximization of the ratio of probabilities.