That is, if you look at the space of all possible outcomes, and select the point where exactly 50% of them are better, and exactly 50% are worse. Choose actions so that this median future is the best.
This seems vulnerable to the following bet: I roll a d6. If I roll 3+, I give you a dollar. Otherwise I shoot you.
I mention that vulnerability further down. Obviously it doesn’t fit human decision making either, but I think it’s qualitatively closer.
An example of an algorithm that’s closer to the desired behavior would be to sample n counterfactuals from your probability distribution. Then take the average of these n outcomes, and take the median of this entire setup. E.g. so 50% of the time the average of the n outcomes is higher, and 50% of the time it’s lower.
As n approaches infinity it becomes equivalent to expected utility, and as it approaches 1 it becomes median expected utility. A reasonable value is probably a few hundred. So that you select outcomes where you come out ahead the vast majority of the time, but still take low probability risks or ignore low probability rewards.
This seems vulnerable to the following bet: I roll a d6. If I roll 3+, I give you a dollar. Otherwise I shoot you.
I mention that vulnerability further down. Obviously it doesn’t fit human decision making either, but I think it’s qualitatively closer.
An example of an algorithm that’s closer to the desired behavior would be to sample n counterfactuals from your probability distribution. Then take the average of these n outcomes, and take the median of this entire setup. E.g. so 50% of the time the average of the n outcomes is higher, and 50% of the time it’s lower.
As n approaches infinity it becomes equivalent to expected utility, and as it approaches 1 it becomes median expected utility. A reasonable value is probably a few hundred. So that you select outcomes where you come out ahead the vast majority of the time, but still take low probability risks or ignore low probability rewards.