Look, we’re arguing past each other here. My logical response here would be to add more options to the system, which would remove the problem you identified (and I don’t understand your house insurance example—this is just the seat-belt decision again as a one-shot, and I would address it by looking at all the financial decisions you make in your life—and if that’s not enough, all the decisions, including all the “don’t do something clearly stupid and pointless” ones).
What I think is clear is:
a) Median maximalisation makes bad decisions in isolated problems.
b) If we combine all the likely decisions that a median maximiser will have to make, the quality of the decisions increase.
If you want to argue against it, either say that a) is bad enough we should reject the approach anyway, even if it decides well in practice, or find examples where a real world median maximaliser will make bad decisions even in the real world (if you would pay Pascal’s mugger, then you could use that as an example).
I don’t understand your house insurance example—this is just the seat-belt decision again as a one-shot
We were modeling the seat-belt decision as something that makes the difference between being dead and being completely fine in the event of an accident (which I suppose is not very realistic, but whatever). I was trying to point to a situation where an event can happen which is bad enough to put in the bottom half of outcomes either way, so that nothing that happens conditional on the event can affect the median outcome, but a decision you can make ahead of time would make the difference between bad and worse.
I do think that a) is bad enough, because a decision procedure that does poorly in isolated problems is wrong, and thus cannot be expected to do well in realistic situations, as I mentioned previously. I guess b) is probably technically true, but it is not enough for the quality of the decisions to increase when the number increases; it should actually increase towards a limit that isn’t still awful, and come close to achieving that limit (I’m pretty sure it fails on at least one of those, though which step it fails on might depend on how you make things precise). I’ve given examples where median maximizers make bad decisions in the real world, but you’ve dismissed them with vague appeals to “everything will be fine when you consider it in the context of all the other decisions it has to make”.
I’ve given examples where median maximizers make bad decisions in the real world, but you’ve dismissed them with vague appeals to “everything will be fine when you consider it in the context of all the other decisions it has to make”.
And I’ve added in the specific other decisions needed to achieve this effect. I agree it’s not clear what exactly the median maximalisation converge on in the real world, but the examples you’ve produced are not sufficient to show it’s bad.
I do think that a) is bad enough, because a decision procedure that does poorly in isolated problems is wrong
My take on this is that counterfactual decision count as well. ie if humans look not only at the decisions they face, but the ones they can imagine facing, then median maximalisation is improved. My justification for this line of thought is—how do you know that one chocolate cake is +10 utility while one coffee is +2 (and two coffees is +3, three is +2, and four is −1)? Not just the ordinal ranking, but the cardinality. I’d argue that you get this by either experiencing circumstances where you choose a 20% chance of a cake over coffee, or imagining yourself in that circumstance. And if imagination and past experiences are valid for the purpose of constructing your utility function, they should be valid for the purpose of median-maximalisation.
And I’ve added in the specific other decisions needed to achieve this effect.
That you claim achieve that effect. But as I said, unless the are choices you can make that would protect you from light injury involve less inconvenience per % reduction in risk than the choices you can make that would protect you from death, it doesn’t work.
However, I did think of something which seems to sort of achieve what you want: if you have high uncertainty about what the value of your utility function will be, then adding something to it with some probability will have a significant effect on the median value, even if the probability is significantly less than 50%. For instance, a 49% chance of death is very bad because if there’s a 49% chance you die, then the median outcome is one in which you’re alive but in a worse situation than all but 1⁄51 of the scenarios in which you die. It may be that this is what you had in mind, and adding future decisions that involve uncertainty was merely a mechanism by which large uncertainty about the outcome was introduced, in which case future-you actually getting to make any choices about them was a red herring. I still don’t find this argument convincing either, though, both because it still undervalues protection from risks of losses that are large relative to the rest your uncertainty about the value of the outcome (for instance, note that when valuing reductions in risk of death, there is still a weird discontinuity around 50%), and because it assumes that you can’t make decisions that selectively have significant consequences only in very good or very bad outcomes (this is what I was getting at with the house insurance example).
My take on this is that counterfactual decision count as well. … And if imagination and past experiences are valid for the purpose of constructing your utility function, they should be valid for the purpose of median-maximalisation.
I don’t understand what you’re saying here. Is it that you can maximize the median value of the mean of the values of your utility function in a bunch of hypothetical scenarios? If so, that sounds kind of like Houshalter’s median of means proposal, which approaches mean maximization as the number of samples considered approaches infinity.
The observation I have is that when facing many decisions, median maximialisation tends to move close to mean maximalisation (since the central limit theorem has “convergence in the distribution”, the median will converge to the mean in the case of averaging repeated independent processes; but there are many other examples of this). Therefore I’m considering what happens if you add “all the decisions you can imagine making” to the set of actual decisions you expect to make. This feels like it should move the two even closer together.
Ah, are you saying you should use your prior to choose a policy that maximizes your median utility, and then implementing that policy, rather than updating your prior with your observations and then choosing a policy that maximizes the median? So like UDT but with medians?
It seems difficult to analyze how it would actually behave, but it seems likely to be true that it acts much more similarly to mean utility maximization than it would if you updated before choosing the policy. Both of these properties (difficulty to analyze, and similarity to mean maximization) make it difficult to identify problems that it would perform poorly on. But this also makes it difficult to defend its alleged advantages (for instance, if it ends up being too similar to mean maximization, and if you use an unbounded utility function as you seem to insist, perhaps it pays Pascal’s mugger).
Ah, are you saying you should use your prior to choose a policy that maximizes your median utility, and then implementing that policy, rather than updating your prior with your observations and then choosing a policy that maximizes the median? So like UDT but with medians?
Ouch! Sorry for not being clear. If you missed that, then you can’t have understood much of what I was saying!
Look, we’re arguing past each other here. My logical response here would be to add more options to the system, which would remove the problem you identified (and I don’t understand your house insurance example—this is just the seat-belt decision again as a one-shot, and I would address it by looking at all the financial decisions you make in your life—and if that’s not enough, all the decisions, including all the “don’t do something clearly stupid and pointless” ones).
What I think is clear is:
a) Median maximalisation makes bad decisions in isolated problems.
b) If we combine all the likely decisions that a median maximiser will have to make, the quality of the decisions increase.
If you want to argue against it, either say that a) is bad enough we should reject the approach anyway, even if it decides well in practice, or find examples where a real world median maximaliser will make bad decisions even in the real world (if you would pay Pascal’s mugger, then you could use that as an example).
We were modeling the seat-belt decision as something that makes the difference between being dead and being completely fine in the event of an accident (which I suppose is not very realistic, but whatever). I was trying to point to a situation where an event can happen which is bad enough to put in the bottom half of outcomes either way, so that nothing that happens conditional on the event can affect the median outcome, but a decision you can make ahead of time would make the difference between bad and worse.
I do think that a) is bad enough, because a decision procedure that does poorly in isolated problems is wrong, and thus cannot be expected to do well in realistic situations, as I mentioned previously. I guess b) is probably technically true, but it is not enough for the quality of the decisions to increase when the number increases; it should actually increase towards a limit that isn’t still awful, and come close to achieving that limit (I’m pretty sure it fails on at least one of those, though which step it fails on might depend on how you make things precise). I’ve given examples where median maximizers make bad decisions in the real world, but you’ve dismissed them with vague appeals to “everything will be fine when you consider it in the context of all the other decisions it has to make”.
And I’ve added in the specific other decisions needed to achieve this effect. I agree it’s not clear what exactly the median maximalisation converge on in the real world, but the examples you’ve produced are not sufficient to show it’s bad.
My take on this is that counterfactual decision count as well. ie if humans look not only at the decisions they face, but the ones they can imagine facing, then median maximalisation is improved. My justification for this line of thought is—how do you know that one chocolate cake is +10 utility while one coffee is +2 (and two coffees is +3, three is +2, and four is −1)? Not just the ordinal ranking, but the cardinality. I’d argue that you get this by either experiencing circumstances where you choose a 20% chance of a cake over coffee, or imagining yourself in that circumstance. And if imagination and past experiences are valid for the purpose of constructing your utility function, they should be valid for the purpose of median-maximalisation.
That you claim achieve that effect. But as I said, unless the are choices you can make that would protect you from light injury involve less inconvenience per % reduction in risk than the choices you can make that would protect you from death, it doesn’t work.
However, I did think of something which seems to sort of achieve what you want: if you have high uncertainty about what the value of your utility function will be, then adding something to it with some probability will have a significant effect on the median value, even if the probability is significantly less than 50%. For instance, a 49% chance of death is very bad because if there’s a 49% chance you die, then the median outcome is one in which you’re alive but in a worse situation than all but 1⁄51 of the scenarios in which you die. It may be that this is what you had in mind, and adding future decisions that involve uncertainty was merely a mechanism by which large uncertainty about the outcome was introduced, in which case future-you actually getting to make any choices about them was a red herring. I still don’t find this argument convincing either, though, both because it still undervalues protection from risks of losses that are large relative to the rest your uncertainty about the value of the outcome (for instance, note that when valuing reductions in risk of death, there is still a weird discontinuity around 50%), and because it assumes that you can’t make decisions that selectively have significant consequences only in very good or very bad outcomes (this is what I was getting at with the house insurance example).
I don’t understand what you’re saying here. Is it that you can maximize the median value of the mean of the values of your utility function in a bunch of hypothetical scenarios? If so, that sounds kind of like Houshalter’s median of means proposal, which approaches mean maximization as the number of samples considered approaches infinity.
The observation I have is that when facing many decisions, median maximialisation tends to move close to mean maximalisation (since the central limit theorem has “convergence in the distribution”, the median will converge to the mean in the case of averaging repeated independent processes; but there are many other examples of this). Therefore I’m considering what happens if you add “all the decisions you can imagine making” to the set of actual decisions you expect to make. This feels like it should move the two even closer together.
Ah, are you saying you should use your prior to choose a policy that maximizes your median utility, and then implementing that policy, rather than updating your prior with your observations and then choosing a policy that maximizes the median? So like UDT but with medians?
It seems difficult to analyze how it would actually behave, but it seems likely to be true that it acts much more similarly to mean utility maximization than it would if you updated before choosing the policy. Both of these properties (difficulty to analyze, and similarity to mean maximization) make it difficult to identify problems that it would perform poorly on. But this also makes it difficult to defend its alleged advantages (for instance, if it ends up being too similar to mean maximization, and if you use an unbounded utility function as you seem to insist, perhaps it pays Pascal’s mugger).
Ouch! Sorry for not being clear. If you missed that, then you can’t have understood much of what I was saying!