“Guaranteed performance” typically cashes out as “replace the value of an action with the probability that its outcome is better than L, then pick the best” whereas “optimizing for the worst case” typically cashes out as “replace the value of an action with the value of its worst outcome, then pick the best.”
The latter is often referred to as “robustness” and the former as “partial robustness,” and which one is applicable depends on the situation. Generally, the latter is used in problems with severe probabilistic uncertainty, whereas the former needs some probabilistic certainty.
Suppose that there are two possible policies A and B, and in the worst case A gives utility 1 and B gives utility 2, but for the specific problem we care about we require a utility of 3. Then an algorithm that optimizes for the worst case will choose B. On the other hand, there is no algorithm (that only chooses between policies A and B) that can guarantee a utility of 3. If you absolutely need a utility of 3 then you’d better come up with a new policy C, or find an additional valid assumption that you can make. The subtlety here is that “optimizing for the worst case” implicitly means “with respect to the current set of assumptions I have encoded into my algorithm, which is probably a subset of the full set of assumptions that I as a human make about the world”.
The notion of guaranteed performance is important because it tells you when you need to do more work and design a better algorithm (for instance, by finding additional regularities of the environment that you can exploit).
I’m not sure what the difference between these two is, could you spell it out for me?
“Guaranteed performance” typically cashes out as “replace the value of an action with the probability that its outcome is better than L, then pick the best” whereas “optimizing for the worst case” typically cashes out as “replace the value of an action with the value of its worst outcome, then pick the best.”
The latter is often referred to as “robustness” and the former as “partial robustness,” and which one is applicable depends on the situation. Generally, the latter is used in problems with severe probabilistic uncertainty, whereas the former needs some probabilistic certainty.
Suppose that there are two possible policies A and B, and in the worst case A gives utility 1 and B gives utility 2, but for the specific problem we care about we require a utility of 3. Then an algorithm that optimizes for the worst case will choose B. On the other hand, there is no algorithm (that only chooses between policies A and B) that can guarantee a utility of 3. If you absolutely need a utility of 3 then you’d better come up with a new policy C, or find an additional valid assumption that you can make. The subtlety here is that “optimizing for the worst case” implicitly means “with respect to the current set of assumptions I have encoded into my algorithm, which is probably a subset of the full set of assumptions that I as a human make about the world”.
The notion of guaranteed performance is important because it tells you when you need to do more work and design a better algorithm (for instance, by finding additional regularities of the environment that you can exploit).