In order to fix “Bayesian Decision Theory” so that it works in multiplayer games, have it search through strategies for the one that leads to maximum utility, rather than just going action by action. I guess this may be a non-mainstream thing?
Nope, that’s definitely a standard thing to do. It’s what I was referring to when I said:
We can incorporate notions like planning and value of information by defining U(θ; a) recursively in terms of an identical agent to ourselves who has seen one additional observation (or, if we are planning against an adversary, in terms of the adversary).
However, this doesn’t actually work that well, since recursively modeling other agents is expensive, and if the adversary is more complicated than our model can capture, we will do poorly. It is often much better to just not assume that we have a model of the adversary in the first place.
If you “need guaranteed performance,” just include that information in the utility function.
There is a difference between “guaranteed performance” and “optimizing for the worst case”. Guaranteed performance means that we can be confident, before the algorithm gets run, that it will hit some performance threshold. I don’t see how you can do that with a Bayesian method, except by performing a frequentist analysis on it.
Given the rest of your article, it looks like “conflated” could be replaced by “correlated” here. Calling the relationship between ideals and algorithms “conflation” already judges the issue a bit :)
When terminology fails to carve reality at its joints then I think it is fair to refer to it as conflation. If it was indeed the case that ideals mapped one-to-one onto algorithms then I would reconsider my word choice.
There is a difference between “guaranteed performance” and “optimizing for the worst case”. Guaranteed performance means that we can be confident, before the algorithm gets run, that it will hit some performance threshold.
Ah, okay. Whoops.
I don’t see how you can do that with a Bayesian method, except by performing a frequentist analysis on it.
How about a deliberate approximation to an ideal use of the evidence? Or do any approximations with limited ranges of validity (i.e. all approximations) count as “frequentist”? Though then we might have to divide computer-programming frequentists into “bayesian frequentists” and “frequentist frequentists” depending on whether they made approximations or applied a toolbox of methods.
How about a deliberate approximation to an ideal use of the evidence?
I’m confused by what you are suggesting here. Even a Bayesian method making no approximations at all doesn’t necessarily have guaranteed performance (see my response to Oscar_Cunningham).
I’m referring to using an approximation in order to guarantee performance. E.g. replacing the sum of a bunch of independent, well-behaved random variables with a gaussian, and using monte-carlo methods to get approximate properties of the individual random variables with known resources if necessary.
“Guaranteed performance” typically cashes out as “replace the value of an action with the probability that its outcome is better than L, then pick the best” whereas “optimizing for the worst case” typically cashes out as “replace the value of an action with the value of its worst outcome, then pick the best.”
The latter is often referred to as “robustness” and the former as “partial robustness,” and which one is applicable depends on the situation. Generally, the latter is used in problems with severe probabilistic uncertainty, whereas the former needs some probabilistic certainty.
Suppose that there are two possible policies A and B, and in the worst case A gives utility 1 and B gives utility 2, but for the specific problem we care about we require a utility of 3. Then an algorithm that optimizes for the worst case will choose B. On the other hand, there is no algorithm (that only chooses between policies A and B) that can guarantee a utility of 3. If you absolutely need a utility of 3 then you’d better come up with a new policy C, or find an additional valid assumption that you can make. The subtlety here is that “optimizing for the worst case” implicitly means “with respect to the current set of assumptions I have encoded into my algorithm, which is probably a subset of the full set of assumptions that I as a human make about the world”.
The notion of guaranteed performance is important because it tells you when you need to do more work and design a better algorithm (for instance, by finding additional regularities of the environment that you can exploit).
Nope, that’s definitely a standard thing to do. It’s what I was referring to when I said:
However, this doesn’t actually work that well, since recursively modeling other agents is expensive, and if the adversary is more complicated than our model can capture, we will do poorly. It is often much better to just not assume that we have a model of the adversary in the first place.
There is a difference between “guaranteed performance” and “optimizing for the worst case”. Guaranteed performance means that we can be confident, before the algorithm gets run, that it will hit some performance threshold. I don’t see how you can do that with a Bayesian method, except by performing a frequentist analysis on it.
When terminology fails to carve reality at its joints then I think it is fair to refer to it as conflation. If it was indeed the case that ideals mapped one-to-one onto algorithms then I would reconsider my word choice.
Ah, okay. Whoops.
How about a deliberate approximation to an ideal use of the evidence? Or do any approximations with limited ranges of validity (i.e. all approximations) count as “frequentist”? Though then we might have to divide computer-programming frequentists into “bayesian frequentists” and “frequentist frequentists” depending on whether they made approximations or applied a toolbox of methods.
I’m confused by what you are suggesting here. Even a Bayesian method making no approximations at all doesn’t necessarily have guaranteed performance (see my response to Oscar_Cunningham).
I’m referring to using an approximation in order to guarantee performance. E.g. replacing the sum of a bunch of independent, well-behaved random variables with a gaussian, and using monte-carlo methods to get approximate properties of the individual random variables with known resources if necessary.
I’m not sure what the difference between these two is, could you spell it out for me?
“Guaranteed performance” typically cashes out as “replace the value of an action with the probability that its outcome is better than L, then pick the best” whereas “optimizing for the worst case” typically cashes out as “replace the value of an action with the value of its worst outcome, then pick the best.”
The latter is often referred to as “robustness” and the former as “partial robustness,” and which one is applicable depends on the situation. Generally, the latter is used in problems with severe probabilistic uncertainty, whereas the former needs some probabilistic certainty.
Suppose that there are two possible policies A and B, and in the worst case A gives utility 1 and B gives utility 2, but for the specific problem we care about we require a utility of 3. Then an algorithm that optimizes for the worst case will choose B. On the other hand, there is no algorithm (that only chooses between policies A and B) that can guarantee a utility of 3. If you absolutely need a utility of 3 then you’d better come up with a new policy C, or find an additional valid assumption that you can make. The subtlety here is that “optimizing for the worst case” implicitly means “with respect to the current set of assumptions I have encoded into my algorithm, which is probably a subset of the full set of assumptions that I as a human make about the world”.
The notion of guaranteed performance is important because it tells you when you need to do more work and design a better algorithm (for instance, by finding additional regularities of the environment that you can exploit).