But to determine how much weight to give Bob as a judge for the second decision, you need to know whether or not Plan A was best for the first decision.
You don’t need certainty. And you don’t necessarily need that particular evidence. It would still work using calibration tests to weight them.
The only evidence you really have access to from last time is who voted for what, and whether everyone thinks it was a good idea in hindsight. I think that would be enough.
What I believe is that a Bayesian Judge is a tool that operates on probability estimates from other agents- and so if you want to reason this way, then you need this data.
Ok. we are talking about different things. I’m talking about using bayesian methods to integrate evidence like votes, voting records, and hindsight estimations of optimality to determine the best distribution of probability over which plan is best (or some other output).
I have no idea how this “Bayesian Judge” thing that uses probability estimates directly would even work.
I have no idea how this “Bayesian Judge” thing that uses probability estimates directly would even work.
Here’s an article on Bayesian aggregation of forecasts. Essentially, you look at past forecasts to get P(Bob: “rain”|rain) and P(Bob: “rain”|~rain). (You can just elicit those expert likelihoods, but if you want this to be a formula rather than a person, you need them to be the data you’re looking for instead of just suggestive of the data you’re looking for.) From just that, you could calibrate Bob to find out what P(rain|Bob: “rain”) and P(rain|Bob: “~rain”) are. When you also have data on past predictions from Alice, Charlie, and David, you can combine them and get a more sophisticated estimate than any individual expert. It’s generally able to notice things like “when Alice and Bob agree, they’re both wrong,” which you couldn’t find by just computing individual calibrations.
That is, this thing you’ve been talking about is a procedure that’s already been worked out and that I’ve personally performed. It’s typically only done for forecasters of mutually exclusive possibilities and is inappropriate for decision-makers for reasons I’ve already mentioned.
You don’t need certainty. And you don’t necessarily need that particular evidence. It would still work using calibration tests to weight them.
The only evidence you really have access to from last time is who voted for what, and whether everyone thinks it was a good idea in hindsight. I think that would be enough.
Ok. we are talking about different things. I’m talking about using bayesian methods to integrate evidence like votes, voting records, and hindsight estimations of optimality to determine the best distribution of probability over which plan is best (or some other output).
I have no idea how this “Bayesian Judge” thing that uses probability estimates directly would even work.
Here’s an article on Bayesian aggregation of forecasts. Essentially, you look at past forecasts to get P(Bob: “rain”|rain) and P(Bob: “rain”|~rain). (You can just elicit those expert likelihoods, but if you want this to be a formula rather than a person, you need them to be the data you’re looking for instead of just suggestive of the data you’re looking for.) From just that, you could calibrate Bob to find out what P(rain|Bob: “rain”) and P(rain|Bob: “~rain”) are. When you also have data on past predictions from Alice, Charlie, and David, you can combine them and get a more sophisticated estimate than any individual expert. It’s generally able to notice things like “when Alice and Bob agree, they’re both wrong,” which you couldn’t find by just computing individual calibrations.
That is, this thing you’ve been talking about is a procedure that’s already been worked out and that I’ve personally performed. It’s typically only done for forecasters of mutually exclusive possibilities and is inappropriate for decision-makers for reasons I’ve already mentioned.
neat!