“The basic idea underlying most uses of Bayes’ theorem is that a hypothesis is supported by any evidence which is rendered (either sufficiently or simply) probable by the truth of that hypothesis.”
— Bayesian Argumentation: The Practical Side of Probability
[Question] How do Bayesians tell what does and doesn’t count as evidence (which, e.g., hypotheses may render more or less probable if true)? Is it possible for something to fuzzily-be evidence?
It’s possible for something to be evidence, without you being sure of which way it points. Suppose there is a room with a liar, who always, lies, and another who always tells the truth within. After you figure out who is who, past info from each may be interpreted.
Seems like “evidence” is a terrible word for the concept! “Data” is better, though “sensory data” is even less misleading while a bit clunkier, and “the set of propositions safely taken for granted” is the least misleading and the clunkiest.
Additionally: imagine the evidence appeared very quickly, and was about an emotionally charged subject. People might misremember the evidence as being one thing when it was actually something similar, but still different, and perhaps critically different. Shouldn’t it be regarded an extremely important Bayesian skill to correctly interpret and remember your experiences? Since they will be used to measure the correct amount of confidence in explanations.
“The basic idea underlying most uses of Bayes’ theorem
In line with that, the obvious answer is, sort of:
Information/occurences which make the hypothesis more or less likely.
For the pure, ideal Bayesian, everything is “evidence”. Given the probabilities that you currently assign to all possible statements about the world, when you observe that some statement P is true, you update all your probabilities in accordance with the mathematical rules.
If I then ask, “suppose I don’t observe that P is true, only something suggesting that P is likely true?” the answer is that in that case I did not observe P. I observed something else Q. It is then the truth of Q that I should use to update my probabilities for P and everything else.
To elaborate rather than adding another answer:
For more human purposes, we can’t treat everything as evidence because it’s really hard to know what the implications are of every piece of raw data—even if those implications are perfectly deterministic, we haven’t got the brain power to figure them all out. And we can sort of turn this argument around to see that when we can figure out some implications of a piece of raw data, then we can treat it as evidence.
And so in everyday life we do the same thing we do when solving all other sorts of problems—we use heuristics to judge when we think we can make interesting use of some information, and then we apply our limited brain power to the task. This can lend some fuzziness to our interpretations, but it’s probably better to be careful and say that it’s how we’re treating things as evidence that’s fuzzy, it’s not an inherent property of the data (this means that different people might see different levels of fuzz in the same situation).
It’s not just a case of any two agents having fuzzy approximations to the same world view. In the least convenient case, agents will start off with radically different beliefs, and those beliefs will affect what they consider to be evidence, and how they interpret evidence. So there is no reason for agents to ever converge in the least convenient case .
Aumann’s theorem assumes the most convenient case
Aumann’s theorem assumes rational agents. Such agents consider every observation to be evidence, and update the probability of every hypothesis in the distribution appropriately. That includes agents who start with radically different beliefs, because for rational agents “belief” is just a distribution over possible hypotheses.
The problem is that each hypothesis is a massively multidimensional model, and no real person can even properly fit one in their mind. There is no hope whatsoever that anyone can accurately update weightings over an enormous number of hypotheses on every observation.
So we live in an even less convenient world than the “least convenient case” that was proposed. Nobody in the real world is rational in the sense of Aumann’s theorem. Not even a superintelligent AGI ever will be, because the space of all possible hypotheses about the world is always enormously more complex than the actual world, and the actual world is more complex than any given agent in it.
Realistic Bayesians can’t treat “everything” as evidence , any more than they can consider every hypothesis .
Evaluating information as evidence is a skill. It’s a skill that’s learned with practice. Being good at most skills isn’t about simply following a set of explicit rules but learning how to execute the skill with real world practice.
One good practice for that is making forcasts about how likely certain future events happen to be.
Relevant