Crush Your Uncertainty

Bayesian epistemology and decision theory provide a rigorous foundation for dealing with mixed or ambiguous evidence, uncertainty, and risky decisions. You can’t always get the epistemic conditions that classical techniques like logic or maximum liklihood require, so this is seriously valuable. However, having internalized this new set of tools, it is easy to fall into the bad habit of failing to avoid situations where it is necessary to use them.

When I first saw the light of an epistemology based on probability theory, I tried to convince my father that the Bayesian answer to problems involving an unknown processes (eg. laplace’s rule of succession), was superior to the classical (eg. maximum likelihood) answer. He resisted, with the following argument:

  • The maximum likelihood estimator plus some measure of significance is easier to compute.

  • In the limit of lots of evidence, this agrees with Bayesian methods.

  • When you don’t have enough evidence for statistical significance, the correct course of action is to collect more evidence, not to take action based on your current knowledge.

I added conditions (eg. what if there is no more evidence and you have to make a decision now?) until he grudgingly stopped fighting the hypothetical and agreed that the Bayesian framework was superior in some situations (months later, mind you).

I now realize that he was right to fight that hypothetical, and he was right that you should prefer classical max likelihood plus significance in most situations. But of course I had to learn this the hard way.

It is not always, or even often, possible to get overwhelming evidence. Sometimes you only have visibility into one part of a system. Sometimes further tests are expensive, and you need to decide now. Sometimes the decision is clear even without further information. The advanced methods can get you through such situations, so it’s critical to know them, but that doesn’t mean you can laugh in the face of uncertainty in general.

At work, I used to do a lot of what you might call “cowboy epistemology”. I quite enjoyed drawing useful conclusions from minimal evidence and careful probability-literate analysis. Juggling multiple hypotheses and visualizing probability flows between them is just fun. This seems harmless, or even helpful, but it meant I didn’t take gathering redundant data seriously enough. I now think you should systematically and completely crush your uncertainty at all opportunities. You should not be satisfied until exactly one hypothesis has non-negligible probability.

Why? If I’m investigating a system, and even though we are not completely clear on what’s going on, the current data is enough to suggest a course of action, and value of information calculations say that decision is not likely enough to change to make further investigation worth it, why then should I go and do further investigation to pin down the details?

The first reason is the obvious one; stronger evidence can make up for human mistakes. While a lot can be said for it’s power, human brain is not a precise instrument; sometimes you’ll feel a little more confident, sometimes a little less. As you gather evidence towards a point where you feel you have enough, that random fluctuation can cause you to stop early. But this only suggests that you should have a small bias towards gathering a bit more evidence.

The second reason is that though you may be able to make the correct immediate decision, going into the future, that residual uncertainty will bite you back eventually. Eventually your habits and heuristics derived from the initial investigation will diverge from what’s actually going on. You would not expect this in a perfect reasoner; they would always use their full uncertainty in all calculations, but again, the human brain is a blunt instrument, and likes to simplify things. What was once a nuanced probability distribution like 95% X, 5% Y might slip to just X when you’re not quite looking, and then, 5% of the time, something comes back from the grave to haunt you.

The third reason is computational complexity. Inference with very high certainty is easy; it’s often just simple direct math or clear intuitive visualizations. With a lot of uncertainty, on the other hand, you need to do your computation once for each of all (or some sample of) probable worlds, or you need to find a shortcut (eg analytic methods), which is only sometimes possible. This is an unavoidable problem for any bounded reasoner.

For example, you simply would not be able to design chips or computer programs if you could not treat transistors as infallible logical gates, and if you really really had to do so, the first thing you would do would be to build an error-correcting base system on top of which you could treat computation as approximately deterministic.

It is possible in small problems to manage uncertainty with advanced methods (eg. Bayes), and this is very much necessary while you decide how to get more certainty, but for unavoidable computational reasons, it is not sustainable in the long term, and must be a temporary condition.

If you take the habit of crushing your uncertainty, your model of situations can be much simpler and you won’t have to deal with residual uncertainty from previous related investigations. Instead of many possible worlds and nuanced probability distributions to remember and gum up your thoughts, you can deal with simple, clear, unambiguous facts.

My previous cowboy-epistemologist self might have agreed with everything written here, but failed to really get that uncertainty is bad. Having just been empowered to deal with uncertainty properly, there was a tendency to not just be unafraid of uncertainty, but to think that it was OK, or even glorify it. What I’m trying to convey here is that that aesthetic is mistaken, and as silly as it feels to have to repeat something so elementary, uncertainty is to be avoided. More viscerally, uncertainty is uncool (unjustified confidence is even less cool, though.)

So what’s this all got to do with my father’s classical methods? I still very much recommend thinking in terms of probability theory when working on a problem; it is, after all, the best basis for epistemology that we know of, and is perfectly adequate as an intuitive framework. It’s just that it’s expensive, and in the epistemic state you really want to be in, that expense is redundant in the sense that you can just use some simpler method that converges to the Bayesian answer.

I could leave you with an overwhelming pile of examples, but I have no particular incentive to crush your uncertainty, so I’ll just remind you to treat hypotheses like zombies; always double tap.