I mean “theoretical evidence” as something that is in contrast to empirical evidence. Alternative phrases include “inside view evidence” and “gears-level evidence”.
I personally really like the phrase “gears-level evidence”. What I’m trying to refer to is something like, “our knowledge of how the gears turn would imply X”. However, I can’t recall ever hearing someone use the phrase “gears-level evidence”. On the other hand, I think I recall hearing “theoretical evidence” used before.
Here are some examples that try to illuminate what I am referring to.
Effectiveness of masks
Iirc, earlier on in the coronavirus pandemic there was empirical evidence saying that masks are not effective. However, as Zvi talked about, “belief in the physical world” would imply that they are effective.
Foxes vs hedgehogs
Consider Isaiah Berlin’s distinction between “hedgehogs” (who rely more on theories, models, global beliefs) and “foxes” (who rely more on data, observations, local beliefs).
- Blind Empiricism
Foxes place more weight on empirical evidence, hedgehogs on theoretical evidence.
Harry’s dark side
Then I won’t do that again! I’ll be extra careful not to turn evil!
“Heard it.”
Frustration was building up inside Harry. He wasn’t used to being outgunned in arguments, at all, ever, let alone by a Hat that could borrow all of his own knowledge and intelligence to argue with him and could watch his thoughts as they formed. Just what kind of statistical summary do your ‘feelings’ come from, anyway? Do they take into account that I come from an Enlightenment culture, or were these other potential Dark Lords the children of spoiled Dark Age nobility, who didn’t know squat about the historical lessons of how Lenin and Hitler actually turned out, or about the evolutionary psychology of self-delusion, or the value of self-awareness and rationality, or -
“No, of course they were not in this new reference class which you have just now constructed in such a way as to contain only yourself. And of course others have pleaded their own exceptionalism, just as you are doing now. But why is it necessary? Do you think that you are the last potential wizard of Light in the world? Why must you be the one to try for greatness, when I have advised you that you are riskier than average? Let some other, safer candidate try!”
The Sorting Hat has empirical evidence that Harry is at risk of going dark. Harry’s understanding of how the gears turn in his brain makes him think that he is not actually at risk of going dark.
Instincts vs A/B tests
Imagine that you are working on a product. A/B tests are showing that option A is better, but your instincts, based on your understanding of how the gears turn, suggest that B is better.
Posting up in basketball
Over the past 5-10 years in basketball, there has been a big push to use analytics more. Analytics people hate post-ups (an approach to scoring). The data says that they are low-efficiency.
I agree with that in a broad sense, but I believe that a specific type of posting up is very high efficiency. Namely, trying to get deep-position post seals when you have a good height-weight advantage. My knowledge of how the gears turn strongly indicates to me that this would be high efficiency offense. However, analytics people still seem to advise against this sort of offense.
First, I love this question.
Second, this might seem way out of left field, but I think this might help you answer it —
https://en.wikipedia.org/wiki/B%C3%BCrgerliches_Gesetzbuch#Abstract_system_of_alienation
I have an idea of what might be going on here with your question.
It might be the case that there’s two fairly-tightly-bound — yet slightly distinct — components in your conception of “theoretical evidence.”
I’m having a hard time finding the precise words, but something around evidence, which behaves more-or-less similarly to how we typically use the phrase, and something around… implication, perhaps… inference, perhaps… something to do with causality or prediction… I’m having a hard time finding the right words here, but something like that.
I think it might be the case that these components are quite tightly bound together, but can be profitably broken up into two related concepts — and thus, being able to separate them BGB-style might be a sort of solution.
Maybe I’m mistaken here — my confidence isn’t super high, but when I thought through this question the German Civil Law concept came to mind quickly.
It’s profitable reading, anyways — BGB I think can be informative around abstract thinking, logic, and order-of-operations. Maybe intellectually fruitful towards your question or maybe not, but interesting and recommended either way.
What makes the thing you’re pointing at different than just “deduction” or “logic”?
You have empirical evidence.
You use the empirical evidence to generate a theory edifice, and further evidence has so far supported it. (induction)
You use the theory to make a prediction (deduction), but that is not itself evidence, it only feels like it because we aren’t logically omniscient and didn’t already know what our theory implied. Whatever probability our prediction has comes from the theory, which gets its predictive value from the empirical evidence that went into creating and testing it.
The early discussions about mask effectiveness during COVID were often between people not trained in physics at all, that just wasn’t part of their thinking process, so a physics-based response was new evidence because of the empirical evidence behind the relevant physics. Also, there were lots of people talking past each other because “mask,” “use,” and “effective” are all underspecified terms that don’t allow for simple yes/no answers at the level of discourse we seem able to publicly support as a society, and institutions don’t usually bother trying to make subtler points to the public for historical, legal, and psychological reasons (that we may or may not agree with in specific cases or in general).
Good question. Maybe one of those is the correct term for what I am pointing at.
I may be misinterpreting what you’re saying, but it sounds to me like you are saying that evidence is only in the territory, not in our maps. Consider the example of how the existence of gravity would imply that aerosol particles containing covid will eventually fall towards the ground, and so the concentration of such particles will decrease as you get further from the source. My understanding of what you’re saying is that gravity, the theory, isn’t evidence. Apples falling from a tree, the empirical observations that allowed us to construct the theory of gravity, that is the actual evidence.
But this would violate how the term is currently used. It seems normal to me to say that gravity is evidence that aerosol particles will dissipate as they get further from their source. In the sense that it feels correct, and in the sense that I recall hearing other people use the term that way.
Then maybe I’m mixing up terms and should make a better mental separation between “evidence” and “data.” In that case “data” is in the territory (and the term I should have used in my previous post), while “evidence” can mean different things in different contexts. Logical evidence, empirical evidence, legal evidence, and so on, all have different standards. In that case I don’t know if there is necessarily a consistent definition beyond “what someone will accept as a convincing reason to reach a conclusion to a certain kind of question,” but I’m not at all confident in that.
Can you cite someone else using the word evidence to refer to a theory or explanation? I can’t recall ever seeing that, but it might be a translation or regional thing. As a souther california Jewish native American English speaker, saying “gravity is evidence that” just sounds wrong, like saying “a red, fast, clever fox”
“Armchair evidence”.
Could be “framing conditions”. I mean, it’s one think to say “masks should help to not spread or receive viral particles”, but it’s another thing to say “masks can’t not limit convection”. Even if you are interested in the first, you have to separate it into the second and similar statements. Things should resemble pieces of an empirical model besides intuitive guesses, to be updateable.
I mean, it’s fine to stick to the intuition, but it doesn’t help with modifying the model.
There are such things as “theorem”, “finding” and “understanding”.
However the word evidence is heavily reserved for theory-distant pieces of data that are not prone to be negotiable. There is the sense that “evidence” is something that shifts beliefs. but this comes from the connection that a brain should be informed by the outside world. We don’t call all persuasive things evidence.
If you are doing theorethical stuff and think in a way where ” evidence” factors heavily you are somewhat likely to do things a bit backwards. Weighting evidence is connected to cogent argumens which are in the realm of inductive reasoning. In the realm of theory we can use proper deductive methods and definitely say stuff about things. A proof either carries or not—there is no “we can kinda say”.
This seems to me like something that is important to change, and a big part of why I am asking this question.
I’ve always been a believer that having a word/phrase for something makes it a lot easier to incorporate it into your thinking. For example, since coming across the term “slack”, I’ve noticed that it is something I incorporate into my thinking a lot more, despite the fact that the concept is something that wasn’t new to me.
I also share the same worry that Eliezer expresses in Blind Empiricism:
Having an easily accessible term for theoretical evidence would make it easier to combine the ways of the Fox with the ways of the Hedgehog. To say “I shift my beliefs this way according to the empirical evidence X. And then I shift my beliefs that way according to the theoretical evidence Y.” Even if you aren’t as bullish about inside view thinking as me or Eliezer, combining the two seems like an undoubtedly good thing, but one that is currently a little difficult to do given the lack of terminology for “theoretical evidence”.
I understand the need to have a usable word for the concept. However trying to hijack meanings of existing words just seems like recipe to have conflicting meanings.
In a court, for example a medical examiner can be asked what was the cause of death. The act of doing this is “opining” and the result is “an opinion”. Only experts can opine and the standing for a expert to be an expert on the issue can be challenged. Asking a non-expert to opine can be objected to, eye-witnesses can be taken to be credible about their experience but far disconnected conclusions are not allowed (it is a separate job of the lawyer to argue those inferences or the fact finder to think it is suffiently shown).
Like “theory” can in folk language mean guess but in science terms means a very regimented and organised set of hypotheses sometimes a term “expert opinion” is used to distinguish for findings that people are willing to back up even under pressure to distinguish between “mere” “personal opinion”
It is true that expert wittness testimony “are among the evidence”. “word against word” kind of cases might be felt tricky because it is pretty easy to lie, that is to fabricate that kind of evidence.
I agree. However, in the rationality community the term evidence is assumed to refer to Bayesian evidence (ie. as opposed to scientific or legal evidence). And I’ve always figured that this is also the case in various technical domains (AI research, data science). So then, at least within the context of these communities there wouldn’t be any hijacking or conflict. Furthermore, more and more people/domains are adopting Bayesian thinking/techniques, and so the context where it would be appropriate to have a term like “theoretical evidence” is expanding.
I am not worried that evidence is too broad. However on that short definition I have a real hard time identifying what is the “event” that happens or not that alters the probabilities.
I get that for example somebody might be worried that when this and neighbouring galaxy merge whether stars will collide. Understanding of scales means this will essentially not happen, even without knowing any positions of stars. Sure it is cognitively prudent. But I have a hard time phrasing it in terms of taking into account evidence. What is the evidence I am factoring in when I come to the realization that 2+2=4? To me it seems that it is a core property of evidence that it is not theorethical, that is the umph that drives towards truth.
Check out How to Convince Me That 2 + 2 = 3 :)
The link connection is not evident and even there the association is with the external situation rather than thought-happenings.
I’m a bit late to the game here, but you may be thinking of a facet of “logical induction”. Basically, logical induction is changing your hypotheses based on putting more thought into an issue, without necessarily getting more Bayesian evidence.
The simplest example is when deciding whether a mathematical proof is true. Technically, you already have a hypothesis that perfectly predicts your data—ZFC set theory—but proving the proof is highly computationally expensive using this hypothesis, so if you want a probability estimate of whether the proof is true you need some other prediction mechanism.
See the Consequences of Logical Induction sequence for more information.
I’m basing this answer on a clarifying example from the comments section:
When put like this, these “evidence” sound a lot like priors. The order should be different though:
First you deduce from the theory that masks are, say, 90% effective. These are the priors.
Then you run the experiments that show that masks are only effective 20% of the time.
Finally you update your beliefs downward and say that masks are 75% effective. These are the posteriors.
To a perfect Bayesian the order shouldn’t matter, but we are not perfect Bayesians and if we try to do it the other way around and apply the theory to update the probabilities we got from the experiments, we would be able to convince ourselves the probability is 75% no matter how much empirical evidence that says otherwise we have accumulated.
If this were true, I would agree with you. I am very much on board with the idea that we are flawed and that we should take steps to minimize the impact of these flaws, even if those steps wouldn’t be necessary for a perfect Bayesian.
However, it isn’t at all apparent to me that your assumption is true. My intuition is that it wouldn’t make much of a difference. But this sounds like a great idea for a psychology/behavioral economics experiment!
The difference can be quite large. If we get the results first, we can come up with Fake Explanations why the masks were only 20% effective in the experiments where in reality they are 75% effective. If we do the prediction first, we wouldn’t predict 20% effectiveness. We wouldn’t predict that our experiment will “fail”. Our theory says masks are effective so we would predict 75% to begin with, and when we get the results it’ll put a big dent in our theory. As it should.
If the order doesn’t matter then it seems a kind of “accumulation of priors” should be possible. It is not obviously evident to me how the perfectness of the bayesian would protect it from this. That is for a given posterior and constant evidence there exists a prior that would give that conclusion. Normally we think of the limit where the amount and weight of the observations dominates. but there might be atleast a calculation where we keep the observation constant and more and more reflect on it, changing or adding new priors.
Then the result that a bayesian will converge on the truth with additional evidende flips to mean that any evidence can be made to fit a sufficiently complex hypothesis ie that with enough reflection there is asymptotic freedom of belief that evidence can’t restrain.
In the face of a very old and experienced bayesian allmost all things it encounters will shift its beliefs very little. If the beliefs were of unknown origin one might be tempted to assume that it would be stubborness of stupidity to not be open to evidence. If you know that you know it seems such stubborness might be justifiable. But how do you know whether you know? And what kind of error is being committed when you are understubborn?
I think you may be underestimating the impact of falsifying evidence. A single observation that violates general relativity, assuming we can perfectly trust its accuracy and rule out any interference from unknown unknowns—would shake our understanding of physics if it comes tomorrow, but had we encountered the very same evidence a century ago our understanding of physics would have already been shaken (assuming the falsified theory wouldn’t be replaced with a better one). To a perfect Bayesian, the confidence at general relativity in both cases should be equal—and very low. Because physics are lawful—the don’t make “mistakes”—we are the ones who are mistaken at understanding them, so a single violation is enough to make a huge dent no matter how many confirming evidence we have managed to pile up.
Of course, in real life we can’t just say “assuming we can perfectly trust its accuracy and rule out any interference from unknown unknowns”. The accuracy of our observations is not perfect, and we can’t rule out unknown unknowns, so we must assign some probability to our observation being wrong. Because of that, a single violating evidence is not enough to completely destroy the theory. And because of that, newer evidence should have more weight—our instruments keep getting better so our observations today are more accurate. And if you go far enough back you can also question the credibility of the observations.
Another issue, which may not apply to physics but applies to many other fields, is that the world does change. A sociology experiment form 200 years ago is evidence on society from 200 years ago, so the results of an otherwise identical experiment from recent years should have more weight when forming a theory of modern society, because society does change—certainly much more than physics change.
But to the hypothetical perfect Bayesian the chronology itself shouldn’t matter—all they have to do is take all that into account when calculating how much they need to update their beliefs, and succeeding to do so it doesn’t matter in which order they apply the evidences.
The act of a single falsification shatter the whole theory seems like a calculation where the prior just gets tossed. However in most calculations the prior still affects things. If you start from somewhere and then either don’t see or see relativistic patterns for 100 years and then see a relativity violation a perfect bayesian would not end with the same end belief. Using the updated prior or the ignorant prior makes a difference and the outcome is geniunely a different degree of belief. Or I guess another way of saying that is that if you suddenly gain access to the middle-time evidence that you missed it still impacts a perfect reasoner. Gaining 100 years worth of relativity pattern increases credence for relativity even if it is already falsified.
Maybe “destroying the theory” was not a good choice of words—the theory will more likely be “demoted” to the stature of “very good approximation”. Like gravity. But the distinction I’m trying to make here is between super-accurate sciences like physics that give exact predictions and still-accurate-but-not-as-physics fields. If medicine says masks are 99% effective, and they were not effective for 100 out of 100 patients, the theory still assigned a probability of 10−200 that this would happen. You need to update it, but you don’t have to “throw it out”. But if physics says a photon should fire and it didn’t fire—then the theory is wrong. Your model did not assign any probability at all to the possibility of the photon not firing.
And before anyone brings 0 And 1 Are Not Probabilities, remember that in the real world:
There is a probability photon could have fired and our instruments have missed it.
There is a probability that we unknowingly failed to set up or confirm the conditions that our theory required in order for the photon to fire.
We do not assign 100% probability to our theory being correct, and we can just throw it out to avoid Laplace throwing us to hell for our negative infinite score.
This means that the falsifying evidence, on its own, does not destroy the theory. But it can still weaken it severely. And my point (which I’ve detoured too far from) is that the perfect Bayesian should achieve the same final posterior no matter at which stage they apply it.
I think the word you are looking for is analysis. Consider the toy scenario: You observe two pieces of evidence:
A = B
B = C
Now, without gathering any additional evidence, you can figure out (given certain assumptions about the gears level working of A, B, and C) that A = C. Because that takes finite time for your brain to realize, it feels like a new piece of information. However, it is merely the result of analyzing the existing evidence to generate additional equivalent statements. Of course, those new ways of describing the territory can be useful, but they shouldn’t result in Baysean updates. Just like getting redundant evidence (eg 1. A = B 2. B = A) shouldn’t move your estimate further than just getting one bit of evidence.
I see what you mean. However, I don’t see how that would fit in a sentence like “The theoretical evidence made me update slightly towards X.”
Ah, but your brain is not a Bayes net! If it were a Bayes net your beliefs would always be in perfect synchrony with the data you’ve observed over time. Every time you observe a new piece of data, the information gets propagated and all of your beliefs get updated accordingly. The only way to update a belief would be to observe a new piece of data.
However, our brains are far from perfect at doing this. For example, I recently realized that the value side of the expected value equation of voting is crazy large. Ie. the probability side of the equation is the chances of your vote being decisive (well, for argument’s sake) and the value side is how valuable it is for your vote to be decisive. At $100/citizen and 300M citizens, that’s $30B in value. Probably much more IMO. So then, in a lot of states the EV of voting is pretty large.
This realization of mine didn’t come from any new data, per se. I already knew that there were roughly 300M people in the US and that the impact of my candidate being elected is somewhere in the ballpark of $100/citizen. I just hadn’t… “connected the dots” until recently. If my brain were a perfect Bayes net the dots would get connected immediately every time I observe a new piece of data, but in reality there are a huge amount of “unconnected dots”.
(What an interesting phenomena, having a lot of “unconnected dots” in your head. That makes it sound like a fun playground to explore.
And it’s interesting that there is a lot of intellectual work you can do without “going out into the world”. Not that you shouldn’t “go out into the world”, just that there is a lot you can do without it. I think I recall hearing that the ancient Greek philosophers thought that it was low-status to “go out into the world”. That was the job for lower class people. High class philosophers were supposed to sit in a chair and think.)
Another phrase for Theoretical Evidence or Instincts is No Evidence At All. What you’re describing is an under-specified rationalization made in an attempt to disregard which way the evidence is pointing and let one cling to beliefs for which they don’t have sufficient support. Zvi’s response wrt masks in light of the evidence that they aren’t effective butting up against his intuition that they are has no evidentiary weight. He was not acting as a curious inquirer, he was a clever arguer.
The point of Sabermetrics is that the “analysis” that baseball scouts used to do (and still do for the losing teams) is worthless when put up against hard statistics taken from actual games. As to your example, even the most expert basketball player’s opinion can’t hold a candle to the massive computational power required to test these different techniques in actual basketball games.
Theoretical evidence can be used that way, but it can also be used appropriately.