Many Weak Arguments vs. One Relatively Strong Argument
My epistemic framework has recently undergone some major shifts, and I believe that my current epistemic framework is better than my previous one. In the past, I tended to try to discover and rely on a single relatively strong argument in favor or against a position. Since then, I’ve come to the conclusion that I should shift my focus toward discovering and relying on many independent weak arguments. In this post, I attempt to explain why. After I posted this article, I got lots of comments in response, and responded to them in this discussion post.
My previous reliance on an individual relatively strong argument
I’m a mathematician by training, and by inclination. In the past, I tried to achieve as much certainty as possible when I’d evaluate an important question.
An example: Something that I’ve thought a lot about is AI risk reduction effort as a target for effective philanthropy. In the past, I attempted to discover a single relatively strong argument for, or against, focus on AI risk reduction. Such an argument requires a number of inputs. An example of an input is an argument as to what kind of AI one should expect to be built by default. I spent a lot of time thinking about this and talking with people about it. What I found was that my views on the question were quite unstable, altering frequently and substantially in response to incoming evidence.
The phenomenon of [my position altering frequently and substantially in response to incoming evidence] was not limited to AI risk. It was characteristic of much of my thinking about important questions that could not be answered with clear-cut evidence. I recognized this as bad, but felt that I had no choice in the matter — I didn’t see another way to think about such questions, and I thought that some such questions are sufficiently important so as to warrant focus. My hope was that my views on these questions would gradually stabilize, but this didn’t happen with the passage of time.
An alternative — reliance on many weak independent arguments
While my views on various questions were bouncing around, I started to notice that some people seemed to be systematically better at answering questions that could not be answered with clear-cut evidence, in the sense that new data supported their prior views more often than new data supported my own prior views.
This puzzled me, as I hadn’t thought that it was possible to form such reliable views on these sorts of questions with the evidence that was available. I noticed that these people didn’t seem to be using my epistemic framework, and I was unclear on what epistemic framework they were using.They didn’t seem to be trying to discover a relatively strong argument.
They sometimes gave weak arguments that seemed to me to be a product of the fundamental cognitive bias described in Eliezer’s article The Halo Effect and Yvain’s articles The Trouble with “Good” and Missing the Trees for the Forest. When a member of a reference class has a given feature, by default, we tend to assume that all members of the reference class have the same feature. Some of the arguments seemed to me sufficiently weak so that they should be ignored, and I didn’t understand why they were being mentioned at all.
What I gradually came to realize is that these people were relying on many independent weak arguments. If the weak arguments collectively supported a position, that’s the position that they would take. They were using the principle of consilience to good effect, obtaining a better predictive model than my own.
Many independent weak arguments: a case study
For concreteness, I’ll give an example of a claim that I believe to be true with high probability, despite the fact each individual argument that supports it is weak.
Claim: At the current margin, on average, majoring in a quantitative subject increases people’s expected earnings relative to majoring in other subjects.
The following weak arguments support this claim:
Weak argument 1: Historically, there’s been a correlation between majoring in a quantitative subject and making more money. Examining the table in a blog post by Bryan Caplan reveals that the common majors that are most strongly associated with high earnings are electrical engineering, computer science, mechanical engineering, finance, economics, accounting, and mathematics, each of which is a quantitative major.
Weak argument 2: Outside of medicine, law, and management, the most salient jobs that offer the high earnings are finance and software engineering, both of which require quantitative skills. Majoring in a quantitative major builds quantitative skills, and so qualifies one for these jobs.
Weak argument 3: Majoring in a subject with an abundance of intelligent people signals to employers that one is intelligent. IQ estimates by college major suggest that the majors with highest average IQ are physics, philosophy, math, economics, and engineering, most of which are quantitative majors. So majoring in a quantitative field signals intelligence. And employers want intelligent employees, so majoring in a quantitative subject increases earnings.
Weak argument 4: Studying a quantitative subject offers better opportunities to test one’s beliefs against the world than studying the humanities and social sciences does, because the measures of performance in quantitative subjects are more objective than those in humanities and social sciences. Thus, studying a quantitative subject raises one’s general human capital relative to what it would have been if one studied a softer subject.
Weak argument 5: Conventional wisdom is that majoring in a quantitative subject increases one’s expected earnings. If there were strong arguments against the claim, one might expect them to percolate into conventional wisdom, which they haven’t. In absence of evidence to the contrary, one should default to conventional wisdom.
Weak argument 6: I know many smart people who enjoy thinking, and who themselves know other many smart people who enjoy thinking. As Yvain discussed in Intellectual Hipsters and Meta-Contrarianism, smart people who enjoy thinking are often motivated to adopt and argue for positions opposed to conventional wisdom, in order to counter-signal intelligence. If the conventional wisdom concerning the subject at hand were wrong, one might expect some of the people who I know to have argued against it, and I’ve never heard them do so.
To verify that these arguments are in fact weak, I’ll give counterarguments against them:
Counterarguments to 1: Correlation is not causation. The people who major in quantitative subjects may make more money later on because they have higher innate ability, or because they have better connections on account of having grown up in households with higher socio-economic status, or for some other nonobvious reason.
Counterarguments to 2: It could be that one only needs to have high school level quantitative knowledge in order to succeed in these jobs.
Majoring in a quantitative field could reduce one’s ability to go to medical school or law school later on (e.g. on account of grading being more strict in quantitative subjects, and medical and law schools selecting students by GPA).
Counterarguments to 3: Potential employees may have other ways of signaling intelligence, so that college major is not so important. As above, majoring in a quantitative subject may lower GPA, resulting in sending a signal of low quality.
Counterarguments to 4: It could be that earnings don’t depend very much on one’s intellectual caliber. For example, maybe social connections matter more than intellectual caliber, so that one should focus on developing social connections. The heavy workload of a quantitative major could hinder this.
Counterarguments to 5: Conventional wisdom is often wrong. Conventional wisdom on this subject is likely rooted in the correlation between majoring in a quantitative subject and having higher earnings, and as discussed in the counterarguments to 1, correlational evidence is weak.
Counterarguments to 6: There are many, many issues on which one can adopt a meta-contrarian position, and meta-contrarians only discuss a few of these, because there are so many of them. Also, “Smart people who like to think” could, for some unknown reason, collectively be motivated to believe the claim.
In view of these counterarguments, how can one be confident in the claim?
First off, I’ll remark that the counterarguments don’t suffice to refute the individual arguments, because the counterarguments aren’t strong, and there are counterarguments against them.
But there are counterarguments to the counterarguments as well. In view of this, one might resign oneself to a position of the type “it may or may not be the case that the claim is true, and it’s hopeless to decide whether or not it is.” Eight years ago, this was how I viewed most claims concerning the human world. In Yvain’s words, I was experiencing epistemic learned helplessness.
It’s not uncommon for mathematicians to hold this position on claims concerning the human world. Of course there are instances of mathematicians using several lines of evidence to arrive at a conclusion in absence of a rigorous proof. But the human world is much messier and more ambiguous than the mathematical world. The great mathematician Carl Friedrich Gauss wrote
There are problems to whose solution I would attach infinitely greater importance than to those of mathematics, for example touching ethics, or our relation to God, or concerning our destiny and our future; but their solution lies wholly beyond us and completely outside the province of science.
Gauss’s quotation doesn’t directly refer to prosaic epistemic questions about the human world, but one could imagine him having such a view toward these questions, and even if not, I’ve heard a number of mathematicians express such a view on questions that cannot be answered with clear-cut evidence.
This not withstanding, my current position is that one can be confident in the claim, not with extremely high confidence (say, the level of confidence that Euler had in the truth of the product formula for the sine function), but with confidence at the ~90% level, which is high enough to be actionable.
Why? The point is that the arguments in favor of the claim are, like Euler’s arguments, largely independent of one another. This corresponds to the fact that the counterarguments are ad hoc and un-unified. The situation is analogous to Carl Sagan’s “Dragon in My Garage” parable. In order to refute all of the arguments via the counterarguments, one needs to assume that all the counterarguments succeed (or other counterarguments succeed), and the counterarguments are pretty independent. If one assumes that for each argument, the counterarguments overpower the argument with probability 50%, and the counterarguments’ successes are independent, the probability that they all succeed is ~1.5%.
The counterarguments are not independent — for example, the point about majoring in a quantitative subject lowering GPA appears twice. So I don’t think that one can be too confident in the conclusion. But the existence of many independent weak arguments suffices to rescue us from epistemic paralysis, and yield an actionable conclusion.
The “single relatively strong argument” approach to the claim in the case study above
The “single relatively strong argument” approach to assessing the above claim is to try to synthesize as many of the above weak arguments and counterarguments as possible, into a single relatively strong argument.
[Added: Kawoomba’s comment realize that the above sentence wasn’t clear. The point is that in focusing on a single strong argument to the exclusion of other arguments, one is implicitly rejecting the weak arguments, and so doing so constitutes an implicit attempt to synthesize the evidence. The sort of thing that I have in mind here is to say “Correlation is not causation. Conventional wisdom is probably rooted in mistaking correlation for causation. Therefore we should ignore conventional wisdom in formulating our relatively strong argument.”]
If I were to try to do this, it would look something like this:
Based on what people and employers say, it appears that many of the high paying jobs in our society require some quantitative skills. It’s unclear how much quantitative skill one needs to do these jobs. But presumably one needs some.
People who are below this threshold may be able to surpass it by majoring in a quantitative subject, and thereby get higher earnings.
Even if one does surpass this threshold, majoring in a quantitative subject may not suffice to signal to employers that one is above that threshold, if the noise to signal ratio is high. But it may not be necessary to get a job that requires quantitative skills right out of college, in order to get high earnings from building quantitative skills in college — it might be possible for an employee to “work his or her way up” to a position that uses quantitative skills, and profit as a result.
It might appear as though people who are already above this threshold wouldn’t get higher earnings from majoring in a quantitative subject. But employers may not be able to tell that potential employees have quantitative skills unless they major a quantitative subject. (Note that if this is true, it suggests that the concern in the previous paragraph is less of an issue. However, it could still be an issue, because different levels of quantitative skills are required to get different jobs, so that the level that employees need to signal is not homogenous). This pushes in favor of majoring in a quantitative subject. People above the threshold may also benefit in majoring in a quantitative subject because it signals intelligence, which is considered to be desirable, independently of the specific quantitative skills that a potential employee has acquired.
It’s necessary to weigh these considerations against the fact that quantitative majors tend to be demanding, leaving less time for other activities, and are harder to get good grades in. Thus, majoring in a quantitative subject involves a tradeoff, the value of which will vary from individual to individual, depending on his or her skills, potential areas of work, and the criteria that graduate schools and employers use to select employees.
Major weaknesses of the “single relatively strong argument” approach
The above argument has some value, and I imagine that a college freshman would find it somewhat useful. But it seems less helpful than the list of weak arguments, together with the most important counterarguments, given earlier in this post. The argument in the previous section doesn’t clearly demarcate the different lines of evidence, and inadvertently leaves out some of the lines of evidence (because some of the lines of evidence don’t easily fit into a single framework).
These problems with using the “single relatively strong argument” approach are closely related to my past unstable epistemology. Because the “single relatively strong argument” approach doesn’t clearly demarcate the different lines of evidence, when a user of the approach gets new counter-evidence that’s orthogonal to the argument, he or she has to rethink the entire argument. Because the “single relatively strong argument” approach leaves out some lines of evidence, it’s less robust than it could be.
A priori, one could imagine that these things wouldn’t be a problem in practice: if the relatively strong argument were true with sufficiently high probability, then it would be unlikely that one would have to completely rethink things in the face of incoming evidence, and it wouldn’t be so important that the argument doesn’t incorporate all of the evidence.
My experience is that this situation does not prevail in practice. One theoretical explanation for this is analogous to a point that I made in my post Robustness of Cost-Effectiveness Estimates and Philanthropy:
A key point that I had missed when I thought about these things earlier in my life is that there are many small probability failure modes, which are not significant individually, but which collectively substantially reduce [the probability that the argument is correct]. When I encountered such a potential failure mode, my reaction was to think “this is very unlikely to be an issue” and then to forget about it. I didn’t notice that I was doing this many times in a row.
This applies not only to cost-effectiveness, but also to the accuracy of individual relatively strong arguments. Relatively strong arguments in domains outside of math and the hard scientists are often much weaker than they appear. The phenomenon of model uncertainty is pronounced.
The points in this section of the post are in consonance with a claim of Philip Tetlock’s in Expert Political Judgment: How Good Is It? How Can We Know?:
Tetlock contends that the fox — the thinker who knows many little things, draws from an eclectic array of traditions, and is better able to improvise in response to changing events — is more successful in predicting the future than the hedgehog, who knows one big thing, toils devotedly within one tradition, and imposes formulaic solutions on ill-defined problems.
A sample implication: a change in my attitude toward Penrose’s beliefs about consciousness
An example that highlights my shift in epistemology is the shift in my attitude concerning Roger Penrose’s beliefs about consciousness.
[Edit: Eliezer’s comment and Vaniver’s comment made me realize that the connection between this example and the rest of my post is unclear. The shift in my attitude toward Penrose’s beliefs about consciousness isn’t coming from my shift toward using the principle of consilience. I agree that the arrow of consilience points against Penrose’s beliefs. The shift in my attitude is coming from the shift from “give weight to arguments that stand up to scrutiny” to “give weight to all arguments with a nontrivial chance of being right, even the ones that don’t seem to hold up to scrutiny.”]
In The Emperor’s New Mind (1989), he argues that known laws of physics are inadequate to explain the phenomenon of consciousness. Penrose proposes the characteristics this new physics may have and specifies the requirements for a bridge between classical and quantum mechanics (what he calls correct quantum gravity). […] Penrose believes that such deterministic yet non-algorithmic processes may come into play in the quantum mechanical wave function reduction, and may be harnessed by the brain. He argues that the present computer is unable to have intelligence because it is an algorithmically deterministic system. He argues against the viewpoint that the rational processes of the mind are completely algorithmic and can thus be duplicated by a sufficiently complex computer. This contrasts with supporters of strong artificial intelligence, who contend that thought can be simulated algorithmically. He bases this on claims that consciousness transcends formal logic because things such as the insolubility of the halting problem and Gödel’s incompleteness theorem prevent an algorithmically based system of logic from reproducing such traits of human intelligence as mathematical insight.
I believe that Penrose’s views about consciousness are very unlikely to be true:
I subscribe to reductionism, and I don’t think that a present computer is unable to have intelligence, according to any reasonable definition of intelligence.
This invocation of Godel’s incompleteness theorem seems to be a non-sequitur, and has been criticized by many mathematicians.
Max Tegmark did a calculation calling into question the physics part of Penrose’s argument.
I don’t know anybody who shares Penrose’s view on consciousness, and the fraction of all scientists who agree with Penrose’s view appears to be tiny.
But Penrose isn’t a random crank. Penrose is one of the greatest physicists of the second half of the 20th century. He’s a far deeper thinker than me, and for that matter, a far deeper thinker than anybody who I’ve ever met.
I have several relatively strong arguments against Penrose’s views on consciousness. Collectively, they’re significantly stronger than the moderately strong argument “great physicists are often right.” In the past, I would have concluded “…therefore Penrose is wrong.”
But it’s not rational to ignore the moderately strong argument that supports Penrose’s views. The chance of the argument being right is non-negligible. I should give nontrivial credence to Penrose’s views on consciousness having substance. Maybe at least some of Penrose’s ideas about consciousness are sound, and that the reason that they seem tenuous is that he’s expressed his ideas poorly, or they’ve been misquoted. Maybe there’s some other way to reconcile the hypothesis that his views are sound, with the evidence against this, that I haven’t thought of.
If I were using my previous epistemic framework, my world view could be turned upside down by a single conversation with Penrose. If I were using my previous epistemological framework, I would be subject to confirmation bias, using my conclusion “…therefore Penrose is wrong” as overly strong evidence against the claim “great physicists are often right,” which I was unwarrantedly ignoring from the outset.
End notes
Retrospectively, it makes sense that there are people who are substantially better than I had been at reasoning about questions that I thought inherently near-impossible to think about.
Acknowledgements: I thank Luke Muehlhauser, Vipul Naik, Nick Beckstead, and Laurens Gunnarsen for useful suggestions for what to include in the post, as well as helpful comments on an earlier draft. I’m indebted to and grateful to Holden Karnofsky at GiveWell for his insights, as well as GiveWell, which offered me the opportunity to think about hard epistemic questions that can’t be answered with clear-cut evidence. Both of these helped me recognize the core thesis of this post.
Note: I formerly worked as a research analyst at GiveWell. All views expressed here are my own.
- A practical guide to long-term planning – and suggestions for longtermism by 10 Oct 2021 15:37 UTC; 140 points) (EA Forum;
- A case for the effectiveness of protest by 29 Nov 2021 11:50 UTC; 123 points) (EA Forum;
- Model Combination and Adjustment by 17 Jul 2013 20:31 UTC; 102 points) (
- Common sense as a prior by 11 Aug 2013 18:18 UTC; 56 points) (
- Rationality is about pattern recognition, not reasoning by 26 May 2015 19:23 UTC; 45 points) (
- Five Ways to Handle Flow-Through Effects by 28 Jul 2016 3:39 UTC; 43 points) (EA Forum;
- Important fact about how people evaluate sets of arguments by 14 Feb 2023 5:27 UTC; 33 points) (
- What are words, phrases, or topics that you think most EAs don’t know about but should? by 21 Jan 2020 20:15 UTC; 30 points) (EA Forum;
- 1 Nov 2019 4:12 UTC; 30 points) 's comment on EA Hotel Fundraiser 5: Out of runway! by (EA Forum;
- “Can we know what to do about AI?”: An Introduction by 9 Jul 2013 18:22 UTC; 28 points) (
- Is there a good place to find the “what we know so far” of the EA movement? by 29 Sep 2019 8:42 UTC; 25 points) (EA Forum;
- A personal history of involvement with effective altruism by 11 Jun 2013 4:49 UTC; 25 points) (
- An epistemology for effective altruism? by 21 Sep 2014 21:46 UTC; 22 points) (EA Forum;
- Sequence thinking vs. cluster thinking by 25 Jul 2016 10:43 UTC; 17 points) (EA Forum;
- Many Weak Arguments and the Typical Mind by 6 Jun 2013 18:52 UTC; 14 points) (
- Advanced Placement exam cutoffs and superficial knowledge over deep knowledge by 1 Sep 2013 21:01 UTC; 12 points) (
- A Case Against Strong Longtermism by 2 Sep 2022 16:40 UTC; 10 points) (EA Forum;
- 22 May 2022 17:04 UTC; 10 points) 's comment on Impact is very complicated by (EA Forum;
- Model Stability in Intervention Assessment by 6 Jun 2013 23:24 UTC; 10 points) (
- Some clarifications concerning my “many weak arguments” post by 7 Jun 2013 19:34 UTC; 8 points) (
- 8 Sep 2013 22:34 UTC; 7 points) 's comment on High School, Human Capital, Signaling and College Admissions by (
- 10 Aug 2013 20:07 UTC; 7 points) 's comment on Common sense as a prior by (
- 2 Dec 2021 12:36 UTC; 6 points) 's comment on A case for the effectiveness of protest by (EA Forum;
- A personal history of involvement with effective altruism by 12 Jun 2013 4:00 UTC; 6 points) (EA Forum;
- 27 Jul 2013 17:01 UTC; 6 points) 's comment on Making Rationality General-Interest by (
- 26 Jun 2013 1:07 UTC; 4 points) 's comment on A personal history of involvement with effective altruism by (
- 27 Jul 2022 23:53 UTC; 4 points) 's comment on Daniel Kokotajlo’s Shortform by (
- Carbon dioxide, climate sensitivity, feedbacks, and the historical record: a cursory examination of the Anthropogenic Global Warming (AGW) hypothesis by 8 Jul 2014 1:58 UTC; 4 points) (
- 14 Feb 2023 19:34 UTC; 4 points) 's comment on Important fact about how people evaluate sets of arguments by (
- 20 Feb 2015 18:38 UTC; 3 points) 's comment on Innate Mathematical Ability by (
- 27 Jun 2013 22:47 UTC; 3 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 19 Aug 2020 3:21 UTC; 2 points) 's comment on MichaelStJules’s Quick takes by (EA Forum;
- 11 Aug 2013 20:19 UTC; 2 points) 's comment on Common sense as a prior by (
- 8 Jun 2013 1:32 UTC; 0 points) 's comment on Many Weak Arguments and the Typical Mind by (
- 6 Jun 2013 20:07 UTC; 0 points) 's comment on Will the world’s elites navigate the creation of AI just fine? by (
- 11 Jul 2013 22:03 UTC; 0 points) 's comment on Svante Arrhenius’s Prediction of Climate Change by (
- 11 Sep 2013 17:47 UTC; 0 points) 's comment on High School, Human Capital, Signaling and College Admissions by (
- 11 Sep 2013 7:19 UTC; 0 points) 's comment on High School, Human Capital, Signaling and College Admissions by (
- 6 Jun 2013 22:06 UTC; -1 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
- 7 Jun 2013 6:14 UTC; -1 points) 's comment on Tiling Agents for Self-Modifying AI (OPFAI #2) by (
Let’s give this a try.
Claim: Relying on few strong arguments is more reliable than relying on many weak arguments.
Motivated reasoning is a bigger risk when dealing with weak arguments, since it is relatively easy to come up with weak arguments on the side that you favor, but it is hard to make an argument rigorous just because you want it to be true. It also seems easier to ignore various weak arguments on the other side (or dismiss them as not even worth considering) than to dismiss a strong argument on the other side.
Selection effects will tend to expose you to more weak arguments on one side of an issue; e.g. if you are surrounded by Blues then you will be exposed to lots of weak arguments in favor of Blue positions, and few arguments in favor of Green positions. A person in this Blue-slanted situation has a better chance of finding their way into the pro-Green camp on an issue if they ignore the argument count and instead only compare the strongest pro-Blue argument that they have seen with the strongest pro-Green argument that they have seen (or, even better, the steel-manned version).
The 80⁄20 rule: in many domains, a small fraction of the things carry a large portion of the weight, and a useful heuristic is to focus on that small fraction (e.g., the 20% of effort that produces 80% of the results). Which suggests that, in this domain, the strongest few arguments will carry most of the evidential weight on an issue, and the long tail of weak arguments will not matter much.
Nonindependence: a set of arguments on a given issue are rarely independent; arguments which share a conclusion often have strong (and perhaps hidden) dependencies and interrelationships. For example, a large fraction of the set of arguments may all rely on the same methodology, or come from the same group of people, or be (perhaps indirect) consequences of a single piece of evidence, or share a single auxiliary assumption. So a set of seemingly independent arguments often provides less evidence than it appears.
Argument structure: the structure of a complex argument is often important but neglected, and it is not accounted for by listing simple points in favor of each side. To take one example, the claim IF (A or B or C or D or E) THEN Z has a very different structure from the claim IFF (A & B & C & D & E) THEN Z, but moderate evidence against D would appear similarly as “a weak argument against the claim” in both cases. Making a strong argument requires engaging with the structure of the argument.
I can already see some counters to these arguments (and some counters to those counters), but I suspect it would be more useful to have a list of arguments on the other side in the same format to compare these with.
Thanks for these thoughts.
Broadly, my reaction is that there’s no royal road to rationality: one has to make judgments on a case by case basis. I haven’t shifted over to using many weak arguments rather than a few strong ones in all instances.
If nothing else, my post shows that:
It’s possible to justifiably have high confidence in a position based on many weak arguments, when there are no strong arguments on the other side.
I was making the mistake of completely ignoring certain pieces of weak evidence when I should have been giving them some weight.
It’s very difficult to come up with truly independent arguments. The examples given aren’t even close. WA1 is that people majoring in a quantitative subject in general have higher earnings. WA2 is that a specific subsets of jobs that require a quantitative subject major have high earnings. Clearly, these are not independent.
I replied here.
That is not nonindependent, i.e., if that were nonindependent, a human being would be incapable of ever giving independent arguments.
The kinds of examples I had in mind with that phrase: 1) a bunch studies have been published which each provide some support for claim X, from a variety of different angles, but they were almost all conducted by the same group of 4 researchers. 2) You don’t know much about nutrition and then read a book by Gary Taubes; now you have a lot of arguments in favor of low carb diets.
The general pattern here is that the object-level evidence (e.g., the findings of each particular study, or the content of each particular Taubes argument) does not entirely screen off the source. There are various pieces of information which you could potentially learn about the 4 researchers or about Taubes which would weaken your confidence in the whole set of arguments.
Better claim: “In the absence of a coherent strong argument, the consideration of many weak arguments is expected to tend toward accurate conclusions.”
Wrong. Moderate evidence against D is moderate evidence against (A & B & C & D & E).
This is a messy subject, and one that’s difficult write about, and I appreciate you tackling the topic. I think there are some important qualifications to make about this post, as others have noted. But I know that when writing about messy subjects, it’s hard to avoid “death by a thousand qualifications.” Lately, I’ve been trying to solve the problem by putting most qualifications in footnotes, e.g. here. You might want to try that, as it mitigates criticisms of the “but you didn’t make qualifications X and Y!” form while still leaving the body text in a readable condition.
Below, I’ll refer to MWA (“many weak arguments”) and ORSA (“one relatively strong argument”), for convenience.
Here’s my guess at what’s going on:
Probability theory doesn’t “intrinsically favor” MWA over ORSA. Both have their uses, their limits, and their “gotchas” when applied to bounded rationality. If MWA is in some important sense “reliably better” than ORSA, I’d need stronger evidence/argument than is provided in this post. (That’s not necessarily a criticism of the post; putting together strong evidence+argument about messy subjects is difficult and time-consuming.)
For historical reasons, Less Wrong tended to attract people accustomed to ORSA thinking — which is common in e.g. mathematics and philosophy. Hence, LWers tend to make “too much reliance on ORSA”-type mistakes more often than they make the “too much reliance on MWA”-type mistakes.
Givewell tends to emphasize the MWA approach, and has been remarkably successful at figuring out the parts of the world they’re trying to understand. This impressed you, and helped you to realize that, like many mathematicians, you had been placing too much emphasis on ORSA thinking.
Thanks for the feedback.
My hunch is that most significant problem with the MWA approach is the assumption of (weak) independence, in the sense that in practice, when sophisticated use of MWA fails, it’s usually because the weak lines of evidence are all being driven by the same selection effect. A hypothetical example that jumps to mind is:
A VC is evaluating a startup. He or she reasons
The sector is growing
My colleagues think that the sector is good to invest in
On an object level, their plan looks good
The people are impressive
and the situation is
Re: #1 — The reason that the sector is growing is because there’s a bubble
Re: #2 — The reason that the VC’s colleagues think that the sector is good to invest in is because, like the VC, they don’t recognize that there’s a bubble.
Re: #3 — The VC’s views on the object level merit of the project are colored by the memes that have been spreading around that are causing the bubble
Re: #4 — The reason that impressive people are going into the sector is because there’s a bubble, so everyone’s going into the sector – the people’s impressiveness isn’t manifesting itself in their choosing a good focus.
I don’t know whether this situation occurs in practice, but it seems very possible.
GiveWell is an interesting case, insofar as it’s done more ORSA work than I’ve seen in most contexts. The page on long lasting insecticide treated nets provides examples. Part of why I’m favoring MWA is because GiveWell has done both and of the two, leans toward MWA.
This is a great example. It’s often very hard to tell whether MWA are independent or not. They could all derive from the same factors. Or they could all be made up by the same type of motivated reasoning.
I think that’s the judgment of being a good “Fox” ala Tetlock’s Hedgehog vs the Fox.
Do you know ORSA that gets you out of this situation?
Which situation? The VC startup thing? The OSRA style is to scrutinize arguments carefully to see if they break down. If you can recognize that the arguments all break down in the same way, then you can conclude that the arguments are dependent and that even collectively, they don’t constitute much evidence.
But that still doesn’t tell you whether to invest in the startup. If an ORSA-er is just paralyzed by indecision here and decides to leave VC and go into theoretical math or whatever, he or she is not really winning.
Unrelatedly, a fun example of MWA triumphing over ORSA could be geologists vs. physicists on the age of the Earth.
I would guess that ORSA doesn’t suffice to be a successful VC. The claim is that it could help, in conjunction with MWAs.
If you scrutinize the weak arguments and find that the break down in different ways, then that suggests that the arguments are independent, and that you should invest in the start-up. If you find that they break down in the same way, then that suggests that you shouldn’t invest in the start-up.
I’d definitely be interested to learn (in more detail, with more examples) why this is. They may very well have good reasons for it.
Could you (or someone else) possibly give some examples of this? This seems like it’s probably true but I’m having trouble thinking of concrete examples. I want to know the nature of the bias I should be compensating for.
It’s true that both MWA and ORSA break down badly in certain, non overlapping, contexts.
I currently believe that MWA generally produces better predictive models about the human world than ORSA does. The context of the human world is a special one, and I would expect ORSA to systematically be better in some natural (not necessarily mathematical) contexts.
I believe that ORSA does outperform MWA in certain situations involving the human world (c.f. the remarks in my other comment about bubbles).
Penrose is a worrisome case to bring as an example, since he is in fact wrong, and therefore you’re giving an example where your reasoning leads to the wrong conclusion. If you can’t easily find examples where your reasoning led you to a new correct conclusion instead of new sympathy toward a wrong conclusion, this is worrisome. In general, I tend to flag recounts of epistemological innovations which lead to new sympathy toward a wrong conclusion, as though the one were displaying compassion for a previously hated enemy, for in epistemology this is not virtue.
The Penrose example worries me for other reasons as well, namely it seems like it would be possible to generate hordes and hordes of weak arguments against Penrose; so it’s as if because the argument against Penrose is strong, you aren’t bothering to try to generate weak arguments; reading this feels like you now prefer weak arguments to strong arguments and don’t try to find the many weak arguments once you see a strong argument, which is not good Bayesianism.
You also claim there’s a strong argument for Penrose, namely his authority (? wasn’t this the kind of reasoning you were arguing against trusting?) but either we have very different domain models here, or you’re not using the Bayesian definition of strong evidence as “an argument you would be very unlikely to observe, in a world where the theory is false”. What do you think is the probability of at least one famous physicist writing a widely panned book about the noncomputability of human consciousness, in a world where consciousness is computable? I should not call it very low, and that means that the pure argument from authority, if you don’t believe the actual specifics of that argument, is Bayesian evidence with a low likelihood ratio or as it would be commonly termed a ‘weak argument’.
JonahSinick is not saying that Penrose is right, only that based on his heuristic he adjusted the probability of that upwards. To judge this wrong, it’s not enough to know that Penrose is wrong, you must also know the probability estimates JonahSinick gave before and after. To give an absurd example, if JonahSinick used to believe the probability was 10^(-15), he would be wise to adjust upwards.
By the way, this isn’t the first time I see you use the meta-heuristic that when a heuristic adds support to a wrong conclusion it should be taken less seriously. While it is valid to some extent, I think you are overusing it.
Responses below. As a meta-remark, your comment doesn’t steelman my argument, and I think that steelmanning arguments helps keep the conversation on track, so I’d appreciate it if you were to do so in the future.
The point of the example is that one shouldn’t decisively conclude that Penrose is wrong — one should instead hedge.
Perhaps a relevant analogy is that of the using seat belts to guard against car accidents — one shouldn’t say “The claim that I’m going to get into a potentially fatal car accident is in fact wrong, so I’m not going to wear seat belts.” You may argue that the relevant probabilities are sufficiently different so that the analogy isn’t a good one. If so, I disagree.
There are many such examples. My post extended to a length of eight pages without my going into them, and I wanted to keep the post to a reasonable length. I’m open to the possibility of writing another post with other examples. The reason that I chose the Penrose example is to vividly illustrate the shift in my epistemology.
One would expect this sort of thing to sometimes happen by chance in the course of updating based on incoming evidence. So I don’t share your concern.
I can see how the example might seem disconsonant with my post, and will consider revising the post to clarify. [Edit: I did this.] The point that I intended to make is that I was previously unknowingly ignoring certain nontrivial weak lines of evidence, on the grounds that they weren’t strong enough, and that I’ve recognized this, and have been working on modifying my epistemological framework accordingly.
I don’t think that the hordes and hordes of weak arguments that you refer to are collectively strong enough to nullify the argument that one should trust Penrose because he’s one of the greatest physicists of the second half of the 20′th century.
I don’t remember arguing against trusting authority above – elaborate if you’d like.
I wasn’t saying that one should give nontrivial credence to Penrose’s views based on his authority. I was saying that one should give nontrivial credence to Penrose’s views based on the fact that he’s a deeper thinker than everybody who I know (in the sense that his accomplishments are deeper than anything that anyone who I know has ever accomplished).
Something has gone severely wrong with the ‘steelman’ concept if it is now being used offensively, to force social obligations onto others. This ‘meta-remark’ amounts to a demand that if JonahSinick says something stupid then it is up to others to search related concept space to find the nearest possible good argument for a better conclusion and act as if Jonah had said that instead of what he actually said. That is an entirely unreasonable expectation of his audience and expecting all readers to come up with what amounts to superior content than the post author whenever they make a reply is just ridiculously computationally inefficient.
I have a known problem with this (Anna Salamon told me so, therefore it is true) so Jonah’s remark above is a priori plausible. I don’t know if I can do so successfully, but will make an effort in this direction.
(It’s true that what Jonah means is technically ‘principle of charity’ used to interpret original intent, not ‘steelman’ used to repair original intent, but the principle of charity says we should interpret the request above as if he had said ‘principle of charity’.)
:-)
No offense intended :-)
Request, not force
My remark that steelmanning keeps the discussion on track is genuine in intention. I agree that norms for steelmanning could conceivably become too strong for efficient discourse, but I think that at the margin, it would be better if people were doing much more steelmanning.
I think the concept you’re looking for is the principle of charity. Steel man is what you do to someone else’s argument in order to make sure yours is good, after you’ve defeated their actual argument. Principle of charity is what you do in discourse to make sure you’re having the best possible discussion.
If you think Eliezer should have steelmanned your argument then you think he has already defeated it—before he even commented!
I guess I didn’t mean that he didn’t steelman my argument, I meant that he didn’t steelman the things that he was objecting to. For example, he could have noted that I did give an example of the type that he seems to have been looking for, rather than focusing on the fact that the Penrose example isn’t of the type that he was looking for. I agree that there’s substantial overlap between this and the principle of charity.
It does make for higher quality discussions, especially when posters who command a larger audience are involved. Let’s also assume that Jonah knows his shizzle, and that if he wrote something which seems stupid at first glance, he may have merely used an unfortunate phraseology. Where’s the fun in shooting down the obvious targets, most readers can do so themselves. Rather skip to the subtle disagreements deep down, where true domina… where more refined and non-obvious counters may be revealed for the readers’ benefit.
As one of those readers I would prefer not to have to. I appreciate the effort others put into keeping the garden well tended and saving me the trouble of reading low quality material myself.
Eliezer’s reply is the kind of reply that I want to see more of. I strongly oppose shaming ‘requests’ used to discourage such replies.
Personally I found the quantitative majors example a very vivid introduction to this style of argument, and much more vivid than the Penrose example. I think the quantitative majors does a very good job of illustrating the kind of reasoning you are supporting, and why it is helpful. I don’t understand the relevance of many weak arguments to the Penrose debate—it seems like a case of some strong and some weak arguments vs. one weak argument or something. If others are like me, a different example might be more helpful.
In hindsight, my presentation in this article was suboptimal. I clarify in a number of comments on this thread.
The common thread that ties together the quantitative majors example and the Penrose example is “rather than dismissing arguments that appear to break down upon examination, one should recognize that such arguments often have a nontrivial chance of succeeding owing to model uncertainty, and one should count such arguments as evidence.”
In the case of the quantitative majors example, the point is that you can amass a large number such arguments to reach a confident conclusion. In the Penrose example, the point is that one should hedge rather than concluding that Penrose is virtually certain to be wrong.
I can give more examples of the use of MWAs to reach a confident conclusion. They’re not sufficiently polished to post, so if you’re interested in hearing them, shoot me at email at jsinick@gmail.com.
Perhaps “hedging” is another term that also needs expanding here. One can reasonably assume that Penrose’s analysis has some definite flaws in it, given the number of probable flaws identified, while still suspecting (for the reasons you’ve explained) that it contains insights that may one day contribute to sounder analysis. Perhaps the main implication of your argument is that we need to keep arguments in our mind in more categories then just a spectrum from “strong” to “weak”. Some apparently weak arguments may be worth periodic re-examination, whereas many probably aren’t.
It’s not at all clear to me why this is the case. The argument you give, as I understand it, is “weak arguments, if independent, add nonlinearly instead of linearly, and so we can’t safely ignore weak arguments.”* But in the case of Penrose, you have a weak argument in his favor (he’s really clever), and many strong arguments against him, of which several are independent. The arrow of consilience points against Penrose, and so you should update against Penrose if you’ve gained a new respect for consilience.
*The argument that we shouldn’t ignore arguments because they are below some evidence threshold, to me, falls under “proper epistemic hygiene” and so doesn’t seem novel or need to be justified.
It appears that I didn’t express myself clearly as well as I would have liked. Thanks for pointing this issue out.
My current epistemological framework is “give weight to all arguments, even the (non-negligibly) weak ones.” My prior epistemological framework had been “give weight to all arguments that stand up to scrutiny.” I agree that the arrow of consilience points against Penrose. My update is coming from the change “give weight to arguments that don’t stand up to scrutiny.”
I added an edit to my post explaining this.
I don’t think that “Penrose is really clever” is an accurate description of my argument. Lots of people are really clever. I know hundreds of mathematicians who are really clever. Penrose is on a much higher level.
I’m not sure we’re using ‘scrutiny’ in the same way. One potential usage is “if I can think of a counterargument, I can exclude that argument from my analysis,” which is one I don’t endorse and it sounds like you no longer endorse.
What I think scrutiny is useful for is determining the likelihood ratio of an argument. To use the first argument given in support for the quantitative major, you might estimate the likelihood ratio to be, say, 2:1 in support, and then after correcting for the counterargument of native ability, estimate the effect to be 3:2 in support. (Previously, this would look like revising the 2:1 estimate down to a 1:1 estimate.)
And so in the Penrose example, his suggestion that quantum effects might have something to do with consciousness is, say, 10:1 evidence in favor, because of your esteem for Penrose’s ability to think. But when Tegmark comes along and runs the numbers, and finds that it doesn’t pan out, I would revise that down to the neighborhood of 101:100. Lots of smart people speculate things could be the case, and then the math doesn’t work out.
And so if you have a precise mathematical model of scrutiny, you can incorporate this evidence together without having to deal with rules of thumb like “give weight to arguments that don’t stand up to scrutiny,” which Eliezer is rightly complaining will often lead you astray.
We’re using different standards for cleverness, but the reason I worded things that way is because everyone has access to the same logic. Penrose’s intuitions are much more honed than yours in particular areas, and so it’s reasonable to use his intuitions as evidence in those areas. But the degree that his intuitions are evidence depends on his skill in that particular area, and if he’s able to articulate the argument, then you can evaluate the argument on its own, and then it doesn’t matter who made it. I’m reminded of the student who wrote to Feynman complaining that she got a test question wrong because she followed his book, which contained a mistake. Feynman responded with “yep, I goofed, and you goofed by trusting me. You should have believed your teacher’s argument, because it’s correct.”
Yes. I wasn’t literally discarding arguments whenever I thought of counterarguments, but I strongly tended in that direction, and I don’t endorse this.
I think that these likelihood ratios are too hard to determine with such high precision.
Metaphorically, I agree with this, my skepticism about determining precise numerical estimates not withstanding.
The confidence level in the range of ~ 0.5% sounds about right, up to an order of magnitude in either direction. The issue was that I was implicitly discarding that probability entirely, as if it it was sufficiently small so that it should play no role whatsoever in my thinking.
As far I know, Penrose hasn’t fully retracted his position. If so, this should be given some weight.
I don’t think that it’s fruitful to numerically quantify things in this way, because I think that the initial estimates are poor, and that making up a number makes epistemology worse rather than better, because of anchoring biases. Certainly when I myself have tried to do this in the past, I’ve had this experience. But maybe I just haven’t seen it done right.
My impression from Eliezer’s comment is that he’s implicitly reasoning in the same way that I was (discarding arguments that have ~ 1% probability of being true, as if they were too unlikely for it to be worthwhile to give any weight to.)
I think that the difference is significant. There’s a dearth of public knowledge concerning the depth of the achievements of the best mathematicians and physicists (as well a sa dearth of public knowledge as to who the best mathematicians and physicists are). I think that the benefits to people’s epistemology if they appreciated this would be nonnegligible.
Here again lies the key point of contention. The point is that there’s a small but non-negligible probability that Penrose isn’t able to articulate the argument despite attempting to do so, or that he communicates under bad implicit assumptions about the language that his readers think in, or there’s another possibility that I haven’t thought of that’s consistent with his views being sound.
I’m certainly not saying that one should believe Penrose’s views with 50+% probability (the level of confidence that the student in the story seems to have had). I’m saying that one should give the possibility enough credence so that one’s world view isn’t turned upside down if one learns that one of the hypotheticals that I give above prevails.
My claim is that “the chance that classical computers aren’t capable of intelligence is negligible” is an inferior epistemic position to “it seems extremely likely that classical computers are capable of intelligence, but Roger Penrose is one of the greatest scientists of the 20th century, has thought about these things, and disagrees, so one could imagine believing otherwise in the future.”
Aside from what I say in my other comment, I’ll also remark that the “majoring in a quantitative subject increases expected earnings” case study provides an example of the type that you seem to be looking for. It’s doesn’t provide an example of a shift from a belief to its negation, but it provides an example from a state of high epistemic uncertainty to reasonably high confidence. This is significant.
Strong evidence against Penrose’s conclusion would necessarily have to be strong (or at least moderate) counterarguments to each of the weak arguments which support his position; it appears to me that faith in reductionism and an observation of an algorithmic consciousness qualify as strong counterarguments, and I think that there’s a very high probability that Tegmark’s work showing that the brain mechanisms that we know about function well in classical physics qualifies as a moderate counterargument to the key points. (This would become strong if combined with proof that what we know is sufficient to model consciousness.)
The evolutionary counterargument (it is unlikely that small changes iterating over time would result in quantum behavior, even if quantum behavior were useful) I reject as false, because evolutionary processes aren’t goal-oriented and aren’t smart enough to avoid using quantum behavior.
This post reminds me of my own experiences with the “smartphone question.” For years, I had derided smartphones for lacking a killer app. However, a recent conversation with several LW community members ultimately changed my mind, and now I consider my previous view to have been very misguided.
What I overlooked was that while there was no one “killer app” that provided huge value, there were lots and lots of smaller apps that provided small individual value. When taken together, they constituted a substantial overall benefit. My focus on the “one big thing” was causing me to miss out on a lot of potential value.
Very well said! I would say it’s a better example than the one listed in the above post...
In machine learning, the approach you’re advocating is called ensemble learning, or more narrowly, “Bayesian model combination.” Polikar (2006) is a nice overview article.
And for a more recent overview, see Rokach (2010).
Sounds a lot like Lord Kelvin saying that biology’s vital force was infinitely beyond the reach of science, and equally wrong in the light of history. Flag: Getting a kick out of not knowing something; motivated uncertainty.
“Science” changed, and will continue to change to incorporate things once not considered possible.
For a while, transmutation of one element into another was impossible (Aristotle), then it was theoretically possible but never accomplished (classical alchemy), then it was shown to be impossible (Lavoisier), then it was observed.
Likewise it has become possible to observe, then predict, and now explain astronomical events. “What is the Sun made of?” is a scientific question now. Likewise “what is our destiny?” may be a scientific question in the future.
Robin Hanson argues for the “many weak arguments” approach in this post. Describing a difference between himself and Bryan Caplan, he writes,
The adage “let the evidence guide me wherever it may” applies. We don’t get to impose on reality what kind of evidence for various positions it provides for us to find. If there is one single strong piece of evidence, then we should update on that single strong piece of evidence, neither being happy nor unhappy about its singular nature. Similar how falsifying an established theory only necessitates one single contradictory (from the confines of the theory) experimental result (as long as we can rely on that result being repeatable / done methodologically right).
I’m not sure if your post is mostly concerned with argument presentation, since you state that “the ‘single relatively strong argument’ approach to assessing the above claim is to try to synthesize as many of the above weak arguments and counterarguments as possible, into a single relatively strong argument.”—That’s not so much one strong argument, as simply taking all the weak arguments then presenting them in an intermixed and baked-together fashion. I still see the component arguments, it’s just that their presentation has become less clean.
Similar to how you can present a syllogism hidden in long and complicated sentences, or you can present its premises and conclusion clearly and numbered. If your post is solely concerned with labelling your arguments concisely and such that they can easily be separately evaluated, you’ll find little argument, but then again that’s a bit of a trivial message (“Clearly delineate your arguments to make it easier for someone to evaluate them”).
Doesn’t “one strong argument” mean something other “many weak arguments intermixed”? Depending on the subject domain, I’d expect smart people to be able to come up with weak arguments for any position. However, for positions that correctly correspond to a feature of the territory, there should be stronger (not in your sense, in the sense of causing Bayesian reasoners with different priors to strongly update) arguments to be found as well, overall.
So one single monolithic strong argument (not an amalgam, but one distinct point to be made) should actually in generality be more evidence to shift a decision than a string of weaker arguments.
Thanks for the thoughtful comment. I modified my post to address it.
The question is: given a single strong piece of evidence in favor of A and several weak pieces of evidence against A, how should one weigh the evidence on both sides? Obviously this will depend on the particulars.
Broadly, my earlier approach was to focus on the single strong piece of evidence in favor of A, and ignore the weak pieces of evidence against A, whereas now my approach is to give some weight to the several weak pieces evidence, and to allow them it to overwhelm the single strong piece of evidence in favor of A in some situations.
This reflects poor presentation on my part. When I said “synthesize,” the sort of thing that I had in mind is to say “Correlation is not causation. Conventional wisdom is probably rooted in mistaking correlation for causation. Therefore we should ignore conventional wisdom.” This implicitly throws out the possibility that conventional wisdom isn’t rooted in mistaking correlation for causation.
In focusing on a single strong argument to the exclusion of other arguments, one is implicitly rejecting the weak arguments, and so doing so constitutes an implicit attempt to synthesize the evidence.
That seems very commonsensical and should be very uncontroversial.
Well, it’s hard to be controversial when saying “you should give some weight even to weak evidence”. As opposed to evidence in favor of a position which would not shift your prior at all? That would be a contradictio in terminis, since evidence in favor is defined as shifting your prior to some degree). “In favor, even weakly” = P(A|evidence) > P(A).
What about the second part, which—because of the “in some situations” qualifier—is equivalent to “there exist situations such that a number of small belief shifts can overcome one large belief shift. Common sense as well, since it just restates that adding enough (non-diminishing and ergo terms of a diverging series) epsilons will overcome any fixed number, however large.
As a general rule, we should not expect there to be strong pieces of evidence in favor of a false position. Not strong in the sense of “well presented”, or “given by an authority figure”, but strong as evaluated by a person knowledgeable in the field. In physics, you won’t get to 5 sigma using many small updates, you’ll have a couple of strong results. There may be the occasional strange circumstance in which there appears to be actual strong evidence against a position which (the position) later turns out to be correct. However, there always will be a lot of weak evidence (at various values of ‘weak’) for or against anything, it’s just too easy to come up with using motivated cognition.
So we should take note if there’s strong evidence involved in any issue, but with “you should not outright ignore weak evidence”, we can all be friends.
I knew this in the abstract, but wasn’t adhering to it properly in practice. See my remarks about the shift in my beliefs about Penrose’s views on consciousness.
But there are often apparently strong pieces of evidence in favor of a false position. That’s the point of the latter half of the “Major weaknesses of the “single relatively strong argument” approach section of my post.
In practice, it’s often the case that all we have is weak evidence — the situation is just that some evidence is weaker than other evidence. It can be easy to deceive oneself into thinking that the relatively strong evidence is stronger than it is.
I agree if you’re talking about “sufficiently weak” evidence. But consider the example of quantitative careers and earnings in my post. I believe that the individual arguments supporting it are genuinely weak, but that there are fewer arguments of the same strength against the claim, so that there’s not much of a risk of motivated cognition skewing the conclusion. Do you disagree?
Jonah, I agree with what you say at least in principle, even if you would claim I don’t follow it in practice. A big advantage of being Bayesian is that you retain probability mass on all the options rather than picking just one. (I recall many times being dismayed with hacky approximations like MAP that let you get rid of the less likely options. Similarly when people conflate the Solomonoff probability of a bitstring with the shortest program that outputs it, even though I guess in that case, the shortest program necessarily has at least as much probability as all the others can combined.)
My main comment on your post is that it’s hard to keep track of all of these things computationally. Probably you should try, but it can get messy. It’s also possible that in keeping track of too many details, you introduce more errors than if you had kept the analysis simple. On many questions in physics, ecology, etc., there’s a single factor that dominates all the rest. Maybe this is less true in human domains because rational agents tend to produce efficiencies due to eating up the free lunches.
So, I’m in favor of this approach if you can do it and make it work, but don’t let the best be the enemy of the good. Focus on the strong arguments first, and only if you have the bandwidth go on to think about the weak ones too.
I raise (at least) two different related points in my post:
“When an argument seems very likely to be wrong but could be right with non-negligible probability, classify it as such, rather than classifying it as false.” I think that you’re pretty good on this point, and better than I had been.
The other is one that you didn’t mention in your comment, and one that I believe that you and I have both largely missed in the past. This is that one doesn’t need a relatively strong argument to be confident in a subtle judgment call — all that one needs is ~4-8 independent weak arguments. (Note that generating and keeping track of these isn’t computationally infeasible.) This is a very crucial point, as it opens up the possibility of no longer needing to rely on single relatively strong arguments that aren’t actually too strong.
I believe that the point in #2 is closely related to what people call “common sense” or “horse sense” or “physicists’ intuition.” In the past, I had thought that “common sense” meant, specifically, “don’t deviate too much from conventional wisdom, because views that are far from mainstream are usually wrong.” Now I realize that it refers to something quite a bit deeper, and not specifically about conventional wisdom.
I’d suggest talking about these things with miruto.
Our chauffeur from last weekend has recently been telling to me that physicists generally use the “many weak arguments” approach.
For example, the math used in quantum field theory remains without a rigorous foundation, and its discovery was analogous to Euler’s heuristic reasoning concerning the product formula for the sine function.
He also referred to scenarios in which (roughly speaking) you have a physical system with many undetermined parameters, and you have ways of bounding different collections of them and that by looking at all of the resulting bounds, you can bound the individual parameters sufficiently tightly so that the whole model is accurate.
Cool. Yes, many examples of #1 come to mind. As far as #2, I don’t believe I had thought of this as a principle specifically.
What I meant about a single factor dominating in physics was that in most cases, even when multiple factors are at play, one of those factors matters more than all the rest such that you can ignore the rest. For example, an electron has gravitational attraction to the atomic nucleus, but this is trivial compared with the electromagnetic attraction. Similarly, the electromagnetic repulsion of the protons in the nucleus is trivial compared with the strong force holding them together. It’s rare in nature to have a close competition between competing forces, at least until you get to higher-level domains like inter-agent competition.
Yes, I agree with this. My comments were about the sort of work that physicists do, as opposed to the relative significance of different physical forces in analyzing physical systems
Do you have/are you planning an argument demonstrating in detail use of MWA in physics?
I don’t know physics, but I think that my post on Euler and the Basel Problem gives a taste of it.
I found this unhelpful because I do math and frequently do non-rigorous math reasoning, which seems to me to have much more of a ORSA argument flavor. Or maybe like two arguments—“here is a beautiful picture, it is correct in one trivial case, therefore it is correct everywhere”. My previous understanding was that physicists non-rigorous math reasoning was much like mathematician’s non-rigorous math reasoning, and that my non-rigorous math reasoning is typical. So to accept this claim I have to change some belief.
Do you think that the specific example of Euler and the Basel Problem doesn’t count as an example of the use of MWAs? If so, I don’t necessarily disagree, but I think it’s closer to MWAs than most mathematical work is, and may be representative of the sort of reasoning that physicists use.
There might just be a terminological distinction here. When I think of the reasoning used by mathematicians/physicists, I think of the reasoning used to guess what is true—in particular to produce a theory with >50% confidence. I don’t think as much of the reasoning used to get you from >50% to >99%, because this is relatively superfluous for a mathematician’s utility function—at best, it doubles your efficiency in proving theorems. Whereas you are concerned more with getting >99%.
This is sort of a stupid point but Euler’s argument does not have very many parts, and the parts themselves are relatively strong. Note that if you take away the first, conceptual point, the argument is not very convincing at all—although this depends on how much calculation of how many even zeta values Euler does. It’s still a pretty far cry from the arguments frequently used in the human world.
Finally, while I can see why Euler’s reasoning may be representative of the sort of reasoning that physicists use, I would like to see more evidence that it is representative. If all you have is the advice of this chauffer, that’s perfectly alright and I will go do something else.
I don’t have much more evidence, but I think that it’s significant that:
Physicists developed quantum field theory in the 1950′s, and that it still hasn’t been made mathematically rigorous, despite the fact that, e.g., Richard Borcherds appears to have spent 15 years (!!) trying.
The mathematicians who I know who have studied quantum field theory have indicated that they don’t understand how physicists came up with the methods that they did.
These suggest that the physicists who invented this theory reasoned in a very different way from how mathematiicans usually do.
That part seems obvious: physicists treat math as a tool, it does not need to be perfect to get the job done. It can be inconsistent, self-contradictory, use techniques way outside of their original realm of applicability, remove infinities by fiat, anything goes, as long as it works. Of course, physicists do prefer fine tools polished to perfection, and complain when they aren’t, but will use them, anyway. And invent and build new crude ones when there is nothing available off the shelf.
What I was highlighting is the effect size.
As a mathematician, it’s possible to get an impression of the type “physicists’ reasoning isn’t rigorous, because they don’t use, e.g. epsilon delta proofs of theorems involving limits, and the reasoning is like a sloppier version of mathematical reasoning.”
The real situation is closer to “physicists dream up highly nontrivial things that are true, that virtually no mathematicians would have been able to come up with without knowledge of physics, and that mathematicians don’t understand sufficiently well to be able to prove even after dozens of years of reflection.”
But mathematicians also frequently dream up highly nontrivial things that are true, that mathematicians (and physicists) don’t understand sufficiently well to be able to prove even after dozens of years of reflection. The Riemann hypothesis is almost three times as old as quantum field theory. There are also the Langlands conjectures, Hodge conjecture, etc., etc. So it’s not clear that something fundamentally different is going on here.
I agree that the sort of reasoning that physicists use sometimes shows up in math.
I don’t think that the Riemann hypothesis counts as an example: as you know, its truth is suggested by surface heuristic considerations, so there’s a sense in which it’s clear why it should be true.
I think that the Langlands program is an example: it constitutes a synthesis of many known number theoretic phenomena that collectively hinted at some general structure: they can be thought of “many weak arguments” for the general conjectures
But the work of Langlands, Shimura, Grothendieck and Deligne should be distinguished between the sort of work that most mathematicians do most of the time, which tends to be significantly more skewed toward deduction.
From what I’ve heard, quantum field theory allows one to accurately predict certain physical constants to 8 decimal places, with the reasons why the computations work very unclear. But I know essentially nothing about this. As I said, I can connect you with my friend for details.
Most physicists most of the time aren’t Dirac, Pauli, Yang, Mills, Feynmann, Witten, etc.
No, but my impression is that the physics culture has been more influenced by the MWA style than mathematical culture has. In particular, my impression is that most physicists understand “the big picture” (which has been figured out by using MWAs) whereas in my experience, most mathematicians are pretty focused on individual research problems.
As a tangent, I think it’s relatively clear both how physicists tend to think differently from mathematicians, and how they came up with path-integration-like techniques in QFT. In both math and physics, researchers will come up with an idea based on intuition, and then verify the idea appropriately. In math the correct notion of verification is proof; in physics it’s experimentation (with proof an acceptable second). This method of verification has a cognitive feedback loop to how the researcher’s intuition works. In particular physicists have intuition that’s based on physical intuition and (generally) a thoroughly imprecise understanding of math, so that from this perspective, using integral-like techniques without any established mathematical underpinnings is intuitively completely plausible. Mathematicians would shirk away from this almost immediately as their intuition would hit the brick wall of “no theoretical foundation”.
If you’re really curious, you can talk with my chauffer (who has deep knowledge on this point).
Experts are apparently known to be not much better than amateurs outside of their area of expertise, so whatever Penrose claims about something other than General Relativity and High-energy Physics should have the same weight as that of, say, a grad student in the relevant area, at best. Especially given that he does not have a track record of being proven right in unrelated fields. Thus the argument that
should have no bearing on taking his claims in neuroscience seriously.
This is the core point of disagreement. The point is that “Experts are apparently known to be not much better than amateurs outside of their area of expertise” might be wrong (just as all of the other arguments against the truth of his beliefs might be wrong).
That could be wrong but it is overwhelmingly unlikely to be. If the “core point of disagreement” is that one can freely ignore established science if convenient then the core point is somewhat lacking.
Why do you think this?
People did science. I read textbooks.
This is not a complicated or ambiguous question.
Note that there’s some evidence that in some fields, experts are actually better outside their own field. This is discussed with relevant studies in Tetlock’s “Expert Political Judgement.” However, this is for predictions of the future, not judgments about correctness of basic models in their field.
That’s interesting, because:
Sorry, bad phrasing on my part. Tetlock’s work shows that experts are better outsider their own field compared to their own field, not necessarily compared to other people.
I suspect your phrasing is still bad, because I don’t recall anything of the sort in the book.
I’m pretty sure that’s what I meant to say. Unfortunately, I don’t have my copy of the book off-hand. I’ll have to get back to you.
Suppose:
P(penrose says | consciousness & expertise is general) = 0.5
P(penrose says | ~consiousness & expertise is general) = 0
ie. if expertise is general, penrose’s opinion is infinitely strong evidence.
P(penrose says | consc & expertise is narrow) = 0.5
P(penrose says | ~ consc & expertise is narrow) = 0.25
ie. if expertise is narrow, penrose’s opinion is weak evidence.
Then suppose 90% chance that expertise is general. ie. We are somehow quite sure that experts are never wrong. (quite sure that the argument that expertise is narrow is wrong)
Then:
P(penrose says | consc) = 0.5
P(penrose says | ~consc) = ~0.025
ie. even under these extremely charitable assumptions, Penrose’s opinion is only 20x (medium) evidence.
Your assumptions are much more reasonable than I have used here, and will give you correspondingly weaker evidence.
And then, why privilege penrose over the ~1e3 experts who disagree with him?
EDIT: If I were charitable, then I’d note that this argument applies to all arguments, including those on the other side; that strong evidence is in general quite hard to find unless you are very sure of your experimental apparatus.
It is valuable to have a list of “people who believe X” and “people who believe ~X,” and one might suspect that, if X, the majority position, is incorrect, the people on the ~X list to disproportionately be higher quality. I’m not a good enough science historian to know if that’s the case historically, especially because you would want to use contemporary measurements of quality, as many people who believed ~X and it turned out to be right are more highly estimated by hindsight.
(More broadly, there may be systematic patterns to public support on scientific controversies, such that 1) one shouldn’t compare length of lists or treat positions of individuals as giving completely independent evidence and 2) there may be patterns that suggest known kinds of events.)
“Consilience” is a word that’s had little favour of late, given the importance of the concept; it deserves to be used more.
I haven’t read the comments yet, so apologies if this has already been said or addressed:
If I am watching others debate, and my attention is restricted to the arguments the opponents are presenting, then my using the “one strong argument” approach may not be a bad thing.
I’m assuming that weak arguments are easy to come by and can be constructed for any position, but strong arguments are rare.
In this situation I would expect anybody who has a strong argument to use it, to the exclusion of weaker ones: if A and B both have access to 50 weak arguments, and A also has access to 1 strong argument, then I would expect the debate to come out looking like (50weak) vs. (1strong) - even though the underlying balance would be more like (50weak) vs. (50weak + 1strong).
(By “having access to” an argument, I mean to include both someone’s knowing an argument, and someone’s having the potential to construct or come across an argument with relatively little effort.)
My main reaction to this is that understanding a subtle situation requires much more careful reflection than occurs in the course of a debate, or in the course of watching a debate. It often takes 500+ hours. So I concede your point, but in practice, don’t I think that it’s so relevant — if one is confining oneself to a few hours of attention, then one’s prospects of coming to an epistemically sound position aren’t very good in any case.
I think that another problem in the context of a debate is with people in often throwing down a lot of arguments. If the weak arguments all come from a single source within a short period of time I tend to discount their arguments (perhaps too much).
I would like to signal that the link about Euler’s argument is broken. I believe the correct link should be https://www.lesswrong.com/posts/WsmnfWTP28dXCKEy8/the-use-of-many-independent-lines-of-evidence-the-basel
Look what I found!
https://en.wikipedia.org/wiki/Boosting_(machine_learning) https://en.wikipedia.org/wiki/Bootstrap_aggregating
Yay prior literature! Always there several months later than when you initially tried to find it.
I don’t buy the assumption that seems to be implied that many arguments have to be weak and a single argument has to be strong.
Why not have many strong reasons instead of one weak reason?
Certainly for complex questions I find multi-threaded answers more convincing than single-threaded ones.
Fox over hedgehog for me.
In terms of picking a major, do something you enjoy that you can conceivably use to get a job. You can actually get a job with a philosophy degree. I did… after I quit accounting because it was too darn boring…
There are contexts in which one doesn’t have access to many strong arguments.
I’ll clarify that I wasn’t arguing that one should major in a quantitative subject — my discussion was restricted to earnings, not to the holistic impact of majoring in a given subject.
Not sure if I’m disagreeing with you, but I’d bet heavily against Penrose’s view given a way to do so.
I’d also like to go on record as giving any Friendly AI permission to advance me towards godhood faster than Mitchell Porter, by functionalist means.