I think your position is going to turn out to be unfalsifiable on the point of whether relationships involving honesty, equality and mutual support actually exist. If your response to claims that they exist is to say “Well in my experience they don’t exist, the people who think they do are just deluded” I can’t provide any evidence that will change your views. After all, I could just be deluded.
As for whether I’m engaging with, and have read, the “real” PUA literature or the “good” PUA literature, I’m not sure whether or not this is an instance of the No True Scotsman argument. There’s no question that a large part of the PUA literature and community are misogynist and committed to an ideology that positions themselves as high-status and women and non-PUA men as low-status. As such that part of PUA culture is antithetical to the goals of LW as I understand them since those goals include maximising everyone’s utility.
If there’s a subset of positive-utility PUA thinking then that criticism does not apply and it’s at least possible that if they have scientific data to back up their claims then there is something useful to be found there.
I think it’s the PUA advocates’ burden of proof to show us that data though, if there really is an elephant of good data pertinent to pursuing high net-utility outcomes in the room. As opposed to some truisms which predate PUA culture by a very long time hidden under an encrustation of placebo superstitions.
I think your position is going to turn out to be unfalsifiable on the point of whether relationships involving honesty, equality and mutual support actually exist.
Huh? I didn’t say those things didn’t exist. I said I was not searching for a lack of those things (I even bolded the word “lack” so you wouldn’t miss it), and that I don’t see why you think that PUA requires such a lack.
No True Scotsman argument
Authentic Man Program and Johnny Soporno are the two schools I’m aware of that are strongly in the honesty and empowerment camps, AFAICT, and would constitute the closest things to “true scotsmen” for me. Most other things that I’ve seen have been a bit of a mixed bag, in that both empathetic and judgmental material (or honest and dishonest) can both be found in the same set of teachings.
Of notable interest to LW-ers, those two schools don’t advocate even the token dishonesty of false premises for starting a conversation, let alone dishonesty regarding anything more important than that.
(Now, if you want to say that these schools aren’t really PUA, then you’re going to be the one making a No True Scotsman argument. ;-) )
and it’s at least possible that if they have scientific data to back up their claims then there is something useful to be found there.
As I said, I’m less interested in “scientific” evidence than Bayesian evidence. The latter can be disappointingly orthogonal to the former, in that what’s generally good scientific evidence isn’t always good Bayesian evidence, and good Bayesian evidence isn’t always considered scientific.
More to the point, if your goals are more instrumental than epistemic, the reason why a particular thing works is of far less interest than whether it works and how it can be utilized.
I took a quick look at AMP and Soporno’s web sites and I’m more than happy to accept them as non-misogynistic dating advice sources aiming for mutually beneficial relationships. I wasn’t previously aware of them but I unconditionally accept them as True Scotsmen.
I’m now interested in how useful their advice is, either in instrumental or epistemic terms. Either would be significant, but if there is no hard evidence then the fact that their intentions are in step with those of LW doesn’t get them a free pass if they don’t have sound methodology behind their claims.
I’m aware Eliezer thinks there’s a difference between scientific evidence and Bayesian evidence but it’s my view that this is because he has a slightly unsophisticated understanding of what science is. My own view is that the sole difference between the two is that science commands you to suspend judgment until the null hypothesis is under p=0.05, at least for the purposes of what is allowed into the scientific canon as provisional fact, and Bayesians are more comfortable making bets with greater degrees of uncertainty.
Regardless, if your goals are genuinely instrumental you very much want to figure out what parts of the effect are due to placebo effects and what parts are due to real effects, so you can maximise your beneficial outcomes with a minimum of effort. If PUA is effective to some extent but solely due to placebo effects then it only merits a tiny footnote in a rationalist approach to relationships. If it has effects beyond placebo effects then and only then is there something interesting for rationalists to look at.
Regardless, if your goals are genuinely instrumental you very much want to figure out what parts of the effect are due to placebo effects and what parts are due to real effects, so you can maximise your beneficial outcomes with a minimum of effort.
There is a word for the problem that results from this way of thinking about instrumental advice. It’s called “akrasia”. ;-)
Again, if you could get people to do things without taking into consideration the various quirks and design flaws of the human brain (from our perspective), then self-help books would be little more than to-do lists.
In general, when I see somebody worrying about placebo effects in instrumental fields affected by motivation, I tend to assume that they are either:
Inhumanly successful and akrasia-free at all their chosen goals, (not bloody likely),
Not actually interested in the goal being discussed, having already solved it to their satisfaction (ala skinny people accusing fat people of lacking willpower), or
Very interested in the goal, but not actually doing anything about it, and thus very much in need of a reason to discount their lack of action by pointing to the lack of “scientifically” validated advice as their excuse for why they’re not doing that much.
I’d prefer not to discuss this at the ad hominem level. You can assume for the sake of argument whichever of those three assumptions you prefer is correct, if it suits you. I’m indifferent to your choice—it makes no difference to my utility. I make no assumptions about why you hold the views you do.
My view is that the rationalist approach is to take it apart to see how it works, and then maybe afterwards put the bits that actually work back together with a dollop of motivating placebo effect on top.
The best way to approach research into helping overweight people lose weight is to study human biochemistry and motivation, and see what combinations of each work best. Not to leave the two areas thoroughly entangled and dismiss those interested in disentangling them as having the wrong motivations. I think the same goes for forming and maintaining romantic relationships.
I’d prefer not to discuss this at the ad hominem level.
Me either. I was asking you for a fourth alternative on the presumption that you might have one.
FWIW, I don’t consider any of those alternatives somehow bad, nor is my intention to use the classification to score some sort of points. People who fall into category 3 are of particular interest to me, however, because they’re people who can potentially be helped by understanding what it is they’re doing.
To put it another way, it wasn’t a rhetorical question, but one of information. If you fall in category 1 or 2, we have little further to discuss, but that’s okay. If you fall in category 3, I’d like to help you out of it. If you fall in an as-yet-to-be-seen category 4, then I get to learn something.
So, win, win, win, win, in all four cases.
The best way to approach research into helping overweight people lose weight is to study human biochemistry and motivation, and see what combinations of each work best.
This is conflating things a bit: my reference to weight loss was pointing out that “universal” weight-loss advice doesn’t really exist, so a rationalist seeking to lose weight must personally test alternatives, if he or she cannot afford to wait for science to figure out the One True Theory of Weight Loss.
My view is that the rationalist approach is to take it apart to see how it works
This presupposes that you already have something that works, which you will not have unless you first test something. Even if you are only testing scientifically-validated principles, you must still find which are applicable to your individual situation and goals!
Heck, medical science uses different treatments for different kinds of cancer, and occasionally different treatments for the same kind of cancer, depending on the situation or the actual results on an individual - does this mean that medical science is irrational? If not, then pointing a finger at the variety of situation-specific PUA advice is just rhetoric, masquerading as reasoning.
I imagine you’d put me in category #2 as I’m currently in a happy long-term relationship. However my self-model says that three years ago when I was single and looking for a partner that I would still want to know what the actual facts about the universe were, so I’d put myself in category #4, the category of people for whom it’s reflexive to ask what the suitably blinded, suitably controlled evidence says whether or not they personally have a problem at that point in their lives with achieving relevant goals.
I think we should worry about placebo effects everywhere they get in the way of finding out how the universe actually works, whether they happen to be in instrumental fields affected by motivation or somewhere else entirely.
That didn’t mean that I chose celibacy until the peer-reviewed literature could show me an optimised mate-finding strategy, of course, but it does mean that I don’t pretend that guesswork based on my experience is a substitute for proper science.
The difference between your PUA example and medicine is that medicine usually has relevant evidence for every single one of those medical decisions. (Evidence-based medicine has not yet driven the folklore out of the hospital by a long chalk but the remaining pockets of irrationality are a Very Bad Thing). Engineers use different materials for different jobs, and photographers use different lenses for different shots too. I don’t see how the fact that these people do situation-specific things gets you to the conclusion that because PUAs are doing situation-specific things too they must be right.
I don’t see how the fact that these people do situation-specific things gets you to the conclusion that because PUAs are doing situation-specific things too they must be right.
It doesn’t. It just refutes your earlier rhetorical conflation of PUA with alternative medicine on the same grounds.
At this point, I’m rather tired of you continually reframing my positions to stronger positions, which you can then show are fallacies.
I’m not saying you’re doing it on purpose (you could just be misunderstanding me, after all), but you’ve been doing it a lot, and it’s really lowering the signal-to-noise ratio. Also, you appear to disagree with some of LW’s premises about what “rationality” is. So, I don’t think continued discussion along these lines is likely to be very productive.
It doesn’t. It just refutes your earlier rhetorical conflation of PUA with alternative medicine on the same grounds.
My intent was to show that in the absence of hard evidence PUA has the same epistemic claim on us as any other genre of folklore or folk-psychology, which is to say not much.
At this point, I’m rather tired of you continually reframing my positions to stronger positions, which you can then show are fallacies.
I admit I’m struggling to understand what your positions actually are, since you are asking me questions about my motivations and accusing me of “rhetoric, not reasoning” but not telling me what you believe to be true and why you believe it to be true. Or to put it another way, I don’t believe you have given me much actual signal to work with, and hence there is a very distinct limit to how much relevant signal I can send back to you.
Maybe we should reboot this conversation and start with you telling me what you believe about PUA and why you believe it?
Maybe we should reboot this conversation and start with you telling me what you believe about PUA and why you believe it?
Ok. I’ll hang in here for a bit, since you seem sincere.
Here’s one belief: PUA literature contains a fairly large number of useful, verifiable, observational predictions about the nonverbal aspects of interactions occurring between men and women while they are becoming acquainted and/or attracted.
Why do I believe this? Because their observational predictions match personal experiences I had prior to encountering the PUA literature. This suggests to me that when it comes to concrete behavioral observations, PUAs are reasonably well-calibrated.
For that reason, I view such PUA literature—where and only where it focuses on such concrete behavioral observations—as being relatively high quality sources of raw observational data.
In this, I find PUA literature to be actually better than the majority of general self-help and personal development material, as there is often nowhere near enough in the way of raw data or experiential-level observation in self-help books.
Of course, the limitation on my statements is the precise definition of “PUA literature”, as there’s definitely a selection effect going on. I tend to ignore PUA material that is excessively misogynistic on its face, simply because extracting the underlying raw data is too… tedious, let’s say. ;-) I also tend to ignore stuff that doesn’t seem to have any connection to concrete observations.
So, my definition of “PUA literature” is thus somewhat circular: I believe good stuff is good, having carefully selected which bits to label “good”. ;-)
Another aspect of my possible selection bias is that I don’t actually read PUA literature in order to do PUA!
I read PUA literature because of its relevance to topics such as confidence, fear, perceptions of self-worth, and other more common “self-help” topics that are of interest to me or to my customers. By comparison, PUA literature (again using my self-selected subset) contains much better raw data than traditional self-help books, because it comes from people who’ve relentlessly calibrated their observations against a harder goal than just, say, “feeling confident”.
Here’s one belief: PUA literature contains a fairly large number of useful, verifiable, observational predictions about the nonverbal aspects of interactions occurring between men and women while they are becoming acquainted and/or attracted.
Why do I believe this? Because their observational predictions match personal experiences I had prior to encountering the PUA literature. This suggests to me that when it comes to concrete behavioral observations, PUAs are reasonably well-calibrated.
The problem with this line of reasoning is that there are people who believe they have relentlessly calibrated their observations against reality using high quality sources of raw observational data and that as a result they have a system that lets them win at Roulette. (Barring high-tech means to track the ball’s vector or identifying an unbalanced wheel).
Roulette seems to be an apt comparison because based on the figures someone else quoted or linked to earlier about a celebrated PUAist hitting on 10 000 women and getting 300 of them into bed, the odds of a celebrated PUAist getting laid on a single approach even according to their own claims is not far off the odds of correctly predicting exactly which hole a Roulette ball will land in.
So when these people say “I tried a new approach where I flip flopped, be-bopped, body rocked, negged, nigged, nugged and nogged, then went for the Dutch Rudder and I believe this worked well” unless they tried this on a really large number of women so that they could detect changes in a base rate of 3% success I really don’t think they have any meaningful evidence. Did their success rate go up from 3% to 4% or what, and what are their error bars?
What’s the base rate for people not using PUA techniques anyway? People other than PUAs are presumably getting laid, so it’s got to be non-zero. The closer it is to 3% the less effect PUA techniques are likely to have.
I’ve already heard the response “Look, we don’t get just one bit of data as feedback. We PUAs get all sorts of nuanced feedback about what works and does not”. If that’s so and this feedback is doing some good this should be reflected in your hit rate for getting laid. If picking up women and getting them in to bed is an unfair metric for PUA effectiveness I really think it should be called something other than PUA.
My thinking is that you don’t have enough data to distinguish whether you are in a world where PUA training has a measurable effect, from a world where PUA have an unfalsifiable mythology that allows them to explain their hits and misses to themselves, and a collection of superstitions about what works and does not, but no actual knowledge that separates them in terms of success rate from those who simply scrub up, dress up and ask a bunch of women out.
I want to see that null hypothesis satisfactorily falsified before I allow that there is an elephant in the room.
Notice that nowhere in my post did I say pickup artists get laid, let alone that they get laid more often!
Nowhere did I state anything about their predictions of what behavior works to get laid!
I even explicitly pointed out that the information I’m most interested in obtaining from PUA literature, has notthing to do with getting laid!
So just by talking about the subject of getting laid, you demonstrate a complete failure to address what I actually wrote, vs. what you appear to have imagined I wrote.
So, please re-read what I actually wrote and respond only to what I actually wrote, if you’d like me to continue to engage in this discussion.
Okay. What observable outcomes do you think you can obtain at better-than-base-rate frequencies employing these supposed insights, and why do you think you can obtain them?
As I said earlier I think that if PUA insights cannot be cashed out in a demonstrable improvement in the one statistic which you would think would matter most to them, rate of getting laid, then there is grounds to question whether these supposed insights are of any use to anyone.
But if you would prefer to use some other metric I’m willing to look at the evidence.
That didn’t mean that I chose celibacy until the peer-reviewed literature could show me an optimised mate-finding strategy, of course, but it does mean that I don’t pretend that guesswork based on my experience is a substitute for proper science.
Guesswork based on your experience isn’t supposed to be a substitute for science. It’s the part of science that you do when choosing which phenomena you want to test, well before you get to the blinding and peer review.
The flip side is that proper science isn’t a substitute for either instrumental rationality or epistemic rationality. Limiting your understanding of the world entirely to what is already published in journals gives you a model of the world that is subjectively objectively wrong.
I don’t disagree but a potentially interesting research area isn’t an elephant in the room that demands attention in a literature review, and limiting yourself to proper science is no sin in a literature review either. Only when the lessons we can learn from proper science are exhausted should we start casting about in the folklore for interesting research areas, and we certainly shouldn’t put much weight on anecdotes from this folklore. In Bayesian terms such anecdotes should shift our prior probability very, very slightly if at all.
My own view is that the sole difference between the two is that science commands you to suspend judgment until the null hypothesis is under p=0.05, at least for the purposes of what is allowed into the scientific canon as provisional fact, and Bayesians are more comfortable making bets with greater degrees of uncertainty.
Why don’t you first describe one, then the other, then contrast them? Then, describe Eliezer’s view and contrast that with your position.
I’ll try to do it briefly, but it will be a bit tight. Let’s see how we go.
Bayes’ Theorem is part of the scientific toolbox. Pick up a first year statistics textbook and it will be in there, although not always under that name (look for “conditional probability” or similar constructs). Most of scientific methodology is about ensuring that you do your Bayesian updating right, by correctly establishing the base rate and the probability of your observations given the null hypothesis. (Scientists don’t state their P(A), but they certainly have an informal sense of what P(A) is likely to be and are more inclined to question a conclusion if it is unlikely than if it is likely).
If you’re doing Bayes right it’s the same as doing science, but I think some of the LW groupthink holds that you can do a valid Bayesian update in the absence of a rigorously established base rate, and so they think this is a difference between being a good Bayesian and being a good scientist. I think they are just being bad Bayesians since updating is no better than guesswork in the absence of a rigorously obtained P(B).
Eliezer (based on The Dilemma: Science or Bayes? ) doesn’t quite carve up science-culture from ideal-science-methodology the way I do, and infers that there is something wrong with Science because the culture doesn’t care about revising instrumentally-indistinguishable models to make them more Eliezer-intuitive. I think this has more to do with trying to win a status war with Science than with any differences in predicted observations that matter.
That doesn’t mean it doesn’t underlie the entire structure. As an analogy, to get from New York to Miami, one must generally go south. But instructions on how to get there will be a hodgepodge of walk north out of the building, west to the car, drive due east, then turn south...the plane takes off headed east...and turns south...etc. Showing that going south is one of several ways to turn while walking doesn’t mean its no conceptually different than north for getting fro New York to Miami. Similarly:
they think this is a difference between being a good Bayesian and being a good scientist.
If one is paid to do plumbing, then there is no difference between being a good plumber and a “good Bayesian”, and in that sense there is no difference between being a “good Bayesian” and a “good scientist”.
In the sense in which it is intended, there is a difference between being a “good Bayesian” and a “good scientist”. To continue the analogy, if one must go from Ramsey to JFK airport across the Tappan Zee Bridge, one’s route will be on a convoluted path to a bridge that’s in a monstrously inconvenient location. It was built there—at great additional expense as that is where the river is widest—to be just outside of the NY/NJ Port Authority’s jurisdiction. The best route from Ramsey to Miami may be that way, but that accommodates human failings, and is not the direct route. Likewise for every movement that is made in a direction not as the crow flies. Bayesian laws are the standard by which the crow flies, against which it makes sense to compare the inferior standards that better suit our personal and organizational deficiencies.
infers that there is something wrong with Science
Well, yes and no. It’s adequately suited for the accumulation of not-false beliefs, but it both could be better instrumentally designed for humans and is not the bedrock of thinking by which anything works. The thing that is essential to the method you described, “Scientists...have an informal sense of what P(A) is likely to be and are more inclined to question a conclusion if it is unlikely than if it is likely”. What abstraction describes the scientist’s thought process, the engine within the scientific method? I suggest it is Bayesian reasoning but even if it is not, one thing it cannot be is more of the Scientific method, as that would lead to recursion. If it is not Bayesian reasoning, then there are some things I am wrong about, and Bayesianism is a failed complete explanation, and the Scientific method is half of a quite adequate method—but they are still different from each other.
the probability of your observations given the null hypothesis.
P(B|~A) is inversely proportional to P(A|B) by Bayes’ Rule, so the direction is right—that’s why we can make planes that don’t fall out of the sky. But just using P(B|~A) isn’t what’s done, because scientists interject their subjective expectations here and pretend they do not. P(B|~A) doesn’t contain whether or not a researcher would have published something had she found a two tail rather than one tail test—a complaint about a paper I read just a few hours ago. What goes into p-values necessarily involves the arbitrary classes the scientist has decided evidence would fit in, and then measures his or her surprise at the class of evidence that is found. That’s not P(B|~A), it’s P(C|~A).
you can do a valid Bayesian update in the absence of a rigorously established base rate...updating is no better than guesswork in the absence of a rigorously obtained P(B)
Do you have examples of boundary cases that distinguish a rigorously established one with one that isn’t?
I think this has more to do with trying to win a status war with Science than with any differences in predicted observations that matter.
If one believes in qualitatively different beliefs, the rigorous and the non-rigorous, one falls into paradoxes such as the lottery paradox. It’s important to establish the actual nature of knowledge as probabilistic, and not be tricked into thinking science is a separate non-overlapping magisteria with other things.
With such actually correct understanding of how beliefs should work, we can think about improving our thinking rather than eternally and in vain trying to smooth out a ripple in a rug that has a table on each of its corners, hoping our mistaken view of the world has few harmful implications like “Jesus Christ is God’s only son” and not “life begins at conception”.
Or, we could not act on our most coherent world-views, only acting according to whatever fragment of thought our non-coherent attention presents to us. Not appealing.
It’s important to establish the actual nature of knowledge as probabilistic, and not be tricked into thinking science is a separate non-overlapping magisteria with other things.
Thank you for saying my point better than I was able to.
What abstraction describes the scientist’s thought process, the engine within the scientific method? I suggest it is Bayesian reasoning but even if it is not, one thing it cannot be is more of the Scientific method, as that would lead to recursion. If it is not Bayesian reasoning, no matter, Bayesianism is a failed complete explanation and the Scientific method is half an adequate method—they are still different from each other.
I don’t think scientists think about it much. That’s more the sort of thing philosophers of science think about. The smarter scientists do what is essentially Bayesian updating, although very few of them would actually put a number on their prior and calculate their posterior based on a surprising p value. They just know that it takes a lot of very good evidence to overturn a well-established theory, and not so much evidence to establish a new claim consistent with the existing scientific knowledge.
What goes into p-values necessarily involves the arbitrary classes the scientist has decided evidence would fit in, and then measures his or her surprise at the class of evidence that is found. That’s not P(B|~A), it’s P(C|~A).
Stating your hypothesis beforehand and specifying exactly what will and will not count as evidence before you collect your data is a very good way of minimising the effect of your own biases, but naughty scientists can and do take the opportunity to cook the experiment by strategically choosing what will count as evidence. Still, overall it’s better than letting scientists pore over the entrails of their experimental results and make up a hypothesis after the fact. If a great new hypothesis comes out of the data then you have do to your legwork and do a whole new experiment to test the new hypothesis, and that’s how it should be. If the effect is real it will keep. The universe won’t change on you.
Do you have examples of boundary cases that distinguish a rigorously established one with one that isn’t?
It’s not a binary distinction. Rather, if you’re unaware of the ways that people’s P(B) estimates can be wildly inaccurate and think that your naive P(B) estimates are likely to be accurate then you can update into all sorts of stupid and factually false beliefs even if you’re an otherwise perfect Bayesian.
The people who think that John Edward can talk to dead people might well be perfect Bayesians who just haven’t checked to see what the probability is that John Edward could produce the effects he produces in a world where he can’t talk to dead people. If you think the things he does are improbable then it’s technically correct to update to a greater belief in the hypothesis that he can channel dead people. It’s only if you know that his results are exactly what you’d expect in a world where he’s a fake that you can do the correct thing, which is not update your prior belief that the probability that he’s a fake is 99.99...9%.
If someone’s done some actual work to see if they can falsify the null hypothesis that PUS techniques are indistinguishable from a change, a comb, a shower and asking some women out I’d be interested in seeing it. In the absence of such work I think good Bayesians have to recognise that they don’t have a P(B) with small enough error bars to be very useful.
Stating your hypothesis beforehand and specifying exactly what will and will not count as evidence before you collect your data is a very good way of minimising the effect of your own biases
Exactly, it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing. So not “If you’re doing Bayes right it’s the same as doing science”, where “science” is an imperfect human construct designed to accommodate the more biased of scientists.
If a great new hypothesis comes out of the data then you have do to your legwork and do a whole new experiment to test the new hypothesis, and that’s how it should be. If the effect is real it will keep. The universe won’t change on you.
These are costs. It’s important, and in some contexts cheap, to know why and how things work instead of saying “I’ll ignore that since enough replication always solves such problems,” when one doesn’t know in which cases one is doing nearly pointless extra work and in which one isn’t doing enough replication. It’s an obviously sub-optimal solution along the lines of “thinking isn’t important; assume infinite resources.”
you can update into all sorts of stupid and factually false beliefs even if you’re an otherwise perfect Bayesian.
It’s praise through faint damnation of the laws of logic that they don’t prevent one from shooting one’s own foot off. Handcuffs are even better at that task, but they are less useful for figuring out what is true.
It’s not a binary distinction.
Exactly, so in “some of the LW groupthink holds that you can do a valid Bayesian update in the absence of a rigorously established base rate,” they are right, and “updating is no better than guesswork in the absence of a rigorously obtained P(B),” is not always true, such as when the following condition doesn’t apply, and it doesn’t here:
if you’re unaware of the ways that people’s P(B) estimates can be wildly inaccurate and think that your naive P(B) estimates are likely to be accurate
What do you think this site is for? People are reading and sharing research papers about biases in their free time. One could likewise criticize jet fuel for being inappropriate for an old fashioned coal powered locomotive. Yes, jet fuel will explode a train...this is not a flaw of jet fuel, and it does not mean that the coal-train is better at transporting things.
If someone’s done some actual work to see if they can falsify the null hypothesis that PUA techniques
That’s not the claim in question.
In any case, there are better ways to think about this subject than with null hypotheses. Those are social constructs focusing (decently) on optimizing preventing belief in untrue things, rather than determining what’s most likely true, here false beliefs have relatively less cost than in most of science, and will in any case only be held probabilistically.
Exactly, it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing. So not “If you’re doing Bayes right it’s the same as doing science”, where “science” is an imperfect human construct designed to accommodate the more biased of scientists.
There’s a very good reason why we do double-blind, placebo-controlled trials rather than just recruiting a bunch of people who browse LW to do experiments with, on the basis that since LWers are “trained in debiasing” they are immune to wishful thinking, confirmation bias, the experimenter effect, the placebo effect and so on.
I have a great deal more faith in methodological constructs that make it impossible for bias to have an effect than in people’s claims to “debiased” status.
Don’t get me wrong, I think that training in avoiding cognitive biases is very important because there are lots of important things we do where we don’t have the luxury of specifying our hypotheses in strictly instrumental terms beforehand, collecting data via suitably blinded proxies and analysing it just in terms of our initial hypothesis.
However my view is that if you think that scientific methodology is just a set of training wheels for people who haven’t clicked on all the sequences yet and that browsing LW makes you immune to the problems that scientific methodology exists specifically to prevent then it’s highly likely you overestimate your resistance to bias.
These are costs. It’s important, and in some contexts cheap, to know why and how things work instead of saying “I’ll ignore that since enough replication always solves such problems,” when one doesn’t know in which cases one is doing nearly pointless extra work and in which one isn’t doing enough replication. It’s an obviously sub-optimal solution along the lines of “thinking isn’t important; assume infinite resources.”
There’s also a cost to acting on the assumption that every correlation is meaningful in a world where we have so much data available to us that we can find arbitrarily large numbers of spurious correlations at P<0.01 if we try hard enough. Either way you’re spending resources, but spending resources in the cause of epistemological purity is okay with me. Spending resources on junk because you are not practising the correct purification rituals is not.
It’s praise through faint damnation of the laws of logic that they don’t prevent one from shooting one’s own foot off. Handcuffs are even better at that task, but they are less useful for figuring out what is true.
The accepted scientific methodology is more like a safety rope or seat belt. Sometimes annoying, almost always rational.
What do you think this site is for? People are reading and sharing research papers about biases in their free time. One could likewise criticize jet fuel for being inappropriate for an old fashioned coal powered locomotive. Yes, jet fuel will explode a train...this is not a flaw of jet fuel, and it does not mean that the coal-train is better at transporting things.
Rather than what a site is for I focus on what a site is.
In many, many ways this site has higher quality discourse than, say, the JREF forums and a population who on average are better versed in cognitive biases. However this discussion has made it obvious to me that on average the JREF forumites are far more aware than the LWers of the various ways that people’s estimates of P(B) can be wrong and can be manipulated.
They would never put it in those terms since Bayes is a closed book to them, but they are very well aware that you can work yourself into completely wrong positions if you aren’t sophisticated enough to correctly estimate the actual base rate at which one would expect to observe things like homeopathy apparently working, people apparently talking to the dead, people apparently having psychic powers, NLP apparently letting you seduce people and so on in worlds where none of these things did anything except act as placebos (at best).
If your P(B) is off then using Bayes Theorem is just being a mathematically precise idiot instead of an imprecise idiot. You’ll get to exactly the right degree of misguided belief, based on the degree to which you’re mistaken about the correct value of P(B,) but that’s still far worse than being someone who wouldn’t know Bayes from a bar of soap but who intuitively perceives something closer to the correct P(B).
The idea that LW browsers think they are liquid-fuelled jets while the scientists who do the actual work of moving society forward are boring old coal trains worries me. I think of LW’s “researchers” as a bunch of enthusiastic amateurs with cheap compasses and hand-drawn maps running around in the bushes in a mildly organised fashion, while scientists are painstakingly and one inch at a time building a gigantic sixteen-lane highway for us all to drive down.
There’s a very good reason why we do double-blind, placebo-controlled trials rather than just recruiting a bunch of people who browse LW to do experiments with
Yes, and people who actually understand the tradeoffs in using formal scientific reasoning and its deviations from the laws of reasoning are the only people in position to intelligently determine that. Those who say “always use the scientific method for important things” or, though I don’t know that there ever has been or ever will be such a person, “always recruit a bunch of people who browse LW,” are not thinking any more than a broken clock is ticking. As an analogy, coal trains are superior to jet planes for transporting millions of bushels of wheat from Alberta to Toronto. It would be inane and disingenuous for broken records always calling for the use of coal trains to either proclaim their greater efficiency in determining which vehicle to use to transport things because they got the wheat case right or pretend that they have a monopoly on calling for the use of trains.
With reasoning, one can intelligently determine a situation’s particulars and spend to eliminate a bias (for example by making a study double-blind) rather than doing that all the time or relying on skill in this case,and without relying on intuition to determine when. One can see that in an area, the costs of thinking something true when it isn’t exceeds the costs of thinking it’s false when it’s true, and set up correspondingly strict protocols, rather than blindly always paying in true things not believed, time, and money for the same, sometimes inadequate and sometimes excessive, amount of skepticism.
However my view is that if you think that scientific methodology is just a set of training wheels for people who haven’t clicked on all the sequences yet and that browsing LW makes you immune to the problems that scientific methodology exists specifically to prevent
My view is that if you think anyone who has interacted with you in this thread has that view you have poor reading comprehension skills.
There’s also a cost to acting on the assumption that every correlation is meaningful
So one can simply...not do that. And be a perfectly good Bayesian.
spending resources in the cause of epistemological purity is okay with me. Spending resources on junk because you are not practising the correct purification rituals is not.
It is not the case that every expenditure reducing the likelihood that something is wrong is optimal,as instead one could instead spend a bit on determining which areas ought to have extra expenditure reducing the likelihood that something is wrong there.
In any case, science has enshrined a particular few levels of spending on junk that it declares perfectly fine because the “correct” purification rituals have been done. I do not think that such spending on junk is justified because in those cases no, science is not strict enough. One can declare a set of arbitrary standards and declare spending according to them correct and ideologically pure or similar, but as one is spending fungible resources towards research goals this is spurious morality.
You’ll get to exactly the right degree of misguided belief...far worse than being someone who wouldn’t know Bayes from a bar of soap but who intuitively
Amazing, let me try one. If a Bayesian reasoner is hit by a meteor and put into a coma, he is worse off than a non-Bayesian who stayed indoors playing Xbox games and was not hit by a meteor. So we see that Bayesian reasoning is not sufficient to confer immortality and transcendence into a godlike being made of pure energy.
People on this site are well aware that if scientific studies following the same rules as the rest of science indicate that people have psychic powers, there’s something wrong with the scientific method and the scientists’ understanding of it because the notion that people have psychic powers are bullshit.
The idea that LW browsers think they are liquid-fuelled jets while the scientists who do the actual work of moving society forward are boring old coal trains worries me.
People here know that there is not some ineffable magic making science the right method in the laboratory and faith the right method in church, or science the right method in the laboratory and love the right method everywhere else, science the right method everywhere and always, etc., as would have been in accordance with people’s intuitions.
How unsurprising it is that actually understanding the benefits and drawbacks of science leads one to conclude that often science is not strict enough, and often too strict, and sometimes but rarely entirely inappropriate when used, and sometimes but rarely unused when it should be used, when heretofore everything was decided by boggling intuition.
Yes, and people who actually understand the tradeoffs in using formal scientific reasoning and its deviations from the laws of reasoning are the only people in position to intelligently determine that.
I’m not going to get into a status competition with you over who is in a position to determine what.
My view is that if you think anyone who has interacted with you in this thread has that view you have poor reading comprehension skills.
The most obvious interpretation of your statement that science is “an imperfect human construct designed to accommodate the more biased of scientists” and that “it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing” is that you think your LW expertise means that you wouldn’t need those safeguards. If I misinterpreted you I think it’s forgivable given your wording, but if I misinterpreted you then please help me out in understanding what you actually meant.
People on this site are well aware that if scientific studies following the same rules as the rest of science indicate that people have scientific powers, there’s something wrong with the scientific method and the scientists’ understanding of it because the notion that people have psychic powers are bullshit.
I’m responding under the assumption that the second “scientific” should read “psychic”. My point was not that people here didn’t get that—I imagine they all do. My point is that the evidence on the table to support PUA theories is vulnerable to all the same problems as the evidence supporting claimed psychic powers, and that when it came to this slightly harder problem some people here seemed to think that the evidence on the table for PUA was actually evidence we would not expect to see in a world where PUA was placebo plus superstition.
I think the JREF community would take one sniff of PUA and say “Looks like a scam based on a placebo”, and that they would be better Bayesians when they did so than anyone who looks at the same evidence and says “Seems legit!”.
(I suspect that the truth is that PUA has a small non-placebo effect, since we live in a universe with ample evidence that advertising and salesmanship have small non-placebo effects that are statistically significant if you get a big enough sample size. However I also suspect that PUAs have no idea which bits of PUA are the efficacious bits and which are superstition, and that they could achieve the modest gains possible much faster if they knew which was which).
I’m not going to get into a status competition with you over who is in a position to determine what.
OK, I will phrase it in different terms that make it explicit that I am making several claims here (one about what Bayesianism can determine, and one about what science can determine). It’s much like I said above:
It’s adequately suited for the accumulation of not-false beliefs, but it both could be better instrumentally designed for humans and is not the bedrock of thinking by which anything works. The thing that is essential to the method you described, “Scientists...have an informal sense of what P(A) is likely to be and are more inclined to question a conclusion if it is unlikely than if it is likely”. What abstraction describes the scientist’s thought process, the engine within the scientific method? I suggest it is Bayesian reasoning but even if it is not, one thing it cannot be is more of the Scientific method, as that would lead to recursion. If it is not Bayesian reasoning, then there are some things I am wrong about, and Bayesianism is a failed complete explanation, and the Scientific method is half of a quite adequate method—but they are still different from each other.
Some people claim Bayesian reasoning models intelligent agents’ learning about their environments, and agents’ deviations from it is failure to learn optimally. This model encompasses choosing when to use something like the scientific method and deciding when it is optimal to label beliefs not as “X% likely to be true, 1-X% likely to be untrue,” but rather “Good enough to rely on by virtue of being satisfactorily likely to be true,” and “Not good enough to rely on by virtue of being satisfactorily likely to be true”. If Bayesianism is wrong, and it may be, it’s wrong.
The scientific method is a somewhat diverse set of particular labeling systems declaring ideas “Good enough to rely on by virtue of being satisfactorily likely to be true,” and “Not good enough to rely on by virtue of being satisfactorily likely to be true.” Not only is the scientific method incomplete by virtue of using a black-box reasoning method inside of it, it doesn’t even claim to be able to adjudicate between circumstances in which it is to be used and in which it is not to be used. It is necessarily incomplete. Scientists’ reliance on intuition to decide when to use it and when not to may well be better than using Bayesian reasoning, particularly if Bayesianism is false, I grant that. But the scientific method doesn’t, correct me if I am wrong, purport to be able to formally decide whether or not a person should subject his or her religious beliefs to it.
The most obvious interpretation of your statement that science is “an imperfect human construct designed to accommodate the more biased of scientists” and that “it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing” is that you think your LW expertise means that you wouldn’t need those safeguards.
I disagree but here is a good example of where Bayesians can apply heuristics that aren’t first-order applications of Bayes rule. The failure mode of the heuristic is also easier to see than where science is accused of being too strict (though that’s really only a part of the total claim, the other parts are that science isn’t strict enough, that it isn’t near Pareto optimal according to its own tradeoffs in which it sacrifices truth, and that it is unfortunately taken as magical by its practitioners).
In those circumstances in which the Bayesian objection to science is that it is too strict, science can reply by ignoring that money is the unit of caring and declare its ideological purity and willingness to always sacrifice resources for greater certainty (such as when the sacrifice is withholding FDA approval of a drug already approved in Europe), “Either way you’re spending resources, but spending resources in the cause of epistemological purity is okay with me. Spending resources on junk because you are not practising the correct purification rituals is not.”
Here, however, the heuristic is “reading charitably”, in which the dangers of excess are really, really obvious. Nonetheless, even if I am wrong about what the best interpretation is, the extra-Bayesian ritual of reading (more) charitably would have had you thinking it more likely than you did that I had meant something more reasonable (and even more so, responding as if I did). It is logically possible that you were reading charitably ideally and my wording was simply terrible. This is a good example of how one can use heuristics other than Bayes’ rule once one discovers one is a human and therefore subject to bias. One can weigh the costs and benefits of it just like each feature of scientific testing.
For “an imperfect human construct designed to accommodate the more biased of scientists”, it would hardly do to assume scientists are all equally biased, and likewise for assuming the construct is optimal no matter the extent of bias in scientists. So the present situation could be improved upon by matching the social restrictions to the bias of scientists and also decreasing that bias. If mostly science isn’t strict enough, then perhaps it should be stricter in general (in many ways it should be) but the last thing to expect is that it is perfectly calibrated. It’s “imperfect”, I wouldn’t describe a rain dance as an “imperfect” method to get rain, it would be an “entirely useless” method. Science is “imperfect”, and it does very well to the extent thinking is warped to accommodate the more biased of scientists, and so something slightly different would be more optimal for the less biased ones.
″...it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing,” and less cost would be called for if they received such training, but not zero. Also, it is important to know that costs are incurred, lest evangelical pastors everywhere be correct when they declare science a “faith”. Science is roughly designed to prevent false things from being called “true” at the expense of true things not being called “true”. This currently occurs to different degrees in different sciences, and it should, and some of those areas should be stricter, and some should be less strict, and in all cases people shouldn’t be misled about belief such that they think there is a qualitative difference between a rigorously established base rate and one not so established, or science and predicting one’s child’s sickness when it vomits a certain color in the middle of the night.
My point is that the evidence on the table to support PUA theories is vulnerable to all the same problems as the evidence supporting claimed psychic powers
It’s not too similar since psychic powers have been found in controlled scientific studies, and they are (less than infinitely, but nearly) certainly not real. PUA theories were formed from people’s observations, then people developed ideas they thought based on the theories, then tested what they thought were the ideas, tested them insufficiently rigorously. Each such idea is barely more likely than the base rate for being correct due to all the failure nodes, but each is more likely, the way barely enriched uranium’s particles are more likely to be U-235 than natural uranium’s are. This is in line with “However I also suspect that PUAs have no idea which bits of PUA are the efficacious bits and which are superstition, and that they could achieve the modest gains possible much faster if they knew which was which”.
When it comes to action, as in psychological experiments in which one is given a single amount of money for correctly guessing the color of something between red and blue, and one determines 60% of the things are red, one should always guess red, one should act upon ideas most likely true if one must act, all else equal.
Any chance of turning this (and some of your other comments) into a top-level post? (perhaps something like, “When You Can (And Can’t) Do Better Than Science”?)
I think the first section should ignore the philosophy of science and cover the science of science, the sociology of it, and concede the sharpshooter’s fallacy, assuming that whatever science does it is trying to do. The task of improving upon the method is then not too normative, since one can simply achieve the same results with fewer resources/better results with the same resources. Also, that way science can’t blame perceived deficiencies on the methods of philosophy, as it could were one to evaluate science according to philosophy’s methods and standards. This section would be the biggest added piece of value that isn’t tying together things already on this site.
A section should look for edges with only one labeled node in the scientific methods where science requires input from a mystery method, such as how scientists generate hypotheses or how scientific revolutions occur. These show the incompleteness of the scientific method as a means to acquire knowledge, even if it is perfect at what it does. Formalization and improvement of the mystery methods would contribute to the scientific method, even if nothing formal within the model changes.
A section should discuss how science isn’t a single method (according to just about everybody), but instead a family of similar methods varying especially among fields. This weakens any claim idealizing science in general, as at most one could claim that a particular field’s method is ideal for human thought and discovery. Assuming each (or most) fields’ methods are ideal (this is the least convenient possible world for the critic of the scientific method as practiced), the costs and benefits of using that method rather than a related scientific method can be speculated upon. I expect to find, as policy debates should not be one sided, that were a field to use other fields’ methods it would have advantages and disadvantages; the simple case is choice of stricter p-value modulating wrong things believed at the expense of true things not believed.
Sections should discuss abuses of statistics, one covering violations of the law (failing to actually test P(B|~A) and instead testing P((B + (some random stuff) - (some other random stuff)|~A) and another covering systemic failures such as publication bias and failure to publish replications. This would be a good place to introduce intra-scientific debates about such things to show both that science isn’t a monolithic outlook that can be supported and how one side in the civil war is aligned with Bayesian critiques. To the extent science is not settled on what the sociology of science is, that is a mark of weakness—it may be perfectly calibrated, but it isn’t too discriminatory here.
A concession I imagine pro-science people might make is to concede the weakness of soft science, such as sociology. Nonetheless, sociology’s scientific method is deeply related to hard sciences’, and its shortcomings somewhat implicate them. What’s more, if sociology is so weak, one wonders whence the pro-science person gets their strong pro-science view. One possibility is that they get it purely from philosophy of science, (a school of which) they wholly endorse, but if that is the case they don’t have an objection in kind to those who also predict science as is works decently but have severe criticisms of it and ideas on how to improve upon it, i.e. Bayesians.
I think it’s fair to contrast the scientific view of science with a philosophical view of Bayesianism to see if they are of the same scope. If science has no position on whether or not science is an approximation of Bayesian reasoning, and Bayesianism does, that is at least one question addressed by the one and not the other. It would be easy to invent a method that’s not useful for finding truth that has a broader scope than science, e.g. answering “yes” to every yes or no question unless it would contradict a previous response. This alone would show they are not synonymous.
A problem with the title “When You Can (And Can’t) Do Better Than Science” is that it is binary, but I really want three things explicitly expressed: 1) When you can do better than science by being stricter than science, 2) when you can do better than science by being more lenient than science, 3) when you can’t do better than science. The equivocation and slipperiness surrounding what it is reasonable to do is a significant part of the last category, e.g. one doesn’t drive where the Tappan Zee Bridge should have been built. The other part is near-perfect ways science operates now according to a reasonable use of “can’t”; I wouldn’t expect science to be absolutely and exactly perfect anywhere, any more than I can be absolutely sure with a probability of 1 that the Flying Spaghetti Monster doesn’t exist.
Second order Bayesianism deserves mention as the thing being advocated. A “good Bayesian” may use heuristics to counteract bias other than just Bayes’ rule, such as the principle of charity, or pretending things are magic to counteract the effort heuristic, or reciting a large number of variably sized numbers to counteract the anchoring effect, etc.
Is there a better analogy than the driving to the airport one for why Bayes’ Rule being part of the scientific toolbox doesn’t show the scientific toolbox isn’t a rough approximation of how to apply Bayes’ Rule? The other one I thought of is light’s exhibiting quantum behavior directly, it being a subset of all that is physical, but all that is physical actually embodying quantum behavior.
A significant confusion is discussing beliefs as if they weren’t probabilistic and actions in some domains as if they ought not be influenced by anything not in a category of true belief “scientifically established”. Bayesianism explains why this is a useful approximation of how one should actually act and thereby permits one to deviate from it without having to claim something like “science doesn’t work”.
Not necessarily to reopen anything, but some notes:
the placebo effect
I’m not sure it’s at all possible to debias against this.
The accepted scientific methodology is more like a safety rope or seat belt.
I agree that those are better metaphors than handcuffs all else equal, but those things would not prevent one from shooting one’s foot, and so it didn’t fit the broader metaphor.
A better analogy would be a law that no medical treatment can be received until a second opinion is obtained, or something like that.
My own view is that the sole difference between the two is that science commands you to suspend judgment until the null hypothesis is under p=0.05, at least for the purposes of what is allowed into the scientific canon as provisional fact, and Bayesians are more comfortable making bets with greater degrees of uncertainty.
His view is only slightly more strict, yet he arrives at some very different conclusions. For example, under your framework Rhine’s ESP experiments are scientific hypothesis tests, and under his they are illogical. I am not convinced by Polanyi, but it is far from clear to me how you could show he is wrong. If you know how to show he is wrong and could explain that in a couple paragraphs (or point me to such a document) I would be very interested in reading it.
Are you familiar with Michael Polanyi Personal Knowledge?
I’m not familiar with his work, unfortunately.
However a quote from one of the reviews concerns me. The reviewer says:
The author furnishes a thought provoking analysis that demonstrates the sufficiency (perhaps not the necessity) of a pseudo-kantian mindset that makes intelligibility possible. Reductionists, various materialists, physicalists, and sundry naturalists will recoil at the prospect that universal immutable immaterial concepts, forms, and laws are essential epistemic conditions for human experience.
If that’s Polanyi’s position it seems both kooky and not immediately relevant to the topic, so unless you can take a shot at explaining what you think Polanyi’s insights are that are relevant to the topic at hand I think we should drop this and take it up elsewhere or by other means if you want to talk about it further.
As I said, I’m less interested in “scientific” evidence than Bayesian evidence. The latter can be disappointingly orthogonal to the former, in that what’s generally good scientific evidence isn’t always good Bayesian evidence, and good Bayesian evidence isn’t always considered scientific.
What are some examples of good scientific evidence that isn’t good bayesian evidence?
What are some examples of good scientific evidence that isn’t good bayesian evidence?
Uh, how about all of parapsychology, aka “the control group for the scientific method”. ;-) Psi experiments can reach p .05 under conventional methods without being good Bayesian evidence, as we’ve seen recently with that “future priming” psi experiment.
(Note that I said “scientific” not Scientific. ;-) )
Ok, I wouldn’t have necessarily classed that as ‘good scientific evidence’ but it seems to be useful Bayesian evidence so we must be looking at it from different angles.
I think your position is going to turn out to be unfalsifiable on the point of whether relationships involving honesty, equality and mutual support actually exist. If your response to claims that they exist is to say “Well in my experience they don’t exist, the people who think they do are just deluded” I can’t provide any evidence that will change your views. After all, I could just be deluded.
As for whether I’m engaging with, and have read, the “real” PUA literature or the “good” PUA literature, I’m not sure whether or not this is an instance of the No True Scotsman argument. There’s no question that a large part of the PUA literature and community are misogynist and committed to an ideology that positions themselves as high-status and women and non-PUA men as low-status. As such that part of PUA culture is antithetical to the goals of LW as I understand them since those goals include maximising everyone’s utility.
If there’s a subset of positive-utility PUA thinking then that criticism does not apply and it’s at least possible that if they have scientific data to back up their claims then there is something useful to be found there.
I think it’s the PUA advocates’ burden of proof to show us that data though, if there really is an elephant of good data pertinent to pursuing high net-utility outcomes in the room. As opposed to some truisms which predate PUA culture by a very long time hidden under an encrustation of placebo superstitions.
Huh? I didn’t say those things didn’t exist. I said I was not searching for a lack of those things (I even bolded the word “lack” so you wouldn’t miss it), and that I don’t see why you think that PUA requires such a lack.
Authentic Man Program and Johnny Soporno are the two schools I’m aware of that are strongly in the honesty and empowerment camps, AFAICT, and would constitute the closest things to “true scotsmen” for me. Most other things that I’ve seen have been a bit of a mixed bag, in that both empathetic and judgmental material (or honest and dishonest) can both be found in the same set of teachings.
Of notable interest to LW-ers, those two schools don’t advocate even the token dishonesty of false premises for starting a conversation, let alone dishonesty regarding anything more important than that.
(Now, if you want to say that these schools aren’t really PUA, then you’re going to be the one making a No True Scotsman argument. ;-) )
As I said, I’m less interested in “scientific” evidence than Bayesian evidence. The latter can be disappointingly orthogonal to the former, in that what’s generally good scientific evidence isn’t always good Bayesian evidence, and good Bayesian evidence isn’t always considered scientific.
More to the point, if your goals are more instrumental than epistemic, the reason why a particular thing works is of far less interest than whether it works and how it can be utilized.
I took a quick look at AMP and Soporno’s web sites and I’m more than happy to accept them as non-misogynistic dating advice sources aiming for mutually beneficial relationships. I wasn’t previously aware of them but I unconditionally accept them as True Scotsmen.
I’m now interested in how useful their advice is, either in instrumental or epistemic terms. Either would be significant, but if there is no hard evidence then the fact that their intentions are in step with those of LW doesn’t get them a free pass if they don’t have sound methodology behind their claims.
I’m aware Eliezer thinks there’s a difference between scientific evidence and Bayesian evidence but it’s my view that this is because he has a slightly unsophisticated understanding of what science is. My own view is that the sole difference between the two is that science commands you to suspend judgment until the null hypothesis is under p=0.05, at least for the purposes of what is allowed into the scientific canon as provisional fact, and Bayesians are more comfortable making bets with greater degrees of uncertainty.
Regardless, if your goals are genuinely instrumental you very much want to figure out what parts of the effect are due to placebo effects and what parts are due to real effects, so you can maximise your beneficial outcomes with a minimum of effort. If PUA is effective to some extent but solely due to placebo effects then it only merits a tiny footnote in a rationalist approach to relationships. If it has effects beyond placebo effects then and only then is there something interesting for rationalists to look at.
There is a word for the problem that results from this way of thinking about instrumental advice. It’s called “akrasia”. ;-)
Again, if you could get people to do things without taking into consideration the various quirks and design flaws of the human brain (from our perspective), then self-help books would be little more than to-do lists.
In general, when I see somebody worrying about placebo effects in instrumental fields affected by motivation, I tend to assume that they are either:
Inhumanly successful and akrasia-free at all their chosen goals, (not bloody likely),
Not actually interested in the goal being discussed, having already solved it to their satisfaction (ala skinny people accusing fat people of lacking willpower), or
Very interested in the goal, but not actually doing anything about it, and thus very much in need of a reason to discount their lack of action by pointing to the lack of “scientifically” validated advice as their excuse for why they’re not doing that much.
Perhaps you can suggest a fourth alternative? ;-)
I’d prefer not to discuss this at the ad hominem level. You can assume for the sake of argument whichever of those three assumptions you prefer is correct, if it suits you. I’m indifferent to your choice—it makes no difference to my utility. I make no assumptions about why you hold the views you do.
My view is that the rationalist approach is to take it apart to see how it works, and then maybe afterwards put the bits that actually work back together with a dollop of motivating placebo effect on top.
The best way to approach research into helping overweight people lose weight is to study human biochemistry and motivation, and see what combinations of each work best. Not to leave the two areas thoroughly entangled and dismiss those interested in disentangling them as having the wrong motivations. I think the same goes for forming and maintaining romantic relationships.
Me either. I was asking you for a fourth alternative on the presumption that you might have one.
FWIW, I don’t consider any of those alternatives somehow bad, nor is my intention to use the classification to score some sort of points. People who fall into category 3 are of particular interest to me, however, because they’re people who can potentially be helped by understanding what it is they’re doing.
To put it another way, it wasn’t a rhetorical question, but one of information. If you fall in category 1 or 2, we have little further to discuss, but that’s okay. If you fall in category 3, I’d like to help you out of it. If you fall in an as-yet-to-be-seen category 4, then I get to learn something.
So, win, win, win, win, in all four cases.
This is conflating things a bit: my reference to weight loss was pointing out that “universal” weight-loss advice doesn’t really exist, so a rationalist seeking to lose weight must personally test alternatives, if he or she cannot afford to wait for science to figure out the One True Theory of Weight Loss.
This presupposes that you already have something that works, which you will not have unless you first test something. Even if you are only testing scientifically-validated principles, you must still find which are applicable to your individual situation and goals!
Heck, medical science uses different treatments for different kinds of cancer, and occasionally different treatments for the same kind of cancer, depending on the situation or the actual results on an individual - does this mean that medical science is irrational? If not, then pointing a finger at the variety of situation-specific PUA advice is just rhetoric, masquerading as reasoning.
I imagine you’d put me in category #2 as I’m currently in a happy long-term relationship. However my self-model says that three years ago when I was single and looking for a partner that I would still want to know what the actual facts about the universe were, so I’d put myself in category #4, the category of people for whom it’s reflexive to ask what the suitably blinded, suitably controlled evidence says whether or not they personally have a problem at that point in their lives with achieving relevant goals.
I think we should worry about placebo effects everywhere they get in the way of finding out how the universe actually works, whether they happen to be in instrumental fields affected by motivation or somewhere else entirely.
That didn’t mean that I chose celibacy until the peer-reviewed literature could show me an optimised mate-finding strategy, of course, but it does mean that I don’t pretend that guesswork based on my experience is a substitute for proper science.
The difference between your PUA example and medicine is that medicine usually has relevant evidence for every single one of those medical decisions. (Evidence-based medicine has not yet driven the folklore out of the hospital by a long chalk but the remaining pockets of irrationality are a Very Bad Thing). Engineers use different materials for different jobs, and photographers use different lenses for different shots too. I don’t see how the fact that these people do situation-specific things gets you to the conclusion that because PUAs are doing situation-specific things too they must be right.
It doesn’t. It just refutes your earlier rhetorical conflation of PUA with alternative medicine on the same grounds.
At this point, I’m rather tired of you continually reframing my positions to stronger positions, which you can then show are fallacies.
I’m not saying you’re doing it on purpose (you could just be misunderstanding me, after all), but you’ve been doing it a lot, and it’s really lowering the signal-to-noise ratio. Also, you appear to disagree with some of LW’s premises about what “rationality” is. So, I don’t think continued discussion along these lines is likely to be very productive.
My intent was to show that in the absence of hard evidence PUA has the same epistemic claim on us as any other genre of folklore or folk-psychology, which is to say not much.
I admit I’m struggling to understand what your positions actually are, since you are asking me questions about my motivations and accusing me of “rhetoric, not reasoning” but not telling me what you believe to be true and why you believe it to be true. Or to put it another way, I don’t believe you have given me much actual signal to work with, and hence there is a very distinct limit to how much relevant signal I can send back to you.
Maybe we should reboot this conversation and start with you telling me what you believe about PUA and why you believe it?
Ok. I’ll hang in here for a bit, since you seem sincere.
Here’s one belief: PUA literature contains a fairly large number of useful, verifiable, observational predictions about the nonverbal aspects of interactions occurring between men and women while they are becoming acquainted and/or attracted.
Why do I believe this? Because their observational predictions match personal experiences I had prior to encountering the PUA literature. This suggests to me that when it comes to concrete behavioral observations, PUAs are reasonably well-calibrated.
For that reason, I view such PUA literature—where and only where it focuses on such concrete behavioral observations—as being relatively high quality sources of raw observational data.
In this, I find PUA literature to be actually better than the majority of general self-help and personal development material, as there is often nowhere near enough in the way of raw data or experiential-level observation in self-help books.
Of course, the limitation on my statements is the precise definition of “PUA literature”, as there’s definitely a selection effect going on. I tend to ignore PUA material that is excessively misogynistic on its face, simply because extracting the underlying raw data is too… tedious, let’s say. ;-) I also tend to ignore stuff that doesn’t seem to have any connection to concrete observations.
So, my definition of “PUA literature” is thus somewhat circular: I believe good stuff is good, having carefully selected which bits to label “good”. ;-)
Another aspect of my possible selection bias is that I don’t actually read PUA literature in order to do PUA!
I read PUA literature because of its relevance to topics such as confidence, fear, perceptions of self-worth, and other more common “self-help” topics that are of interest to me or to my customers. By comparison, PUA literature (again using my self-selected subset) contains much better raw data than traditional self-help books, because it comes from people who’ve relentlessly calibrated their observations against a harder goal than just, say, “feeling confident”.
The problem with this line of reasoning is that there are people who believe they have relentlessly calibrated their observations against reality using high quality sources of raw observational data and that as a result they have a system that lets them win at Roulette. (Barring high-tech means to track the ball’s vector or identifying an unbalanced wheel).
Roulette seems to be an apt comparison because based on the figures someone else quoted or linked to earlier about a celebrated PUAist hitting on 10 000 women and getting 300 of them into bed, the odds of a celebrated PUAist getting laid on a single approach even according to their own claims is not far off the odds of correctly predicting exactly which hole a Roulette ball will land in.
So when these people say “I tried a new approach where I flip flopped, be-bopped, body rocked, negged, nigged, nugged and nogged, then went for the Dutch Rudder and I believe this worked well” unless they tried this on a really large number of women so that they could detect changes in a base rate of 3% success I really don’t think they have any meaningful evidence. Did their success rate go up from 3% to 4% or what, and what are their error bars?
What’s the base rate for people not using PUA techniques anyway? People other than PUAs are presumably getting laid, so it’s got to be non-zero. The closer it is to 3% the less effect PUA techniques are likely to have.
I’ve already heard the response “Look, we don’t get just one bit of data as feedback. We PUAs get all sorts of nuanced feedback about what works and does not”. If that’s so and this feedback is doing some good this should be reflected in your hit rate for getting laid. If picking up women and getting them in to bed is an unfair metric for PUA effectiveness I really think it should be called something other than PUA.
My thinking is that you don’t have enough data to distinguish whether you are in a world where PUA training has a measurable effect, from a world where PUA have an unfalsifiable mythology that allows them to explain their hits and misses to themselves, and a collection of superstitions about what works and does not, but no actual knowledge that separates them in terms of success rate from those who simply scrub up, dress up and ask a bunch of women out.
I want to see that null hypothesis satisfactorily falsified before I allow that there is an elephant in the room.
Once again, you are misstating my claims.
Notice that nowhere in my post did I say pickup artists get laid, let alone that they get laid more often!
Nowhere did I state anything about their predictions of what behavior works to get laid!
I even explicitly pointed out that the information I’m most interested in obtaining from PUA literature, has notthing to do with getting laid!
So just by talking about the subject of getting laid, you demonstrate a complete failure to address what I actually wrote, vs. what you appear to have imagined I wrote.
So, please re-read what I actually wrote and respond only to what I actually wrote, if you’d like me to continue to engage in this discussion.
Okay. What observable outcomes do you think you can obtain at better-than-base-rate frequencies employing these supposed insights, and why do you think you can obtain them?
As I said earlier I think that if PUA insights cannot be cashed out in a demonstrable improvement in the one statistic which you would think would matter most to them, rate of getting laid, then there is grounds to question whether these supposed insights are of any use to anyone.
But if you would prefer to use some other metric I’m willing to look at the evidence.
Guesswork based on your experience isn’t supposed to be a substitute for science. It’s the part of science that you do when choosing which phenomena you want to test, well before you get to the blinding and peer review.
The flip side is that proper science isn’t a substitute for either instrumental rationality or epistemic rationality. Limiting your understanding of the world entirely to what is already published in journals gives you a model of the world that is subjectively objectively wrong.
I don’t disagree but a potentially interesting research area isn’t an elephant in the room that demands attention in a literature review, and limiting yourself to proper science is no sin in a literature review either. Only when the lessons we can learn from proper science are exhausted should we start casting about in the folklore for interesting research areas, and we certainly shouldn’t put much weight on anecdotes from this folklore. In Bayesian terms such anecdotes should shift our prior probability very, very slightly if at all.
No ad hominem fallacy present in grandparent.
Why don’t you first describe one, then the other, then contrast them? Then, describe Eliezer’s view and contrast that with your position.
I’ll try to do it briefly, but it will be a bit tight. Let’s see how we go.
Bayes’ Theorem is part of the scientific toolbox. Pick up a first year statistics textbook and it will be in there, although not always under that name (look for “conditional probability” or similar constructs). Most of scientific methodology is about ensuring that you do your Bayesian updating right, by correctly establishing the base rate and the probability of your observations given the null hypothesis. (Scientists don’t state their P(A), but they certainly have an informal sense of what P(A) is likely to be and are more inclined to question a conclusion if it is unlikely than if it is likely).
If you’re doing Bayes right it’s the same as doing science, but I think some of the LW groupthink holds that you can do a valid Bayesian update in the absence of a rigorously established base rate, and so they think this is a difference between being a good Bayesian and being a good scientist. I think they are just being bad Bayesians since updating is no better than guesswork in the absence of a rigorously obtained P(B).
Eliezer (based on The Dilemma: Science or Bayes? ) doesn’t quite carve up science-culture from ideal-science-methodology the way I do, and infers that there is something wrong with Science because the culture doesn’t care about revising instrumentally-indistinguishable models to make them more Eliezer-intuitive. I think this has more to do with trying to win a status war with Science than with any differences in predicted observations that matter.
That doesn’t mean it doesn’t underlie the entire structure. As an analogy, to get from New York to Miami, one must generally go south. But instructions on how to get there will be a hodgepodge of walk north out of the building, west to the car, drive due east, then turn south...the plane takes off headed east...and turns south...etc. Showing that going south is one of several ways to turn while walking doesn’t mean its no conceptually different than north for getting fro New York to Miami. Similarly:
If one is paid to do plumbing, then there is no difference between being a good plumber and a “good Bayesian”, and in that sense there is no difference between being a “good Bayesian” and a “good scientist”.
In the sense in which it is intended, there is a difference between being a “good Bayesian” and a “good scientist”. To continue the analogy, if one must go from Ramsey to JFK airport across the Tappan Zee Bridge, one’s route will be on a convoluted path to a bridge that’s in a monstrously inconvenient location. It was built there—at great additional expense as that is where the river is widest—to be just outside of the NY/NJ Port Authority’s jurisdiction. The best route from Ramsey to Miami may be that way, but that accommodates human failings, and is not the direct route. Likewise for every movement that is made in a direction not as the crow flies. Bayesian laws are the standard by which the crow flies, against which it makes sense to compare the inferior standards that better suit our personal and organizational deficiencies.
Well, yes and no. It’s adequately suited for the accumulation of not-false beliefs, but it both could be better instrumentally designed for humans and is not the bedrock of thinking by which anything works. The thing that is essential to the method you described, “Scientists...have an informal sense of what P(A) is likely to be and are more inclined to question a conclusion if it is unlikely than if it is likely”. What abstraction describes the scientist’s thought process, the engine within the scientific method? I suggest it is Bayesian reasoning but even if it is not, one thing it cannot be is more of the Scientific method, as that would lead to recursion. If it is not Bayesian reasoning, then there are some things I am wrong about, and Bayesianism is a failed complete explanation, and the Scientific method is half of a quite adequate method—but they are still different from each other.
P(B|~A) is inversely proportional to P(A|B) by Bayes’ Rule, so the direction is right—that’s why we can make planes that don’t fall out of the sky. But just using P(B|~A) isn’t what’s done, because scientists interject their subjective expectations here and pretend they do not. P(B|~A) doesn’t contain whether or not a researcher would have published something had she found a two tail rather than one tail test—a complaint about a paper I read just a few hours ago. What goes into p-values necessarily involves the arbitrary classes the scientist has decided evidence would fit in, and then measures his or her surprise at the class of evidence that is found. That’s not P(B|~A), it’s P(C|~A).
Do you have examples of boundary cases that distinguish a rigorously established one with one that isn’t?
If one believes in qualitatively different beliefs, the rigorous and the non-rigorous, one falls into paradoxes such as the lottery paradox. It’s important to establish the actual nature of knowledge as probabilistic, and not be tricked into thinking science is a separate non-overlapping magisteria with other things.
With such actually correct understanding of how beliefs should work, we can think about improving our thinking rather than eternally and in vain trying to smooth out a ripple in a rug that has a table on each of its corners, hoping our mistaken view of the world has few harmful implications like “Jesus Christ is God’s only son” and not “life begins at conception”.
Or, we could not act on our most coherent world-views, only acting according to whatever fragment of thought our non-coherent attention presents to us. Not appealing.
Thank you for saying my point better than I was able to.
I don’t think scientists think about it much. That’s more the sort of thing philosophers of science think about. The smarter scientists do what is essentially Bayesian updating, although very few of them would actually put a number on their prior and calculate their posterior based on a surprising p value. They just know that it takes a lot of very good evidence to overturn a well-established theory, and not so much evidence to establish a new claim consistent with the existing scientific knowledge.
Stating your hypothesis beforehand and specifying exactly what will and will not count as evidence before you collect your data is a very good way of minimising the effect of your own biases, but naughty scientists can and do take the opportunity to cook the experiment by strategically choosing what will count as evidence. Still, overall it’s better than letting scientists pore over the entrails of their experimental results and make up a hypothesis after the fact. If a great new hypothesis comes out of the data then you have do to your legwork and do a whole new experiment to test the new hypothesis, and that’s how it should be. If the effect is real it will keep. The universe won’t change on you.
It’s not a binary distinction. Rather, if you’re unaware of the ways that people’s P(B) estimates can be wildly inaccurate and think that your naive P(B) estimates are likely to be accurate then you can update into all sorts of stupid and factually false beliefs even if you’re an otherwise perfect Bayesian.
The people who think that John Edward can talk to dead people might well be perfect Bayesians who just haven’t checked to see what the probability is that John Edward could produce the effects he produces in a world where he can’t talk to dead people. If you think the things he does are improbable then it’s technically correct to update to a greater belief in the hypothesis that he can channel dead people. It’s only if you know that his results are exactly what you’d expect in a world where he’s a fake that you can do the correct thing, which is not update your prior belief that the probability that he’s a fake is 99.99...9%.
If someone’s done some actual work to see if they can falsify the null hypothesis that PUS techniques are indistinguishable from a change, a comb, a shower and asking some women out I’d be interested in seeing it. In the absence of such work I think good Bayesians have to recognise that they don’t have a P(B) with small enough error bars to be very useful.
Exactly, it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing. So not “If you’re doing Bayes right it’s the same as doing science”, where “science” is an imperfect human construct designed to accommodate the more biased of scientists.
These are costs. It’s important, and in some contexts cheap, to know why and how things work instead of saying “I’ll ignore that since enough replication always solves such problems,” when one doesn’t know in which cases one is doing nearly pointless extra work and in which one isn’t doing enough replication. It’s an obviously sub-optimal solution along the lines of “thinking isn’t important; assume infinite resources.”
It’s praise through faint damnation of the laws of logic that they don’t prevent one from shooting one’s own foot off. Handcuffs are even better at that task, but they are less useful for figuring out what is true.
Exactly, so in “some of the LW groupthink holds that you can do a valid Bayesian update in the absence of a rigorously established base rate,” they are right, and “updating is no better than guesswork in the absence of a rigorously obtained P(B),” is not always true, such as when the following condition doesn’t apply, and it doesn’t here:
What do you think this site is for? People are reading and sharing research papers about biases in their free time. One could likewise criticize jet fuel for being inappropriate for an old fashioned coal powered locomotive. Yes, jet fuel will explode a train...this is not a flaw of jet fuel, and it does not mean that the coal-train is better at transporting things.
That’s not the claim in question.
In any case, there are better ways to think about this subject than with null hypotheses. Those are social constructs focusing (decently) on optimizing preventing belief in untrue things, rather than determining what’s most likely true, here false beliefs have relatively less cost than in most of science, and will in any case only be held probabilistically.
There’s a very good reason why we do double-blind, placebo-controlled trials rather than just recruiting a bunch of people who browse LW to do experiments with, on the basis that since LWers are “trained in debiasing” they are immune to wishful thinking, confirmation bias, the experimenter effect, the placebo effect and so on.
I have a great deal more faith in methodological constructs that make it impossible for bias to have an effect than in people’s claims to “debiased” status.
Don’t get me wrong, I think that training in avoiding cognitive biases is very important because there are lots of important things we do where we don’t have the luxury of specifying our hypotheses in strictly instrumental terms beforehand, collecting data via suitably blinded proxies and analysing it just in terms of our initial hypothesis.
However my view is that if you think that scientific methodology is just a set of training wheels for people who haven’t clicked on all the sequences yet and that browsing LW makes you immune to the problems that scientific methodology exists specifically to prevent then it’s highly likely you overestimate your resistance to bias.
There’s also a cost to acting on the assumption that every correlation is meaningful in a world where we have so much data available to us that we can find arbitrarily large numbers of spurious correlations at P<0.01 if we try hard enough. Either way you’re spending resources, but spending resources in the cause of epistemological purity is okay with me. Spending resources on junk because you are not practising the correct purification rituals is not.
The accepted scientific methodology is more like a safety rope or seat belt. Sometimes annoying, almost always rational.
Rather than what a site is for I focus on what a site is.
In many, many ways this site has higher quality discourse than, say, the JREF forums and a population who on average are better versed in cognitive biases. However this discussion has made it obvious to me that on average the JREF forumites are far more aware than the LWers of the various ways that people’s estimates of P(B) can be wrong and can be manipulated.
They would never put it in those terms since Bayes is a closed book to them, but they are very well aware that you can work yourself into completely wrong positions if you aren’t sophisticated enough to correctly estimate the actual base rate at which one would expect to observe things like homeopathy apparently working, people apparently talking to the dead, people apparently having psychic powers, NLP apparently letting you seduce people and so on in worlds where none of these things did anything except act as placebos (at best).
If your P(B) is off then using Bayes Theorem is just being a mathematically precise idiot instead of an imprecise idiot. You’ll get to exactly the right degree of misguided belief, based on the degree to which you’re mistaken about the correct value of P(B,) but that’s still far worse than being someone who wouldn’t know Bayes from a bar of soap but who intuitively perceives something closer to the correct P(B).
The idea that LW browsers think they are liquid-fuelled jets while the scientists who do the actual work of moving society forward are boring old coal trains worries me. I think of LW’s “researchers” as a bunch of enthusiastic amateurs with cheap compasses and hand-drawn maps running around in the bushes in a mildly organised fashion, while scientists are painstakingly and one inch at a time building a gigantic sixteen-lane highway for us all to drive down.
Yes, and people who actually understand the tradeoffs in using formal scientific reasoning and its deviations from the laws of reasoning are the only people in position to intelligently determine that. Those who say “always use the scientific method for important things” or, though I don’t know that there ever has been or ever will be such a person, “always recruit a bunch of people who browse LW,” are not thinking any more than a broken clock is ticking. As an analogy, coal trains are superior to jet planes for transporting millions of bushels of wheat from Alberta to Toronto. It would be inane and disingenuous for broken records always calling for the use of coal trains to either proclaim their greater efficiency in determining which vehicle to use to transport things because they got the wheat case right or pretend that they have a monopoly on calling for the use of trains.
With reasoning, one can intelligently determine a situation’s particulars and spend to eliminate a bias (for example by making a study double-blind) rather than doing that all the time or relying on skill in this case,and without relying on intuition to determine when. One can see that in an area, the costs of thinking something true when it isn’t exceeds the costs of thinking it’s false when it’s true, and set up correspondingly strict protocols, rather than blindly always paying in true things not believed, time, and money for the same, sometimes inadequate and sometimes excessive, amount of skepticism.
My view is that if you think anyone who has interacted with you in this thread has that view you have poor reading comprehension skills.
So one can simply...not do that. And be a perfectly good Bayesian.
It is not the case that every expenditure reducing the likelihood that something is wrong is optimal,as instead one could instead spend a bit on determining which areas ought to have extra expenditure reducing the likelihood that something is wrong there.
In any case, science has enshrined a particular few levels of spending on junk that it declares perfectly fine because the “correct” purification rituals have been done. I do not think that such spending on junk is justified because in those cases no, science is not strict enough. One can declare a set of arbitrary standards and declare spending according to them correct and ideologically pure or similar, but as one is spending fungible resources towards research goals this is spurious morality.
Amazing, let me try one. If a Bayesian reasoner is hit by a meteor and put into a coma, he is worse off than a non-Bayesian who stayed indoors playing Xbox games and was not hit by a meteor. So we see that Bayesian reasoning is not sufficient to confer immortality and transcendence into a godlike being made of pure energy.
People on this site are well aware that if scientific studies following the same rules as the rest of science indicate that people have psychic powers, there’s something wrong with the scientific method and the scientists’ understanding of it because the notion that people have psychic powers are bullshit.
People here know that there is not some ineffable magic making science the right method in the laboratory and faith the right method in church, or science the right method in the laboratory and love the right method everywhere else, science the right method everywhere and always, etc., as would have been in accordance with people’s intuitions.
How unsurprising it is that actually understanding the benefits and drawbacks of science leads one to conclude that often science is not strict enough, and often too strict, and sometimes but rarely entirely inappropriate when used, and sometimes but rarely unused when it should be used, when heretofore everything was decided by boggling intuition.
Grammar nitpick: should be “is bullshit,” referring to the singular “notion.”
I’m not going to get into a status competition with you over who is in a position to determine what.
The most obvious interpretation of your statement that science is “an imperfect human construct designed to accommodate the more biased of scientists” and that “it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing” is that you think your LW expertise means that you wouldn’t need those safeguards. If I misinterpreted you I think it’s forgivable given your wording, but if I misinterpreted you then please help me out in understanding what you actually meant.
I’m responding under the assumption that the second “scientific” should read “psychic”. My point was not that people here didn’t get that—I imagine they all do. My point is that the evidence on the table to support PUA theories is vulnerable to all the same problems as the evidence supporting claimed psychic powers, and that when it came to this slightly harder problem some people here seemed to think that the evidence on the table for PUA was actually evidence we would not expect to see in a world where PUA was placebo plus superstition.
I think the JREF community would take one sniff of PUA and say “Looks like a scam based on a placebo”, and that they would be better Bayesians when they did so than anyone who looks at the same evidence and says “Seems legit!”.
(I suspect that the truth is that PUA has a small non-placebo effect, since we live in a universe with ample evidence that advertising and salesmanship have small non-placebo effects that are statistically significant if you get a big enough sample size. However I also suspect that PUAs have no idea which bits of PUA are the efficacious bits and which are superstition, and that they could achieve the modest gains possible much faster if they knew which was which).
OK, I will phrase it in different terms that make it explicit that I am making several claims here (one about what Bayesianism can determine, and one about what science can determine). It’s much like I said above:
Some people claim Bayesian reasoning models intelligent agents’ learning about their environments, and agents’ deviations from it is failure to learn optimally. This model encompasses choosing when to use something like the scientific method and deciding when it is optimal to label beliefs not as “X% likely to be true, 1-X% likely to be untrue,” but rather “Good enough to rely on by virtue of being satisfactorily likely to be true,” and “Not good enough to rely on by virtue of being satisfactorily likely to be true”. If Bayesianism is wrong, and it may be, it’s wrong.
The scientific method is a somewhat diverse set of particular labeling systems declaring ideas “Good enough to rely on by virtue of being satisfactorily likely to be true,” and “Not good enough to rely on by virtue of being satisfactorily likely to be true.” Not only is the scientific method incomplete by virtue of using a black-box reasoning method inside of it, it doesn’t even claim to be able to adjudicate between circumstances in which it is to be used and in which it is not to be used. It is necessarily incomplete. Scientists’ reliance on intuition to decide when to use it and when not to may well be better than using Bayesian reasoning, particularly if Bayesianism is false, I grant that. But the scientific method doesn’t, correct me if I am wrong, purport to be able to formally decide whether or not a person should subject his or her religious beliefs to it.
I disagree but here is a good example of where Bayesians can apply heuristics that aren’t first-order applications of Bayes rule. The failure mode of the heuristic is also easier to see than where science is accused of being too strict (though that’s really only a part of the total claim, the other parts are that science isn’t strict enough, that it isn’t near Pareto optimal according to its own tradeoffs in which it sacrifices truth, and that it is unfortunately taken as magical by its practitioners).
In those circumstances in which the Bayesian objection to science is that it is too strict, science can reply by ignoring that money is the unit of caring and declare its ideological purity and willingness to always sacrifice resources for greater certainty (such as when the sacrifice is withholding FDA approval of a drug already approved in Europe), “Either way you’re spending resources, but spending resources in the cause of epistemological purity is okay with me. Spending resources on junk because you are not practising the correct purification rituals is not.”
Here, however, the heuristic is “reading charitably”, in which the dangers of excess are really, really obvious. Nonetheless, even if I am wrong about what the best interpretation is, the extra-Bayesian ritual of reading (more) charitably would have had you thinking it more likely than you did that I had meant something more reasonable (and even more so, responding as if I did). It is logically possible that you were reading charitably ideally and my wording was simply terrible. This is a good example of how one can use heuristics other than Bayes’ rule once one discovers one is a human and therefore subject to bias. One can weigh the costs and benefits of it just like each feature of scientific testing.
For “an imperfect human construct designed to accommodate the more biased of scientists”, it would hardly do to assume scientists are all equally biased, and likewise for assuming the construct is optimal no matter the extent of bias in scientists. So the present situation could be improved upon by matching the social restrictions to the bias of scientists and also decreasing that bias. If mostly science isn’t strict enough, then perhaps it should be stricter in general (in many ways it should be) but the last thing to expect is that it is perfectly calibrated. It’s “imperfect”, I wouldn’t describe a rain dance as an “imperfect” method to get rain, it would be an “entirely useless” method. Science is “imperfect”, and it does very well to the extent thinking is warped to accommodate the more biased of scientists, and so something slightly different would be more optimal for the less biased ones.
″...it’s a cost and a deviation from ideal thinking to minimize the influence of scientists who receive no training in debiasing,” and less cost would be called for if they received such training, but not zero. Also, it is important to know that costs are incurred, lest evangelical pastors everywhere be correct when they declare science a “faith”. Science is roughly designed to prevent false things from being called “true” at the expense of true things not being called “true”. This currently occurs to different degrees in different sciences, and it should, and some of those areas should be stricter, and some should be less strict, and in all cases people shouldn’t be misled about belief such that they think there is a qualitative difference between a rigorously established base rate and one not so established, or science and predicting one’s child’s sickness when it vomits a certain color in the middle of the night.
It’s not too similar since psychic powers have been found in controlled scientific studies, and they are (less than infinitely, but nearly) certainly not real. PUA theories were formed from people’s observations, then people developed ideas they thought based on the theories, then tested what they thought were the ideas, tested them insufficiently rigorously. Each such idea is barely more likely than the base rate for being correct due to all the failure nodes, but each is more likely, the way barely enriched uranium’s particles are more likely to be U-235 than natural uranium’s are. This is in line with “However I also suspect that PUAs have no idea which bits of PUA are the efficacious bits and which are superstition, and that they could achieve the modest gains possible much faster if they knew which was which”.
When it comes to action, as in psychological experiments in which one is given a single amount of money for correctly guessing the color of something between red and blue, and one determines 60% of the things are red, one should always guess red, one should act upon ideas most likely true if one must act, all else equal.
Any chance of turning this (and some of your other comments) into a top-level post? (perhaps something like, “When You Can (And Can’t) Do Better Than Science”?)
Yes.
I think the first section should ignore the philosophy of science and cover the science of science, the sociology of it, and concede the sharpshooter’s fallacy, assuming that whatever science does it is trying to do. The task of improving upon the method is then not too normative, since one can simply achieve the same results with fewer resources/better results with the same resources. Also, that way science can’t blame perceived deficiencies on the methods of philosophy, as it could were one to evaluate science according to philosophy’s methods and standards. This section would be the biggest added piece of value that isn’t tying together things already on this site.
A section should look for edges with only one labeled node in the scientific methods where science requires input from a mystery method, such as how scientists generate hypotheses or how scientific revolutions occur. These show the incompleteness of the scientific method as a means to acquire knowledge, even if it is perfect at what it does. Formalization and improvement of the mystery methods would contribute to the scientific method, even if nothing formal within the model changes.
A section should discuss how science isn’t a single method (according to just about everybody), but instead a family of similar methods varying especially among fields. This weakens any claim idealizing science in general, as at most one could claim that a particular field’s method is ideal for human thought and discovery. Assuming each (or most) fields’ methods are ideal (this is the least convenient possible world for the critic of the scientific method as practiced), the costs and benefits of using that method rather than a related scientific method can be speculated upon. I expect to find, as policy debates should not be one sided, that were a field to use other fields’ methods it would have advantages and disadvantages; the simple case is choice of stricter p-value modulating wrong things believed at the expense of true things not believed.
Sections should discuss abuses of statistics, one covering violations of the law (failing to actually test P(B|~A) and instead testing P((B + (some random stuff) - (some other random stuff)|~A) and another covering systemic failures such as publication bias and failure to publish replications. This would be a good place to introduce intra-scientific debates about such things to show both that science isn’t a monolithic outlook that can be supported and how one side in the civil war is aligned with Bayesian critiques. To the extent science is not settled on what the sociology of science is, that is a mark of weakness—it may be perfectly calibrated, but it isn’t too discriminatory here.
A concession I imagine pro-science people might make is to concede the weakness of soft science, such as sociology. Nonetheless, sociology’s scientific method is deeply related to hard sciences’, and its shortcomings somewhat implicate them. What’s more, if sociology is so weak, one wonders whence the pro-science person gets their strong pro-science view. One possibility is that they get it purely from philosophy of science, (a school of which) they wholly endorse, but if that is the case they don’t have an objection in kind to those who also predict science as is works decently but have severe criticisms of it and ideas on how to improve upon it, i.e. Bayesians.
I think it’s fair to contrast the scientific view of science with a philosophical view of Bayesianism to see if they are of the same scope. If science has no position on whether or not science is an approximation of Bayesian reasoning, and Bayesianism does, that is at least one question addressed by the one and not the other. It would be easy to invent a method that’s not useful for finding truth that has a broader scope than science, e.g. answering “yes” to every yes or no question unless it would contradict a previous response. This alone would show they are not synonymous.
A problem with the title “When You Can (And Can’t) Do Better Than Science” is that it is binary, but I really want three things explicitly expressed: 1) When you can do better than science by being stricter than science, 2) when you can do better than science by being more lenient than science, 3) when you can’t do better than science. The equivocation and slipperiness surrounding what it is reasonable to do is a significant part of the last category, e.g. one doesn’t drive where the Tappan Zee Bridge should have been built. The other part is near-perfect ways science operates now according to a reasonable use of “can’t”; I wouldn’t expect science to be absolutely and exactly perfect anywhere, any more than I can be absolutely sure with a probability of 1 that the Flying Spaghetti Monster doesn’t exist.
Second order Bayesianism deserves mention as the thing being advocated. A “good Bayesian” may use heuristics to counteract bias other than just Bayes’ rule, such as the principle of charity, or pretending things are magic to counteract the effort heuristic, or reciting a large number of variably sized numbers to counteract the anchoring effect, etc.
Is there a better analogy than the driving to the airport one for why Bayes’ Rule being part of the scientific toolbox doesn’t show the scientific toolbox isn’t a rough approximation of how to apply Bayes’ Rule? The other one I thought of is light’s exhibiting quantum behavior directly, it being a subset of all that is physical, but all that is physical actually embodying quantum behavior.
A significant confusion is discussing beliefs as if they weren’t probabilistic and actions in some domains as if they ought not be influenced by anything not in a category of true belief “scientifically established”. Bayesianism explains why this is a useful approximation of how one should actually act and thereby permits one to deviate from it without having to claim something like “science doesn’t work”.
Thoughts?
Not necessarily to reopen anything, but some notes:
I’m not sure it’s at all possible to debias against this.
I agree that those are better metaphors than handcuffs all else equal, but those things would not prevent one from shooting one’s foot, and so it didn’t fit the broader metaphor.
A better analogy would be a law that no medical treatment can be received until a second opinion is obtained, or something like that.
Are you familiar with Michael Polanyi Personal Knowledge?
His view is only slightly more strict, yet he arrives at some very different conclusions. For example, under your framework Rhine’s ESP experiments are scientific hypothesis tests, and under his they are illogical. I am not convinced by Polanyi, but it is far from clear to me how you could show he is wrong. If you know how to show he is wrong and could explain that in a couple paragraphs (or point me to such a document) I would be very interested in reading it.
I’m not familiar with his work, unfortunately.
However a quote from one of the reviews concerns me. The reviewer says:
If that’s Polanyi’s position it seems both kooky and not immediately relevant to the topic, so unless you can take a shot at explaining what you think Polanyi’s insights are that are relevant to the topic at hand I think we should drop this and take it up elsewhere or by other means if you want to talk about it further.
What are some examples of good scientific evidence that isn’t good bayesian evidence?
Uh, how about all of parapsychology, aka “the control group for the scientific method”. ;-) Psi experiments can reach p .05 under conventional methods without being good Bayesian evidence, as we’ve seen recently with that “future priming” psi experiment.
(Note that I said “scientific” not Scientific. ;-) )
Ok, I wouldn’t have necessarily classed that as ‘good scientific evidence’ but it seems to be useful Bayesian evidence so we must be looking at it from different angles.