This is an essay I wrote in 2017 as coursework for the final year of my Psychology undergrad degree. (That was a year before I learned about EA and the rationalist movement.)
I’m posting this as a shortform comment, rather than as a full post, because it’s now a little outdated, it’s just one of many things that people have written on this topic, and I don’t think the topic is of central interest to a massive portion of LessWrong readers. But I do think it holds up well, is pretty clear, and makes some points that generalise decently beyond psychology (e.g., about drawing boundaries between science and pseudoscience, evaluating research fields, and good research practice).
I put the references in a “reply” to this.
Psychology’s scientific status has been denied or questioned by some (e.g., Berezow, 2012; Campbell, 2012). Evaluating such critiques and their rebuttals requires defining “science”, considering what counts as psychology, and exploring how unscientific elements within a field influence the scientific standing of that field as a whole. This essay presents a conception of “science” that consolidates features commonly seen as important into a family resemblance model. Using this model, I argue psychology is indeed a science, despite unscientific individuals, papers, and practices within it. However, these unscientific practices make psychology less scientific than it could be. Thus, I outline their nature and effects, and how psychologists are correcting these issues.
Addressing whether psychology is a science requires specifying what is meant by “science”. This is more difficult than some writers seem to recognise. For example, Berezow (2012) states we can “definitively” say psychology is non-science “[b]ecause psychology often does not meet the five basic requirements for a field to be considered scientifically rigorous: clearly defined terminology, quantifiability, highly controlled experimental conditions, reproducibility and, finally, predictability and testability.” However, there are fields that do not meet those criteria whose scientific status is generally unquestioned. For example, astronomy and earthquake science do not utilise experiments (Irzik & Nola, 2014). Furthermore, Berezow leaves unmentioned other features associated with science, such as data-collection and inference-making (Irzik & Nola, 2011). Many such features have been noted by various writers, though some are contested by others or only present or logical in certain sciences. For example, direct observation of the matters of interest has been rightly noted as helping make fields scientific, as it reduces issues like the gap between self-reported intentions and the behaviours researchers seek to predict (Godin, Conner, & Sheeran, 2005; Rhodes & de Bruijn, 2013; Sheeran, 2002; Skinner, 1987). However, self-reported intentions are still useful predictors of behaviour and levers for manipulating it (Godin et al., 2005; Rhodes & de Bruijn, 2013; Sheeran, 2002), and science often productively investigates constructs such as gravity that are not directly observable (Bringmann & Eronen, 2016; Chomsky, 1971; Fanelli, 2010; Michell, 2013). Thus, definitions of science would benefit from noting the value of direct observation, but cannot exclude indirect measures or unobservable constructs. This highlights the difficulty – or perhaps impossibility – of defining science by way of a list of necessary and sufficient conditions for scientific status (Mahner, 2013).
An attractive solution is instead constructing a family resemblance model of science (Dagher & Erduran, 2016; Irzik & Nola, 2011, 2014; Pigliucci, 2013). Family resemblance models are sets of features shared by many but not all examples of something. To demonstrate, three characteristics common in science are experiments, double-blind trials, and the hypothetico-deductive method (Irzik & Nola, 2014). A definition of science omitting these would be missing something important. However, calling these “necessary” excludes many sciences; for example, particle physics would be rendered unscientific for lack of double-blind trials (Cleland & Brindell, 2013; Irzik & Nola, 2014). Thus, a family resemblance model of science only requires a field to have enough scientific features, rather than requiring the field to have all such features. The full list of features this model should include, the relative importance of each feature, and what number or combination is required for something to be a “science” could all be debated. However, for showing that psychology is a science, it will suffice to provide a rough family resemblance model incorporating some particularly important features, which I shall now outline.
Firstly, Berezow’s (2012) “requirements”, while not actually necessary for scientific status, do belong in a family resemblance model of science. That is, when these features can be achieved, they make a field more scientific. The importance of reproducibility is highlighted also by Kahneman (2014) and Klein et al. (2014a, 2014b), and that of testability or falsifiability is also mentioned by Popper (1957) and Ferguson and Heene (2012). These features are related to the more fundamental idea that science should be empirical; claims should be required to be supported by evidence (Irzik & Nola, 2011; Pigliucci, 2013). Together, these features allow science to be self-correcting, incrementally progressing towards truth by accumulation of evidence and peer-review of ideas and findings (Open Science Collaboration, 2015). This is further supported by scientists’ methods and results being made public and transparent (Anderson, Martinson, & De Vries, 2007, 2010; Nosek et al., 2015; Stricker, 1997). Additionally, findings and predictions should logically cohere with established theories, including those from other sciences (Lilienfeld, 2011; Mahner, 2013). These features all support science’s ultimate aims to benefit humanity by explaining, predicting, and controlling phenomena (Hansson, 2013; Irzik & Nola, 2014; Skinner, cited in Delprato & Midgley, 1992). Each feature may not be necessary for scientific status, and many other features could be added, but the point is that each feature a field possesses makes that field more scientific. Thus, armed with this model, we are nearly ready to productively evaluate the scientific status of psychology.
However, two further questions must first be addressed: What is psychology, and how do unscientific occurrences within psychology affect the scientific status of the field as a whole? For example, it can generally be argued parapsychology is not truly part of psychology, for reasons such as its lack of support from mainstream psychologists. However, there are certain more challenging instances, such as the case of a paper by Bem (2011) claiming to find evidence for precognition. This used accepted methodological and analytical techniques, was published in a leading psychology journal, and was written by a prominent, mainstream psychologist. Thus, one must accept that this paper is, to a substantial extent, part of psychology. It therefore appears important to determine whether Bem’s paper exemplifies science. It certainly has many scientific features, such as use of experiments and evidence. However, it lacks other features, such as logical coherence with the established principle of causation only proceeding forwards in time.
But it is unnecessary here to determine whether the paper is non-science, insufficiently scientific, or bad science, because, regardless, this episode shows psychology as a field being scientific. This is because scientific features such as self-correction and reproducibility are most applicable to a field as a whole, rather than to an individual scientist or article, and these features are visible in psychology’s response to Bem’s (2011) paper. Replication attempts were produced and supported the null hypothesis; namely, that precognition does not occur (Galak, LeBoeuf, Nelson, Simmons, 2012; Ritchie, Wiseman, & French, 2012; Wagenmakers, Wetzels, Borsboom, van der Maas, & Kievit, 2012). Furthermore, publicity, peer-review, and self-correction of findings and ideas were apparent in those failed replications and in commentary on Bem’s paper (Wagenmakers, Wetzels, Borsboom, & van der Maas, 2011; Francis, 2012; LeBel & Peters, 2011). Peers discussed many issues with Bem’s article, such as several variables having been recorded by Bem’s experimental program yet not mentioned in the study (Galak et al., 2012; Ritchie et al., 2012), suggesting that the positive results reported may have been false positives emerging by chance from many, mostly unreported analyses. Wagenmakers et al. (2011) similarly noted other irregularities and unexplained choices in data transformation and analysis, and highlighted that Bem had previously recommended to psychologists: “If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. […] Go on a fishing expedition for something—anything—interesting” (Bem, cited in Wagenmakers et al., 2011). These responses to Bem’s study by psychologists highlight that, while the scientific status of that study is highly questionable, isolated events such as that need not overly affect the scientific status of the entire field of psychology.
Indeed, psychology’s response to Bem’s (2011) paper exemplifies ways in which the field in general fits the family resemblance model of science outlined earlier. This model captures how different parts of psychology can each be scientific, despite showing different combinations of scientific features. For example, behaviourists may use more direct observation and clearly defined terminology (see Delprato & Midgley, 1992; Skinner, 1987), while evolutionary psychologists better integrate their theories and findings with established theories from other sciences (see Burke, 2014; Confer et al., 2010). These features make subfields that have them more scientific, but lacking one feature does not make a subfield non-science. Similarly, while much of psychology utilises controlled experiments, those parts that do not, like longitudinal studies of the etiology of mental disorders, can still be scientific if they have enough other scientific features, such as accumulation of evidence to increase our capacity for prediction and intervention.
Meanwhile, other scientific features are essentially universal in psychology. For example, all psychological claims and theories are expected to be based on or confirmed by evidence, and are rejected or modified if found not to be. Additionally, psychological methods and findings are made public by publication, with papers being peer-reviewed before this and open to critique afterwards, facilitating self-correction. Such self-correction can be seen in the response to Bem’s (2011) paper, as well as in how most psychological researchers now reject the untestable ideas of early psychoanalysis (see Cioffi, 2013; Pigliucci, 2013). Parts of psychology vary in their emphasis on basic versus applied research; for example, some psychologists investigate the processes underlying sadness while others conduct trials of specific cognitive therapy techniques for depression. However, these various branches can support each other, and all psychological research ultimately pursues benefitting humanity by explaining, predicting, and controlling phenomena. Indeed, while there is much work to be done and precision is rarely achieved, psychology can already make predictions much more accurate than chance or intuition in many areas, and thus provides benefits as diverse as anxiety-reduction via exposure therapy and HIV-prevention via soap operas informed by social-cognitive theories (Bandura, 2002; Lilienfeld, Ritschel, Lynn, Cautin, & Latzman, 2013; Zimbardo, 2004). All considered, most of psychology exemplifies most important scientific features, and thus psychology should certainly be considered a science.
However, psychology is not as scientific as it could be. Earlier I noted that isolated papers reporting inaccurate findings and utilising unscientific practices, as Bem (2011) seems highly likely to have, should not significantly affect psychology’s scientific status, as long as the field self-corrects adequately. However, as several commentators on Bem’s paper noted, more worrying is what that paper reflects regarding psychology more broadly, given that it largely met or exceeded psychology’s methodological, analytical, and reporting standards (Francis, 2012; LeBel & Peters, 2011; Wagenmakers et al., 2011). The fact Bem met these standards, yet still “discovered” and got published results that seem to violate fundamental principles about how causation works, highlights the potential prevalence of spurious findings in psychological literature. These findings could result from various flaws and biases, yet might fail to be recognised or countered in the way Bem’s report was if they are not as clearly false; indeed, they may be entirely plausible, yet inaccurate (LeBel & Peters, 2011). Thus, I will now discuss how critiques regarding Bem’s paper apply to much of mainstream psychology.
Firstly, the kind of “fishing expedition” recommended by Bem (cited in Wagenmakers et al., 2011) is common in psychology. Researchers often record many variables, and have flexibility in which variables, interactions, participants, data transformations, and statistics they use in their analyses (John, Loewenstein, & Prelec, 2012). Wagenmakers et al. (2012) note that such practices are not inherently problematic, and indeed such explorations are useful for suggesting hypotheses to test in a confirmatory manner. The issue is that often these explorations are inadequately reported and are presented as confirmatory themselves, despite the increased risk of false positives when conducting multiple comparisons (Asendorpf et al., 2013; Wagenmakers et al., 2012). Neuropsychological studies can be particularly affected by failures to control for multiple comparisons, even if all analyses are reported, because analysis of brain activity makes huge numbers of comparisons the norm. Thus, without statistical controls, false positives are almost guaranteed (Bennett, Baird, Miller, & Wolford, 2009). The issue of uncontrolled multiple comparisons, whether reported or not, causing false positives can be compounded by hindsight bias making results seem plausible and predictable in retrospect (Wagenmakers et al., 2012). This can cause overconfidence in findings and make researchers feel comfortable writing articles as if these findings were hypothesised beforehand (Kerr, 1998). These practices inflate the number of false discoveries and spurious confirmations of theories in psychological literature.
This is compounded by publication bias. Journals are more likely to publish novel and positive results than replications or negative results (Ferguson & Heene, 2012; Francis, 2012; Ioannidis, Munafò, Fusar-Poli, Nosek, & David, 2014; Kerr, 1998). One reason for this is that, despite the importance of self-correction and incremental progress, replications or negative results are often treated as not show anything substantially interesting (Klein et al., 2014b). Another reason is the idea that null results are hard to interpret or overly likely to be false negatives (Ferguson & Heene, 2012; Kerr, 1998). Psychological studies regularly have insufficient power; their sample sizes mean that, even if an effect of the expected size does exist, the chance of not finding it is substantial (Asendorpf et al., 2013; Bakker, Hartgerink, Wicherts, & van der Maas, 2016). Further, the frequentist statistics typically used by psychologists cannot clearly quantify the support data provides for null hypotheses; these statistics have difficulty distinguishing between powerful evidence for no effect and simply a failure to find evidence for an effect (Dienes, 2011). While concerns about the interpretability of null results are thus often reasonable, they distort the psychological literature’s representation of reality (see Fanelli, 2010; Kerr, 1998). Publication bias also takes the form of researchers being more likely to submit for publication those studies that revealed positive results (John et al., 2012). This can occur because researchers themselves also often find negative results difficult to interpret, and know they are less likely to be published or to lead to incentives like grants or prestige (Kerr, 1998; Open Science Collaboration, 2015). Thus, flexibility in analysis, failure to control for or report multiple comparisons, presentation of exploratory results as confirmatory, publication bias, low power, and difficulty interpreting null results are interrelated issues. These issues in turn make psychology less scientific by reducing the transparency of methods and findings.
These issues also undermine other scientific features. The Open Science Collaboration (2015) conducted replications of 100 studies from leading psychological journals, finding that less than half replicated successfully. This low level of reproducibility in itself makes psychology less scientific, and provides further evidence of the likely high prevalence and impact of the issues noted above (Asendorpf et al., 2013; Open Science Collaboration, 2015). Together, these problems impede self-correction, and make psychology’s use of evidence and testability of theories less meaningful, as replications and negative tests are often unreported (Ferguson & Heene, 2012). This undermines psychology’s ability to benefit humanity by explaining, predicting, and controlling phenomena.
However, while these issues make psychology less scientific, they do not make it non-science. Other sciences, including “hard sciences” like physics and biology, also suffer from issues like publication bias and low reproducibility and transparency (Alatalo, Mappes, & Edgar, 1997; Anderson, Burnham, Gould, & Cherry, 2001; McNutt, 2014; Miguel et al., 2014; Sarewitz, 2012; Service, 2002). Their presence is problematic and demands a response in any case, and may be more pronounced in psychology than in “harder” sciences, but it is not necessarily damning (see Fanelli, 2010). For example, the Open Science Collaboration (2015) did find a large portion of effects replicated, particularly effects whose initial evidence was stronger. Meanwhile, Klein et al. (2014a) found a much higher rate of replication for more established effects, compared to the Open Science Collaboration’s quasi-random sample of recent findings. Both results highlight that, while psychology certainly has work to do to become more reliable, the field also has the capacity to scientifically progress towards truth and is already doing so to a meaningful extent.
Furthermore, psychologists themselves are highlighting these issues and researching and implementing solutions for them. Bakker et al. (2016) discuss the problem of low power and how to overcome it with larger sample sizes, reinforced by researchers habitually running power analyses prior to conducting studies and reviewers checking these analyses have been conducted. Nosek et al. (2015) proposed guidelines for promoting transparency by changing what journals encourage or require, such as replications, better reporting and sharing of materials and data, and pre-registration of studies and analysis plans. Pre-registration side-steps confirmation and hindsight bias and unreported, uncorrected multiple comparisons, as expectations and analysis plans are on record before data is gathered (Wagenmakers et al., 2012). Journals can also conditionally accept studies for publication based on pre-registered plans, minimising bias against null results by both journals and researchers. Such proposals still welcome exploratory analyses, but prevent these analyses being presented as confirmatory (Miguel et al., 2014). Finally, psychologists have argued for, outlined how to use, and adopted Bayesian statistics as an alternative to frequentist statistics (Ecker, Lewandowsky, & Apai, 2011; Wagenmakers et al., 2011). Bayesian statistics provide clear quantification of evidence for null hypotheses, combatting one source of publication bias and making testability of psychological claims more meaningful (Dienes, 2011; Francis, 2012). These proposals are beginning to take effect. For example, many journals and organisations are signatories to Nosek et al.’s guidelines. Additionally, the Centre for Open Science, led by the psychologist Brian Nosek, has set up online tools for researchers to routinely make their data, code, and pre-registered plans public (Miguel et al., 2014). This shows psychology self-correcting its practices, not just individual findings, to become more scientific.
I have argued here that claims that psychology is non-scientific may often reflect unworkable definitions of science and ignorance of what psychology actually involves. A family resemblance model of science overcomes the former issue by outlining features that sciences do not have to possess to be science, but do become more scientific by possessing. This model suggests psychology is a science because it generally exemplifies most scientific features; most importantly, it accumulates evidence publicly, incrementally, and self-critically to benefit humanity by explaining, predicting, and controlling phenomena. However, psychology is not as scientific as it could be. A variety of interrelated issues with researchers’ and journals’ practices and incentive structures impede the effectiveness and meaningfulness of psychology’s scientific features. But failure to be perfectly scientific is not unique to psychology; it is universal among sciences. Science has achieved what it has because of its constant commitment to incremental improvement and self-correction of its own practices. In keeping with this, psychologists are researching and discussing psychology’s issues and their potential solutions, and such solutions are being put into action. More work must be done, and more researchers and journals must act on and push for these discussions and solutions, but already it is clear both that psychology is a science and that it is actively working to become more scientific.
Alatalo, R. V., Mappes, J., & Elgar, M. A. (1997). Heritabilities and paradigm shifts. Nature, 385(6615), 402-403. doi:10.1038/385402a0
Anderson, D. R., Burnham, K. P., Gould, W. R., & Cherry, S. (2001). Concerns about finding effects that are actually spurious. Wildlife Society Bulletin, 29(1), 311-316.
Anderson, M. S., Martinson, B. C., & Vries, R. D. (2007). Normative dissonance in science: Results from a national survey of U.S. scientists. Journal of Empirical Research on Human Research Ethics: An International Journal, 2(4), 3-14. doi:10.1525/jer.2007.2.4.3
Anderson, M. S., Ronning, E. A., Vries, R. D., & Martinson, B. C. (2010). Extending the Mertonian norms: Scientists’ subscription to norms of research. The Journal of Higher Education, 81(3), 366-393. doi:10.1353/jhe.0.0095
Asendorpf, J. B., Conner, M., Fruyt, F. D., Houwer, J. D., Denissen, J. J., Fiedler, K., … Wicherts, J. M. (2013). Recommendations for increasing replicability in psychology. European Journal of Personality, 27(2), 108-119. doi:10.1002/per.1919
Bakker, M., Hartgerink, C. H., Wicherts, J. M., & Han L. J. Van Der Maas. (2016). Researchers’ intuitions about power in psychological research. Psychological Science, 27(8), 1069-1077. doi:10.1177/0956797616647519
Bandura, A. (2002). Environmental sustainability by sociocognitive deceleration of population growth. In P. Shmuck & W. P. Schultz (Eds.), Psychology of sustainable development (pp. 209-238). New York, NY: Springer.
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407-425. doi:10.1037/a0021524
Bennett, C. M., Miller, M. B., & Wolford, G. L. (2009). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: an argument for multiple comparisons correction. Neuroimage, 47(Suppl 1), S125. doi:10.1016/s1053-8119(09)71202-9
Berezow, A. B. (2012, July 13). Why psychology isn’t science. Los Angeles Times. Retrieved from http://latimes.com
Bringmann, L. F., & Eronen, M. I. (2015). Heating up the measurement debate: What psychologists can learn from the history of physics. Theory & Psychology, 26(1), 27-43. doi:10.1177/0959354315617253
Burke, D. (2014). Why isn’t everyone an evolutionary psychologist? Frontiers in Psychology, 5. doi:10.3389/fpsyg.2014.00910
Campbell, H. (2012, July 17). A biologist and a psychologist square off over the definition of science. Science 2.0. Retrieved from http://www.science20.com
Chomsky, N. (1971). The case against BF Skinner. The New York Review of Books, 17(11), 18-24.
Cleland, C. E, & Brindell, S. (2013). Science and the messy, uncontrollable world of nature. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 183-202). Chicago, IL: University of Chicago Press.
Confer, J. C., Easton, J. A., Fleischman, D. S., Goetz, C. D., Lewis, D. M., Perilloux, C., & Buss, D. M. (2010). Evolutionary psychology: Controversies, questions, prospects, and limitations. American Psychologist, 65(2), 110-126. doi:10.1037/a0018413
Dagher, Z. R., & Erduran, S. (2016). Reconceptualizing nature of science for science education: Why does it matter? Science & Education, 25, 147-164. doi:10.1007/s11191-015-9800-8
Delprato, D. J., & Midgley, B. D. (1992). Some fundamentals of B. F. Skinner’s behaviorism. American Psychologist, 47(11), 1507-1520. doi:10.1037//0003-066x.47.11.1507
Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on?. Perspectives on Psychological Science, 6(3), 274-290. doi:10.1177/1745691611406920
Ecker, U. K., Lewandowsky, S., & Apai, J. (2011). Terrorists brought down the plane!—No, actually it was a technical fault: Processing corrections of emotive information. The Quarterly Journal of Experimental Psychology, 64(2), 283-310. doi:10.1080/17470218.2010.497927
Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PLoS ONE, 5(4). doi:10.1371/journal.pone.0010068
Ferguson, C. J., & Heene, M. (2012). A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null. Perspectives on Psychological Science, 7(6), 555-561. doi:10.1177/1745691612459059
Francis, G. (2012). Too good to be true: Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin & Review, 19(2), 151-156. doi:10.3758/s13423-012-0227-9
Galak, J., LeBoeuf, R. A., Nelson, L. D., & Simmons, J. P. (2012). Correcting the past: Failures to replicate psi. Journal of Personality and Social Psychology, 103(6), 933-948. doi:10.1037/a0029709
Godin, G., Conner, M., & Sheeran, P. (2005). Bridging the intention-behaviour gap: The role of moral norm. British Journal of Social Psychology, 44(4), 497-512. doi:10.1348/014466604x17452
Hansson, S. O. (2013). Defining pseudoscience and science. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 61-77). Chicago, IL: University of Chicago Press.
Ioannidis, J. P., Munafò, M. R., Fusar-Poli, P., Nosek, B. A., & David, S. P. (2014). Publication and other reporting biases in cognitive sciences: Detection, prevalence, and prevention. Trends in Cognitive Sciences, 18(5), 235-241. doi:10.1016/j.tics.2014.02.010
Irzik, G., & Nola, R. (2011). A family resemblance approach to the nature of science for science education. Science & Education, 20(7), 591-607. doi:10.1007/s11191-010-9293-4
Irzik, G., & Nola, R. (2014). New directions for nature of science research. In M. R. Matthews (Ed.), International Handbook of Research in History, Philosophy and Science Teaching (pp. 999-1021). Dordrecht: Springer.
John, L., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth-telling. Psychological Science, 23(5), 524-532. doi:10.1177/0956797611430953
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196-217. doi:10.1207/s15327957pspr0203_4
Kahneman, D. (2014). A new etiquette for replication. Social Psychology, 45(4), 310-311.
Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, S., Bernstein, M. J., Bocian, K., … Nosek, B. (2014a). Investigating variation in replicability: A “many labs” replication project. Social Psychology, 45(3), 142-152. doi:10.1027/a000001
Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, S., Bernstein, M. J., Bocian, K., … Nosek, B. (2014b). Theory building through replication: Response to commentaries on the “many labs” replication project. Social Psychology, 45(4), 299-311. doi:10.1027/1864-9335/a000202
Lebel, E. P., & Peters, K. R. (2011). Fearing the future of empirical psychology: Bem’s (2011) evidence of psi as a case study of deficiencies in modal research practice. Review of General Psychology, 15(4), 371-379. doi:10.1037/a0025172
Lilienfeld, S. O. (2011). Distinguishing scientific from pseudoscientific psychotherapies: Evaluating the role of theoretical plausibility, with a little help from Reverend Bayes. Clinical Psychology: Science and Practice, 18(2), 105-112. doi:10.1111/j.1468-2850.2011.01241.x
Lilienfeld, S. O., Ritschel, L. A., Lynn, S. J., Cautin, R. L., & Latzman, R. D. (2013). Why many clinical psychologists are resistant to evidence-based practice: Root causes and constructive remedies. Clinical Psychology Review, 33(7), 883-900. doi:10.1016/j.cpr.2012.09.008
Mahner, M. (2013). Science and pseudoscience: How to demarcate after the (alleged) demise of the demarcation problem. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 29-43). Chicago, IL: University of Chicago Press.
McNutt, M. (2014). Reproducibility. Science, 343(6168), 229. doi:10.1126/science.1250475
Michell, J. (2013). Constructs, inferences, and mental measurement. New Ideas in Psychology, 31(1), 13-21. doi:10.1016/j.newideapsych.2011.02.004
Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., … Laan, M. V. (2014). Promoting transparency in social science research. Science, 343(6166), 30-31. doi:10.1126/science.1245317
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., … & Contestabile, M. (2015). Promoting an open research culture. Science, 348(6242), 1422-1425.
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Popper, K. (1957). Philosophy of science: A personal report. In C. A. Mace (Ed.), British Philosophy in Mid-Century (155-160). London: Allen and Unwin.
Pigliucci, M. (2013). The demarcation problem: A (belated) response to Laudan. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 9-28). Chicago, IL: University of Chicago Press.
Rhodes, R. E., & Bruijn, G. D. (2013). How big is the physical activity intention-behaviour gap? A meta-analysis using the action control framework. British Journal of Health Psychology, 18(2), 296-309. doi:10.1111/bjhp.12032
Ritchie, S. J., Wiseman, R., & French, C. C. (2012). Failing the future: Three unsuccessful attempts to replicate Bem’s “retroactive facilitation of recall” effect. PLoS ONE, 7(3), e33423. doi:10.1371/journal.pone.0033423
Sarewitz, D. (2012). Beware the creeping cracks of bias. Nature, 485(7397), 149.
Service, R. F. (2002). Scientific misconduct: Bell Labs fires star physicist found guilty of forging data. Science, 298(5591), 30-31. doi:10.1126/science.298.5591.30
Sheeran, P. (2002). Intention—behavior relations: A conceptual and empirical review. European Review of Social Psychology, 12(1), 1-36. doi:10.1080/14792772143000003
Skinner, B. F. (1987). Whatever happened to psychology as the science of behavior? American Psychologist, 42(8), 780-786. doi:10.1037/0003-066x.42.8.780
Stricker, G. (1997). Are science and practice commensurable? American Psychologist, 52(4), 442-448. doi:10.1037//0003-066x.52.4.442
Wagenmakers, E., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011). Journal of Personality and Social Psychology, 100(3), 426-432. doi:10.1037/a0022790
Wagenmakers, E., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632-638. doi:10.1177/1745691612463078
Zimbardo, P. G. (2012). Does psychology make a significant difference in our lives?. In Applied Psychology (pp. 39-64). Psychology Press.
Psychology: An Imperfect and Improving Science
This is an essay I wrote in 2017 as coursework for the final year of my Psychology undergrad degree. (That was a year before I learned about EA and the rationalist movement.)
I’m posting this as a shortform comment, rather than as a full post, because it’s now a little outdated, it’s just one of many things that people have written on this topic, and I don’t think the topic is of central interest to a massive portion of LessWrong readers. But I do think it holds up well, is pretty clear, and makes some points that generalise decently beyond psychology (e.g., about drawing boundaries between science and pseudoscience, evaluating research fields, and good research practice).
I put the references in a “reply” to this.
Psychology’s scientific status has been denied or questioned by some (e.g., Berezow, 2012; Campbell, 2012). Evaluating such critiques and their rebuttals requires defining “science”, considering what counts as psychology, and exploring how unscientific elements within a field influence the scientific standing of that field as a whole. This essay presents a conception of “science” that consolidates features commonly seen as important into a family resemblance model. Using this model, I argue psychology is indeed a science, despite unscientific individuals, papers, and practices within it. However, these unscientific practices make psychology less scientific than it could be. Thus, I outline their nature and effects, and how psychologists are correcting these issues.
Addressing whether psychology is a science requires specifying what is meant by “science”. This is more difficult than some writers seem to recognise. For example, Berezow (2012) states we can “definitively” say psychology is non-science “[b]ecause psychology often does not meet the five basic requirements for a field to be considered scientifically rigorous: clearly defined terminology, quantifiability, highly controlled experimental conditions, reproducibility and, finally, predictability and testability.” However, there are fields that do not meet those criteria whose scientific status is generally unquestioned. For example, astronomy and earthquake science do not utilise experiments (Irzik & Nola, 2014). Furthermore, Berezow leaves unmentioned other features associated with science, such as data-collection and inference-making (Irzik & Nola, 2011). Many such features have been noted by various writers, though some are contested by others or only present or logical in certain sciences. For example, direct observation of the matters of interest has been rightly noted as helping make fields scientific, as it reduces issues like the gap between self-reported intentions and the behaviours researchers seek to predict (Godin, Conner, & Sheeran, 2005; Rhodes & de Bruijn, 2013; Sheeran, 2002; Skinner, 1987). However, self-reported intentions are still useful predictors of behaviour and levers for manipulating it (Godin et al., 2005; Rhodes & de Bruijn, 2013; Sheeran, 2002), and science often productively investigates constructs such as gravity that are not directly observable (Bringmann & Eronen, 2016; Chomsky, 1971; Fanelli, 2010; Michell, 2013). Thus, definitions of science would benefit from noting the value of direct observation, but cannot exclude indirect measures or unobservable constructs. This highlights the difficulty – or perhaps impossibility – of defining science by way of a list of necessary and sufficient conditions for scientific status (Mahner, 2013).
An attractive solution is instead constructing a family resemblance model of science (Dagher & Erduran, 2016; Irzik & Nola, 2011, 2014; Pigliucci, 2013). Family resemblance models are sets of features shared by many but not all examples of something. To demonstrate, three characteristics common in science are experiments, double-blind trials, and the hypothetico-deductive method (Irzik & Nola, 2014). A definition of science omitting these would be missing something important. However, calling these “necessary” excludes many sciences; for example, particle physics would be rendered unscientific for lack of double-blind trials (Cleland & Brindell, 2013; Irzik & Nola, 2014). Thus, a family resemblance model of science only requires a field to have enough scientific features, rather than requiring the field to have all such features. The full list of features this model should include, the relative importance of each feature, and what number or combination is required for something to be a “science” could all be debated. However, for showing that psychology is a science, it will suffice to provide a rough family resemblance model incorporating some particularly important features, which I shall now outline.
Firstly, Berezow’s (2012) “requirements”, while not actually necessary for scientific status, do belong in a family resemblance model of science. That is, when these features can be achieved, they make a field more scientific. The importance of reproducibility is highlighted also by Kahneman (2014) and Klein et al. (2014a, 2014b), and that of testability or falsifiability is also mentioned by Popper (1957) and Ferguson and Heene (2012). These features are related to the more fundamental idea that science should be empirical; claims should be required to be supported by evidence (Irzik & Nola, 2011; Pigliucci, 2013). Together, these features allow science to be self-correcting, incrementally progressing towards truth by accumulation of evidence and peer-review of ideas and findings (Open Science Collaboration, 2015). This is further supported by scientists’ methods and results being made public and transparent (Anderson, Martinson, & De Vries, 2007, 2010; Nosek et al., 2015; Stricker, 1997). Additionally, findings and predictions should logically cohere with established theories, including those from other sciences (Lilienfeld, 2011; Mahner, 2013). These features all support science’s ultimate aims to benefit humanity by explaining, predicting, and controlling phenomena (Hansson, 2013; Irzik & Nola, 2014; Skinner, cited in Delprato & Midgley, 1992). Each feature may not be necessary for scientific status, and many other features could be added, but the point is that each feature a field possesses makes that field more scientific. Thus, armed with this model, we are nearly ready to productively evaluate the scientific status of psychology.
However, two further questions must first be addressed: What is psychology, and how do unscientific occurrences within psychology affect the scientific status of the field as a whole? For example, it can generally be argued parapsychology is not truly part of psychology, for reasons such as its lack of support from mainstream psychologists. However, there are certain more challenging instances, such as the case of a paper by Bem (2011) claiming to find evidence for precognition. This used accepted methodological and analytical techniques, was published in a leading psychology journal, and was written by a prominent, mainstream psychologist. Thus, one must accept that this paper is, to a substantial extent, part of psychology. It therefore appears important to determine whether Bem’s paper exemplifies science. It certainly has many scientific features, such as use of experiments and evidence. However, it lacks other features, such as logical coherence with the established principle of causation only proceeding forwards in time.
But it is unnecessary here to determine whether the paper is non-science, insufficiently scientific, or bad science, because, regardless, this episode shows psychology as a field being scientific. This is because scientific features such as self-correction and reproducibility are most applicable to a field as a whole, rather than to an individual scientist or article, and these features are visible in psychology’s response to Bem’s (2011) paper. Replication attempts were produced and supported the null hypothesis; namely, that precognition does not occur (Galak, LeBoeuf, Nelson, Simmons, 2012; Ritchie, Wiseman, & French, 2012; Wagenmakers, Wetzels, Borsboom, van der Maas, & Kievit, 2012). Furthermore, publicity, peer-review, and self-correction of findings and ideas were apparent in those failed replications and in commentary on Bem’s paper (Wagenmakers, Wetzels, Borsboom, & van der Maas, 2011; Francis, 2012; LeBel & Peters, 2011). Peers discussed many issues with Bem’s article, such as several variables having been recorded by Bem’s experimental program yet not mentioned in the study (Galak et al., 2012; Ritchie et al., 2012), suggesting that the positive results reported may have been false positives emerging by chance from many, mostly unreported analyses. Wagenmakers et al. (2011) similarly noted other irregularities and unexplained choices in data transformation and analysis, and highlighted that Bem had previously recommended to psychologists: “If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. […] Go on a fishing expedition for something—anything—interesting” (Bem, cited in Wagenmakers et al., 2011). These responses to Bem’s study by psychologists highlight that, while the scientific status of that study is highly questionable, isolated events such as that need not overly affect the scientific status of the entire field of psychology.
Indeed, psychology’s response to Bem’s (2011) paper exemplifies ways in which the field in general fits the family resemblance model of science outlined earlier. This model captures how different parts of psychology can each be scientific, despite showing different combinations of scientific features. For example, behaviourists may use more direct observation and clearly defined terminology (see Delprato & Midgley, 1992; Skinner, 1987), while evolutionary psychologists better integrate their theories and findings with established theories from other sciences (see Burke, 2014; Confer et al., 2010). These features make subfields that have them more scientific, but lacking one feature does not make a subfield non-science. Similarly, while much of psychology utilises controlled experiments, those parts that do not, like longitudinal studies of the etiology of mental disorders, can still be scientific if they have enough other scientific features, such as accumulation of evidence to increase our capacity for prediction and intervention.
Meanwhile, other scientific features are essentially universal in psychology. For example, all psychological claims and theories are expected to be based on or confirmed by evidence, and are rejected or modified if found not to be. Additionally, psychological methods and findings are made public by publication, with papers being peer-reviewed before this and open to critique afterwards, facilitating self-correction. Such self-correction can be seen in the response to Bem’s (2011) paper, as well as in how most psychological researchers now reject the untestable ideas of early psychoanalysis (see Cioffi, 2013; Pigliucci, 2013). Parts of psychology vary in their emphasis on basic versus applied research; for example, some psychologists investigate the processes underlying sadness while others conduct trials of specific cognitive therapy techniques for depression. However, these various branches can support each other, and all psychological research ultimately pursues benefitting humanity by explaining, predicting, and controlling phenomena. Indeed, while there is much work to be done and precision is rarely achieved, psychology can already make predictions much more accurate than chance or intuition in many areas, and thus provides benefits as diverse as anxiety-reduction via exposure therapy and HIV-prevention via soap operas informed by social-cognitive theories (Bandura, 2002; Lilienfeld, Ritschel, Lynn, Cautin, & Latzman, 2013; Zimbardo, 2004). All considered, most of psychology exemplifies most important scientific features, and thus psychology should certainly be considered a science.
However, psychology is not as scientific as it could be. Earlier I noted that isolated papers reporting inaccurate findings and utilising unscientific practices, as Bem (2011) seems highly likely to have, should not significantly affect psychology’s scientific status, as long as the field self-corrects adequately. However, as several commentators on Bem’s paper noted, more worrying is what that paper reflects regarding psychology more broadly, given that it largely met or exceeded psychology’s methodological, analytical, and reporting standards (Francis, 2012; LeBel & Peters, 2011; Wagenmakers et al., 2011). The fact Bem met these standards, yet still “discovered” and got published results that seem to violate fundamental principles about how causation works, highlights the potential prevalence of spurious findings in psychological literature. These findings could result from various flaws and biases, yet might fail to be recognised or countered in the way Bem’s report was if they are not as clearly false; indeed, they may be entirely plausible, yet inaccurate (LeBel & Peters, 2011). Thus, I will now discuss how critiques regarding Bem’s paper apply to much of mainstream psychology.
Firstly, the kind of “fishing expedition” recommended by Bem (cited in Wagenmakers et al., 2011) is common in psychology. Researchers often record many variables, and have flexibility in which variables, interactions, participants, data transformations, and statistics they use in their analyses (John, Loewenstein, & Prelec, 2012). Wagenmakers et al. (2012) note that such practices are not inherently problematic, and indeed such explorations are useful for suggesting hypotheses to test in a confirmatory manner. The issue is that often these explorations are inadequately reported and are presented as confirmatory themselves, despite the increased risk of false positives when conducting multiple comparisons (Asendorpf et al., 2013; Wagenmakers et al., 2012). Neuropsychological studies can be particularly affected by failures to control for multiple comparisons, even if all analyses are reported, because analysis of brain activity makes huge numbers of comparisons the norm. Thus, without statistical controls, false positives are almost guaranteed (Bennett, Baird, Miller, & Wolford, 2009). The issue of uncontrolled multiple comparisons, whether reported or not, causing false positives can be compounded by hindsight bias making results seem plausible and predictable in retrospect (Wagenmakers et al., 2012). This can cause overconfidence in findings and make researchers feel comfortable writing articles as if these findings were hypothesised beforehand (Kerr, 1998). These practices inflate the number of false discoveries and spurious confirmations of theories in psychological literature.
This is compounded by publication bias. Journals are more likely to publish novel and positive results than replications or negative results (Ferguson & Heene, 2012; Francis, 2012; Ioannidis, Munafò, Fusar-Poli, Nosek, & David, 2014; Kerr, 1998). One reason for this is that, despite the importance of self-correction and incremental progress, replications or negative results are often treated as not show anything substantially interesting (Klein et al., 2014b). Another reason is the idea that null results are hard to interpret or overly likely to be false negatives (Ferguson & Heene, 2012; Kerr, 1998). Psychological studies regularly have insufficient power; their sample sizes mean that, even if an effect of the expected size does exist, the chance of not finding it is substantial (Asendorpf et al., 2013; Bakker, Hartgerink, Wicherts, & van der Maas, 2016). Further, the frequentist statistics typically used by psychologists cannot clearly quantify the support data provides for null hypotheses; these statistics have difficulty distinguishing between powerful evidence for no effect and simply a failure to find evidence for an effect (Dienes, 2011). While concerns about the interpretability of null results are thus often reasonable, they distort the psychological literature’s representation of reality (see Fanelli, 2010; Kerr, 1998). Publication bias also takes the form of researchers being more likely to submit for publication those studies that revealed positive results (John et al., 2012). This can occur because researchers themselves also often find negative results difficult to interpret, and know they are less likely to be published or to lead to incentives like grants or prestige (Kerr, 1998; Open Science Collaboration, 2015). Thus, flexibility in analysis, failure to control for or report multiple comparisons, presentation of exploratory results as confirmatory, publication bias, low power, and difficulty interpreting null results are interrelated issues. These issues in turn make psychology less scientific by reducing the transparency of methods and findings.
These issues also undermine other scientific features. The Open Science Collaboration (2015) conducted replications of 100 studies from leading psychological journals, finding that less than half replicated successfully. This low level of reproducibility in itself makes psychology less scientific, and provides further evidence of the likely high prevalence and impact of the issues noted above (Asendorpf et al., 2013; Open Science Collaboration, 2015). Together, these problems impede self-correction, and make psychology’s use of evidence and testability of theories less meaningful, as replications and negative tests are often unreported (Ferguson & Heene, 2012). This undermines psychology’s ability to benefit humanity by explaining, predicting, and controlling phenomena.
However, while these issues make psychology less scientific, they do not make it non-science. Other sciences, including “hard sciences” like physics and biology, also suffer from issues like publication bias and low reproducibility and transparency (Alatalo, Mappes, & Edgar, 1997; Anderson, Burnham, Gould, & Cherry, 2001; McNutt, 2014; Miguel et al., 2014; Sarewitz, 2012; Service, 2002). Their presence is problematic and demands a response in any case, and may be more pronounced in psychology than in “harder” sciences, but it is not necessarily damning (see Fanelli, 2010). For example, the Open Science Collaboration (2015) did find a large portion of effects replicated, particularly effects whose initial evidence was stronger. Meanwhile, Klein et al. (2014a) found a much higher rate of replication for more established effects, compared to the Open Science Collaboration’s quasi-random sample of recent findings. Both results highlight that, while psychology certainly has work to do to become more reliable, the field also has the capacity to scientifically progress towards truth and is already doing so to a meaningful extent.
Furthermore, psychologists themselves are highlighting these issues and researching and implementing solutions for them. Bakker et al. (2016) discuss the problem of low power and how to overcome it with larger sample sizes, reinforced by researchers habitually running power analyses prior to conducting studies and reviewers checking these analyses have been conducted. Nosek et al. (2015) proposed guidelines for promoting transparency by changing what journals encourage or require, such as replications, better reporting and sharing of materials and data, and pre-registration of studies and analysis plans. Pre-registration side-steps confirmation and hindsight bias and unreported, uncorrected multiple comparisons, as expectations and analysis plans are on record before data is gathered (Wagenmakers et al., 2012). Journals can also conditionally accept studies for publication based on pre-registered plans, minimising bias against null results by both journals and researchers. Such proposals still welcome exploratory analyses, but prevent these analyses being presented as confirmatory (Miguel et al., 2014). Finally, psychologists have argued for, outlined how to use, and adopted Bayesian statistics as an alternative to frequentist statistics (Ecker, Lewandowsky, & Apai, 2011; Wagenmakers et al., 2011). Bayesian statistics provide clear quantification of evidence for null hypotheses, combatting one source of publication bias and making testability of psychological claims more meaningful (Dienes, 2011; Francis, 2012). These proposals are beginning to take effect. For example, many journals and organisations are signatories to Nosek et al.’s guidelines. Additionally, the Centre for Open Science, led by the psychologist Brian Nosek, has set up online tools for researchers to routinely make their data, code, and pre-registered plans public (Miguel et al., 2014). This shows psychology self-correcting its practices, not just individual findings, to become more scientific.
I have argued here that claims that psychology is non-scientific may often reflect unworkable definitions of science and ignorance of what psychology actually involves. A family resemblance model of science overcomes the former issue by outlining features that sciences do not have to possess to be science, but do become more scientific by possessing. This model suggests psychology is a science because it generally exemplifies most scientific features; most importantly, it accumulates evidence publicly, incrementally, and self-critically to benefit humanity by explaining, predicting, and controlling phenomena. However, psychology is not as scientific as it could be. A variety of interrelated issues with researchers’ and journals’ practices and incentive structures impede the effectiveness and meaningfulness of psychology’s scientific features. But failure to be perfectly scientific is not unique to psychology; it is universal among sciences. Science has achieved what it has because of its constant commitment to incremental improvement and self-correction of its own practices. In keeping with this, psychologists are researching and discussing psychology’s issues and their potential solutions, and such solutions are being put into action. More work must be done, and more researchers and journals must act on and push for these discussions and solutions, but already it is clear both that psychology is a science and that it is actively working to become more scientific.
References
Alatalo, R. V., Mappes, J., & Elgar, M. A. (1997). Heritabilities and paradigm shifts. Nature, 385(6615), 402-403. doi:10.1038/385402a0
Anderson, D. R., Burnham, K. P., Gould, W. R., & Cherry, S. (2001). Concerns about finding effects that are actually spurious. Wildlife Society Bulletin, 29(1), 311-316.
Anderson, M. S., Martinson, B. C., & Vries, R. D. (2007). Normative dissonance in science: Results from a national survey of U.S. scientists. Journal of Empirical Research on Human Research Ethics: An International Journal, 2(4), 3-14. doi:10.1525/jer.2007.2.4.3
Anderson, M. S., Ronning, E. A., Vries, R. D., & Martinson, B. C. (2010). Extending the Mertonian norms: Scientists’ subscription to norms of research. The Journal of Higher Education, 81(3), 366-393. doi:10.1353/jhe.0.0095
Asendorpf, J. B., Conner, M., Fruyt, F. D., Houwer, J. D., Denissen, J. J., Fiedler, K., … Wicherts, J. M. (2013). Recommendations for increasing replicability in psychology. European Journal of Personality, 27(2), 108-119. doi:10.1002/per.1919
Bakker, M., Hartgerink, C. H., Wicherts, J. M., & Han L. J. Van Der Maas. (2016). Researchers’ intuitions about power in psychological research. Psychological Science, 27(8), 1069-1077. doi:10.1177/0956797616647519
Bandura, A. (2002). Environmental sustainability by sociocognitive deceleration of population growth. In P. Shmuck & W. P. Schultz (Eds.), Psychology of sustainable development (pp. 209-238). New York, NY: Springer.
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407-425. doi:10.1037/a0021524
Bennett, C. M., Miller, M. B., & Wolford, G. L. (2009). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: an argument for multiple comparisons correction. Neuroimage, 47(Suppl 1), S125. doi:10.1016/s1053-8119(09)71202-9
Berezow, A. B. (2012, July 13). Why psychology isn’t science. Los Angeles Times. Retrieved from http://latimes.com
Bringmann, L. F., & Eronen, M. I. (2015). Heating up the measurement debate: What psychologists can learn from the history of physics. Theory & Psychology, 26(1), 27-43. doi:10.1177/0959354315617253
Burke, D. (2014). Why isn’t everyone an evolutionary psychologist? Frontiers in Psychology, 5. doi:10.3389/fpsyg.2014.00910
Campbell, H. (2012, July 17). A biologist and a psychologist square off over the definition of science. Science 2.0. Retrieved from http://www.science20.com
Chomsky, N. (1971). The case against BF Skinner. The New York Review of Books, 17(11), 18-24.
Cleland, C. E, & Brindell, S. (2013). Science and the messy, uncontrollable world of nature. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 183-202). Chicago, IL: University of Chicago Press.
Confer, J. C., Easton, J. A., Fleischman, D. S., Goetz, C. D., Lewis, D. M., Perilloux, C., & Buss, D. M. (2010). Evolutionary psychology: Controversies, questions, prospects, and limitations. American Psychologist, 65(2), 110-126. doi:10.1037/a0018413
Dagher, Z. R., & Erduran, S. (2016). Reconceptualizing nature of science for science education: Why does it matter? Science & Education, 25, 147-164. doi:10.1007/s11191-015-9800-8
Delprato, D. J., & Midgley, B. D. (1992). Some fundamentals of B. F. Skinner’s behaviorism. American Psychologist, 47(11), 1507-1520. doi:10.1037//0003-066x.47.11.1507
Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on?. Perspectives on Psychological Science, 6(3), 274-290. doi:10.1177/1745691611406920
Ecker, U. K., Lewandowsky, S., & Apai, J. (2011). Terrorists brought down the plane!—No, actually it was a technical fault: Processing corrections of emotive information. The Quarterly Journal of Experimental Psychology, 64(2), 283-310. doi:10.1080/17470218.2010.497927
Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PLoS ONE, 5(4). doi:10.1371/journal.pone.0010068
Ferguson, C. J., & Heene, M. (2012). A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null. Perspectives on Psychological Science, 7(6), 555-561. doi:10.1177/1745691612459059
Francis, G. (2012). Too good to be true: Publication bias in two prominent studies from experimental psychology. Psychonomic Bulletin & Review, 19(2), 151-156. doi:10.3758/s13423-012-0227-9
Galak, J., LeBoeuf, R. A., Nelson, L. D., & Simmons, J. P. (2012). Correcting the past: Failures to replicate psi. Journal of Personality and Social Psychology, 103(6), 933-948. doi:10.1037/a0029709
Godin, G., Conner, M., & Sheeran, P. (2005). Bridging the intention-behaviour gap: The role of moral norm. British Journal of Social Psychology, 44(4), 497-512. doi:10.1348/014466604x17452
Hansson, S. O. (2013). Defining pseudoscience and science. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 61-77). Chicago, IL: University of Chicago Press.
Ioannidis, J. P., Munafò, M. R., Fusar-Poli, P., Nosek, B. A., & David, S. P. (2014). Publication and other reporting biases in cognitive sciences: Detection, prevalence, and prevention. Trends in Cognitive Sciences, 18(5), 235-241. doi:10.1016/j.tics.2014.02.010
Irzik, G., & Nola, R. (2011). A family resemblance approach to the nature of science for science education. Science & Education, 20(7), 591-607. doi:10.1007/s11191-010-9293-4
Irzik, G., & Nola, R. (2014). New directions for nature of science research. In M. R. Matthews (Ed.), International Handbook of Research in History, Philosophy and Science Teaching (pp. 999-1021). Dordrecht: Springer.
John, L., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth-telling. Psychological Science, 23(5), 524-532. doi:10.1177/0956797611430953
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196-217. doi:10.1207/s15327957pspr0203_4
Kahneman, D. (2014). A new etiquette for replication. Social Psychology, 45(4), 310-311.
Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, S., Bernstein, M. J., Bocian, K., … Nosek, B. (2014a). Investigating variation in replicability: A “many labs” replication project. Social Psychology, 45(3), 142-152. doi:10.1027/a000001
Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Bahník, S., Bernstein, M. J., Bocian, K., … Nosek, B. (2014b). Theory building through replication: Response to commentaries on the “many labs” replication project. Social Psychology, 45(4), 299-311. doi:10.1027/1864-9335/a000202
Lebel, E. P., & Peters, K. R. (2011). Fearing the future of empirical psychology: Bem’s (2011) evidence of psi as a case study of deficiencies in modal research practice. Review of General Psychology, 15(4), 371-379. doi:10.1037/a0025172
Lilienfeld, S. O. (2011). Distinguishing scientific from pseudoscientific psychotherapies: Evaluating the role of theoretical plausibility, with a little help from Reverend Bayes. Clinical Psychology: Science and Practice, 18(2), 105-112. doi:10.1111/j.1468-2850.2011.01241.x
Lilienfeld, S. O., Ritschel, L. A., Lynn, S. J., Cautin, R. L., & Latzman, R. D. (2013). Why many clinical psychologists are resistant to evidence-based practice: Root causes and constructive remedies. Clinical Psychology Review, 33(7), 883-900. doi:10.1016/j.cpr.2012.09.008
Mahner, M. (2013). Science and pseudoscience: How to demarcate after the (alleged) demise of the demarcation problem. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 29-43). Chicago, IL: University of Chicago Press.
McNutt, M. (2014). Reproducibility. Science, 343(6168), 229. doi:10.1126/science.1250475
Michell, J. (2013). Constructs, inferences, and mental measurement. New Ideas in Psychology, 31(1), 13-21. doi:10.1016/j.newideapsych.2011.02.004
Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., … Laan, M. V. (2014). Promoting transparency in social science research. Science, 343(6166), 30-31. doi:10.1126/science.1245317
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., … & Contestabile, M. (2015). Promoting an open research culture. Science, 348(6242), 1422-1425.
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
Popper, K. (1957). Philosophy of science: A personal report. In C. A. Mace (Ed.), British Philosophy in Mid-Century (155-160). London: Allen and Unwin.
Pigliucci, M. (2013). The demarcation problem: A (belated) response to Laudan. In M. Pigliucci & M. Boudry (Eds.), The philosophy of pseudoscience (pp. 9-28). Chicago, IL: University of Chicago Press.
Rhodes, R. E., & Bruijn, G. D. (2013). How big is the physical activity intention-behaviour gap? A meta-analysis using the action control framework. British Journal of Health Psychology, 18(2), 296-309. doi:10.1111/bjhp.12032
Ritchie, S. J., Wiseman, R., & French, C. C. (2012). Failing the future: Three unsuccessful attempts to replicate Bem’s “retroactive facilitation of recall” effect. PLoS ONE, 7(3), e33423. doi:10.1371/journal.pone.0033423
Sarewitz, D. (2012). Beware the creeping cracks of bias. Nature, 485(7397), 149.
Service, R. F. (2002). Scientific misconduct: Bell Labs fires star physicist found guilty of forging data. Science, 298(5591), 30-31. doi:10.1126/science.298.5591.30
Sheeran, P. (2002). Intention—behavior relations: A conceptual and empirical review. European Review of Social Psychology, 12(1), 1-36. doi:10.1080/14792772143000003
Skinner, B. F. (1987). Whatever happened to psychology as the science of behavior? American Psychologist, 42(8), 780-786. doi:10.1037/0003-066x.42.8.780
Stricker, G. (1997). Are science and practice commensurable? American Psychologist, 52(4), 442-448. doi:10.1037//0003-066x.52.4.442
Wagenmakers, E., Wetzels, R., Borsboom, D., & van der Maas, H. L. J. (2011). Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011). Journal of Personality and Social Psychology, 100(3), 426-432. doi:10.1037/a0022790
Wagenmakers, E., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632-638. doi:10.1177/1745691612463078
Zimbardo, P. G. (2012). Does psychology make a significant difference in our lives?. In Applied Psychology (pp. 39-64). Psychology Press.