• Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

What Is Replication in Psychology Research?

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

replication experimental psychology

Emily is a board-certified science editor who has worked with top digital publishing brands like Voices for Biodiversity, Study.com, GoodTherapy, Vox, and Verywell.

replication experimental psychology

Examples of Replication in Psychology

  • Why Replication Matters
  • How It Works

What If Replication Fails?

  • The Replication Crisis

How Replication Can Be Strengthened

Replication refers to the repetition of a research study, generally with different situations and subjects, to determine if the basic findings of the original study can be applied to other participants and circumstances.

In other words, when researchers replicate a study, it means they reproduce the experiment to see if they can obtain the same outcomes.

Once a study has been conducted, researchers might be interested in determining if the results hold true in other settings or for other populations. In other cases, scientists may want to replicate the experiment to further demonstrate the results.

At a Glance

In psychology, replication is defined as reproducing a study to see if you get the same results. It's an important part of the research process that strengthens our understanding of human behavior. It's not always a perfect process, however, and extraneous variables and other factors can interfere with results.

For example, imagine that health psychologists perform an experiment showing that hypnosis can be effective in helping middle-aged smokers kick their nicotine habit. Other researchers might want to replicate the same study with younger smokers to see if they reach the same result.

Exact replication is not always possible. Ethical standards may prevent modern researchers from replicating studies that were conducted in the past, such as Stanley Milgram's infamous obedience experiments .

That doesn't mean that researchers don't perform replications; it just means they have to adapt their methods and procedures. For example, researchers have replicated Milgram's study using lower shock thresholds and improved informed consent and debriefing procedures.

Why Replication Is Important in Psychology

When studies are replicated and achieve the same or similar results as the original study, it gives greater validity to the findings. If a researcher can replicate a study’s results, it is more likely that those results can be generalized to the larger population.

Human behavior can be inconsistent and difficult to study. Even when researchers are cautious about their methods, extraneous variables can still create bias and affect results. 

That's why replication is so essential in psychology. It strengthens findings, helps detect potential problems, and improves our understanding of human behavior.

How Do Scientists Replicate an Experiment?

When conducting a study or experiment , it is essential to have clearly defined operational definitions. In other words, what is the study attempting to measure?

When replicating earlier researchers, experimenters will follow the same procedures but with a different group of participants. If the researcher obtains the same or similar results in follow-up experiments, it means that the original results are less likely to be a fluke.

The steps involved in replicating a psychology experiment often include the following:

  • Review the original experiment : The goal of replication is to use the exact methods and procedures the researchers used in the original experiment. Reviewing the original study to learn more about the hypothesis, participants, techniques, and methodology is important.
  • Conduct a literature review : Review the existing literature on the subject, including any other replications or previous research. Considering these findings can provide insights into your own research.
  • Perform the experiment : The next step is to conduct the experiment. During this step, keeping your conditions as close as possible to the original experiment is essential. This includes how you select participants, the equipment you use, and the procedures you follow as you collect your data.
  • Analyze the data : As you analyze the data from your experiment, you can better understand how your results compare to the original results.
  • Communicate the results : Finally, you will document your processes and communicate your findings. This is typically done by writing a paper for publication in a professional psychology journal. Be sure to carefully describe your procedures and methods, describe your findings, and discuss how your results compare to the original research.

So what happens if the original results cannot be reproduced? Does that mean that the experimenters conducted bad research or that, even worse, they lied or fabricated their data?

In many cases, non-replicated research is caused by differences in the participants or in other extraneous variables that might influence the results of an experiment. Sometimes the differences might not be immediately clear, but other researchers might be able to discern which variables could have impacted the results.

For example, minor differences in things like the way questions are presented, the weather, or even the time of day the study is conducted might have an unexpected impact on the results of an experiment. Researchers might strive to perfectly reproduce the original study, but variations are expected and often impossible to avoid.

Are the Results of Psychology Experiments Hard to Replicate?

In 2015, a group of 271 researchers published the results of their five-year effort to replicate 100 different experimental studies previously published in three top psychology journals. The replicators worked closely with the original researchers of each study in order to replicate the experiments as closely as possible.

The results were less than stellar. Of the 100 experiments in question, 61% could not be replicated with the original results. Of the original studies, 97% of the findings were deemed statistically significant. Only 36% of the replicated studies were able to obtain statistically significant results.

As one might expect, these dismal findings caused quite a stir. You may have heard this referred to as the "'replication crisis' in psychology.

Similar replication attempts have produced similar results. Another study published in 2018 replicated 21 social and behavioral science studies. In these studies, the researchers were only able to successfully reproduce the original results about 62% of the time.

So why are psychology results so difficult to replicate? Writing for The Guardian , John Ioannidis suggested that there are a number of reasons why this might happen, including competition for research funds and the powerful pressure to obtain significant results. There is little incentive to retest, so many results obtained purely by chance are simply accepted without further research or scrutiny.

The American Psychological Association suggests that the problem stems partly from the research culture. Academic journals are more likely to publish novel, innovative studies rather than replication research, creating less of an incentive to conduct that type of research.

Reasons Why Research Cannot Be Replicated

The project authors suggest that there are three potential reasons why the original findings could not be replicated.  

  • The original results were a false positive.
  • The replicated results were a false negative.
  • Both studies were correct but differed due to unknown differences in experimental conditions or methodologies.

The Nobel Prize-winning psychologist Daniel Kahneman has suggested that because published studies are often too vague in describing methods used, replications should involve the authors of the original studies to more carefully mirror the methods and procedures used in the original research.

In fact, one investigation found that replication rates are much higher when original researchers are involved.

While some might be tempted to look at the results of such replication projects and assume that psychology is more art than science, many suggest that such findings actually help make psychology a stronger science. Human thought and behavior is a remarkably subtle and ever-changing subject to study.

In other words, it's normal and expected for variations to exist when observing diverse populations and participants.

Some research findings might be wrong, but digging deeper, pointing out the flaws, and designing better experiments helps strengthen the field. The APA notes that replication research represents a great opportunity for students. it can help strengthen research skills and contribute to science in a meaningful way.

Nosek BA, Errington TM. What is replication ?  PLoS Biol . 2020;18(3):e3000691. doi:10.1371/journal.pbio.3000691

Burger JM. Replicating Milgram: Would people still obey today ?  Am Psychol . 2009;64(1):1-11. doi:10.1037/a0010932

Makel MC, Plucker JA, Hegarty B. Replications in psychology research: How often do they really occur? Perspectives on Psychological Science . 2012;7(6):537-542. doi:10.1177/1745691612460688

Aarts AA, Anderson JE, Anderson CJ, et al. Estimating the reproducibility of psychological science . Science. 2015;349(6251). doi:10.1126/science.aac4716

Camerer CF, Dreber A, Holzmeister F, et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015 . Nat Hum Behav . 2018;2(9):637-644. doi:10.1038/s41562-018-0399-z

American Psychological Association. Learning into the replication crisis: Why you should consider conducting replication research .

Kahneman D. A new etiquette for replication . Social Psychology. 2014;45(4):310-311.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

The role of replication in psychological science

  • Paper in Philosophy of Science in Practice
  • Published: 08 January 2021
  • Volume 11 , article number  23 , ( 2021 )

Cite this article

replication experimental psychology

  • Samuel C. Fletcher   ORCID: orcid.org/0000-0002-9061-8976 1  

1974 Accesses

8 Citations

15 Altmetric

Explore all metrics

The replication or reproducibility crisis in psychological science has renewed attention to philosophical aspects of its methodology. I provide herein a new, functional account of the role of replication in a scientific discipline: to undercut the underdetermination of scientific hypotheses from data, typically by hypotheses that connect data with phenomena. These include hypotheses that concern sampling error, experimental control, and operationalization. How a scientific hypothesis could be underdetermined in one of these ways depends on a scientific discipline’s epistemic goals, theoretical development, material constraints, institutional context, and their interconnections. I illustrate how these apply to the case of psychological science. I then contrast this “bottom-up” account with “top-down” accounts, which assume that the role of replication in a particular science, such as psychology, must follow from a uniform role that it plays in science generally. Aside from avoiding unaddressed problems with top-down accounts, my bottom-up account also better explains the variability of importance of replication of various types across different scientific disciplines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

replication experimental psychology

History of Replication Failures in Psychology

replication experimental psychology

Low replicability can support robust and efficient science

replication experimental psychology

Making our “meta-hypotheses” clear: heterogeneity and the role of direct replications in science

These related events include Daryl Bem’s use of techniques standard in psychology to show evidence for extra-sensory perception ( 2011 ), the revelations of high-profile scientific fraud by Diederik Stapel (Callaway 2011 ) and Marc Hauser (Carpenter 2012 ), and related replication failures involving prominent effects such as ego depletion (Hagger et al. 2016 ).

The quotation reads: “the scientifically significant physical effect may be defined as that which can be regularly reproduced by anyone who carries out the appropriate experiment in the way prescribed.” See also Popper ( 1959 , p. 45): “Only when certain events recur in accordance with rules or regularities, as in the case of repeatable experiments, can our observations be tested—in principle—by anyone. … Only by such repetition can we convince ourselves that we are not dealing with a mere isolated ‘coincidence,’ but with events which, on account of their regularity and reproducibility, are in principle inter-subjectively testable.” Zwaan et al. ( 2018 , pp. 1, 2, 4) also quote Dunlap ( 1926 ) (published earlier as Dunlap ( 1925 )) for the same point.

Schmidt ( 2009 , pp. 90–2), citing much the same passages of Popper ( 1959 , p. 45) as the others mentioned, also provides a similar explanation of replication’s importance, appealing to general virtues such as objectivity and reliability. (See the first paragraphs of Schmidt ( 2009 , p. 90; 2017 , p. 236) for especially clear statements, and Machery ( 2020 ) for an account of replication based on its ability to buttress reliability in particular.) But for him, that explanation only motivates why establishing a definition of replication is important in the first place; it plays no role in his definition itself. Thus, by drawing on Schmidt’s account of what replication is, I am not committing to his and others’ stated explanations of why is important.

For example, it is compatible with modifications or clarifications of how interpretation plays an essential role in determining what data models are or what they represent, either for Suppes’ hierarchy (Leonelli 2019 ) or Bogan and Woodward’s (Harris 2003 ). It is also compatible with interactions between the levels of data and phenomena (or experiment) in the course of a scientific investigation (Bailer-Jones 2009 , Ch. 7).

That’s not to say there is no interesting relationship between low-level underdetermination and the question of scientific realism, only that it much more indirect. See Laymon ( 1982 ) for a discussion thereof and Brewer and Chinn ( 1994 ) for historical examples from psychology as they bear on the motivation for theory change.

The first function, concerning mistakes in data analysis, does not appear in Schmidt ( 2009 , 2017 ). That said, neither he nor I claim that our lists are exhaustive, but they do seem to enumerate the most common types of low-level underdetermination that arise in the interpretation of the results of psychological studies. One type that occurs more often in the physical sciences concerns the accuracy, precision, and systematic error of an experiment or measurement technique; I hope in future work to address this other function in more detail. It would also be interesting to compare the present perspective to that of Feest ( 2019 ), who, focusing on the “epistemic uncertainty” regarding the third and sixth functions, arrives at a more pessimistic and limiting conclusion about the role of replication in psychological science.

For examples from economics, see Cartwright ( 1991 , pp. 145–6); for examples from gravitational and particle physics, see Franklin and Howson ( 1984 , pp. 56–8).

This is also analogous to the case of the demarcation problem, on which progress might be possible if one helps oneself to discipline-specific information (Hansson 2013 ).

Of course, there is a variety of quantitative and qualitative methods in psychological research, and qualitative methods are not always a good target for statistical analysis. But the question of whether the data are representative of the population of interest is important regardless of whether that data is quantitative or qualitative.

Meehl ( 1967 ) wanted to distinguish this lack of precise predictions from the situation in physics, but perhaps overstated his case: there are many experimental situations in physics in which theory predicts the existence of an effect determined by an unknown parameter, too. Meehl ( 1967 ) was absolutely right, though, that one cannot rest simply with evidence against a non-zero effect size; doing so abdicates responsibility to find just what the aforementioned patterns of human behavior and mental life are .

Online participant services such as Amazon Turk and other crowdsourced methods offer a potentially more diverse participant pool at a more modest cost (Uhlmann et al. 2019 ), but come with their own challenges.

“Big science” is a historiographical cluster concept referring to science with one or more of the following characteristics: large budgets, large staff sizes, large or particularly expensive equipment, and complex and expansive laboratories (Galison and Hevly 1992 ).

For secondary sources on MSRP, see Musgrave and Pigden ( 2016 , §§2.2, 3.4)

For more on this, see Musgrave and Pigden ( 2016 , §4).

In what follows, I use my own examples rather than Guttinger’s, with the exception of some overlap in discussion of Leonelli ( 2018 ).

Leonelli ( 2018 ) has argued that this possibility is realized in certain sciences that focus on qualitative data collection, but it is yet unclear whether this is really due to pragmatic limitations on the possibility of replications, rather than a lack of underdetermination, low-level or otherwise.

Bailer-Jones, D.M. (2009). Scientific models in philosophy of science . Pittsburgh: University of Pittsburgh Press.

Book   Google Scholar  

Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature , 533 (7604), 452–454.

Article   Google Scholar  

Begley, C.G., & Ellis, L.M. (2012). Raise standards for preclinical cancer research: drug development. Nature , 483 (7391), 531–533.

Bem, D.J. (2011). Feeling the future: experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology , 100 (3), 407.

Benjamin, D.J., Berger, J.O., Johannesson, M., Nosek, B.A., Wagenmakers, E.-J., Berk, R., Bollen, K.A., Brembs, B., Brown, L., Camerer, C., & et al. (2018). Redefine statistical significance. Nature Human Behaviour , 2 (1), 6.

Bird, A. (2018). Understanding the replication crisis as a base rate fallacy. The British Journal for the Philosophy of Science , forthcoming.

Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review , 97 (3), 303–352.

Brewer, W.F., & Chinn, C.A. (1994). Scientists’ responses to anomalous data: Evidence from psychology, history, and philosophy of science. In PSA: Proceedings of the biennial meeting of the philosophy of science association , (Vol. 1 pp. 304–313): Philosophy of Science Association.

Button, K.S., Ioannidis, J.P., Mokrysz, C., Nosek, B.A., Flint, J., Robinson, E.S., & Munafò, M.R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience , 14 (5), 365–376.

Callaway, E. (2011). Report finds massive fraud at Dutch universities. Nature , 479 (7371), 15.

Camerer, C.F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Almenberg, J., Altmejd, A., Chan, T., & et al. (2016). Evaluating replicability of laboratory experiments in economics. Science , 351 (6280), 1433–1436.

Carpenter, S. (2012). Government sanctions Harvard psychologist. Science , 337 (6100), 1283–1283.

Cartwright, N. (1991). Replicability, reproducibility, and robustness: comments on Harry Collins. History of Political Economy , 23 (1), 143–155.

Chen, X. (1994). The rule of reproducibility and its applications in experiment appraisal. Synthese , 99 , 87–109.

Dunlap, K. (1925). The experimental methods of psychology. The Pedagogical Seminary and Journal of Genetic Psychology , 32 (3), 502–522.

Dunlap, K. (1926). The experimental methods of psychology. In Murchison, C. (Ed.) Psychologies of 1925: Powell lectures in psychological theory (pp. 331–351). Worcester: Clark University Press.

Feest, U. (2019). Why replication is overrated. Philosophy of Science , 86 (5), 895–905.

Feyerabend, P. (1970). Consolation for the specialist. In Lakatos, I., & Musgrave, A. (Eds.) Criticism and the growth of knowledge (pp. 197–230). Cambridge: Cambridge University Press.

Feyerabend, P. (1975). Against method . London: New Left Books.

Google Scholar  

Fidler, F., & Wilcox, J. (2018). Reproducibility of scientific results. In Zalta, E.N. (Ed.) The Stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University, winter 2018 edition .

Franklin, A., & Howson, C. (1984). Why do scientists prefer to vary their experiments? Studies in History and Philosophy of Science Part A , 15 (1), 51–62.

Galison, P., & Hevly, B.W. (Eds.). (1992). Big science: the growth of large-scale research . Stanford: Stanford University Press.

Gelman, A. (2018). Don’t characterize replications as successes or failures. Behavioral and Brain Sciences , 41 , e128.

Gillies, D.A. (1971). A falsifying rule for probability statements. The British Journal for the Philosophy of Science , 22 (3), 231–261.

Gómez, O.S., Juristo, N., & Vegas, S. (2010). Replications types in experimental disciplines. In Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement, ESEM ’10 . New York: Association for Computing Machinery.

Greenwald, A.G., Pratkanis, A.R., Leippe, M.R., & Baumgardner, M.H. (1986). Under what conditions does theory obstruct research progress? Psychological Review , 93 (2), 216–229.

Guttinger, S. (2020). The limits of replicability. European Journal for Philosophy of Science , 10 (10), 1–17.

Hagger, M.S., Chatzisarantis, N.L., Alberts, H., Anggono, C.O., Batailler, C., Birt, A.R., Brand, R., Brandt, M.J., Brewer, G., Bruyneel, S., & et al. (2016). A multilab preregistered replication of the ego-depletion effect. Perspectives on Psychological Science , 11 (4), 546–573.

Hansson, S.O. (2013). Defining pseudoscience and science. In Pigliucci, M., & Boudry, M. (Eds.) Philosophy of pseudoscience: reconsidering the demarcation problem (pp. 61–77). Chicago: University of Chicago Press.

Harris, T. (2003). Data models and the acquisition and manipulation of data. Philosophy of Science , 70 (5), 1508–1517.

Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In Lakatos, I., & Musgrave, A. (Eds.) Criticism and the growth of knowledge (pp. 91–196). Cambridge: Cambridge University Press.

Lakens, D., Adolfi, F.G., Albers, C.J., Anvari, F., Apps, M.A., Argamon, S.E., Baguley, T., Becker, R.B., Benning, S.D., Bradford, D.E., & et al. (2018). Justify your alpha. Nature Human Behaviour , 2 (3), 168.

Laudan, L. (1983). The demise of the demarcation problem. In Cohan, R., & Laudan, L. (Eds.) Physics, philosophy, and psychoanalysis (pp. 111–127). Dordrecht: Reidel.

Lawrence, M.S., Stojanov, P., Polak, P., Kryukov, G.V., Cibulskis, K., Sivachenko, A., Carter, S.L., Stewart, C., Mermel, C.H., Roberts, S.A., & et al. (2013). Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature , 499 (7457), 214–218.

Laymon, R. (1982). Scientific realism and the hierarchical counterfactual path from data to theory. In PSA: Proceedings of the biennial meeting of the philosophy of science association , (Vol. 1 pp. 107–121): Philosophy of Science Association.

LeBel, E.P., Berger, D., Campbell, L., & Loving, T.J. (2017). Falsifiability is not optional. Journal of Personality and Social Psychology , 113 (2), 254–261.

Leonelli, S. (2018). Rethinking reproducibility as a criterion for research quality. In Boumans, M., & Chao, H.-K. (Eds.) Including a symposium on Mary Morgan: curiosity, imagination, and surprise, volume 36B of Research in the History of Economic Thought and Methodology (pp. 129–146): Emerald Publishing Ltd.

Leonelli, S. (2019). What distinguishes data from models? European Journal for Philosophy of Science , 9 (2), 22.

Machery, E. (2020). What is a replication? Philosophy of Science , forthcoming.

Meehl, P.E. (1967). Theory-testing in psychology and physics: a methodological paradox. Philosophy of Science , 34 (2), 103–115.

Meehl, P.E. (1990). Appraising and amending theories: the strategy of Lakatosian defense and two principles that warrant it. Psychological Inquiry , 1 (2), 108–141.

Musgrave, A., & Pigden, C. (2016). Imre Lakatos. In Zalta, E.N. (Ed.) The Stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University, winter 2016 edition .

Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human Behaviour , 3 (3), 221–229.

Norton, J.D. (2015). Replicability of experiment. THEORIA. Revista de Teoría Historia y Fundamentos de la Ciencia , 30 (2), 229–248.

Nosek, B.A., & Errington, T.M. (2017). Reproducibility in cancer biology: making sense of replications. Elife , 6 , e23383.

Nosek, B.A., & Errington, T.M. (2020). What is replication? PLoS Biology , 18 (3), e3000691.

Nuijten, M.B., Bakker, M., Maassen, E., & Wicherts, J.M. (2018). Verify original results through reanalysis before replicating. Behavioral and Brain Sciences , 41 , e143.

Open Science Collaboration (OSC). (2015). Estimating the reproducibility of psychological science. Science , 349 (6251), aac4716.

Popper, K.R. (1959). The logic of scientific discovery . Oxford: Routledge.

Radder, H. (1992). Experimental reproducibility and the experimenters’ regress. PSA: Proceedings of the biennial meeting of the philosophy of science association (Vol. 1 pp. 63–73). Philosophy of Science Association.

Rosenthal, R. (1990). Replication in behavioral research. In Neuliep, J.W. (Ed.) Handbook of replication research in the behavioral and social sciences, volume 5 of Journal of Social Behavior and Personality (pp. 1–30). Corte Madera: Select Press.

Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology , 13 (2), 90–100.

Schmidt, S. (2017). Replication. In Makel, M.C., & Plucker, J.A. (Eds.) Toward a more perfect psychology: improving trust, accuracy, and transparency in research (pp. 233–253): American Psychological Association.

Simons, D.J. (2014). The value of direct replication. Perspectives on Psychological Science , 9 (1), 76–80.

Simons, D.J., Shoda, Y., & Lindsay, D.S. (2017). Constraints on generality (COG): a proposed addition to all empirical papers. Perspectives on Psychological Science , 12 (6), 1123–1128.

Stanford, K. (2017). Underdetermination of scientific theory. In Zalta, E.N. (Ed.) The Stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University, winter 2017 edition .

Suppes, P. (1962). Models of data. In Nagel, E., Suppes, P., & Tarski, A. (Eds.) Logic, methodology and philosophy of science: proceedings of the 1960 international congress (pp. 252–261). Stanford: Stanford University Press.

Suppes, P. (2007). Statistical concepts in philosophy of science. Synthese , 154 (3), 485–496.

Uhlmann, E.L., Ebersole, C.R., Chartier, C.R., Errington, T.M., Kidwell, M.C., Lai, C.K., McCarthy, R.J., Riegelman, A., Silberzahn, R., & Nosek, B.A. (2019). Scientific Utopia III: crowdsourcing science. Perspectives on Psychological Science , 14 (5), 711–733.

Zwaan, R.A., Etz, A., Lucas, R.E., & Donnellan, M.B. (2018). Making replication mainstream. Behavioral and Brain Sciences , 41 , e120.

Download references

Acknowledgments

Thanks to audiences in London (UK XPhi 2018), Burlington (Social Science Roundtable 2019), and Geneva (EPSA2019) for their comments on an earlier version, and especially to the Pitt Center for Philosophy of Science Reading Group in Spring 2020: Jean Baccelli, Andrew Buskell, Christian Feldbacher-Escamilla, Marie Gueguen, Paola Hernandez-Chavez, Edouard Machery, Adina Roskies, and Sander Verhaegh.

This research was partially supported by a Single Semester Leave from the University of Minnesota, and a Visiting Fellowship at the Center for Philosophy of Science at the University of Pittsburgh.

Author information

Authors and affiliations.

Department of Philosophy, University of Minnesota, Twin Cities, Minneapolis, MN, USA

Samuel C. Fletcher

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Samuel C. Fletcher .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: EPSA2019: Selected papers from the biennial conference in Geneva

Guest Editors: Anouk Barberousse, Richard Dawid, Marcel Weber

Rights and permissions

Reprints and permissions

About this article

Fletcher, S.C. The role of replication in psychological science. Euro Jnl Phil Sci 11 , 23 (2021). https://doi.org/10.1007/s13194-020-00329-2

Download citation

Received : 16 June 2020

Accepted : 30 October 2020

Published : 08 January 2021

DOI : https://doi.org/10.1007/s13194-020-00329-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Replication
  • Underdetermination
  • Confirmation
  • Reproducibility

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Psychol

Logo of bmcpsychol

Psychology, replication & beyond

Keith r. laws.

School of Life and Medical Sciences, University of Hertfordshire, Hatfield, UK

Associated Data

Not applicable.

Modern psychology is apparently in crisis and the prevailing view is that this partly reflects an inability to replicate past findings. If a crisis does exists, then it is some kind of ‘chronic’ crisis, as psychologists have been censuring themselves over replicability for decades. While the debate in psychology is not new, the lack of progress across the decades is disappointing. Recently though, we have seen a veritable surfeit of debate alongside multiple orchestrated and well-publicised replication initiatives. The spotlight is being shone on certain areas and although not everyone agrees on how we should interpret the outcomes, the debate is happening and impassioned. The issue of reproducibility occupies a central place in our whig history of psychology.

In the parlance of Karl Popper, the notion of falsification is seductive – some seem to imagine that it identifies an act as opposed to a process . It often carries the misleading implication that hypotheses can be readily discarded in the face of something called a ‘failed’ replication. Popper [ 46 ] was quite transparent when he declared “… a few stray basic statements contradicting a theory will hardly induce us to reject it as falsified. We shall take it as falsified only if we discover a reproducible effect which refutes the theory . In other words, we only accept the falsification if a low level empirical hypothesis which describes such an effect is proposed and corroborated.” (p.203: my italics). Popper’s view might reassure those whose psychological models have recently come under scrutiny through replication initiatives. We cannot, nor should we, close the door on a hypothesis because a study fails to be replicated. The hypothesis is not nullified and ‘nay-saying’ alone is an insufficient response from scientists. Like Popper, we might expect a testable alternative hypothesis that attempts to account for the discrepancy across studies; and one that itself may be subject to testing rather than merely being ad hoc . In other words, a ‘failed’ replication is not, in itself, the answer to a question, but a further question.

Replication, replication, replication

At least two key types of replication exist: direct and conceptual. Conceptual replication generally refers to cases where researchers ‘tweak’ the methods of previous studies [ 43 ] and when successful, may be informative with regard to the boundaries and possible moderators of an effect. When a conceptual replication fails, however, fewer clear implications exist for the original study because of likely differences in procedure or stimuli and so on. For this reason, we have seen an increased weight given to direct replications.

How often do direct and conceptual replications occur in psychology? Screening 100 of the most-cited psychology journals since 1900, Makel, Plucker & Hegarty [ 40 ] found that approximately 1.6 % of all psychology articles used the term replication in the text. A further more detailed analysis of 500 randomly selected articles revealed that only 68 % using the term replication were actual replications. They calculated an overall replication rate of 1.07 % and Makel et al. [ 40 ] found that only 18 % of those were direct rather than conceptual replications.

The lack of replication in psychology is systemic and widespread, and particularly the bias against publishing direct replications. In their survey of social science journal editors , Neuliep & Crandall [ 42 ] found almost three quarters preferred to publish novel findings rather than replications. In a parallel survey of reviewers for social science journals, Neuliep & Crandall [ 43 ] found over half (54 %) stated a preference for new findings over replications. Indeed, reviewers stated that replications were “Not newsworthy” or even a “Waste of space”. By contrast, comments from natural science journal editors present a more varied picture, with comments ranging from “Replication without some novelty is not accepted” to “Replication is rarely an issue for us…since we publish them.” [ 39 ].

Despite an enduring historical abandonment of replication, the tide appears to be turning. Makel et al. [ 40 ] found that the replication rate after the year 2000 was 1.84 times higher than for the period between 1950 and 1999. In a more recent evolution, several large-scale direct replication projects have emerged during the past 2 years including: the Many Labs project [ 33 ]; a set of preregistered replications published in a special issue of Social Psychology (Edited by [ 44 ]); the Reproducibility Project of the Open Science Collaboration [ 45 ]; and the Pipeline Project by Schweinsberg et al. [ 50 ]. In two of these projects (Many Labs by [ 33 ]; Pipeline Project by [ 50 ]), a group of researchers replicated samples of studies, with each group replicating all studies. In the two remaining projects, a number of research groups each replicated one study, selected from a sample of studies (Registered Reports by [ 44 ]; Open Science Collaboration, [ 45 ]). Each project ensured that replications were sufficiently powered (typically in excess of 90 % -thus offering a very good probability of detecting true effects) and where possible, used the original materials and stimuli as provided by the original authors. It is worth considering each in more detail.

Many Labs involved 36 research groups across 12 countries who replicated 13 psychological studies in over 6,000 participants. Studies of classic and newer effects were selected partly because they had simple designs that could be adapted for online administration. Reassuringly perhaps, 10 of the 13 effects replicated consistently across 36 different samples with, of course, some variability in the effect size reported compared to the original studies – some smaller but also some larger. One effect received weak support. Only two studies consistently failed to replicate and both involved what are described as ‘ social priming’ phenomena. One study where ‘accidental’ exposure to a US flag resulted in increased conservatism amongst Americans [ 11 ]. Participants viewed four photos and were asked to just estimate the time-of-day in the photo – the US flag appeared in two photos. Following this, they completed an 8-item questionnaire assessing their views toward various political issues (e.g., abortion, gun control). In the second priming study, exposure to ‘money’ had resulted in endorsement of the current social system [ 12 ]. In this study, participants completed demographic questions against a background that showed a faint picture of US $100 bills or the same background but blurred. Each of these two priming experiments had a single significant p -value (out of 36 replications) and for flag priming, it was in the opposite direction to that expected.

Turning to the special issue of Social Psychology edited by Nosek & Lakens [ 44 ]. This contained a series of articles replicating important results in social psychology. Important was broadly defined as “…often cited, a topic of intense scholarly or public interest, a challenge to established theories), but should also have uncertain truth value (e.g., few confirmations, imprecise estimates of effect sizes).” One might euphemistically describe the studies as curios . The articles were first submitted as Registered Reports and reviewed prior to data collection, with authors being assured their findings would be published regardless of outcome, as long as they adhered to the registered protocol. Attempted replications included the “Romeo and Juliet effect” – does parental interference lead to increases in love and commitment (Original: [ 17 ]; Replication: Sinclair, Hood, & Wright, [ 53 ]), does experiencing physical warmth (warm therapeutic packs) increase judgments of interpersonal warmth (Original: [ 58 ]; Replication: Lynott, Corker, Wortman, Connell, Donnellan, Lucas, & O’Brien, [ 38 ]), does recalling unethical behavior lead participants to see the room as darker (Original: [ 3 ]; Replication: [ 10 ]); does physical cleanliness reduce the severity of moral judgments (original : [ 49 ]: [ 28 ]). In contrast to high replication rate of Many Labs , the Registered Reports replications failed to confirm the results in 10 of 13 studies.

In the largest crowdsourced effort to date, the OSC Reproducibility project involved 270 collaborators attempting to replicate 100 findings from 3 major psychology journals Psychological Science (PSCI), Journal of Personality and Social Psychology (JPSP), and Journal of Experimental Psychology: Learning, Memory, and Cognition (JEP: LMC). While 97 of 100 studies originally reported statistically significant results, only 36 % of the replications did so with a mean effect size of around half of that reported in the original studies.

All of the journals exhibited a large reduction of around 50 % in effect sizes, with replications from JPSP particularly affected - shrinking by 75 % from 0.29 to 0.07. The replicability in one domain of psychology (good or poor) in no way guarantees what will happen in another domain. One thing we know from this project, is that “…reproducibility was stronger in studies and journals representing cognitive psychology than social psychology topics. For example, combining across journals, 14 of 55 (25 %) of social psychology effects replicated by the P  < 0.05 criterion, whereas 21 of 42 (50 %) of cognitive psychology effects did so.” The reasons for such a difference are debatable, but provide no licence to either congratulate cognitive psychologists or berate social psychologists. Indeed, the authors paint a considered and faithful picture of what their findings mean when they conclude “…how many of the effects have we established are true? Zero. And how many of the effects have we established are false? Zero. Is this a limitation of the project design? No. It is the reality of doing science”. (Open Science Collaboration p.4716-7)

The studies that were not selected for replication are informative – they were described as “…deemed infeasible to replicate because of time, resources, instrumentation, dependence on historical events, or hard-to-access samples… [and some] required specialized samples (such as macaques or people with autism), resources (such as eye tracking machines or functional magnetic resonance imaging), or knowledge making them difficult to match with teams”. Thus, the main drivers of replication are often economic in terms of time, money and human investment. High cost studies are likely to remain castles in the air, leaving us with little insight about replicability rates in some areas such as functional imaging (e.g. [ 9 ]), clinical and health psychology (see Coyne, this issue), and neuropsychology.

The ‘ Pipeline project’ by Schweinsberg et al. [ 50 ] intentionally used a non-adversarial approach. They crowdsourced 25 research teams across various countries to replicate a series of 10 unpublished moral-judgment experiments from the lead author’s (Uhlmann) lab i.e., in the pipeline. This speaks directly to Lykken’s [ 37 ] proposal from nearly 50 years ago that “…ideally all experiments would be replicated before publication” although at that time, he deemed it ‘impractical’.

Pipeline replications included: the Bigot–misanthrope effect – whether participants judge a manager who selectively mistreats racial minorities as a more blameworthy person than a manager who mistreats all of his employees; Bad tipper effect - are people who leave a full tip, but entirely in pennies judged more negatively than someone who leaves less money, but in notes; the Burn-in-hell effect – do people perceive corporate executives as more likely to burn in hell than members of social categories defined by antisocial behaviour, such as vandals. Six of ten findings replicated across all of their replication criteria, one further finding replicated but with a significantly smaller effect size than the original, one finding replicated consistently in the original culture but not outside of it ( bad tipper replicated in US and not outside), and two findings effects were unsupported.

The headline replication rates differed considerably across projects – occurring more frequently for Many Labs (77 %) and the Pipeline Project (60 %) than Registered Reports (30 %) and the Open Science Collaboration (36 %). Why are replication rates lower in the latter two projects? Possible explanations include the choice of likely versus unlikely replication candidates. Amongst the Many Labs studies, some had already previously been replicated and were selected knowing this fact. By contrast, the studies in the Pipeline project had not been previously replicated (indeed, not even previously published). Also important from a different perspective is whether each study was replicated only once by one group or multiple times by many groups.

In the Many Labs and Pipeline projects, 36 and 25 separate research groups were replicating each of 13 and 10 studies respectively. Multiple analyses lend themselves to meta-analytic techniques and analysis of the heterogeneity across research groups examining the same effect – the extent to which they accord in their effect sizes or not. The Many Labs project reported I2 values, which estimate the proportion of variation due to heterogeneity rather than chance. In the majority of cases, heterogeneity was small to moderate or even non-existent (e.g. across the 36 replications for both of the social priming studies: flag and money). Indeed, heterogeneity of effect sizes was greater between studies than within studies. When heterogeneity was greater, it was - perhaps surprisingly - where mean effect sizes were largest. Nonetheless, Many Labs reassuringly shows that some effects are highly replicable across research groups, countries, presentational differences (online versus face to face).

Counter-intuitive and even fanciful psychological hypotheses are not necessarily more likely to be false, but believing them to be so may influence researchers– even implicitly – in terms of how replications are conducted. In their extensive literature search, Makel et al. [ 40 ] reported that most direct replications are conducted by authors who proposed the original findings. This raises the thorny question of who should replicate? Almost 50 years ago Bakan [ 2 ] sagely warned that “If an investigator attempts to replicate his own investigation at another time, he will inevitably be under the influence of what he has already done…He should challenge, for example, his personal identification with the results he has already obtained, and prepare himself for finding both novelty and contradiction with respect to his earlier investigation” and that “…If one investigator is interested in replicating the investigation of another investigator, he should carefully take into account the possibility of suggestion, or his willingness to accept the results of the earlier investigator. …He should take careful cognizance of possible motivation for showing the earlier investigator to be in error, etc. [p. 110].” The irony is that as psychologists, we should be acutely aware of such biases - we cannot ignore the psychology of replication in the replication of psychology.

What are we replicating and why?

The cheap and easy.

Few areas of psychology have fallen under the replication lens and where they have, they are psychology’s equivalent to take-away meals – easy to prepare studies (e.g. often using online links to questionnaires). Hence, the focus has tended to be on studies from social and cognitive psychology, and not for example developmental or clinical studies, which are more prohibitive. Other notable examples exist such as cognitive neuropsychology, where the single case study has been predominant for decades – how can anyone recreate the brain injury and subsequent cognitive testing in a second patient?

The contentious

We cannot assert that the totality– or even a representative sample - of psychology has been scrutinised for replication. We can also see why some may feel targeted – replication does not (and probably cannot) occur in a random fashion. The vast majority of psychological studies are overlooked. To date, psychologists have targeted the unexpected, the curious, and newsworthy findings; and largely within a narrow range of areas (cognitive and social primarily). As psychologists, the need to sample more widely ought to go without saying; and one corollary of this, is that it makes no sense to claim that psychology is in crisis.

Too often perhaps, psychologists have been attracted to replicating contentious topics such as social priming, ego-depletion, psychic ability and so on. Some high impact journals have become repositories for the attention-grabbing, strange, unexpected and unbelievable findings. This goes to the systemic heart of the matter. Hartshorne & Schachner [ 27 ] amongst many others have noted “…replicability is not systematically considered in measuring paper, researcher, and journal quality. As a result, the current incentive structure rewards the publication of non-replicable findings …” (p.3 my italics). This is nothing new in science, as the quest for scientific prestige has historically resulted in a conflict between the goals of science and the personal goals of the scientist (see [ 47 ]).

The preposterous

“If there is no ESP, then we want to be able to carry out null experiments and get no effect, otherwise we cannot put much belief in work on small effects in non-ESP situations. If there is ESP, that is exciting. However, thus far it does not look as if it will replace the telephone” (Mosteller [ 41 ], p 396)

From the opposite perspective, Jim Coyne (this issue) maintains that psychology would benefit from some “…provision for screening out candidates for replication for which a consensus could be reached that the research hypotheses were improbable and not warranting the effort and resources required for a replication to establish this.” The frustration of some psychologists is palpable as they peruse apparently improbable hypotheses. Coyne’s concern echoes that of Edwards [ 18 ] who half a century ago similarly remarked, “If a hypothesis is preposterous to start with, no amount of bias against it can be too great. On the other hand, if it is preposterous to start with, why test it ?” Edwards (p 402). How preposterous can we get? According to Simmons et al. [ 51 ], it is “…unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis. (p. 1359). Indeed, they managed to show by manipulating what they describe as researcher degrees of freedom (e.g. ‘data-peeking’, deciding when to stop testing participants, whether to exclude outlying data points) , that people appear to forget their age and claim to be 1.5 years younger after listening to the Beatles song “When I’m 64”.

The fact that seemingly incredible findings can be published raises disquiet about the methods normally employed by psychologists and in some circles, this has inflated to concerns about psychology more generally. Within the methodological and statistical frameworks that psychologists normally operate, we have to face the unpalatable possibility that the wriggle room for researchers is – unacceptably large. Further, it is implicitly reinforced, as Coyne notes, by the actions of some journals as well as media outlets– and until that is adequately addressed, little will change.

The negative

Interestingly, the four replication projects outlined above almost wholly neglected null findings. To date, replication efforts are invariably aimed at positive findings. Should we not also try to replicate null findings? Given the propensity for positive findings to become nulls , what is the likelihood of reverse effects in more adequately powered studies? The emphasis on replicating positive outcomes betrays the wider bias that psychologist have against null findings per se (Laws [ 36 ]). The overwhelming majority of published findings in psychology are positive (93.5 %: [ 54 ]) and the aversion to null findings may well be worse in psychology than other sciences [ 20 ]. Intriguingly, we can see a hint of this issue inthe OSC reproducibility project, which did include 3 %of sampled findings that were null initially - and whiletwo were confirmed as nulls, one did indeed become significant.As psychologists, we might ponder how the biasagainst publishing null findings finds a clear echo in the bias against replicating null findings.

A conflict between belief and evidence

The wriggle room is fertile ground for psychologists to exploit the disjunction between belief and evidence that seems quite pervasive in psychology. As remarked upon by Francis “Contrary to its central role in other sciences,it appears that successful replication is sometimes not related to belief about an effect in experimental psychology. A high rate of successful replication is not sufficient to induce belief in an effect [ 8 ] , nor is a high rate of successful replication necessary for belief [ 22 ].” The Bem [ 8 ] study documented “experimental evidence for anomalous retroactive influences on cognition and affect” or in plain language…precognition. Using multiple tasks, and nine experiments involving over 1,000 participants, Bem had implausibly demonstrated that the performance of participants reflected what happened after they had made their decision. For example, on a memory test, participants were more likely to remember words that they were later asked to practise i.e. memory rehearsal seemingly worked back in time. In another task, participants had to select which of two curtains on a computer screen hid an erotic image, and they did so at a level significantly greater than chance, but not when the hidden images were less titillating. Furthermore, Bem and colleagues [ 7 ] later meta-analysed 90 previous studies to establish a significant effect size of 0.22.

Bem presents nine replications of a phenomenon and a large meta-analysis, yet we do not believe it, while other phenomena do not so readily replicate (e.g. bystander apathy [ 22 ]) but we do believe in them. Francis [ 23 ] bleakly concludes “ The scientific method is supposed to be able to reveal truths about the world, and the reliability of empirical findings is supposed to be the final arbiter of science; but this method does not seem to work in experimental psychology as it is currently practiced .” Whether we believe in Bem’s precognition, social priming, or indeed, any published psychological finding – researchers are operating within the methodological and statistical wriggle room . The task for psychologists is to view these phenomena like any other scientific question i.e. in need of explanation. If they can close-down the wriggle room, then we might expect such curios and anomalies to evaporate in a cloud of nonsignificant results.

While some might view the disjunction between belief and evidence as ‘healthy skepticism’, others might also describe it as resistance to evidence or even anti-science. A pertinent example comes from Lykken [ 37 ] who described a study in which people who see frogs in a Rorschach test – ‘frog responders’ – were more likely to have an eating disorder [ 48 ] – a finding interpreted as evidence of harboring oral impregnation fantasies and an unconscious belief in anal birth. Lykken asked 20 clinician colleagues to estimate the likelihood of this ‘cloacal theory of birth’ before and after seeing Sapolsky’s evidence. Beforehand, they reported a “…median value of 0.01, which can be interpreted to mean, roughly, ‘I don't believe it’ ” and after being shown the confirmatory evidence “… the median unchanged at 0.01. I interpret this consensus to mean, roughly, ‘I still don’t believe it.’” (p. 151–152) . Lykken remarked that normally when a prediction is confirmed by experiment, we might expect “…a nontrivial increment in one’s confidence in that theory should result, especially when one’s prior confidence is low… [but that] this rule is wrong not only in a few exceptional instances but as it is routinely applied to the majority of experimental reports in the psychological literature” p.152 . Often such claims give rise to a version of Feynman’s maxim that “Extraordinary claims require extraordinary evidence”. The remarkableness of a claim, however, is not necessarily relevant to either the type or the scale of evidence required. Instead of setting different criteria for the ordinary and extraordinary, we need to continue to close the wriggle room .

Beliefs and the failure to self-correct

“Scientists should not be in the business of simply ignoring literature that they do not like because it contests their view.” [ 30 ]

Taking this to the opposite extreme, some researchers may choose to ignore the findings of meta-analyses at the expense of selected individual studies that accord more with their view. Giner-Sorolla [ 24 ] maintained that “…meta-analytic validation is not seen as necessary to proclaim an effect reliable. Textbooks, press reports, and narrative reviews often rest conclusions on single influential articles rather than insisting on a replication across independent labs and multiple contexts ” (p 564, my italics).

Stoebe & Strack rightly point-out, “Even multiple failures to replicate an established finding would not result in a rejection of the original hypothesis, if there are also multiple studies that supported that hypothesis.” [and] ‘believers’ “…will keep on believing, pointing at the successful replications and derogating the unsuccessful ones, whereas the nonbelievers will maintain their belief system drawing on the failed replications for support of their rejection of the original hypothesis.” (p.64). Psychology rarely – if ever- proceeds with an unequivocal knock-out blow delivered by a negative finding or even a meta-analysis. Indeed, psychology often has more of the feel of trench warfare, where models and hypotheses are ultimately abandoned largely because researchers lose interest [ 26 ].

Jussim et al. [ 30 ] provide some interesting examples of precisely how social psychology doesn’t seem to correct itself when big findings fail to replicate. If doubts are raised about an original finding then as Jussim et al point out, we might expect citations to reflect this debate, the uncertainly and as such the original and the unsuccessful replications would be expected to be fairly equally cited.

In a classic study, Darley & Gross [ 15 ] found people applied a stereotype about social class when they saw a young girl taking a maths test either after seeing her playing in an affluent or poor background. After obtaining the original materials and following the procedure carefully, Baron et al. [ 6 ] published two failed replications using more than twice as many participants. Not only did they fail to replicate, the evidence was in the opposite direction. Such findings ought to encourage debate with relatively equal attention to the pro and con studies in the literature - alas no. Jussim et al. reported that “…since 1996, the original study has been cited 852 times, while the failed replications have been cited just 38 times (according to Google Scholar searches conducted on 9/11/15).”

This is not an unusual case, as Jussim et al. report several examples of failed replications not being cited, while original studies continue to be readily cited. The infamous and seminal study by Bargh and colleagues [ 5 ] showed that unconsciously priming people with an ‘elderly stereotype’ (unscrambling jumbled sentences that contained words like: old, lonely, bingo, wrinkle ) makes them subsequently walk more slowly. However, Doyen et al. [ 16 ] failed to replicate the finding using more accurate measures of walking speed. Since 2013, Bargh et al. has been cited 900 times and Doyen et al. 192. Or a meta-analysis of 88 studies by Jost et al. [ 29 ] showing that conservativism is a syndrome characterized by rigidity, dogmatism, prejudice, and fear, not replicated by a larger better controlled meta-analysis conducted by Van Hiel and colleagues [ 57 ]. Since 2010, the former has been cited 1030 times while the latter a mere 60 by comparison. Jussim et al. suggest “This pattern of ignoring correctives likely leads social psychology to overstate the extent to which evidence supports the original study’s conclusions…[] it behooves researchers to grapple with the full literature, not just the studies conducive to their preferred arguments”.

Meta-analysis: rescue remedy or statistical alchemy?

Some view meta-analysis as the closest thing we have to a definitive approach for establishing the veracity and reliability of an effect. In the context of discussing social priming experiments, John Bargh [ 4 ] declared that “… In science the way to answer questions about replicability of effects is through statistical techniques such as meta-analysis ”. Others are more skeptical: “Meta-analysis is a reasonable way to search for patterns in previously published research. It has serious limitations, however, as a method for confirming hypotheses and for establishing the replicability of experiments” (p. 486 Hyman, 2010). Meta-analysis is not a magic dust that we can sprinkle over primary literatures to elucidate necessary truths. Likewise totemically accumulating replicated findings, in itself, does not necessarily prove anything (pace Popper). Does it matter if we replicate a finding once, twice, or 20 times, what ratio of positive to negative outcomes do we find acceptable? Answers or rules of thumb do not exist – it often comes down to our beliefs in psychology.

This special issue of BMC Psychology contains 4 articles (Taylor & Munafo, [ 56 ]; Lakens, Hilgaard & Staaks [ 34 ]; Coppens, Verkoeijen, Bouwmeester & Rikers, [ 13 ]; Coyne [ 14 ]) and in each, meta-analysis occupies a pivotal place. As shown by Taylor & Munafo (current issue), meta analyses have proliferated, are highly cited and “…most worryingly, the perceived authority of the conclusions of a meta-analysis means that it has become possible to use a meta-analysis in the hope of having the final word in an academic debate.” As with all methods, meta-analysis has its own limitations and retrospective validation via meta-analysis is not a substitute for prospective replication using adequately powered trials, but they do have substantive role to play in the reproducibility question.

Judging the weight of evidence is never straightforward and whether a finding sustains in psychology often reflect our beliefs almost as much as the evidence. Indeed, meta-analysis rightly or wrongly enables some ideas to persist despite a lack of support at the level of individual study or trial. This has certainly been argued in the use of meta-analyses to establish a case for psychic abilities, where Storm, Tressoldi & Di Risio [ 55 ] identify how “It distorts what scientists mean by confirmatory evidence. It confuses retrospective sanctification with prospective replicability.” (p.489)

This is a kind of free-lunch’ notion of meta-analysis. Feinstein [ 21 ] even stated that “ meta-analysis is analogous to statistical alchemy for the 21st century … the main appeal is that it can convert existing things into something better. “Significance” can be attained statistically when small group sizes are pooled into big ones” (p. 71). Undoubtedly, the conclusions of meta-analyses may prove unreliable where small numbers of nonsignificant trials are pooled to produce significant effects [ 19 ]. Nonetheless, it is also quite feasible for a majority of negative outcomes in a literature and still produce a reliable overall significant effect size (e.g. streptokinase: [ 35 ]).

Two of the papers presented here (Lakens et al. this issue; Taylor & Munafo this issue) offer extremely good suggestions relating to some of these conflicts in meta-analytic findings. Lakens and colleagues offer 6 recommendations, including permitting others to “re-analyze the data to examine how sensitive the results are to subjective choices such as inclusion criteria” and enabling this by providing links to data files that permit such analysis. Currently, we also need to address data sharing in regular papers. Sampling papers published in one year in the top 50 high-impact journals, Alsheikh-Ali et al. [ 1 ] reported that a substantial proportion of papers published in high-impact journals “…are either not subject to any data availability policies, or do not adhere to the data availability instructions in their respective journals”. Such efforts for transparency are extremely welcome and indeed, echo the posting online of our interactive CBT for schizophrenia meta-analysis database ( http://www.cbtinschizophrenia.com/ ), which has been used by others to test new hypotheses (e.g. [ 25 ]).

Taylor & Munafo (this issue) advise greater triangulation of evidence and in this particular instance, supplementing traditional meta-analysis and P-curve analysis [ 52 ]. In passing, Taylor & Munafo also mention “…adversarial collaboration, where primary study authors on both sides of a particular debate contribute to an agreed protocol and work together to interpret the results”. The proposed version of adversarial collaboration proposed by Kahneman [ 31 ] urged scientists to engage in a “good-faith effort to conduct debates by carrying out joint research” (p. 729). More recently, he elaborated on this in the context of the furore over failed replications (Kahneman [ 32 ]). Coyne covers some aspects of this latest paper on replication etiquette and finds some of it wanting. It may however be possible to find some new adversarial middle ground, but it crucially depends upon psychologists being more open. Indeed, some aspects of adversarial collaboration could dovetail with Lakens et als’ proposal regarding hosting relevant data on web platforms. In such a scenario, opposing views could test their hypotheses in a public arena using a shared database.

In the context of adversarial collaboration, some uncertainty and difference of opinion exists about how we might accommodate the views of those being replicated. One possibility again requires openness and that is for those who are replicated to be asked to submit a review; and crucially, the review and replicator’s responses are then published alongside the paper. Indeed, this happened with the paper of Coppens et al. (this issue). They replicated the ‘testing effect’ reported by Carpenter (2009) – that information which has been retrieved from memory is better recalled than that which has simply been studied. Their replications and meta-analysis partially replicate the original findings, and Carpenter was one of the reviewers whose review is available alongside the paper (along with the author responses). Indeed, from its initiation, BMC Psychology has published all reviews and responses to reviewers alongside published papers. This degree of openness is unusual in psychology journals, but does offer readers a glimpse into the process behind a replication (or any paper), allows the person being replicated to contribute and comment on the replication, to reply and be published in the same journal at the same time.

Ultimately, the issues that psychologists face over replication are as much about our beliefs, biases and openness as anything else. We are not dispassionate about the outcomes that we measure. Maybe because the substance of our spotlight is people, cognition and brains, we sometimes care too much about the ‘truths’ we choose to declare. They have implications. Similarly, we should not ignore the incentive structures and conflicts between the personal goals of psychologists and the goals of science. They have implications. Finally, the attitudes of psychologists to the transparency of our science needs to change. They have implications.

Acknowledgements

Availability of data and materials, authors’ contributions, competing interests.

Keith R Laws is a Section Editor for BMC Psychology, who declares no competing interests.

Consent for publication

Ethics approval and consent to participate.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

The Psychology of Replication and Replication in Psychology

Affiliation.

  • 1 Department of Psychological Sciences, Purdue University [email protected].
  • PMID: 26168115
  • DOI: 10.1177/1745691612459520

Like other scientists, psychologists believe experimental replication to be the final arbiter for determining the validity of an empirical finding. Reports in psychology journals often attempt to prove the validity of a hypothesis or theory with multiple experiments that replicate a finding. Unfortunately, these efforts are sometimes misguided because in a field like experimental psychology, ever more successful replication does not necessarily ensure the validity of an empirical finding. When psychological experiments are analyzed with statistics, the rules of probability dictate that random samples should sometimes be selected that do not reject the null hypothesis, even if an effect is real. As a result, it is possible for a set of experiments to have too many successful replications. When there are too many successful replications for a given set of experiments, a skeptical scientist should be suspicious that null or negative findings have been suppressed, the experiments were run improperly, or the experiments were analyzed improperly. This article describes the implications of this observation and demonstrates how to test for too much successful replication by using a set of experiments from a recent research paper.

Keywords: aversion; effect size; memory; power; publication bias; replication; scientific method.

© The Author(s) 2012.

PubMed Disclaimer

Similar articles

  • Implications of "Too Good to Be True" for Replication, Theoretical Claims, and Experimental Design: An Example Using Prominent Studies of Racial Bias. Francis G. Francis G. Front Psychol. 2016 Sep 22;7:1382. doi: 10.3389/fpsyg.2016.01382. eCollection 2016. Front Psychol. 2016. PMID: 27713708 Free PMC article. Review.
  • Rewarding Replications: A Sure and Simple Way to Improve Psychological Science. Koole SL, Lakens D. Koole SL, et al. Perspect Psychol Sci. 2012 Nov;7(6):608-14. doi: 10.1177/1745691612462586. Perspect Psychol Sci. 2012. PMID: 26168120
  • Replications in Psychology Research: How Often Do They Really Occur? Makel MC, Plucker JA, Hegarty B. Makel MC, et al. Perspect Psychol Sci. 2012 Nov;7(6):537-42. doi: 10.1177/1745691612460688. Perspect Psychol Sci. 2012. PMID: 26168110
  • Publication bias and the failure of replication in experimental psychology. Francis G. Francis G. Psychon Bull Rev. 2012 Dec;19(6):975-91. doi: 10.3758/s13423-012-0322-y. Psychon Bull Rev. 2012. PMID: 23055145 Review.
  • Too good to be true: publication bias in two prominent studies from experimental psychology. Francis G. Francis G. Psychon Bull Rev. 2012 Apr;19(2):151-6. doi: 10.3758/s13423-012-0227-9. Psychon Bull Rev. 2012. PMID: 22351589
  • The golden ratio-dispelling the myth. Naini FB. Naini FB. Maxillofac Plast Reconstr Surg. 2024 Jan 17;46(1):2. doi: 10.1186/s40902-024-00411-2. Maxillofac Plast Reconstr Surg. 2024. PMID: 38228978 Free PMC article. Review.
  • Testing Replicability and Generalizability of the Time on Task Effect. Krämer RJ, Koch M, Levacher J, Schmitz F. Krämer RJ, et al. J Intell. 2023 Apr 28;11(5):82. doi: 10.3390/jintelligence11050082. J Intell. 2023. PMID: 37233332 Free PMC article.
  • CLASS-WIDE FUNCTION-RELATED INTERVENTION TEAMS (CW-FIT): Student and Teacher Outcomes from a Multisite Randomized Replication Trial. Wills H, Kamps D, Caldarella P, Wehby J, Romine RS. Wills H, et al. Elem Sch J. 2018 Sep;119(1):29-51. Epub 2018 Jul 25. Elem Sch J. 2018. PMID: 36968127 Free PMC article.
  • Opposing influences of global and local stimulus-hand proximity on crosstalk interference in dual tasks. Ellinghaus R, Janczyk M, Wirth R, Kunde W, Fischer R, Liepelt R. Ellinghaus R, et al. Q J Exp Psychol (Hove). 2023 Nov;76(11):2461-2478. doi: 10.1177/17470218231157548. Epub 2023 Mar 7. Q J Exp Psychol (Hove). 2023. PMID: 36765279 Free PMC article.
  • Early development of negative and positive affect: Implications for ADHD symptomatology across three birth cohorts. Gustafsson HC, Nolvi S, Sullivan EL, Rasmussen JM, Gyllenhammer LE, Entringer S, Wadhwa PD, O'Connor TG, Karlsson L, Karlsson H, Korja R, Buss C, Graham AM, Nigg JT. Gustafsson HC, et al. Dev Psychopathol. 2021 Dec;33(5):1837-1848. doi: 10.1017/s0954579421001012. Epub 2021 Oct 15. Dev Psychopathol. 2021. PMID: 36238202 Free PMC article.

LinkOut - more resources

Full text sources, research materials.

  • NCI CPTC Antibody Characterization Program

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

  • A-Z Publications

Annual Review of Psychology

Volume 73, 2022, review article, replicability, robustness, and reproducibility in psychological science.

  • Brian A. Nosek 1,2 , Tom E. Hardwicke 3 , Hannah Moshontz 4 , Aurélien Allard 5 , Katherine S. Corker 6 , Anna Dreber 7 , Fiona Fidler 8 , Joe Hilgard 9 , Melissa Kline Struhl 2 , Michèle B. Nuijten 10 , Julia M. Rohrer 11 , Felipe Romero 12 , Anne M. Scheel 13 , Laura D. Scherer 14 , Felix D. Schönbrodt 15 , and Simine Vazire 16
  • View Affiliations Hide Affiliations Affiliations: 1 Department of Psychology, University of Virginia, Charlottesville, Virginia 22904, USA; email: [email protected] 2 Center for Open Science, Charlottesville, Virginia 22903, USA 3 Department of Psychology, University of Amsterdam, 1012 ZA Amsterdam, The Netherlands 4 Addiction Research Center, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA 5 Department of Psychology, University of California, Davis, California 95616, USA 6 Psychology Department, Grand Valley State University, Allendale, Michigan 49401, USA 7 Department of Economics, Stockholm School of Economics, 113 83 Stockholm, Sweden 8 School of Biosciences, University of Melbourne, Parkville VIC 3010, Australia 9 Department of Psychology, Illinois State University, Normal, Illinois 61790, USA 10 Meta-Research Center, Tilburg University, 5037 AB Tilburg, The Netherlands 11 Department of Psychology, Leipzig University, 04109 Leipzig, Germany 12 Department of Theoretical Philosophy, University of Groningen, 9712 CP Groningen, The Netherlands 13 Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands 14 University of Colorado Anschutz Medical Campus, Aurora, Colorado 80045, USA 15 Department of Psychology, Ludwig Maximilian University of Munich, 80539 Munich, Germany 16 School of Psychological Sciences, University of Melbourne, Parkville VIC 3052, Australia
  • Vol. 73:719-748 (Volume publication date January 2022) https://doi.org/10.1146/annurev-psych-020821-114157
  • First published as a Review in Advance on October 19, 2021
  • Copyright © 2022 by Annual Reviews. All rights reserved

Replication—an important, uncommon, and misunderstood practice—is gaining appreciation in psychology. Achieving replicability is important for making research progress. If findings are not replicable, then prediction and theory development are stifled. If findings are replicable, then interrogation of their meaning and validity can advance knowledge. Assessing replicability can be productive for generating and testing hypotheses by actively confronting current understandings to identify weaknesses and spur innovation. For psychology, the 2010s might be characterized as a decade of active confrontation. Systematic and multi-site replication projects assessed current understandings and observed surprising failures to replicate many published findings. Replication efforts highlighted sociocultural challenges such as disincentives to conduct replications and a tendency to frame replication as a personal attack rather than a healthy scientific practice, and they raised awareness that replication contributes to self-correction. Nevertheless, innovation in doing and understanding replication and its cousins, reproducibility and robustness, has positioned psychology to improve research practices and accelerate progress.

Article metrics loading...

Full text loading...

Literature Cited

  • Alogna VK , Attaya MK , Aucoin P , Bahník Š , Birch S et al. 2014 . Registered Replication Report: Schooler and Engstler-Schooler (1990). Perspect. Psychol. Sci. 9 : 5 556– 78 [Google Scholar]
  • Altmejd A , Dreber A , Forsell E , Huber J , Imai T et al. 2019 . Predicting the replicability of social science lab experiments. PLOS ONE 14 : 12 e0225826 [Google Scholar]
  • Anderson CJ , Bahník Š , Barnett-Cowan M , Bosco FA , Chandler J et al. 2016 . Response to Comment on “Estimating the reproducibility of psychological science. Science 351 : 6277 1037 [Google Scholar]
  • Anderson MS , Martinson BC , De Vries R. 2007 . Normative dissonance in science: results from a national survey of U.S. scientists. J. Empir. Res. Hum. Res. Ethics 2 : 4 3– 14 [Google Scholar]
  • Appelbaum M , Cooper H , Kline RB , Mayo-Wilson E , Nezu AM , Rao SM. 2018 . Journal article reporting standards for quantitative research in psychology: the APA Publications and Communications Board task force report. Am. Psychol. 73 : 1 3 – 25 Corrigendum 2018 . Am. Psychol 73 : 7 947 [Google Scholar]
  • Armeni K , Brinkman L , Carlsson R , Eerland A , Fijten R et al. 2020 . Towards wide-scale adoption of open science practices: the role of open science communities. MetaArXiv, Oct. 6 https://doi.org/10.31222/osf.io/7gct9 [Crossref]
  • Artner R , Verliefde T , Steegen S , Gomes S , Traets F et al. 2020 . The reproducibility of statistical results in psychological research: an investigation using unpublished raw data. Psychol. Methods. In press. https://doi.org/10.1037/met0000365 [Crossref] [Google Scholar]
  • Baker M. 2016 . Dutch agency launches first grants programme dedicated to replication. Nat. News. https://doi.org/10.1038/nature.2016.20287 [Crossref] [Google Scholar]
  • Bakker M , van Dijk A , Wicherts JM. 2012 . The rules of the game called psychological science. Perspect. Psychol. Sci. 7 : 6 543– 54 [Google Scholar]
  • Bakker M , Wicherts JM. 2011 . The (mis)reporting of statistical results in psychology journals. Behav. Res. Methods 43 : 3 666– 78 [Google Scholar]
  • Baribault B , Donkin C , Little DR , Trueblood JS , Oravecz Z et al. 2018 . Metastudies for robust tests of theory. PNAS 115 : 11 2607– 12 [Google Scholar]
  • Baron J , Hershey JC. 1988 . Outcome bias in decision evaluation. J. Pers. Soc. Psychol. 54 : 4 569– 79 [Google Scholar]
  • Baumeister RF. 2016 . Charting the future of social psychology on stormy seas: winners, losers, and recommendations. J. Exp. Soc. Psychol. 66 : 153– 58 [Google Scholar]
  • Baumeister RF , Vohs KD. 2016 . Misguided effort with elusive implications. Perspect. Psychol. Sci. 11 : 4 574– 75 [Google Scholar]
  • Benjamin DJ , Berger JO , Johannesson M , Nosek BA , Wagenmakers E-J et al. 2018 . Redefine statistical significance. Nat. Hum. Behav. 2 : 1 6– 10 [Google Scholar]
  • Botvinik-Nezer R , Holzmeister F , Camerer CF , Dreber A , Huber J et al. 2020 . Variability in the analysis of a single neuroimaging dataset by many teams. Nature 582 : 7810 84– 88 [Google Scholar]
  • Bouwmeester S , Verkoeijen PPJL , Aczel B , Barbosa F , Bègue L et al. 2017 . Registered Replication Report: Rand, Greene, and Nowak (2012). Perspect. Psychol. Sci. 12 : 3 527– 42 [Google Scholar]
  • Brown NJL , Heathers JAJ. 2017 . The GRIM test: A simple technique detects numerous anomalies in the reporting of results in psychology. Soc. Psychol. Pers. Sci. 8 : 4 363– 69 [Google Scholar]
  • Button KS , Ioannidis JPA , Mokrysz C , Nosek BA , Flint J et al. 2013 . Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14 : 5 365– 76 [Google Scholar]
  • Byers-Heinlein K , Bergmann C , Davies C , Frank M , Hamlin JK et al. 2020 . Building a collaborative psychological science: lessons learned from ManyBabies 1. Can. Psychol. Psychol. Can. 61 : 4 349– 63 [Google Scholar]
  • Camerer CF , Dreber A , Forsell E , Ho T-H , Huber J et al. 2016 . Evaluating replicability of laboratory experiments in economics. Science 351 : 6280 1433– 36 [Google Scholar]
  • Camerer CF , Dreber A , Holzmeister F , Ho T-H , Huber J et al. 2018 . Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat. Hum. Behav. 2 : 9 637– 44 [Google Scholar]
  • Carter EC , Schönbrodt FD , Gervais WM , Hilgard J. 2019 . Correcting for bias in psychology: a comparison of meta-analytic methods. Adv. Methods Pract. Psychol. Sci. 2 : 2 115– 44 [Google Scholar]
  • Cent. Open Sci 2020 . APA joins as new signatory to TOP guidelines. Center for Open Science Nov. 10. https://www.cos.io/about/news/apa-joins-as-new-signatory-to-top-guidelines [Google Scholar]
  • Cesario J. 2014 . Priming, replication, and the hardest science. Perspect. Psychol. Sci. 9 : 1 40– 48 [Google Scholar]
  • Chambers C. 2019 . What's next for Registered Reports?. Nature 573 : 7773 187– 89 [Google Scholar]
  • Cheung I , Campbell L , LeBel EP , Ackerman RA , Aykutoğlu B et al. 2016 . Registered Replication Report: Study 1 from Finkel, Rusbult, Kumashiro, & Hannon (2002). Perspect. Psychol. Sci. 11 : 5 750– 64 [Google Scholar]
  • Christensen G , Wang Z , Paluck EL , Swanson N , Birke DJ , Miguel E , Littman R. 2019 . Open science practices are on the rise: the State of Social Science (3S) Survey. MetaArXiv, Oct. 18. https://doi.org/10.31222/osf.io/5rksu [Crossref]
  • Christensen-Szalanski JJ , Willham CF. 1991 . The hindsight bias: a meta-analysis. Organ. Behav. Hum. Decis. Process. 48 : 1 147– 68 [Google Scholar]
  • Cohen J. 1962 . The statistical power of abnormal-social psychological research: a review. J. Abnorm. Soc. Psychol. 65 : 3 145– 53 [Google Scholar]
  • Cohen J. 1973 . Statistical power analysis and research results. Am. Educ. Res. J. 10 : 3 225– 29 [Google Scholar]
  • Cohen J. 1990 . Things I have learned (so far). Am. Psychol. 45 : 1304– 12 [Google Scholar]
  • Cohen J. 1992 . A power primer. Psychol. Bull. 112 : 1 155– 59 [Google Scholar]
  • Cohen J. 1994 . The earth is round (p < .05). Am. Psychol. 49 : 12 997– 1003 [Google Scholar]
  • Colling LJ , Szücs D , De Marco D , Cipora K , Ulrich R et al. 2020 . Registered Replication Report on Fischer, Castel, Dodd, and Pratt (2003). Adv. Methods Pract. Psychol. Sci 3 : 2 143– 62 [Google Scholar]
  • Cook FL. 2016 . Dear Colleague Letter: robust and reliable research in the social, behavioral, and economic sciences. National Science Foundation Sept. 20. https://www.nsf.gov/pubs/2016/nsf16137/nsf16137.jsp [Google Scholar]
  • Crandall CS , Sherman JW. 2016 . On the scientific superiority of conceptual replications for scientific progress. J. Exp. Soc. Psychol. 66 : 93– 99 [Google Scholar]
  • Crisp RJ , Miles E , Husnu S 2014 . Support for the replicability of imagined contact effects. Soc. Psychol. 45 : 4 303– 4 [Google Scholar]
  • Cronbach LJ , Meehl PE. 1955 . Construct validity in psychological tests. Psychol. Bull. 52 : 4 281– 302 [Google Scholar]
  • Dang J , Barker P , Baumert A , Bentvelzen M , Berkman E et al. 2021 . A multilab replication of the ego depletion effect. Soc. Psychol. Pers. Sci. 12 : 1 14– 24 [Google Scholar]
  • Devezer B , Nardin LG , Baumgaertner B , Buzbas EO. 2019 . Scientific discovery in a model-centric framework: reproducibility, innovation, and epistemic diversity. PLOS ONE 14 : 5 e0216125 [Google Scholar]
  • Dijksterhuis A. 2018 . Reflection on the professor-priming replication report. Perspect. Psychol. Sci. 13 : 2 295– 96 [Google Scholar]
  • Dreber A , Pfeiffer T , Almenberg J , Isaksson S , Wilson B et al. 2015 . Using prediction markets to estimate the reproducibility of scientific research. PNAS 112 : 50 15343– 47 [Google Scholar]
  • Duhem PMM. 1954 . The Aim and Structure of Physical Theory Princeton, NJ: Princeton Univ. Press [Google Scholar]
  • Ebersole CR , Alaei R , Atherton OE , Bernstein MJ , Brown M et al. 2017 . Observe, hypothesize, test, repeat: Luttrell, Petty and Xu (2017) demonstrate good science. J. Exp. Soc. Psychol. 69 : 184– 86 [Google Scholar]
  • Ebersole CR , Atherton OE , Belanger AL , Skulborstad HM , Allen JM et al. 2016a . Many Labs 3: evaluating participant pool quality across the academic semester via replication. J. Exp. Soc. Psychol. 67 : 68– 82 [Google Scholar]
  • Ebersole CR , Axt JR , Nosek BA 2016b . Scientists’ reputations are based on getting it right, not being right. PLOS Biol . 14 : 5 e1002460 [Google Scholar]
  • Ebersole CR , Mathur MB , Baranski E , Bart-Plange D-J , Buttrick NR et al. 2020 . Many Labs 5: testing pre-data-collection peer review as an intervention to increase replicability. Adv. Methods Pract. Psychol. Sci. 3 : 3 309– 31 [Google Scholar]
  • Eerland A , Sherrill AM , Magliano JP , Zwaan RA , Arnal JD et al. 2016 . Registered Replication Report: Hart & Albarracín (2011). Perspect. Psychol. Sci. 11 : 1 158– 71 [Google Scholar]
  • Ellemers N , Fiske ST , Abele AE , Koch A , Yzerbyt V. 2020 . Adversarial alignment enables competing models to engage in cooperative theory building toward cumulative science. PNAS 117 : 14 7561– 67 [Google Scholar]
  • Epskamp S , Nuijten MB. 2018 . Statcheck: extract statistics from articles and recompute p values. Statistical Software https://CRAN.R-project.org/package=statcheck [Google Scholar]
  • Errington TM , Denis A , Perfito N , Iorns E , Nosek BA 2021 . Challenges for assessing reproducibility and replicability in preclinical cancer biology. eLife In press [Google Scholar]
  • Etz A , Vandekerckhove J. 2016 . A Bayesian perspective on the reproducibility project: psychology. PLOS ONE 11 : 2 e0149794 [Google Scholar]
  • Fanelli D. 2010 .. “ Positive” results increase down the hierarchy of the sciences. PLOS ONE 5 : 4 e10068 [Google Scholar]
  • Fanelli D. 2012 . Negative results are disappearing from most disciplines and countries. Scientometrics 90 : 3 891– 904 [Google Scholar]
  • Feest U. 2019 . Why replication is overrated. Philos. Sci. 86 : 5 895– 905 [Google Scholar]
  • Ferguson MJ , Carter TJ , Hassin RR. 2014 . Commentary on the attempt to replicate the effect of the American flag on increased Republican attitudes. Soc. Psychol. 45 : 4 301– 2 [Google Scholar]
  • Fetterman AK , Sassenberg K. 2015 . The reputational consequences of failed replications and wrongness admission among scientists. PLOS ONE 10 : 12 e0143723 [Google Scholar]
  • Forsell E , Viganola D , Pfeiffer T , Almenberg J , Wilson B et al. 2019 . Predicting replication outcomes in the Many Labs 2 study. J. Econ. Psychol. 75 : 102117 [Google Scholar]
  • Franco A , Malhotra N , Simonovits G. 2014 . Publication bias in the social sciences: unlocking the file drawer. Science 345 : 6203 1502– 5 [Google Scholar]
  • Franco A , Malhotra N , Simonovits G. 2016 . Underreporting in psychology experiments: evidence from a study registry. Soc. Psychol. Pers. Sci. 7 : 1 8– 12 [Google Scholar]
  • Frank MC , Bergelson E , Bergmann C , Cristia A , Floccia C et al. 2017 . A collaborative approach to infant research: promoting reproducibility, best practices, and theory-building. Infancy 22 : 4 421– 35 [Google Scholar]
  • Funder DC , Ozer DJ. 2019 . Evaluating effect size in psychological research: sense and nonsense. Adv. Methods Pract. Psychol. Sci. 2 : 2 156– 68 [Google Scholar]
  • Gelman A , Carlin J. 2014 . Beyond power calculations: assessing type S (sign) and type M (magnitude) errors. Perspect. Psychol. Sci. 9 : 6 641– 51 [Google Scholar]
  • Gelman A , Loken E. 2013 . The garden of forking paths: why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time Work. Pap., Columbia Univ. New York: [Google Scholar]
  • Gergen KJ. 1973 . Social psychology as history. J. Pers. Soc. Psychol. 26 : 2 309– 20 [Google Scholar]
  • Gervais WM , Jewell JA , Najle MB , Ng BKL. 2015 . A powerful nudge? Presenting calculable consequences of underpowered research shifts incentives toward adequately powered designs. Soc. Psychol. Pers. Sci. 6 : 7 847– 54 [Google Scholar]
  • Ghelfi E , Christopherson CD , Urry HL , Lenne RL , Legate N et al. 2020 . Reexamining the effect of gustatory disgust on moral judgment: a multilab direct replication of Eskine, Kacinik, and Prinz (2011). Adv. Methods Pract. Psychol. Sci. 3 : 1 3– 23 [Google Scholar]
  • Gilbert DT , King G , Pettigrew S , Wilson TD 2016 . Comment on “Estimating the reproducibility of psychological science. Science 351 : 6277 1037 [Google Scholar]
  • Giner-Sorolla R. 2012 . Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspect. Psychol. Sci. 7 : 6 562– 71 [Google Scholar]
  • Giner-Sorolla R. 2019 . From crisis of evidence to a “crisis” of relevance? Incentive-based answers for social psychology's perennial relevance worries. Eur. Rev. Soc. Psychol. 30 : 1 1– 38 [Google Scholar]
  • Gollwitzer M. 2020 . DFG Priority Program SPP 2317 Proposal: A meta-scientific program to analyze and optimize replicability in the behavioral, social, and cognitive sciences (META-REP). PsychArchives, May 29. http://dx.doi.org/10.23668/psycharchives.3010 [Crossref]
  • Gordon M , Viganola D , Bishop M , Chen Y , Dreber A et al. 2020 . Are replication rates the same across academic fields? Community forecasts from the DARPA SCORE programme. R. Soc. Open Sci. 7 : 7 200566 [Google Scholar]
  • Götz M , O'Boyle EH , Gonzalez-Mulé E , Banks GC , Bollmann SS 2020 . The “Goldilocks Zone”: (Too) many confidence intervals in tests of mediation just exclude zero. Psychol. Bull. 147 : 1 95– 114 [Google Scholar]
  • Greenwald AG. 1975 . Consequences of prejudice against the null hypothesis. Psychol. Bull. 82 : 1 1– 20 [Google Scholar]
  • Hagger MS , Chatzisarantis NLD , Alberts H , Anggono CO , Batailler C et al. 2016 . A multilab preregistered replication of the ego-depletion effect. Perspect. Psychol. Sci. 11 : 4 546– 73 [Google Scholar]
  • Hanea AM , McBride MF , Burgman MA , Wintle BC , Fidler F et al. 2017 . I nvestigate D iscuss E stimate A ggregate for structured expert judgement. Int. J. Forecast. 33 : 1 267– 79 [Google Scholar]
  • Hardwicke TE , Bohn M , MacDonald KE , Hembacher E , Nuijten MB et al. 2021 . Analytic reproducibility in articles receiving open data badges at the journal Psychological Science : an observational study. R. Soc. Open Sci. 8 : 1 201494 [Google Scholar]
  • Hardwicke TE , Mathur MB , MacDonald K , Nilsonne G , Banks GC et al. 2018 . Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition . R. Soc. Open Sci. 5 : 8 180448 [Google Scholar]
  • Hardwicke TE , Serghiou S , Janiaud P , Danchev V , Crüwell S et al. 2020a . Calibrating the scientific ecosystem through meta-research. Annu. Rev. Stat. Appl. 7 : 11– 37 [Google Scholar]
  • Hardwicke TE , Thibault RT , Kosie JE , Wallach JD , Kidwell M , Ioannidis J. 2020b . Estimating the prevalence of transparency and reproducibility-related research practices in psychology (2014–2017). MetaArXiv, Jan. 2. https://doi.org/10.31222/osf.io/9sz2y [Crossref]
  • Hedges LV , Schauer JM. 2019 . Statistical analyses for studying replication: meta-analytic perspectives. Psychol. Methods 24 : 5 557– 70 [Google Scholar]
  • Hoogeveen S , Sarafoglou A , Wagenmakers E-J. 2020 . Laypeople can predict which social-science studies will be replicated successfully. Adv. Methods Pract. Psychol. Sci. 3 : 3 267– 85 [Google Scholar]
  • Hughes BM. 2018 . Psychology in Crisis London: Palgrave Macmillan [Google Scholar]
  • Inbar Y. 2016 . Association between contextual dependence and replicability in psychology may be spurious. PNAS 113 : 34 E4933– 34 [Google Scholar]
  • Ioannidis JPA. 2005 . Why most published research findings are false. PLOS Med 2 : 8 e124 [Google Scholar]
  • Ioannidis JPA. 2008 . Why most discovered true associations are inflated. Epidemiology 19 : 5 640– 48 [Google Scholar]
  • Ioannidis JPA. 2014 . How to make more published research true. PLOS Med 11 : 10 e1001747 [Google Scholar]
  • Ioannidis JPA , Trikalinos TA. 2005 . Early extreme contradictory estimates may appear in published research: the Proteus phenomenon in molecular genetics research and randomized trials. J. Clin. Epidemiol. 58 : 6 543– 49 [Google Scholar]
  • Isager PM , van Aert RCM , Bahník Š , Brandt M , DeSoto KA et al. 2020 . Deciding what to replicate: A formal definition of “replication value” and a decision model for replication study selection. MetaArXiv, Sept. 2. https://doi.org/10.31222/osf.io/2gurz [Crossref]
  • John LK , Loewenstein G , Prelec D. 2012 . Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol. Sci. 23 : 5 524– 32 [Google Scholar]
  • Kahneman D. 2003 . Experiences of collaborative research. Am. Psychol. 58 : 9 723– 30 [Google Scholar]
  • Kerr NL. 1998 . HARKing: Hypothesizing after the results are known. Pers. Soc. Psychol. Rev. 2 : 3 196– 217 [Google Scholar]
  • Kidwell MC , Lazarević LB , Baranski E , Hardwicke TE , Piechowski S et al. 2016 . Badges to acknowledge open practices: a simple, low-cost, effective method for increasing transparency. PLOS Biol 14 : 5 e1002456 [Google Scholar]
  • Klein RA , Cook CL , Ebersole CR , Vitiello C , Nosek BA et al. 2019 . Many Labs 4: failure to replicate mortality salience effect with and without original author involvement. PsyArXiv, Dec. 11. https://doi.org/10/ghwq2w [Crossref]
  • Klein RA , Ratliff KA , Vianello M , Adams RB , Bahník Š et al. 2014 . Investigating variation in replicability: a “many labs” replication project. Soc. Psychol. 45 : 3 142– 52 [Google Scholar]
  • Klein RA , Vianello M , Hasselman F , Adams BG , Adams RB et al. 2018 . Many Labs 2: investigating variation in replicability across samples and settings. Adv. Methods Pract. Psychol. Sci. 1 : 4 443– 90 [Google Scholar]
  • Kunda Z. 1990 . The case for motivated reasoning. Psychol. Bull. 108 : 3 480– 98 [Google Scholar]
  • Lakens D. 2019 . The value of preregistration for psychological science: a conceptual analysis. PsyArXiv, Nov. 18. https://doi.org/10.31234/osf.io/jbh4w [Crossref]
  • Lakens D , Adolfi FG , Albers CJ , Anvari F , Apps MA et al. 2018 . Justify your alpha. Nat. Hum. Behav. 2 : 3 168– 71 [Google Scholar]
  • Landy JF , Jia ML , Ding IL , Viganola D , Tierney W et al. 2020 . Crowdsourcing hypothesis tests: making transparent how design choices shape research results. Psychol. Bull. 146 : 5 451– 79 [Google Scholar]
  • Leary MR , Diebels KJ , Davisson EK , Jongman-Sereno KP , Isherwood JC et al. 2017 . Cognitive and interpersonal features of intellectual humility. Pers. Soc. Psychol. Bull. 43 : 6 793– 813 [Google Scholar]
  • LeBel EP , McCarthy RJ , Earp BD , Elson M , Vanpaemel W. 2018 . A unified framework to quantify the credibility of scientific findings. Adv. Methods Pract. Psychol. Sci. 1 : 3 389– 402 [Google Scholar]
  • Leighton DC , Legate N , LePine S , Anderson SF , Grahe J 2018 . Self-esteem, self-disclosure, self-expression, and connection on Facebook: a collaborative replication meta-analysis. Psi Chi J. Psychol. Res. 23 : 2 98– 109 [Google Scholar]
  • Leising D , Thielmann I , Glöckner A , Gärtner A , Schönbrodt F. 2020 . Ten steps toward a better personality science—how quality may be rewarded more in research evaluation. PsyArXiv, May 31. https://doi.org/10.31234/osf.io/6btc3 [Crossref]
  • Leonelli S 2018 . Rethinking reproducibility as a criterion for research quality. Research in the History of Economic Thought and Methodology 36 L Fiorito, S Scheall, CE Suprinyak 129– 46 Bingley, UK: Emerald [Google Scholar]
  • Lewandowsky S , Oberauer K. 2020 . Low replicability can support robust and efficient science. Nat. Commun. 11 : 1 358 [Google Scholar]
  • Maassen E , van Assen MALM , Nuijten MB , Olsson-Collentine A , Wicherts JM. 2020 . Reproducibility of individual effect sizes in meta-analyses in psychology. PLOS ONE 15 : 5 e0233107 [Google Scholar]
  • Machery E. 2020 . What is a replication?. Philos. Sci. 87 : 4 545 – 67 [Google Scholar]
  • ManyBabies Consort 2020 . Quantifying sources of variability in infancy research using the infant-directed-speech preference. Adv. Methods Pract. Psychol. Sci. 3 : 1 24– 52 [Google Scholar]
  • Marcus A , Oransky I 2018 . Meet the “data thugs” out to expose shoddy and questionable research. Science Feb. 18. https://www.sciencemag.org/news/2018/02/meet-data-thugs-out-expose-shoddy-and-questionable-research [Google Scholar]
  • Marcus A , Oransky I. 2020 . Tech firms hire “Red Teams.” Scientists should, too. WIRED July 16. https://www.wired.com/story/tech-firms-hire-red-teams-scientists-should-too/ [Google Scholar]
  • Mathur MB , VanderWeele TJ. 2020 . New statistical metrics for multisite replication projects. J. R. Stat. Soc. A 183 : 3 1145– 66 [Google Scholar]
  • Maxwell SE. 2004 . The persistence of underpowered studies in psychological research: causes, consequences, and remedies. Psychol. Methods 9 : 2 147– 63 [Google Scholar]
  • Maxwell SE , Lau MY , Howard GS. 2015 . Is psychology suffering from a replication crisis? What does “failure to replicate” really mean?. Am. Psychol. 70 : 6 487– 98 [Google Scholar]
  • Mayo DG. 2018 . Statistical Inference as Severe Testing Cambridge, UK: Cambridge Univ. Press [Google Scholar]
  • McCarthy R , Gervais W , Aczel B , Al-Kire R , Baraldo S et al. 2021 . A multi-site collaborative study of the hostile priming effect. Collabra Psychol 7 : 1 18738 [Google Scholar]
  • McCarthy RJ , Hartnett JL , Heider JD , Scherer CR , Wood SE et al. 2018 . An investigation of abstract construal on impression formation: a multi-lab replication of McCarthy and Skowronski (2011). Int. Rev. Soc. Psychol. 31 : 1 15 [Google Scholar]
  • Meehl PE. 1978 . Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. J. Consult. Clin. Psychol. 46 : 4 806– 34 [Google Scholar]
  • Meyer MN , Chabris C. 2014 . Why psychologists' food fight matters. Slate Magazine July 31. https://slate.com/technology/2014/07/replication-controversy-in-psychology-bullying-file-drawer-effect-blog-posts-repligate.html [Google Scholar]
  • Mischel W. 2008 . The toothbrush problem. APS Observer Dec. 1. https://www.psychologicalscience.org/observer/the-toothbrush-problem [Google Scholar]
  • Moran T , Hughes S , Hussey I , Vadillo MA , Olson MA et al. 2020 . Incidental attitude formation via the surveillance task: a Registered Replication Report of Olson and Fazio (2001). PsyArXiv, April 17. https://doi.org/10/ghwq2z [Crossref]
  • Moshontz H , Campbell L , Ebersole CR , IJzerman H , Urry HL et al. 2018 . The Psychological Science Accelerator: advancing psychology through a distributed collaborative network. Adv. Methods Pract. Psychol. Sci. 1 : 4 501– 15 [Google Scholar]
  • Munafò MR , Chambers CD , Collins AM , Fortunato L , Macleod MR. 2020 . Research culture and reproducibility. Trends Cogn. Sci. 24 : 2 91– 93 [Google Scholar]
  • Muthukrishna M , Henrich J. 2019 . A problem in theory. Nat. Hum. Behav. 3 : 3 221– 29 [Google Scholar]
  • Natl. Acad. Sci. Eng. Med 2019 . Reproducibility and Replicability in Science Washington, DC: Natl. Acad. Press [Google Scholar]
  • Nelson LD , Simmons J , Simonsohn U. 2018 . Psychology's renaissance. Annu. Rev. Psychol. 69 : 511– 34 [Google Scholar]
  • Nickerson RS. 1998 . Confirmation bias: a ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 2 : 2 175– 220 [Google Scholar]
  • Nosek B. 2019a . Strategy for culture change. Center for Open Science June 11. https://www.cos.io/blog/strategy-for-culture-change [Google Scholar]
  • Nosek B. 2019b . The rise of open science in psychology, a preliminary report. Center for Open Science June 3. https://www.cos.io/blog/rise-open-science-psychology-preliminary-report [Google Scholar]
  • Nosek BA , Alter G , Banks GC , Borsboom D , Bowman SD et al. 2015 . Promoting an open research culture. Science 348 : 6242 1422– 25 [Google Scholar]
  • Nosek BA , Beck ED , Campbell L , Flake JK , Hardwicke TE et al. 2019 . Preregistration is hard, and worthwhile. Trends Cogn. Sci. 23 : 10 815– 18 [Google Scholar]
  • Nosek BA , Ebersole CR , DeHaven AC , Mellor DT. 2018 . The preregistration revolution. PNAS 115 : 11 2600– 6 [Google Scholar]
  • Nosek BA , Errington TM. 2020a . What is replication?. PLOS Biol 18 : 3 e3000691 [Google Scholar]
  • Nosek BA , Errington TM. 2020b . The best time to argue about what a replication means? Before you do it. Nature 583 : 7817 518– 20 [Google Scholar]
  • Nosek BA , Gilbert EA. 2017 . Mischaracterizing replication studies leads to erroneous conclusions. PsyArXiv, April 18. https://doi.org/10.31234/osf.io/nt4d3 [Crossref]
  • Nosek BA , Spies JR , Motyl M. 2012 . Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspect. On Psychol. Sci. 7 : 6 615– 31 [Google Scholar]
  • Nuijten MB , Bakker M , Maassen E , Wicherts JM. 2018 . Verify original results through reanalysis before replicating. Behav. Brain Sci. 41 : e143 [Google Scholar]
  • Nuijten MB , Hartgerink CHJ , van Assen MALM , Epskamp S , Wicherts JM 2016 . The prevalence of statistical reporting errors in psychology (1985–2013). Behav. Res. Methods 48 : 4 1205– 26 [Google Scholar]
  • Nuijten MB , van Assen MA , Veldkamp CL , Wicherts JM. 2015 . The replication paradox: Combining studies can decrease accuracy of effect size estimates. Rev. Gen. Psychol. 19 : 2 172– 82 [Google Scholar]
  • O'Donnell M , Nelson LD , Ackermann E , Aczel B , Akhtar A et al. 2018 . Registered Replication Report: Dijksterhuis and van Knippenberg (1998). Perspect. Psychol. Sci. 13 : 2 268– 94 [Google Scholar]
  • Olsson-Collentine A , Wicherts JM , van Assen MALM. 2020 . Heterogeneity in direct replications in psychology and its association with effect size. Psychol. Bull. 146 : 10 922– 40 [Google Scholar]
  • Open Sci. Collab 2015 . Estimating the reproducibility of psychological science. Science 349 : 6251 aac4716 [Google Scholar]
  • Patil P , Peng RD , Leek JT. 2016 . What should researchers expect when they replicate studies? A statistical view of replicability in psychological science. Perspect. Psychol. Sci. 11 : 4 539– 44 [Google Scholar]
  • Pawel S , Held L. 2020 . Probabilistic forecasting of replication studies. PLOS ONE 15 : 4 e0231416 [Google Scholar]
  • Perugini M , Gallucci M , Costantini G. 2014 . Safeguard power as a protection against imprecise power estimates. Perspect. Psychol. Sci. 9 : 3 319– 32 [Google Scholar]
  • Protzko J , Krosnick J , Nelson LD , Nosek BA , Axt J et al. 2020 . High replicability of newly-discovered social-behavioral findings is achievable. PsyArXiv, Sept. 10. https://doi.org/10.31234/osf.io/n2a9x [Crossref]
  • Rogers EM. 2003 . Diffusion of Innovations New York: Free Press, 5th ed.. [Google Scholar]
  • Romero F. 2017 . Novelty versus replicability: virtues and vices in the reward system of science. Philos. Sci. 84 : 5 1031– 43 [Google Scholar]
  • Rosenthal R. 1979 . The file drawer problem and tolerance for null results. Psychol. Bull. 86 : 3 638– 41 [Google Scholar]
  • Rothstein HR , Sutton AJ , Borenstein M 2005 . Publication bias in meta-analysis. Publication Bias in Meta-Analysis: Prevention, Assessment and Adjustments HR Rothstein, AJ Sutton, M Borenstein 1– 7 Chichester, UK: Wiley & Sons [Google Scholar]
  • Rouder JN. 2016 . The what, why, and how of born-open data. Behav. Res. Methods 48 : 3 1062– 69 [Google Scholar]
  • Scheel AM , Schijen M , Lakens D. 2020 . An excess of positive results: comparing the standard psychology literature with Registered Reports. PsyArXiv, Febr. 5. https://doi.org/10.31234/osf.io/p6e9c [Crossref]
  • Schimmack U. 2012 . The ironic effect of significant results on the credibility of multiple-study articles. Psychol. Methods 17 : 4 551– 66 [Google Scholar]
  • Schmidt S. 2009 . Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev. Gen. Psychol. 13 : 2 90– 100 [Google Scholar]
  • Schnall S 2014 . Commentary and rejoinder on Johnson, Cheung, and Donnellan (2014a). Clean data: Statistical artifacts wash out replication efforts. Soc. Psychol. 45 : 4 315– 17 [Google Scholar]
  • Schwarz N , Strack F. 2014 . Does merely going through the same moves make for a “direct” replication? Concepts, contexts, and operationalizations. Soc. Psychol. 45 : 4 305– 6 [Google Scholar]
  • Schweinsberg M , Madan N , Vianello M , Sommer SA , Jordan J et al. 2016 . The pipeline project: pre-publication independent replications of a single laboratory's research pipeline. J. Exp. Soc. Psychol. 66 : 55– 67 [Google Scholar]
  • Sedlmeier P , Gigerenzer G. 1992 . Do studies of statistical power have an effect on the power of studies?. Psychol. Bull. 105 : 2 309– 16 [Google Scholar]
  • Shadish WR , Cook TD , Campbell DT 2002 . Experimental and Quasi-Experimental Designs for Generalized Causal Inference Boston: Houghton Mifflin [Google Scholar]
  • Shiffrin RM , Börner K , Stigler SM. 2018 . Scientific progress despite irreproducibility: a seeming paradox. PNAS 115 : 11 2632– 39 [Google Scholar]
  • Shih M , Pittinsky TL 2014 . Reflections on positive stereotypes research and on replications. Soc. Psychol. 45 : 4 335– 38 [Google Scholar]
  • Silberzahn R , Uhlmann EL , Martin DP , Anselmi P , Aust F et al. 2018 . Many analysts, one data set: making transparent how variations in analytic choices affect results. Adv. Methods Pract. Psychol. Sci. 1 : 3 337– 56 [Google Scholar]
  • Simmons JP , Nelson LD , Simonsohn U 2011 . False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22 : 11 1359– 66 [Google Scholar]
  • Simons DJ. 2014 . The value of direct replication. Perspect. Psychol. Sci. 9 : 1 76– 80 [Google Scholar]
  • Simons DJ , Shoda Y , Lindsay DS. 2017 . Constraints on generality (COG): a proposed addition to all empirical papers. Perspect. Psychol. Sci. 12 : 6 1123– 28 [Google Scholar]
  • Simonsohn U. 2015 . Small telescopes: detectability and the evaluation of replication results. Psychol. Sci. 26 : 5 559– 69 [Google Scholar]
  • Simonsohn U , Simmons JP , Nelson LD. 2020 . Specification curve analysis. Nat. Hum. Behav. 4 : 1208– 14 [Google Scholar]
  • Smaldino PE , McElreath R. 2016 . The natural selection of bad science. R. Soc. Open Sci. 3 : 9 160384 [Google Scholar]
  • Smith PL , Little DR. 2018 . Small is beautiful: in defense of the small-N design. Psychon. Bull. Rev. 25 : 6 2083– 101 [Google Scholar]
  • Soderberg CK. 2018 . Using OSF to share data: a step-by-step guide. Adv. Methods Pract. Psychol. Sci. 1 : 1 115– 20 [Google Scholar]
  • Soderberg CK , Errington T , Schiavone SR , Bottesini JG , Thorn FS et al. 2021 . Initial evidence of research quality of Registered Reports compared with the standard publishing model. Nat. Hum. Behav 5 : 8 990 – 97 [Google Scholar]
  • Soto CJ. 2019 . How replicable are links between personality traits and consequential life outcomes? The life outcomes of personality replication project. Psychol. Sci. 30 : 5 711– 27 [Google Scholar]
  • Spellman BA. 2015 . A short (personal) future history of revolution 2.0. Perspect. Psychol. Sci. 10 : 6 886– 99 [Google Scholar]
  • Steegen S , Tuerlinckx F , Gelman A , Vanpaemel W 2016 . Increasing transparency through a multiverse analysis. Perspect. Psychol. Sci. 11 : 5 702– 12 [Google Scholar]
  • Sterling TD. 1959 . Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versa. J. Am. Stat. Assoc. 54 : 285 30– 34 [Google Scholar]
  • Sterling TD , Rosenbaum WL , Weinkam JJ. 1995 . Publication decisions revisited: the effect of the outcome of statistical tests on the decision to publish and vice versa. Am. Stat. 49 : 108– 12 [Google Scholar]
  • Stroebe W , Strack F. 2014 . The alleged crisis and the illusion of exact replication. Perspect. Psychol. Sci. 9 : 1 59– 71 [Google Scholar]
  • Szucs D , Ioannidis JPA. 2017 . Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLOS Biol 15 : 3 e2000797 [Google Scholar]
  • Tiokhin L , Derex M. 2019 . Competition for novelty reduces information sampling in a research game—a registered report. R. Soc. Open Sci. 6 : 5 180934 [Google Scholar]
  • Van Bavel JJ , Mende-Siedlecki P , Brady WJ , Reinero DA 2016 . Contextual sensitivity in scientific reproducibility. PNAS 113 : 23 6454– 59 [Google Scholar]
  • Vazire S. 2018 . Implications of the credibility revolution for productivity, creativity, and progress. Perspect. Psychol. Sci. 13 : 4 411– 17 [Google Scholar]
  • Vazire S , Schiavone SR , Bottesini JG. 2020 . Credibility beyond replicability: improving the four validities in psychological science. PsyArXiv, Oct. 7. https://doi.org/10.31234/osf.io/bu4d3 [Crossref]
  • Verhagen J , Wagenmakers E-J. 2014 . Bayesian tests to quantify the result of a replication attempt. J. Exp. Psychol. Gen. 143 : 4 1457– 75 [Google Scholar]
  • Verschuere B , Meijer EH , Jim A , Hoogesteyn K , Orthey R et al. 2018 . Registered Replication Report on Mazar, Amir, and Ariely (2008). Adv. Methods Pract. Psychol. Sci. 1 : 3 299– 317 [Google Scholar]
  • Vosgerau J , Simonsohn U , Nelson LD , Simmons JP 2019 . 99% impossible: a valid, or falsifiable, internal meta-analysis. J. Exp. Psychol. Gen. 148 : 9 1628– 39 [Google Scholar]
  • Wagenmakers E-J , Beek T , Dijkhoff L , Gronau QF , Acosta A et al. 2016 . Registered Replication Report: Strack, Martin, & Stepper (1988). Perspect. Psychol. Sci. 11 : 6 917– 28 [Google Scholar]
  • Wagenmakers E-J , Wetzels R , Borsboom D , van der Maas HL. 2011 . Why psychologists must change the way they analyze their data. The case of psi: comment on Bem (2011). J. Pers. Soc. Psychol. 100 : 3 426– 32 [Google Scholar]
  • Wagenmakers E-J , Wetzels R , Borsboom D , van der Maas HL , Kievit RA. 2012 . An agenda for purely confirmatory research. Perspect. Psychol. Sci. 7 : 6 632– 38 [Google Scholar]
  • Wagge J , Baciu C , Banas K , Nadler JT , Schwarz S et al. 2018 . A demonstration of the Collaborative Replication and Education Project: replication attempts of the red-romance effect. PsyArXiv, June 22. https://doi.org/10.31234/osf.io/chax8 [Crossref]
  • Whitcomb D , Battaly H , Baehr J , Howard-Snyder D. 2017 . Intellectual humility: owning our limitations. Philos. Phenomenol. Res. 94 : 3 509– 39 [Google Scholar]
  • Wicherts JM , Bakker M , Molenaar D. 2011 . Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLOS ONE 6 : 11 e26828 [Google Scholar]
  • Wiktop G. 2020 . Systematizing Confidence in Open Research and Evidence (SCORE). Defense Advanced Research Projects Agency https://www.darpa.mil/program/systematizing-confidence-in-open-research-and-evidence [Google Scholar]
  • Wilkinson MD , Dumontier M , Aalbersberg IJ , Appleton G , Axton M et al. 2016 . The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3 : 1 160018 [Google Scholar]
  • Wilson BM , Harris CR , Wixted JT. 2020 . Science is not a signal detection problem. PNAS 117 : 11 5559– 67 [Google Scholar]
  • Wilson BM , Wixted JT. 2018 . The prior odds of testing a true effect in cognitive and social psychology. Adv. Methods Pract. Psychol. Sci. 1 : 2 186– 97 [Google Scholar]
  • Wintle B , Mody F , Smith E , Hanea A , Wilkinson DP et al. 2021 . Predicting and reasoning about replicability using structured groups. MetaArXiv, May 4. https://doi.org/10.31222/osf.io/vtpmb [Crossref]
  • Yang Y , Youyou W , Uzzi B. 2020 . Estimating the deep replicability of scientific findings using human and artificial intelligence. PNAS 117 : 20 10762– 68 [Google Scholar]
  • Yarkoni T. 2019 . The generalizability crisis. PsyArXiv, Nov. 22. https://doi.org/10.31234/osf.io/jqw35 [Crossref]
  • Yong E 2012 . A failed replication draws a scathing personal attack from a psychology professor. National Geo-graphic March 10. https://www.nationalgeographic.com/science/phenomena/2012/03/10/failed-replication-bargh-psychology-study-doyen/ [Google Scholar]

Data & Media loading...

Supplementary Data

Download the Supplemental Material as a single PDF. Includes Supplemental Text, Supplemental Tables 1-12, and Supplemental Figures 1-2.

  • Article Type: Review Article

Most Read This Month

Most cited most cited rss feed, job burnout, executive functions, social cognitive theory: an agentic perspective, on happiness and human potentials: a review of research on hedonic and eudaimonic well-being, sources of method bias in social science research and recommendations on how to control it, mediation analysis, missing data analysis: making it work in the real world, grounded cognition, personality structure: emergence of the five-factor model, motivational beliefs, values, and goals.

IMAGES

  1. PPT

    replication experimental psychology

  2. Research Methods: Replication in Psychology (1 of 3)

    replication experimental psychology

  3. Nick & Bethan Redshaw's A-Level Psychology Resources

    replication experimental psychology

  4. The Psychology of Replication and Replication in Psychology

    replication experimental psychology

  5. PPT

    replication experimental psychology

  6. Experimental design and replication. The experiments were conducted

    replication experimental psychology

VIDEO

  1. Research Methods: Replication in Psychology (2 of 3)

  2. Research Methods: Replication in Psychology (1 of 3)

  3. Principles of Experimental Design

  4. THE EXPERIMENTAL PROOF OF REPLICATION

  5. Class 12 biology chapter 6,part 6||RNA world||Replication||by study with Farru

  6. Mastering TRANSCRIPTION

COMMENTS

  1. Replication in Psychology: Definition, Steps, and Challenges

    In psychology, replication is defined as reproducing a study. It is essential for validity, but it's not always easy to perform experiments and get the same result.

  2. A causal replication framework for designing and assessing ...

    Through two applied examples, the article demonstrates how the causal replication framework may be utilized to plan prospective replication designs, as well as to interpret results from existing replication efforts.

  3. Replications in Psychology Research: How Often Do They Really ...

    One topic receiving substantial attention is the role of replication in psychological science. Using the complete publication history of the 100 psychology journals with the highest 5-year impact factors, the current article provides an overview of replications in psychological research since 1900.

  4. A discipline-wide investigation of the replicability of ...

    Using the replication scores, we conducted three sets of analyses: First, we determined subfield differences in estimated replication rates, bridging the gaps in previous small-sample manual replications; second, we compared replication rates between experimental and non-experimental research designs; and third, we examined how replicability ...

  5. The Psychology of Replication and Replication in Psychology

    Like other scientists, psychologists believe experimental replication to be the final arbiter for determining the validity of an empirical finding. Reports in psychology journals often attempt to prove the validity of a hypothesis or theory with multiple experiments that replicate a finding.

  6. Replication in Psychological Science - D. Stephen Lindsay, 2015

    Replication rates would be high in psychology if all of the effects studied were huge and robust, but if psychologists studied only huge and robust effects, then progress toward understanding subtleties of psychology would surely be thwarted.

  7. The role of replication in psychological science | European ...

    I provide herein a new, functional account of the role of replication in a scientific discipline: to undercut the underdetermination of scientific hypotheses from data, typically by hypotheses that connect data with phenomena. These include hypotheses that concern sampling error, experimental control, and operationalization.

  8. Psychology, replication & beyond - PMC - National Center for ...

    The lack of replication in psychology is systemic and widespread, and particularly the bias against publishing direct replications. In their survey of social science journal editors, Neuliep & Crandall found almost three quarters preferred to publish novel findings rather than replications.

  9. The Psychology of Replication and Replication in ... - PubMed

    Like other scientists, psychologists believe experimental replication to be the final arbiter for determining the validity of an empirical finding. Reports in psychology journals often attempt to prove the validity of a hypothesis or theory with multiple experiments that replicate a finding.

  10. Replicability, Robustness, and Reproducibility in ...

    Replicationan important, uncommon, and misunderstood practice—is gaining appreciation in psychology. Achieving replicability is important for making research progress. If findings are not replicable, then prediction and theory development are stifled.