Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Perspective
  • Published: 12 October 2020

Eight problems with literature reviews and how to fix them

  • Neal R. Haddaway   ORCID: orcid.org/0000-0003-3902-2234 1 , 2 , 3 ,
  • Alison Bethel 4 ,
  • Lynn V. Dicks 5 , 6 ,
  • Julia Koricheva   ORCID: orcid.org/0000-0002-9033-0171 7 ,
  • Biljana Macura   ORCID: orcid.org/0000-0002-4253-1390 2 ,
  • Gillian Petrokofsky 8 ,
  • Andrew S. Pullin 9 ,
  • Sini Savilaakso   ORCID: orcid.org/0000-0002-8514-8105 10 , 11 &
  • Gavin B. Stewart   ORCID: orcid.org/0000-0001-5684-1544 12  

Nature Ecology & Evolution volume  4 ,  pages 1582–1589 ( 2020 ) Cite this article

12k Accesses

100 Citations

389 Altmetric

Metrics details

  • Conservation biology
  • Environmental impact

An Author Correction to this article was published on 19 October 2020

This article has been updated

Traditional approaches to reviewing literature may be susceptible to bias and result in incorrect decisions. This is of particular concern when reviews address policy- and practice-relevant questions. Systematic reviews have been introduced as a more rigorous approach to synthesizing evidence across studies; they rely on a suite of evidence-based methods aimed at maximizing rigour and minimizing susceptibility to bias. Despite the increasing popularity of systematic reviews in the environmental field, evidence synthesis methods continue to be poorly applied in practice, resulting in the publication of syntheses that are highly susceptible to bias. Recognizing the constraints that researchers can sometimes feel when attempting to plan, conduct and publish rigorous and comprehensive evidence syntheses, we aim here to identify major pitfalls in the conduct and reporting of systematic reviews, making use of recent examples from across the field. Adopting a ‘critical friend’ role in supporting would-be systematic reviews and avoiding individual responses to police use of the ‘systematic review’ label, we go on to identify methodological solutions to mitigate these pitfalls. We then highlight existing support available to avoid these issues and call on the entire community, including systematic review specialists, to work towards better evidence syntheses for better evidence and better decisions.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 digital issues and online access to articles

111,21 € per year

only 9,27 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

bias in literature review

Similar content being viewed by others

bias in literature review

Sequencing-based analysis of microbiomes

bias in literature review

Robustness of cancer microbiome signals over a broad range of methodological variation

bias in literature review

The role of artificial intelligence in achieving the Sustainable Development Goals

Change history, 19 october 2020.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

Grant, M. J. & Booth, A. A typology of reviews: an analysis of 14 review types and associated methodologies. Health Info Libr. J. 26 , 91–108 (2009).

PubMed   Google Scholar  

Haddaway, N. R. & Macura, B. The role of reporting standards in producing robust literature reviews. Nat. Clim. Change 8 , 444–447 (2018).

Google Scholar  

Pullin, A. S. & Knight, T. M. Science informing policy–a health warning for the environment. Environ. Evid. 1 , 15 (2012).

Haddaway, N., Woodcock, P., Macura, B. & Collins, A. Making literature reviews more reliable through application of lessons from systematic reviews. Conserv. Biol. 29 , 1596–1605 (2015).

CAS   PubMed   Google Scholar  

Pullin, A., Frampton, G., Livoreil, B. & Petrokofsky, G. Guidelines and Standards for Evidence Synthesis in Environmental Management (Collaboration for Environmental Evidence, 2018).

White, H. The twenty-first century experimenting society: the four waves of the evidence revolution. Palgrave Commun. 5 , 47 (2019).

O’Leary, B. C. et al. The reliability of evidence review methodology in environmental science and conservation. Environ. Sci. Policy 64 , 75–82 (2016).

Woodcock, P., Pullin, A. S. & Kaiser, M. J. Evaluating and improving the reliability of evidence syntheses in conservation and environmental science: a methodology. Biol. Conserv. 176 , 54–62 (2014).

Campbell Systematic Reviews: Policies and Guidelines (Campbell Collaboration, 2014).

Higgins, J. P. et al. Cochrane Handbook for Systematic Reviews of Interventions (John Wiley & Sons, 2019).

Shea, B. J. et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ 358 , j4008 (2017).

PubMed   PubMed Central   Google Scholar  

Haddaway, N. R., Land, M. & Macura, B. “A little learning is a dangerous thing”: a call for better understanding of the term ‘systematic review’. Environ. Int. 99 , 356–360 (2017).

Freeman, R. E. Strategic Management: A Stakeholder Approach (Cambridge Univ. Press, 2010).

Haddaway, N. R. et al. A framework for stakeholder engagement during systematic reviews and maps in environmental management. Environ. Evid. 6 , 11 (2017).

Land, M., Macura, B., Bernes, C. & Johansson, S. A five-step approach for stakeholder engagement in prioritisation and planning of environmental evidence syntheses. Environ. Evid. 6 , 25 (2017).

Oliver, S. & Dickson, K. Policy-relevant systematic reviews to strengthen health systems: models and mechanisms to support their production. Evid. Policy 12 , 235–259 (2016).

Savilaakso, S. et al. Systematic review of effects on biodiversity from oil palm production. Environ. Evid. 3 , 4 (2014).

Savilaakso, S., Laumonier, Y., Guariguata, M. R. & Nasi, R. Does production of oil palm, soybean, or jatropha change biodiversity and ecosystem functions in tropical forests. Environ. Evid. 2 , 17 (2013).

Haddaway, N. R. & Crowe, S. Experiences and lessons in stakeholder engagement in environmental evidence synthesis: a truly special series. Environ. Evid. 7 , 11 (2018).

Sánchez-Bayo, F. & Wyckhuys, K. A. Worldwide decline of the entomofauna: a review of its drivers. Biol. Conserv. 232 , 8–27 (2019).

Agarwala, M. & Ginsberg, J. R. Untangling outcomes of de jure and de facto community-based management of natural resources. Conserv. Biol. 31 , 1232–1246 (2017).

Gurevitch, J., Curtis, P. S. & Jones, M. H. Meta-analysis in ecology. Adv. Ecol. Res. 32 , 199–247 (2001).

CAS   Google Scholar  

Haddaway, N. R., Macura, B., Whaley, P. & Pullin, A. S. ROSES RepOrting standards for Systematic Evidence Syntheses: pro forma, flow-diagram and descriptive summary of the plan and conduct of environmental systematic reviews and systematic maps. Environ. Evid. 7 , 7 (2018).

Lwasa, S. et al. A meta-analysis of urban and peri-urban agriculture and forestry in mediating climate change. Curr. Opin. Environ. Sustain. 13 , 68–73 (2015).

Pacifici, M. et al. Species’ traits influenced their response to recent climate change. Nat. Clim. Change 7 , 205–208 (2017).

Owen-Smith, N. Ramifying effects of the risk of predation on African multi-predator, multi-prey large-mammal assemblages and the conservation implications. Biol. Conserv. 232 , 51–58 (2019).

Prugh, L. R. et al. Designing studies of predation risk for improved inference in carnivore-ungulate systems. Biol. Conserv. 232 , 194–207 (2019).

Li, Y. et al. Effects of biochar application in forest ecosystems on soil properties and greenhouse gas emissions: a review. J. Soil Sediment. 18 , 546–563 (2018).

Moher, D., Liberati, A., Tetzlaff, J. & Altman, D. G., The PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 6 , e1000097 (2009).

Bernes, C. et al. What is the influence of a reduction of planktivorous and benthivorous fish on water quality in temperate eutrophic lakes? A systematic review. Environ. Evid. 4 , 7 (2015).

McDonagh, M., Peterson, K., Raina, P., Chang, S. & Shekelle, P. Avoiding bias in selecting studies. Methods Guide for Effectiveness and Comparative Effectiveness Reviews [Internet] (Agency for Healthcare Research and Quality, 2013).

Burivalova, Z., Hua, F., Koh, L. P., Garcia, C. & Putz, F. A critical comparison of conventional, certified, and community management of tropical forests for timber in terms of environmental, economic, and social variables. Conserv. Lett. 10 , 4–14 (2017).

Min-Venditti, A. A., Moore, G. W. & Fleischman, F. What policies improve forest cover? A systematic review of research from Mesoamerica. Glob. Environ. Change 47 , 21–27 (2017).

Bramer, W. M., Giustini, D. & Kramer, B. M. R. Comparing the coverage, recall, and precision of searches for 120 systematic reviews in Embase, MEDLINE, and Google Scholar: a prospective study. Syst. Rev. 5 , 39 (2016).

Bramer, W. M., Giustini, D., Kramer, B. M. R. & Anderson, P. F. The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews. Syst. Rev. 2 , 115 (2013).

Gusenbauer, M. & Haddaway, N. R. Which academic search systems are suitable for systematic reviews or meta‐analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Res. Synth. Methods 11 , 181–217 (2020).

Livoreil, B. et al. Systematic searching for environmental evidence using multiple tools and sources. Environ. Evid. 6 , 23 (2017).

Mlinarić, A., Horvat, M. & Šupak Smolčić, V. Dealing with the positive publication bias: why you should really publish your negative results. Biochem. Med. 27 , 447–452 (2017).

Lin, L. & Chu, H. Quantifying publication bias in meta‐analysis. Biometrics 74 , 785–794 (2018).

Haddaway, N. R. & Bayliss, H. R. Shades of grey: two forms of grey literature important for reviews in conservation. Biol. Conserv. 191 , 827–829 (2015).

Viechtbauer, W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 36 , 1–48 (2010).

Bilotta, G. S., Milner, A. M. & Boyd, I. On the use of systematic reviews to inform environmental policies. Environ. Sci. Policy 42 , 67–77 (2014).

Englund, G., Sarnelle, O. & Cooper, S. D. The importance of data‐selection criteria: meta‐analyses of stream predation experiments. Ecology 80 , 1132–1141 (1999).

Burivalova, Z., Şekercioğlu, Ç. H. & Koh, L. P. Thresholds of logging intensity to maintain tropical forest biodiversity. Curr. Biol. 24 , 1893–1898 (2014).

Bicknell, J. E., Struebig, M. J., Edwards, D. P. & Davies, Z. G. Improved timber harvest techniques maintain biodiversity in tropical forests. Curr. Biol. 24 , R1119–R1120 (2014).

Damette, O. & Delacote, P. Unsustainable timber harvesting, deforestation and the role of certification. Ecol. Econ. 70 , 1211–1219 (2011).

Blomley, T. et al. Seeing the wood for the trees: an assessment of the impact of participatory forest management on forest condition in Tanzania. Oryx 42 , 380–391 (2008).

Haddaway, N. R. et al. How does tillage intensity affect soil organic carbon? A systematic review. Environ. Evid. 6 , 30 (2017).

Higgins, J. P. et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ 343 , d5928 (2011).

Stewart, G. Meta-analysis in applied ecology. Biol. Lett. 6 , 78–81 (2010).

Koricheva, J. & Gurevitch, J. Uses and misuses of meta‐analysis in plant ecology. J. Ecol. 102 , 828–844 (2014).

Vetter, D., Ruecker, G. & Storch, I. Meta‐analysis: a need for well‐defined usage in ecology and conservation biology. Ecosphere 4 , 1–24 (2013).

Stewart, G. B. & Schmid, C. H. Lessons from meta-analysis in ecology and evolution: the need for trans-disciplinary evidence synthesis methodologies. Res. Synth. Methods 6 , 109–110 (2015).

Macura, B. et al. Systematic reviews of qualitative evidence for environmental policy and management: an overview of different methodological options. Environ. Evid. 8 , 24 (2019).

Koricheva, J. & Gurevitch, J. in Handbook of Meta-analysis in Ecology and Evolution (eds Koricheva, J. et al.) Ch. 1 (Princeton Scholarship Online, 2013).

Britt, M., Haworth, S. E., Johnson, J. B., Martchenko, D. & Shafer, A. B. The importance of non-academic coauthors in bridging the conservation genetics gap. Biol. Conserv. 218 , 118–123 (2018).

Graham, L., Gaulton, R., Gerard, F. & Staley, J. T. The influence of hedgerow structural condition on wildlife habitat provision in farmed landscapes. Biol. Conserv. 220 , 122–131 (2018).

Delaquis, E., de Haan, S. & Wyckhuys, K. A. On-farm diversity offsets environmental pressures in tropical agro-ecosystems: a synthetic review for cassava-based systems. Agric. Ecosyst. Environ. 251 , 226–235 (2018).

Popay, J. et al. Guidance on the Conduct of Narrative Synthesis in Systematic Reviews: A Product from the ESRC Methods Programme Version 1 (Lancaster Univ., 2006).

Pullin, A. S. et al. Human well-being impacts of terrestrial protected areas. Environ. Evid. 2 , 19 (2013).

Waffenschmidt, S., Knelangen, M., Sieben, W., Bühn, S. & Pieper, D. Single screening versus conventional double screening for study selection in systematic reviews: a methodological systematic review. BMC Med. Res. Methodol. 19 , 132 (2019).

Rallo, A. & García-Arberas, L. Differences in abiotic water conditions between fluvial reaches and crayfish fauna in some northern rivers of the Iberian Peninsula. Aquat. Living Resour. 15 , 119–128 (2002).

Glasziou, P. & Chalmers, I. Research waste is still a scandal—an essay by Paul Glasziou and Iain Chalmers. BMJ 363 , k4645 (2018).

Haddaway, N. R. Open Synthesis: on the need for evidence synthesis to embrace Open Science. Environ. Evid. 7 , 26 (2018).

Download references

Acknowledgements

We thank C. Shortall from Rothamstead Research for useful discussions on the topic.

Author information

Authors and affiliations.

Mercator Research Institute on Climate Change and Global Commons, Berlin, Germany

Neal R. Haddaway

Stockholm Environment Institute, Stockholm, Sweden

Neal R. Haddaway & Biljana Macura

Africa Centre for Evidence, University of Johannesburg, Johannesburg, South Africa

College of Medicine and Health, Exeter University, Exeter, UK

Alison Bethel

Department of Zoology, University of Cambridge, Cambridge, UK

Lynn V. Dicks

School of Biological Sciences, University of East Anglia, Norwich, UK

Department of Biological Sciences, Royal Holloway University of London, Egham, UK

Julia Koricheva

Department of Zoology, University of Oxford, Oxford, UK

Gillian Petrokofsky

Collaboration for Environmental Evidence, UK Centre, School of Natural Sciences, Bangor University, Bangor, UK

Andrew S. Pullin

Liljus ltd, London, UK

Sini Savilaakso

Department of Forest Sciences, University of Helsinki, Helsinki, Finland

Evidence Synthesis Lab, School of Natural and Environmental Sciences, University of Newcastle, Newcastle-upon-Tyne, UK

Gavin B. Stewart

You can also search for this author in PubMed   Google Scholar

Contributions

N.R.H. developed the manuscript idea and a first draft. All authors contributed to examples and edited the text. All authors have read and approve of the final submission.

Corresponding author

Correspondence to Neal R. Haddaway .

Ethics declarations

Competing interests.

S.S. is a co-founder of Liljus ltd, a firm that provides research services in sustainable finance as well as forest conservation and management. The other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary table.

Examples of literature reviews and common problems identified.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Haddaway, N.R., Bethel, A., Dicks, L.V. et al. Eight problems with literature reviews and how to fix them. Nat Ecol Evol 4 , 1582–1589 (2020). https://doi.org/10.1038/s41559-020-01295-x

Download citation

Received : 24 March 2020

Accepted : 31 July 2020

Published : 12 October 2020

Issue Date : December 2020

DOI : https://doi.org/10.1038/s41559-020-01295-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

What secondary research evidence exists on the effects of forest management after disturbances: a systematic map protocol.

  • Moritz Baumeister
  • Markus A. Meyer

Environmental Evidence (2024)

Systematic review and meta-analysis of ex-post evaluations on the effectiveness of carbon pricing

  • Niklas Döbbeling-Hildebrandt
  • Klaas Miersch
  • Jan C. Minx

Nature Communications (2024)

Facing the storm: Developing corporate adaptation and resilience action plans amid climate uncertainty

  • Katharina Hennes
  • David Bendig
  • Andreas Löschel

npj Climate Action (2024)

A review of the necessity of a multi-layer land-use planning

  • Hashem Dadashpoor
  • Leyla Ghasempour

Landscape and Ecological Engineering (2024)

Synthesizing the relationships between environmental DNA concentration and freshwater macrophyte abundance: a systematic review and meta-analysis

  • Toshiaki S. Jo

Hydrobiologia (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.

bias in literature review

Jump to navigation

Home

Cochrane Training

Chapter 7: considering bias and conflicts of interest among the included studies.

Isabelle Boutron, Matthew J Page, Julian PT Higgins, Douglas G Altman, Andreas Lundh, Asbjørn Hróbjartsson; on behalf of the Cochrane Bias Methods Group

Key Points:

  • Review authors should seek to minimize bias. We draw a distinction between two places in which bias should be considered. The first is in the results of the individual studies included in a systematic review. The second is in the result of the meta-analysis (or other synthesis) of findings from the included studies.
  • Problems with the design and execution of individual studies of healthcare interventions raise questions about the internal validity of their findings; empirical evidence provides support for this concern.
  • An assessment of the internal validity of studies included in a Cochrane Review should emphasize the risk of bias in their results, that is, the risk that they will over-estimate or under-estimate the true intervention effect.
  • Results of meta-analyses (or other syntheses) across studies may additionally be affected by bias due to the absence of results from studies that should have been included in the synthesis.
  • Review authors should consider source of funding and conflicts of interest of authors of the study, which may inform the exploration of directness and heterogeneity of study results, assessment of risk of bias within studies, and assessment of risk of bias in syntheses owing to missing results.

Cite this chapter as: Boutron I, Page MJ, Higgins JPT, Altman DG, Lundh A, Hróbjartsson A. Chapter 7: Considering bias and conflicts of interest among the included studies. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

7.1 Introduction

Cochrane Reviews seek to minimize bias. We define bias as a systematic error , or deviation from the truth, in results. Biases can lead to under-estimation or over-estimation of the true intervention effect and can vary in magnitude: some are small (and trivial compared with the observed effect) and some are substantial (so that an apparent finding may be due entirely to bias). A source of bias may even vary in direction across studies. For example, bias due to a particular design flaw such as lack of allocation sequence concealment may lead to under-estimation of an effect in one study but over-estimation in another (Jüni et al 2001).

Bias can arise because of the actions of primary study investigators or because of the actions of review authors, or may be unavoidable due to constraints on how research can be undertaken in practice. Actions of authors can, in turn, be influenced by conflicts of interest. In this chapter we introduce issues of bias in the context of a Cochrane Review, covering both biases in the results of included studies and biases in the results of a synthesis. We introduce the general principles of assessing the risk that bias may be present, as well as the presentation of such assessments and their incorporation into analyses. Finally, we address how source of funding and conflicts of interest of study authors may impact on study design, conduct and reporting. Conflicts of interest held by review authors are also of concern; these should be addressed using editorial procedures and are not covered by this chapter (see Chapter 1, Section 1.3 ).

We draw a distinction between two places in which bias should be considered. The first is in the results of the individual studies included in a systematic review . Since the conclusions drawn in a review depend on the results of the included studies, if these results are biased, then a meta-analysis of the studies will produce a misleading conclusion. Therefore, review authors should systematically take into account risk of bias in results of included studies when interpreting the results of their review.

The second place in which bias should be considered is the result of the meta-analysis (or other synthesis) of findings from the included studies . This result will be affected by biases in the included studies, and may additionally be affected by bias due to the absence of results from studies that should have been included in the synthesis. Specifically, the conclusions of the review may be compromised when decisions about how, when and where to report results of eligible studies are influenced by the nature and direction of the results. This is the problem of ‘non-reporting bias’ (also described as ‘publication bias’ and ‘selective reporting bias’). There is convincing evidence that results that are statistically non-significant and unfavourable to the experimental intervention are less likely to be published than statistically significant results, and hence are less easily identified by systematic reviews (see Section 7.2.3 ). This leads to results being missing systematically from syntheses, which can lead to syntheses over-estimating or under-estimating the effects of an intervention. For this reason, the assessment of risk of bias due to missing results is another essential component of a Cochrane Review.

Both the risk of bias in included studies and risk of bias due to missing results may be influenced by conflicts of interest of study investigators or funders . For example, investigators with a financial interest in showing that a particular drug works may exclude participants who did not respond favourably to the drug from the analysis, or fail to report unfavourable results of the drug in a manuscript.

Further discussion of assessing risk of bias in the results of an individual randomized trial is available in Chapter 8 , and of a non-randomized study in Chapter 25 . Further discussion of assessing risk of bias due to missing results is available in Chapter 13 .

7.1.1 Why consider risk of bias?

There is good empirical evidence that particular features of the design, conduct and analysis of randomized trials lead to bias on average, and that some results of randomized trials are suppressed from dissemination because of their nature. However, it is usually impossible to know to what extent biases have affected the results of a particular study or analysis (Savović et al 2012). For these reasons, it is more appropriate to consider whether a result is at risk of bias rather than claiming with certainty that it is biased. Most recent tools for assessing the internal validity of findings from quantitative studies in health now focus on risk of bias, whereas previous tools targeted the broader notion of ‘methodological quality’ (see also Section 7.1.2 ).

Bias should not be confused with imprecision . Bias refers to systematic error , meaning that multiple replications of the same study would reach the wrong answer on average. Imprecision refers to random error , meaning that multiple replications of the same study will produce different effect estimates because of sampling variation, but would give the right answer on average. Precision depends on the number of participants and (for dichotomous outcomes) the number of events in a study, and is reflected in the confidence interval around the intervention effect estimate from each study. The results of smaller studies are subject to greater sampling variation and hence are less precise. A small trial may be at low risk of bias yet its result may be estimated very imprecisely, with a wide confidence interval. Conversely, the results of a large trial may be precise (narrow confidence interval) but also at a high risk of bias.

Bias should also not be confused with the external validity of a study, that is, the extent to which the results of a study can be generalized to other populations and settings. For example, a study may enrol participants who are not representative of the population who most commonly experience a particular clinical condition. The results of this study may have limited generalizability to the wider population, but will not necessarily give a biased estimate of the effect in the highly specific population on which it is based. Factors influencing the applicability of an included study to the review question are covered in Chapter 14 and Chapter 15 .

7.1.2 From quality scales to domain-based tools

Critical assessment of included studies has long been an important component of a systematic review or meta-analysis, and methods have evolved greatly over time. Early appraisal tools were structured as quality ‘scales’, which combined information on several features into a single score. However, this approach was questioned after it was revealed that the type of quality scale used could significantly influence the interpretation of the meta-analysis results (Jüni et al 1999). That is, risk ratios of trials deemed ‘high quality’ by some scales suggested that the experimental intervention was superior, whereas when trials were deemed ‘high quality’ by other scales, the opposite was the case. The lack of a theoretical framework underlying the concept of ‘quality’ assessed by these scales resulted in tools mixing different concepts such as risk of bias, imprecision, relevance, applicability, ethics, and completeness of reporting. Furthermore, the summary score combining these components is difficult to interpret (Jüni et al 2001).

In 2008, Cochrane released the Cochrane risk-of-bias (RoB) tool, which was slightly revised in 2011 (Higgins et al 2011). The tool was built on the following key principles:

  • The tool focused on a single concept: risk of bias. It did not consider other concepts such as the quality of reporting, precision (the extent to which results are free of random errors), or external validity (directness, applicability or generalizability).
  • The tool was based on a domain-based (or component) approach, in which different types of bias are considered in turn. Users were asked to assess seven domains: random sequence generation, allocation sequence concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective outcome reporting, and other sources of bias. There was no scoring system in the tool.
  • The domains were selected to characterize mechanisms through which bias may be introduced into a trial, based on a combination of theoretical considerations and empirical evidence.
  • The assessment of risk of bias required judgement and should thus be completely transparent. Review authors provided a judgement for each domain, rated as ‘low’, ‘high’ or ‘unclear’ risk of bias, and provided reasons to support their judgement.

This tool has been implemented widely both in Cochrane Reviews and non-Cochrane reviews (Jørgensen et al 2016). However, user testing has raised some concerns related to the modest inter-rater reliability of some domains (Hartling et al 2013), the need to rethink the theoretical background of the ‘selective outcome reporting’ domain (Page and Higgins 2016), the misuse of the ‘other sources of bias’ domain (Jørgensen et al 2016), and the lack of appropriate consideration of the risk-of-bias assessment in the analyses and interpretation of results (Hopewell et al 2013).

To address these concerns, a new version of the Cochrane risk-of-bias tool, RoB 2, has been developed, and this should be used for all randomized trials in Cochrane Reviews ( MECIR Box 7.1.a ). The tool, described in Chapter 8 , includes important innovations in the assessment of risk of bias in randomized trials. The structure of the tool is similar to that of the ROBINS-I tool for non-randomized studies of interventions (described in Chapter 25 ). Both tools include a fixed set of bias domains, which are intended to cover all issues that might lead to a risk of bias. To help reach risk-of-bias judgements, a series of ‘signalling questions’ are included within each domain. Also, the assessment is typically specific to a particular result. This is because the risk of bias may differ depending on how an outcome is measured and how the data for the outcome are analysed. For example, if two analyses for a single outcome are presented, one adjusted for baseline prognostic factors and the other not, then the risk of bias in the two results may be different. 

MECIR Box 7.1.a Relevant expectations for conduct of intervention reviews

Assessing risk of bias ( )

   Risk of bias in individual study results for the included studies should be explicitly considered to determine the extent to which findings of the studies can be believed. Risks of bias might vary by result. It may not be feasible to assess the risk of bias in every single result available across the included studies, particularly if a large number of studies and results are available. Review author should therefore assess risk of bias in the results of outcomes included in their ‘summary of findings’ tables, which present the findings of seven or fewer outcomes that are most important to patients. The RoB 2 tool – as described in the – is the preferred tool for all randomized trials in new reviews. The Cochrane Evidence Production and Methods Directorate is, however, aware that there remain challenges in learning and implementation of the tool, and use of the original Cochrane risk of bias tool is acceptable for the time being.

7.2 Empirical evidence of bias

Where possible, assessments of risk of bias in a systematic review should be informed by evidence. The following sections summarize some of the key evidence about bias that informs our guidance on risk-of-bias assessments in Cochrane Reviews.

7.2.1 Empirical evidence of bias in randomized trials: meta-epidemiologic studies

Many empirical studies have shown that methodological features of the design, conduct and reporting of studies are associated with biased intervention effect estimates. This evidence is mainly based on meta-epidemiologic studies using a large collection of meta-analyses to investigate the association between a reported methodological characteristic and intervention effect estimates in randomized trials. The first meta-epidemiologic study was published in 1995. It showed exaggerated intervention effect estimates when intervention allocation methods were inadequate or unclear and when trials were not described as double-blinded (Schulz et al 1995). These results were subsequently confirmed in several meta-epidemiologic studies, showing that lack of reporting of adequate random sequence generation, allocation sequence concealment, double blinding and more specifically blinding of outcome assessors tend to yield higher intervention effect estimates on average (Dechartres et al 2016a, Page et al 2016).

Evidence from meta-epidemiologic studies suggests that the influence of methodological characteristics such as lack of blinding and inadequate allocation sequence concealment varies by the type of outcome. For example, the extent of over-estimation is larger when the outcome is subjectively measured (e.g. pain) and therefore likely to be influenced by knowledge of the intervention received, and lower when the outcome is objectively measured (e.g. death) and therefore unlikely to be influenced by knowledge of the intervention received (Wood et al 2008, Savović et al 2012).

7.2.2 Trial characteristics explored in meta-epidemiologic studies that are not considered sources of bias

Researchers have also explored the influence of other trial characteristics that are not typically considered a threat to a direct causal inference for intervention effect estimates. Recent meta-epidemiologic studies have shown that effect estimates were lower in prospectively registered trials compared with trials not registered or registered retrospectively (Dechartres et al 2016b, Odutayo et al 2017). Others have shown an association between sample size and effect estimates, with larger effects observed in smaller trials (Dechartres et al 2013). Studies have also shown a consistent association between intervention effect and single or multiple centre status, with single-centre trials showing larger effect estimates, even after controlling for sample size (Dechartres et al 2011).

In some of these cases, plausible bias mechanisms can be hypothesized. For example, both the number of centres and sample size may be associated with intervention effect estimates because of non-reporting bias (e.g. single-centre studies and small studies may be more likely to be published when they have larger, statistically significant effects than when they have smaller, non-significant effects); or single-centre and small studies may be subject to less stringent controls and checks. However, alternative explanations are possible, such as differences in factors relating to external validity (e.g. participants in small, single-centre trials may be more homogenous than participants in other trials). Because of this, these factors are not directly captured by the risk-of-bias tools recommended by Cochrane. Review authors should record these characteristics systematically for each study included in the systematic review (e.g. in the ‘Characteristics of included studies’ table) where appropriate. For example, trial registration status should be recorded for all randomized trials identified.

7.2.3 Empirical evidence of non-reporting biases

A list of the key types of non-reporting biases is provided in Table 7.2.a . In the sections that follow, we provide some of the evidence that underlies this list.

Table 7.2.a Definitions of some types of non-reporting biases

Publication bias

The or of research findings, depending on the nature and direction of the results.

Time-lag bias

The or publication of research findings, depending on the nature and direction of the results.

Language bias

The publication of research findings , depending on the nature and direction of the results.

Citation bias

The or of research findings, depending on the nature and direction of the results.

Multiple (duplicate) publication bias

The or publication of research findings, depending on the nature and direction of the results.

Location bias

The publication of research findings in journals with different or in standard databases, depending on the nature and direction of results.

Selective (non-) reporting bias

The of some outcomes or analyses, but not others, depending on the nature and direction of the results.

7.2.3.1 Selective publication of study reports

There is convincing evidence that the publication of a study report is influenced by the nature and direction of its results (Chan et al 2014). Direct empirical evidence of such selective publication (or ‘publication bias’) is obtained from analysing a cohort of studies in which there is a full accounting of what is published and unpublished (Franco et al 2014). Schmucker and colleagues analysed the proportion of published studies in 39 cohorts (including 5112 studies identified from research ethics committees and 12,660 studies identified from trials registers) (Schmucker et al 2014). Only half of the studies were published, and studies with statistically significant results were more likely to be published than those with non-significant results (odds ratio (OR) 2.8; 95% confidence interval (CI) 2.2 to 3.5) (Schmucker et al 2014). Similar findings were observed by Scherer and colleagues, who conducted a systematic review of 425 studies that explored subsequent full publication of research initially presented at biomedical conferences (Scherer et al 2018). Only 37% of the 307,028 abstracts presented at conferences were published later in full (60% for randomized trials), and abstracts with statistically significant results in favour of the experimental intervention (versus results in favour of the comparator intervention) were more likely to be published in full (OR 1.17; 95% CI 1.07 to 1.28) (Scherer et al 2018). By examining a cohort of 164 trials submitted to the FDA for regulatory approval, Rising and colleagues found that trials with favourable results were more likely than those with unfavourable results to be published (OR 4.7; 95% CI 1.33 to 17.1) (Rising et al 2008).

In addition to being more likely than unpublished randomized trials to have statistically significant results, published trials also tend to report larger effect estimates in favour of the experimental intervention than trials disseminated elsewhere (e.g. in conference abstracts, theses, books or government reports) (ratio of odds ratios 0.90; 95% CI 0.82 to 0.98) (Dechartres et al 2018). This bias has been observed in studies in many scientific disciplines, including the medical, biological, physical and social sciences (Polanin et al 2016, Fanelli et al 2017).

7.2.3.2 Other types of selective dissemination of study reports

The length of time between completion of a study and publication of its results can be influenced by the nature and direction of the study results (‘time-lag bias’). Several studies suggest that randomized trials with results that favour the experimental intervention are published in journals about one year earlier on average than trials with unfavourable results (Hopewell et al 2007, Urrutia et al 2016).

Investigators working in a non-English speaking country may publish some of their work in local, non-English language journals, which may not be indexed in the major biomedical databases (‘language bias’). It has long been assumed that investigators are more likely to publish positive studies in English-language journals than in local, non-English language journals (Morrison et al 2012). Contrary to this belief, Dechartres and colleagues identified larger intervention effects in randomized trials published in a language other than English than in English (ratio of odds ratios 0.86; 95% CI 0.78 to 0.95), which the authors hypothesized may be related to the higher risk of bias observed in the non-English language trials (Dechartres et al 2018). Several studies have found that in most cases there were no major differences between summary estimates of meta-analyses restricted to English-language studies compared with meta-analyses including studies in languages other than English (Morrison et al 2012, Dechartres et al 2018).

The number of times a study report is cited appears to be influenced by the nature and direction of its results (‘citation bias’). In a meta-analysis of 21 methodological studies, Duyx and colleagues observed that articles with statistically significant results were cited 1.57 times the rate of articles with non-significant results (rate ratio 1.57; 95% CI 1.34 to 1.83) (Duyx et al 2017). They also found that articles with results in a positive direction (regardless of their statistical significance) were cited at 2.14 times the rate of articles with results in a negative direction (rate ratio 2.14; 95% CI 1.29 to 3.56) (Duyx et al 2017). In an analysis of 33,355 studies across all areas of science, Fanelli and colleagues found that the number of citations received by a study was positively correlated with the magnitude of effects reported (Fanelli et al 2017). If positive studies are more likely to be cited, they may be more likely to be located, and thus more likely to be included in a systematic review.

Investigators may report the results of their study across multiple publications; for example, Blümle and colleagues found that of 807 studies approved by a research ethics committee in Germany from 2000 to 2002, 135 (17%) had more than one corresponding publication (Blümle et al 2014). Evidence suggests that studies with statistically significant results or larger treatment effects are more likely to lead to multiple publications (‘multiple (duplicate) publication bias’) (Easterbrook et al 1991, Tramèr et al 1997), which makes it more likely that they will be located and included in a meta-analysis.

Research suggests that the accessibility or level of indexing of journals is associated with effect estimates in trials (‘location bias’). For example, a study of 61 meta-analyses found that trials published in journals indexed in Embase but not MEDLINE yielded smaller effect estimates than trials indexed in MEDLINE (ratio of odds ratios 0.71; 95% CI 0.56 to 0.90); however, the risk of bias due to not searching Embase may be minor, given the lower prevalence of Embase-unique trials (Sampson et al 2003). Also, Moher and colleagues estimate that 18,000 biomedical research studies are tucked away in ‘predatory’ journals, which actively solicit manuscripts and charge publications fees without providing robust editorial services (such as peer review and archiving or indexing of articles) (Moher et al 2017). The direction of bias associated with non-inclusion of studies published in predatory journals depends on whether they are publishing valid studies with null results or studies whose results are biased towards finding an effect.

7.2.3.3 Selective dissemination of study results

The need to compress a substantial amount of information into a few journal pages, along with a desire for the most noteworthy findings to be published, can lead to omission from publication of results for some outcomes because of the nature and direction of the findings. Particular results may not be reported at all ( ‘selective non-reporting of results’ ) or be reported incompletely ( ‘selective under-reporting of results’ , e.g. stating only that “P>0.05” rather than providing summary statistics or an effect estimate and measure of precision) (Kirkham et al 2010). In such instances, the data necessary to include the results in a meta-analysis are unavailable. Excluding such studies from the synthesis ignores the information that no significant difference was found, and biases the synthesis towards finding a difference (Schmid 2016).

Evidence of selective non-reporting and under-reporting of results in randomized trials has been obtained by comparing what was pre-specified in a trial protocol with what is available in the final trial report. In two landmark studies, Chan and colleagues found that results were not reported for at least one benefit outcome in 71% of randomized trials in one cohort (Chan et al 2004a) and 88% in another (Chan et al 2004b). Results were under-reported (e.g. stating only that “P>0.05”) for at least one benefit outcome in 92% of randomized trials in one cohort and 96% in another. Statistically significant results for benefit outcomes were twice as likely as non-significant results to be completely reported (range of odds ratios 2.4 to 2.7) (Chan et al 2004a, Chan et al 2004b). Reviews of studies investigating selective non-reporting and under-reporting of results suggest that it is more common for outcomes defined by trialists as secondary rather than primary (Jones et al 2015, Li et al 2018).

Selective non-reporting and under-reporting of results occurs for both benefit and harm outcomes. Examining the studies included in a sample of 283 Cochrane Reviews, Kirkham and colleagues suspected that 50% of 712 studies with results missing for the primary benefit outcome of the review were missing because of the nature of the results (Kirkham et al 2010). This estimate was slightly higher (63%) in 393 studies with results missing for the primary harm outcome of 322 systematic reviews (Saini et al 2014).

7.3 General procedures for risk-of-bias assessment

7.3.1 collecting information for assessment of risk of bias.

Information for assessing the risk of bias can be found in several sources, including published articles, trials registers, protocols, clinical study reports (i.e. documents prepared by pharmaceutical companies, which provide extensive detail on trial methods and results), and regulatory reviews (see also Chapter 5, Section 5.2 ).

Published articles are the most frequently used source of information for assessing risk of bias. This source is theoretically very valuable because it has been reviewed by editors and peer reviewers, who ideally will have prompted authors to report their methods transparently. However, the completeness of reporting of published articles is, in general, quite poor, and essential information for assessing risk of bias is frequently missing. For example, across 20,920 randomized trials included in 2001 Cochrane Reviews, the percentage of trials at unclear risk of bias was 49% for random sequence generation, 57% for allocation sequence concealment; 31% for blinding and 25% for incomplete outcome data (Dechartres et al 2017). Nevertheless, more recent trials were less likely to be judged at unclear risk of bias, suggesting that reporting is improving over time (Dechartres et al 2017).

Trials registers can be a useful source of information to obtain results of studies that have not yet been published (Riveros et al 2013). However, registers typically report only limited information about methods used in the trial to inform an assessment of risk of bias (Wieseler et al 2012). Protocols, which outline the objectives, design, methodology, statistical consideration and procedural aspects of a clinical study, may provide more detailed information on the methods used than that provided in the results report of a study. They are increasingly being published or made available by journals who publish the final report of a study. Protocols are also available in some trials registers, particularly ClinicalTrials.gov (Zarin et al 2016), on websites dedicated to data sharing such as ClinicalStudyDataRequest.com , or from drug regulatory authorities such as the European Medicines Agency. Clinical study reports are another highly useful source of information (Wieseler et al 2012, Jefferson et al 2014).

It may be necessary to contact study investigators to request access to the trial protocol, to clarify incompletely reported information or understand discrepant information available in different sources. To reduce the risk that study authors provide overly positive answers to questions about study design and conduct, we suggest review authors use open-ended questions. For example, to obtain information about the randomization process, review authors might consider asking: ‘What process did you use to assign each participant to an intervention?’ To obtain information about blinding of participants, it might be useful to request something like, ‘Please describe any measures used to ensure that trial participants were unaware of the intervention to which they were assigned’. More focused questions can then be asked to clarify remaining uncertainties.

7.3.2 Performing assessments of risk of bias   

Risk-of-bias assessments in Cochrane Reviews should be performed independently by at least two people ( MECIR Box 7.3.a ). Doing so can minimize errors in assessments and ensure that the judgement is not influenced by a single person’s preconceptions. Review authors should also define in advance the process for resolving disagreements. For example, both assessors may attempt to resolve disagreements via discussion, and if that fails, call on another author to adjudicate the final judgement. Review authors assessing risk of bias should have either content or methodological expertise (or both), and an adequate understanding of the relevant methodological issues addressed by the risk-of-bias tool. There is some evidence that intensive, standardized training may significantly improve the reliability of risk-of-bias assessments (da Costa et al 2017). To improve reliability of assessments, a review team could consider piloting the risk-of-bias tool on a sample of articles. This may help ensure that criteria are applied consistently and that consensus can be reached. Three to six papers should provide a suitable sample for this. We do not recommend the use of statistical measures of agreement (such as kappa statistics ) to describe the extent to which assessments by multiple authors were the same. It is more important that reasons for any disagreement are explored and resolved.

MECIR Box 7.3.a Relevant expectations for conduct of intervention reviews

Assessing risk of bias in duplicate ( )

Duplicating the risk-of-bias assessment reduces both the risk of making mistakes and the possibility that assessments are influenced by a single person’s biases.

The process for reaching risk-of-bias judgements should be transparent. In other words, readers should be able to discern why a particular result was rated at low risk of bias and why another was rated at high risk of bias. This can be achieved by review authors providing information in risk-of-bias tables to justify the judgement made. Such information may include direct quotes from study reports that articulate which methods were used, and an explanation for why such a method is flawed. Cochrane Review authors are expected to record the source of information (including the precise location within a document) that informed each risk-of-bias judgement ( MECIR Box 7.3.b ).

MECIR Box 7.3.b Relevant expectations for conduct of intervention reviews

Supporting judgements of risk of bias ( )

Providing support for the judgement makes the process transparent.

Providing sources of information for risk-of-bias assessments ( )

Readers, editors and referees should have the opportunity to see for themselves from where supports for judgements have been obtained.

Many results are often available in trial reports, so review authors should think carefully about which results to assess for risk of bias. Review authors should assess risk of bias in results for outcomes that are included in the ‘Summary of findings’ table ( MECIR Box 7.1.a ). Such tables typically include seven or fewer patient-important outcomes (for more details on constructing a ‘Summary of findings’ table, see Chapter 14 ).

Novel methods for assessing risk of bias are emerging, including machine learning systems designed to semi-automate risk-of-bias assessment (Marshall et al 2016, Millard et al 2016). These methods involve using a sample of previous risk-of-bias assessments to train machine learning models to predict risk of bias from PDFs of study reports, and extract supporting text for the judgements. Some of these approaches showed good performance for identifying relevant sentences to identify information pertinent to risk of bias from the full-text content of research articles describing clinical trials. A study showed that about one-third of articles could be assessed by just one reviewer if such a tool is used instead of the two required reviewers (Millard et al 2016). However, reliability in reaching judgements about risk of bias compared with human reviewers was slight to moderate depending on the domain assessed (Gates et al 2018).

7.4 Presentation of assessment of risk of bias

Risk-of-bias assessments may be presented in a Cochrane Review in various ways. A full risk-of-bias table includes responses to each signalling question within each domain (see Chapter 8, Section 8.2 ) and risk-of-bias judgements, along with text to support each judgement. Such full tables are lengthy and are unlikely to be of great interest to readers, so should generally not be included in the main body of the review. It is nevertheless good practice to make these full tables available for reference.

We recommend the use of forest plots that present risk-of-bias judgements alongside the results of each study included in a meta-analysis (see Figure 7.4.a ). This will give a visual impression of the relative contributions of the studies at different levels of risk of bias, especially when considered in combination with the weight given to each study. This may assist authors in reaching overall conclusions about the risk of bias of the synthesized result, as discussed in Section 7.6 . Optionally, forest plots or other tables or graphs can be ordered (stratified) by judgements on each risk-of-bias domain or by the overall risk-of-bias judgement for each result.

Review authors may wish to generate bar graphs illustrating the relative contributions of studies with each of risk-of-bias judgement (low risk of bias, some concerns, and high risk of bias). When dividing up a bar into three regions for this purpose, it is preferable to determine the regions according to statistical information (e.g. precision, or weight in a meta-analysis) arising from studies in each category, rather than according to the number of studies in each category.

Figure 7.4.a Forest plot displaying RoB 2 risk-of-bias judgements for each randomized trial included in a meta-analysis of mental health first aid (MHFA) knowledge scores. Adapted from Morgan et al (2018).

bias in literature review

7.5 Summary assessments of risk of bias

Review authors should make explicit summary judgements about the risk of bias for important results both within studies and across studies (see MECIR Box 7.5.a ). The tools currently recommended by Cochrane for assessing risk of bias within included studies (RoB 2 and ROBINS-I) produce an overall judgement of risk of bias for the result being assessed. These overall judgements are derived from assessments of individual bias domains as described, for example, in Chapter 8, Section 8.2 .

To summarize risk of bias across study results in a synthesis, review authors should follow guidance for assessing certainty in the body of evidence (e.g. using GRADE), as described in Chapter 14, Section 14.2.2 . When a meta-analysis is dominated by study results at high risk of bias, the certainty of the body of evidence may be rated as being lower than if such studies were excluded from the meta-analysis. Section 7.6 discusses some possible courses of action that may be preferable to retaining such studies in the synthesis.

MECIR Box 7.5.a Relevant expectations for conduct of intervention reviews

Summarizing risk-of-bias assessments ( )

.

7.6 Incorporating assessment of risk of bias into analyses

7.6.1 introduction.

When performing and presenting meta-analyses, review authors should address risk of bias in the results of included studies ( MECIR Box 7.6.a ). It is not appropriate to present analyses and interpretations while ignoring flaws identified during the assessment of risk of bias. In this section we present suitable strategies for addressing risk of bias in results from studies included in a meta-analysis, either in order to understand the impact of bias or to determine a suitable estimate of intervention effect (Section 7.6.2 ). For the latter, decisions often involve a trade-off between bias and precision. A meta-analysis that includes all eligible studies may produce a result with high precision (narrow confidence interval) but be seriously biased because of flaws in the conduct of some of the studies. However, including only the studies at low risk of bias in all domains assessed may produce a result that is unbiased but imprecise (if there are only a few studies at low risk of bias).

MECIR Box 7.6.a Relevant expectations for conduct of intervention reviews

Addressing risk of bias in the synthesis ( )

.

Incorporating assessments of risk of bias ( )

If randomized trials have been assessed using one or more tools in addition to the RoB 2 tool

.

7.6.2 Including risk-of-bias assessments in analyses

Broadly speaking, studies at high risk of bias should be given reduced weight in meta-analyses compared with studies at low risk of bias. However, methodological approaches for weighting studies according to their risk of bias are not sufficiently well developed that they can currently be recommended for use in Cochrane Reviews.

When risks of bias vary across studies in a meta-analysis, four broad strategies are available to incorporate assessments into the analysis. The choice of strategy will influence which result to present as the main finding for a particular outcome (e.g. in the Abstract). The intended strategy should be described in the protocol for the review.

(1) Primary analysis restricted to studies at low risk of bias

The first approach involves restricting the primary analysis to studies judged to be at low risk of bias overall. Review authors who restrict their primary analysis in this way are encouraged to perform sensitivity analyses to show how conclusions might be affected if studies at a high risk of bias were included.

(2) Present multiple (stratified) analyses

Stratifying according to the overall risk of bias will produce multiple estimates of the intervention effect: for example, one based on all studies, one based on studies at low risk of bias, and one based on studies at high risk of bias. Two or more such estimates might be considered with equal prominence (e.g. the first and second of these). However, presenting the results in this way may be confusing for readers. In particular, people who need to make a decision usually require a single estimate of effect. Furthermore, ‘Summary of findings’ tables typically present only a single result for each outcome. On the other hand, a stratified forest plot presents all the information transparently. Though we would generally recommend stratification is done on the basis of overall risk of bias, review authors may choose to conduct subgroup analyses based on specific bias domains (e.g. risk of bias arising from the randomization process).

Formal comparisons of intervention effects according to risk of bias can be done with a test for differences across subgroups (e.g. comparing studies at high risk of bias with studies at low risk of bias), or by using meta-regression (for more details see Chapter 10, Section 10.11.4 ). However, review authors should be cautious in planning and carrying out such analyses, because an individual review may not have enough studies in each category of risk of bias to identify meaningful differences. Lack of a statistically significant difference between studies at high and low risk of bias should not be interpreted as absence of bias, because these analyses typically have low power.

The choice between strategies (1) and (2) should be based to large extent on the balance between the potential for bias and the loss of precision when studies at high or unclear risk of bias are excluded.

(3) Present all studies and provide a narrative discussion of risk of bias

The simplest approach to incorporating risk-of-bias assessments in results is to present an estimated intervention effect based on all available studies, together with a description of the risk of bias in individual domains, or a description of the summary risk of bias, across studies. This is the only feasible option when all studies are at the same risk of bias. However, when studies have different risks of bias, we discourage such an approach for two reasons. First, detailed descriptions of risk of bias in the Results section, together with a cautious interpretation in the Discussion section, will often be lost in the Authors’ conclusions, Abstract and ‘Summary of findings’ table, so that the final interpretation ignores the risk of bias and decisions continue to be based, at least in part, on compromised evidence. Second, such an analysis fails to down-weight studies at high risk of bias and so will lead to an overall intervention effect that is too precise, as well as being potentially biased.

When the primary analysis is based on all studies, summary assessments of risk of bias should be incorporated into explicit measures of the certainty of evidence for each important outcome, for example, by using the GRADE system (Guyatt et al 2008). This incorporation can help to ensure that judgements about the risk of bias, as well as other factors affecting the quality of evidence, such as imprecision, heterogeneity and publication bias, are considered appropriately when interpreting the results of the review (see Chapter 14 and Chapter 15 ).

(4) Adjust effect estimates for bias

A final, more sophisticated, option is to adjust the result from each study in an attempt to remove the bias. Adjustments are usually undertaken within a Bayesian framework, with assumptions about the size of the bias and its uncertainty being expressed through prior distributions (see Chapter 10, Section 10.13 ). Prior distributions may be based on expert opinion or on meta-epidemiological findings (Turner et al 2009, Welton et al 2009). The approach is increasingly used in decision making, where adjustments can additionally be made for applicability of the evidence to the decision at hand. However, we do not encourage use of bias adjustments in the context of a Cochrane Review because the assumptions required are strong, limited methodological expertise is available, and it is not possible to account for issues of applicability due to the diverse intended audiences for Cochrane Reviews. The approach might be entertained as a sensitivity analysis in some situations.

7.7 Considering risk of bias due to missing results

The 2011 Cochrane risk-of-bias tool for randomized trials encouraged a study-level judgement about whether there has been selective reporting, in general, of the trial results. As noted in Section 7.2.3.3 , selective reporting can arise in several ways: (1) selective non-reporting of results, where results for some of the analysed outcomes are selectively omitted from a published report; (2) selective under-reporting of data, where results for some outcomes are selectively reported with inadequate detail for the data to be included in a meta-analysis; and (3) bias in selection of the reported result, where a result has been selected for reporting by the study authors, on the basis of the results, from multiple measurements or analyses that have been generated for the outcome domain (Page and Higgins 2016).

The RoB 2 and ROBINS-I tools focus solely on risk of bias as it pertains to a specific trial result. With respect to selective reporting, RoB 2 and ROBINS-I examine whether a specific result from the trial is likely to have been selected from multiple possible results on the basis of the findings (scenario 3 above). Guidance on assessing the risk of bias in selection of the reported result is available in Chapter 8 (for randomized trials) and Chapter 25 (for non-randomized studies of interventions).

If there is no result (i.e. it has been omitted selectively from the report or under-reported), then a risk-of-bias assessment at the level of the study result is not applicable. Selective non-reporting of results and selective under-reporting of data are therefore not covered by the RoB 2 and ROBINS-I tools. Instead, selective non-reporting of results and under-reporting of data should be assessed at the level of the synthesis across studies. Both practices lead to a situation similar to that when an entire study report is unavailable because of the nature of the results (also known as publication bias). Regardless of whether an entire study report or only a particular result of a study is unavailable, the same consequence can arise: bias in a synthesis because available results differ systematically from missing results (Page et al 2018). Chapter 13 provides detailed guidance on assessing risk of bias due to missing results in a systematic review.

7.8 Considering source of funding and conflict of interest of authors of included studies

Readers of a trial report often need to reflect on whether conflicts of interest have influenced the design, conduct, analysis and reporting of a trial. It is therefore now common for scientific journals to require authors of trial reports to provide a declaration of conflicts of interest (sometimes called ‘competing’ or ‘declarations of’ interest), to report funding sources and to describe any funder’s role in the trial.

In this section, we characterize conflicts of interest in randomized trials and discuss how conflicts of interest may impact on trial design and effect estimates. We also suggest how review authors can collect, process and use information on conflicts of interest in the assessment of:

  • directness of studies to the review’s research question;
  • heterogeneity in results due to differences in the designs of eligible studies;
  • risk of bias in results of included studies;
  • risk of bias in a synthesis due to missing results.

At the time of writing, a formal Tool for Addressing Conflicts of Interest in Trials (TACIT) is being developed under the auspices of the Cochrane Bias Methods Group. The TACIT development process has informed the content of this section, and we encourage readers to check http://tacit.one for more detailed guidance that will become available.

7.8.1 Characteristics of conflicts of interest

The Institute of Medicine defined conflicts of interest as “ a set of circumstances that creates a risk that professional judgment or actions regarding a primary interest will be unduly influenced by a secondary interest” (Lo et al 2009). In a clinical trial, the primary interest is to provide patients, clinicians and health policy makers with an unbiased and clinically relevant estimate of an intervention effect. Secondary interest may be both financial and non-financial.

Financial conflicts of interest involve both financial interests related to a specific trial (for example, a company funding a trial of a drug produced by the same company) and financial interests related to the authors of a trial report (for example, authors’ ownership of stocks or employment by a drug company).

For drug and device companies and other manufacturers, the financial difference between a negative and positive pivotal trial can be considerable. For example, the mean stock price of the companies funding 23 positive pivotal oncology trials increased by 14% after disclosure of the results (Rothenstein et al 2011). Industry funding is common, especially in drug trials. In a study of 200 trial publications from 2015, 68 (38%) of 178 trials with funding declarations were industry funded (Hakoum et al 2017). Also, in a cohort of oncology drug trials, industry funded 44% of trials and authors declared conflicts of interest in 69% of trials (Riechelmann et al 2007).

The degree of funding, and the type of the involvement of industry funders, may differ across trials. In some situations, involvement includes only the provision of free study medication for a trial that has otherwise been planned and conducted independently, and funded largely, by public means. In other situations, a company fully funds and controls a trial. In rarer cases, head-to-head trials comparing two drugs may be funded by the two different companies producing the drugs.

A Cochrane Methodology Review analysed 75 studies of the association between industry funding and trial results (Lundh et al 2017). The authors concluded that trials funded by a drug or device company were more likely to have positive conclusions and statistically significant results, and that this association could not be explained by differences in risk of bias between industry and non-industry funded trials. However, industry and non-industry trials may differ in ways that may confound the association; for example due to choice of patient population, comparator interventions or outcomes. Only one of the included studies used a meta-epidemiological design and found no clear association between industry funding and the magnitude of intervention effects (Als-Nielsen et al 2003). Similar to the association with industry funding, other studies have reported that results of trials conducted by authors with a financial conflict of interest were more likely to be positive (Ahn et al 2017).

Conflicts of interest may also be non-financial (Viswanathan et al 2014). Characterizations of non-financial conflicts of interest differ somewhat, but typically distinguish between conflicts related mainly to an individual (e.g. adherence to a theory or ideology), relationships to other individuals (e.g. loyalty to friends, family members or close colleagues), or relationship to groups (e.g. work place or professional groups). In medicine, non-financial conflicts of interest have received less attention than financial conflicts of interest. In addition, financial and non-financial conflicts are often intertwined; for example, non-financial conflicts related to institutional association can be considered as indirect financial conflicts linked to employment. Definitions of what should be characterized as a ‘non-financial’ conflict of interest, and, in particular, whether personal beliefs, experiences or intellectual commitments should be considered conflicts of interest, have been debated (Bero and Grundy 2016).

It is useful to differentiate between non-financial conflicts of interest of a trial researcher and the basic interests and hopes involved in doing good trial research. Most researchers conducting a trial will have an interest in the scientific problem addressed, a well-articulated theoretical position, anticipation for a specific trial result, and hopes for publication in a respectable journal. This is not a conflict of interest but a basic condition for doing health research. However, individual researchers may lose sight of the primacy of the methodological neutrality at the heart of a scientific enquiry, and become unduly occupied with the secondary interest of how trial results may affect academic appearance or chances of future funding. Extreme examples are the publication of fabricated trial data or trials, some of which have had an impact on systematic reviews (Marret et al 2009).

Few empirical studies of non-financial conflicts of interest in randomized trials have been published, and to our knowledge there are none that assess the impact of non-financial conflicts of interest on trial results and conclusions. However, non-financial conflicts of interests have been investigated in other types of clinical research; for example, guideline authors’ specialty appears to have influenced their voting behaviour while developing guidelines for mammography screening (Norris et al 2012).

7.8.2 Conflict of interest and trial design

Core decisions on designing a trial involve defining the type of participants to be included, the type of experimental intervention, the type of comparator, the outcomes (and timing of outcome assessments) and the choice of analysis. Such decisions will often reflect a compromise between what is clinically and scientifically ideal and what is practically possible. However, when investigators have important conflicts of interest, a trial may be designed in a way that increases its chances of detecting a positive trial result, at the expense of clinical applicability. For example, narrow eligibility criteria may exclude older and frail patients, thus reducing the possibility of detecting clinically relevant harms. Alternatively, trial designers may choose placebo as a comparator despite an effective intervention being in regular use, or they may focus on short-term surrogate outcomes rather than clinically relevant long-term outcomes (Estellat and Ravaud 2012, Wieland et al 2017).

Trial design choices may be more subtle. For example, a trial may be designed to favour an experimental drug by using an inferior comparator drug when better alternatives exist (Safer 2002) or by using a low dose of the comparator drug when the focus is efficacy and a high dose of the comparator drug when the focus is harms (Mann and Djulbegovic 2013). In a typical Cochrane Review with fairly broad eligibility criteria aiming to identify and summarize all relevant trials, it is pertinent to consider the degree to which a given trial result directly relates to the question posed by the review. If all or most identified trials have narrow eligibility criteria and short-term outcomes, a review question focusing on broad patient categories and long-term effects can only be answered indirectly by the included studies. This has implications for the assessment of the certainty of the evidence provided by the review, which is addressed through the concept of indirectness in the GRADE framework (see Chapter 14, Section 14.2 ).

If results in a meta-analysis display heterogeneity, then differences in design choices that are driven by conflicts of interest may be one reason for this. Thus, conflicts of interest may also affect reflections on the certainty of the evidence through the GRADE concept of inconsistency.

7.8.3 Conflicts of interest and risk of bias in a trial’s effect estimate

Authors of Cochrane Reviews have sometimes included conflicts of interest as an ‘other source of bias’ while using the previous versions of the risk-of-bias tool (Jørgensen et al 2016). Consistent with previous versions of the Handbook , we discourage the inclusion of conflicts of interest directly in the risk-of-bias assessment. Adding conflicts of interest to the bias tool is inconsistent with the conceptual structure of the tool, which is built on mechanistically defined bias domains. Also, restricting consideration of the potential impact of conflicts of interest to a question of risk of bias in an individual trial result overlooks other important aspects, such as the design of the trial (see Section 7.8.2 ) and potential bias in a meta-analysis due to missing results (see Section 7.8.4 ).

Conflicts of interest may lead to bias in effect estimates from a trial through several mechanisms. For example, if those recruiting participants into a trial have important conflicts of interest and the allocation sequence is not concealed, then they may be more likely to subvert the allocation process to produce intervention groups that are systematically unbalanced in favour of their preferred intervention. Similarly, investigators with important conflicts of interests may decide to exclude from the analysis some patients who did not respond as anticipated to the experimental intervention, resulting in bias due to missing outcome data. Furthermore, selective reporting of a favourable result may be strongly associated with conflicts of interest (McGauran et al 2010), due to either selective reporting of particular outcome measurements or selective reporting of particular analyses (Eyding et al 2010, Vedula et al 2013). One study found that use of modified-intention-to-treat analysis and post-randomization exclusions occurred more often in trials with industry funding or author conflicts of interest (Montedori et al 2011). Accessing the trial protocol and statistical analysis plan to determine which outcomes and analyses were pre-specified is therefore especially important for a trial with relevant conflicts of interest.

Review authors should explain how consideration of conflicts of interest informed their risk-of-bias judgements. For example, when information on the analysis plans is lacking, review authors may judge the risk of bias in selection of the reported result to be high if the study investigators had important financial conflicts of interest. Conversely, if trial investigators have clearly used methods that are likely to minimize bias, review authors should not judge the risk of bias for each domain higher just because the investigators happen to have conflicts of interest. In addition, as an optional component in the revised risk-of-bias tool, review authors may reflect on the direction of bias (e.g. bias in favour of the experimental intervention). Information on conflicts of interest may inform the assessment of direction of bias.

7.8.4 Conflicts of interest and risk of bias in a synthesis of trial results

Conflicts of interest may also affect the decision not to report trial results. Conflicts of interest are probably one of several important reasons for decisions not to publish trials with negative findings, and not to publish unfavourable results (Sterne 2013). When relevant trial results are systematically missing from a meta-analysis because of the nature of the findings, the synthesis is at risk of bias due to missing results. Chapter 13 provides detailed guidance on assessing risk of bias due to missing results in a systematic review.

7.8.5 Practical approach to identifying and extracting information on conflicts of interest

When assessing conflicts of interest in a trial, review authors will, to a large degree, rely on declared conflicts. Source of funding may be reported in a trial publication, and conflicts of interest may be reported in an accompanying declaration, for example the International Committee of Medical Journal Editors ( ICMJE ) declaration. In a random sample of 1002 articles published in 2016, authors of 229 (23%) declared having a conflict of interest (Grundy et al 2018). Unfortunately, undeclared conflicts of interest and sources of funding are fairly common (Rasmussen et al 2015, Patel et al 2018).

It is always prudent to examine closely the conflicts of interest of lead and corresponding authors, based on information reported in the trial publication and the author declaration (for example, the ICMJE declaration form). Review authors should also consider examining conflicts of interest of trial co-authors and any commercial collaborators with conflicts of interest; for example, a commercial contract research organization hired by the funder to collect and analyse trial data or the involvement of a medical writing agency. Due to the high prevalence of undisclosed conflicts of interest, review authors should consider expanding their search for conflicts of interest data from other sources (e.g. disclosure in other publications by the authors, the trial protocol, the clinical study report, and public conflicts of interest registries (e.g. Open Payments database)).

We suggest that review authors balance the workload involved with the expected gain, and search additional sources of information on conflicts of interest when there is reason to suspect important conflicts of interest . As a rule of thumb, in trials with unclear funding source and no declaration of conflicts of interest from lead or corresponding authors, we suggest review authors search the Open Payments database, ClinicalTrials.gov , and conflicts of interest declarations in a few previous publications by the study authors. In trials with no commercial funding (including no company employee co-authors) and no declared conflicts of interest for lead or corresponding authors, we suggest review authors not bother to consult additional sources. Also, for trials where lead or corresponding authors have clear conflicts of interest, little additional information may be gained from checking conflicts of interest of co-authors.

Gaining access to relevant information on financial conflicts of interest is possible for a considerable number of trials, despite inherent problems of undeclared conflicts. We expect that the proportion of trials with relevant declarations will increase further.

Access to relevant information on non-financial conflicts of interest is more difficult to gain. Declaration of non-financial conflicts of interest is requested by approximately 50% of journals (Shawwa et al 2016). The term was deleted from ICMJE’s declaration in 2010 in exchange for a broad category of “Other relationships or activities” (Drazen et al 2010). Therefore, non-financial conflicts of interests are seldom self-declared, although if available, such information should be considered.

Non-financial conflicts of interest are difficult to address due to lack of relevant empirical studies on their impact on study results, lack of relevant thresholds for importance, and lack of declaration in many previous trials. However, as a rule of thumb, we suggest that review authors assume trial authors have no non-financial conflicts of interest unless there are clear suggestions of the opposite. Examples of such clues could be a considerable spin in trial publications (Boutron et al 2010), an institutional relationship pertinent to the intervention tested, or external evidence of a fixated ideological or theoretical position.

7.8.6 Judgement of notable concern about conflict of interest

Review authors should describe funding information and conflicts of interest of authors for all studies in the ‘Characteristics of included studies’ table ( MECIR Box 7.8.a ). Also, review authors may want to explore (e.g. in a subgroup analysis) whether trials with conflicts of interest have different intervention effect estimates, or more variable effect estimates, than trials without conflicts of interest. In both cases, review authors need to aim for a relevant threshold for when any conflict of interest is deemed important. If put too low, there is a risk that trivial conflicts of interest will cloud important ones; if set too high, there is the risk that important conflicts of interest are downplayed or ignored.

This judgement should take into account both the degree of conflicts of interest of study authors and also the extent of their involvement in the study. We pragmatically suggest review authors aim for a judgement about whether or not there is reason for ‘notable concern’ about conflicts of interest. This information could be displayed in a table with three columns:

  • trial identifier;
  • judgement (e.g. ‘notable concern about conflict of interest’ versus ‘no notable concern about conflict of interest’); and
  • rationale for judgement, potentially subdivided according to who had conflicts of interest (e.g. lead or corresponding authors, other authors) and stage(s) of the trial to which they contributed (design, conduct, analysis, reporting).

A judgement of ‘notable concern about conflict of interest’ should be based on reflected assessment of identified conflicts of interest. A hypothetical possibility for undeclared conflicts of interest is, as a rule of thumb, not considered sufficient reason for ‘notable concern’. By ‘notable concern’ we imply important conflicts of interest expected to have a potential impact on study design, risk of bias in study results or risk of bias in a synthesis due to missing results. For example, financial conflicts of interest are important in a trial initiated, designed, analysed and reported by drug or device company employees. Conversely, financial conflicts of interest are less important in a trial initiated, designed, analysed and reported by academics adhering to the arm’s length principle when acquiring free trial medication from a drug company, and where lead authors have no conflicts of interest. Similarly, non-financial conflicts of interest may be important in a trial of a highly controversial and ideologically loaded question such as the adverse effect of male circumcision. Non-financial conflicts of interest are less concerning in a trial comparing two treatments in general use with no connotation to highly controversial scientific theories, ideology or professional groups. Mixing trivial conflicts of interest with important ones may mask the latter and will expand review author workload considerably.

MECIR Box 7.8.a Relevant expectations for conduct of intervention reviews

Addressing conflicts of interest in included trials ( )

 

7.9 Chapter information

Authors: Isabelle Boutron, Matthew J Page, Julian PT Higgins, Douglas G Altman, Andreas Lundh, Asbjørn Hróbjartsson

Acknowledgements: We thank Gerd Antes, Peter Gøtzsche, Peter Jüni, Steff Lewis, David Moher, Andrew Oxman, Ken Schulz, Jonathan Sterne and Simon Thompson for their contributions to previous versions of this chapter.

7.10 References

Ahn R, Woodbridge A, Abraham A, Saba S, Korenstein D, Madden E, Boscardin WJ, Keyhani S. Financial ties of principal investigators and randomized controlled trial outcomes: cross sectional study. BMJ 2017; 356 : i6770.

Als-Nielsen B, Chen W, Gluud C, Kjaergard LL. Association of funding and conclusions in randomized drug trials: a reflection of treatment effect or adverse events? JAMA 2003; 290 : 921-928.

Bero LA, Grundy Q. Why Having a (Nonfinancial) Interest Is Not a Conflict of Interest. PLoS Biology 2016; 14 : e2001221.

Blümle A, Meerpohl JJ, Schumacher M, von Elm E. Fate of clinical research studies after ethical approval--follow-up of study protocols until publication. PloS One 2014; 9 : e87184.

Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA 2010; 303 : 2058-2064.

Chan A-W, Song F, Vickers A, Jefferson T, Dickersin K, Gøtzsche PC, Krumholz HM, Ghersi D, van der Worp HB. Increasing value and reducing waste: addressing inaccessible research. The Lancet 2014; 383 : 257-266.

Chan AW, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 2004a; 291 : 2457-2465.

Chan AW, Krleža-Jeric K, Schmid I, Altman DG. Outcome reporting bias in randomized trials funded by the Canadian Institutes of Health Research. Canadian Medical Association Journal 2004b; 171 : 735-740.

da Costa BR, Beckett B, Diaz A, Resta NM, Johnston BC, Egger M, Jüni P, Armijo-Olivo S. Effect of standardized training on the reliability of the Cochrane risk of bias assessment tool: a prospective study. Systematic Reviews 2017; 6 : 44.

Dechartres A, Boutron I, Trinquart L, Charles P, Ravaud P. Single-center trials show larger treatment effects than multicenter trials: evidence from a meta-epidemiologic study. Annals of Internal Medicine 2011; 155 : 39-51.

Dechartres A, Trinquart L, Boutron I, Ravaud P. Influence of trial sample size on treatment effect estimates: meta-epidemiological study. BMJ 2013; 346 : f2304.

Dechartres A, Trinquart L, Faber T, Ravaud P. Empirical evaluation of which trial characteristics are associated with treatment effect estimates. Journal of Clinical Epidemiology 2016a; 77 : 24-37.

Dechartres A, Ravaud P, Atal I, Riveros C, Boutron I. Association between trial registration and treatment effect estimates: a meta-epidemiological study. BMC Medicine 2016b; 14 : 100.

Dechartres A, Trinquart L, Atal I, Moher D, Dickersin K, Boutron I, Perrodeau E, Altman DG, Ravaud P. Evolution of poor reporting and inadequate methods over time in 20 920 randomised controlled trials included in Cochrane reviews: research on research study. BMJ 2017; 357 : j2490.

Dechartres A, Atal I, Riveros C, Meerpohl J, Ravaud P. Association between publication characteristics and treatment effect estimates: A meta-epidemiologic study. Annals of Internal Medicine 2018.

Drazen JM, de Leeuw PW, Laine C, Mulrow C, DeAngelis CD, Frizelle FA, Godlee F, Haug C, Hébert PC, Horton R, Kotzin S, Marusic A, Reyes H, Rosenberg J, Sahni P, Van der Weyden MB, Zhaori G. Towards more uniform conflict disclosures: the updated ICMJE conflict of interest reporting form. BMJ 2010; 340 : c3239.

Duyx B, Urlings MJE, Swaen GMH, Bouter LM, Zeegers MP. Scientific citations favor positive results: a systematic review and meta-analysis. Journal of Clinical Epidemiology 2017; 88 : 92-101.

Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet 1991; 337 : 867-872.

Estellat C, Ravaud P. Lack of head-to-head trials and fair control arms: randomized controlled trials of biologic treatment for rheumatoid arthritis. Archives of Internal Medicine 2012; 172 : 237-244.

Eyding D, Lelgemann M, Grouven U, Harter M, Kromp M, Kaiser T, Kerekes MF, Gerken M, Wieseler B. Reboxetine for acute treatment of major depression: systematic review and meta-analysis of published and unpublished placebo and selective serotonin reuptake inhibitor controlled trials. BMJ 2010; 341 : c4737.

Fanelli D, Costas R, Ioannidis JPA. Meta-assessment of bias in science. Proceedings of the National Academy of Sciences of the United States of America 2017; 114 : 3714-3719.

Franco A, Malhotra N, Simonovits G. Social science. Publication bias in the social sciences: unlocking the file drawer. Science 2014; 345 : 1502-1505.

Gates A, Vandermeer B, Hartling L. Technology-assisted risk of bias assessment in systematic reviews: a prospective cross-sectional evaluation of the RobotReviewer machine learning tool. Journal of Clinical Epidemiology 2018; 96 : 54-62.

Grundy Q, Dunn AG, Bourgeois FT, Coiera E, Bero L. Prevalence of Disclosed Conflicts of Interest in Biomedical Research and Associations With Journal Impact Factors and Altmetric Scores. JAMA 2018; 319 : 408-409.

Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, Schünemann HJ. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008; 336 : 924-926.

Hakoum MB, Jouni N, Abou-Jaoude EA, Hasbani DJ, Abou-Jaoude EA, Lopes LC, Khaldieh M, Hammoud MZ, Al-Gibbawi M, Anouti S, Guyatt G, Akl EA. Characteristics of funding of clinical trials: cross-sectional survey and proposed guidance. BMJ Open 2017; 7 : e015997.

Hartling L, Hamm MP, Milne A, Vandermeer B, Santaguida PL, Ansari M, Tsertsvadze A, Hempel S, Shekelle P, Dryden DM. Testing the risk of bias tool showed low reliability between individual reviewers and across consensus assessments of reviewer pairs. Journal of Clinical Epidemiology 2013; 66 : 973-981.

Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savovic J, Schulz KF, Weeks L, Sterne JAC. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ 2011; 343 : d5928.

Hopewell S, Clarke M, Stewart L, Tierney J. Time to publication for results of clinical trials. Cochrane Database of Systematic Reviews 2007; 2 : MR000011.

Hopewell S, Boutron I, Altman D, Ravaud P. Incorporation of assessments of risk of bias of primary studies in systematic reviews of randomised trials: a cross-sectional study. BMJ Open 2013; 3 : 8.

Jefferson T, Jones MA, Doshi P, Del Mar CB, Hama R, Thompson MJ, Onakpoya I, Heneghan CJ. Risk of bias in industry-funded oseltamivir trials: comparison of core reports versus full clinical study reports. BMJ Open 2014; 4 : e005253.

Jones CW, Keil LG, Holland WC, Caughey MC, Platts-Mills TF. Comparison of registered and published outcomes in randomized controlled trials: a systematic review. BMC Medicine 2015; 13 : 282.

Jørgensen L, Paludan-Muller AS, Laursen DR, Savovic J, Boutron I, Sterne JAC, Higgins JPT, Hróbjartsson A. Evaluation of the Cochrane tool for assessing risk of bias in randomized clinical trials: overview of published comments and analysis of user practice in Cochrane and non-Cochrane reviews. Systematic Reviews 2016; 5 : 80.

Jüni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA 1999; 282 : 1054-1060.

Jüni P, Altman DG, Egger M. Systematic reviews in health care: Assessing the quality of controlled clinical trials. BMJ 2001; 323 : 42-46.

Kirkham JJ, Dwan KM, Altman DG, Gamble C, Dodd S, Smyth R, Williamson PR. The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews. BMJ 2010; 340 : c365.

Li G, Abbade LPF, Nwosu I, Jin Y, Leenus A, Maaz M, Wang M, Bhatt M, Zielinski L, Sanger N, Bantoto B, Luo C, Shams I, Shahid H, Chang Y, Sun G, Mbuagbaw L, Samaan Z, Levine MAH, Adachi JD, Thabane L. A systematic review of comparisons between protocols or registrations and full reports in primary biomedical research. BMC Medical Research Methodology 2018; 18 : 9.

Lo B, Field MJ, Institute of Medicine (US) Committee on Conflict of Interest in Medical Research Education and Practice. Conflict of Interest in Medical Research, Education, and Practice . Washington, D.C.: National Academies Press (US); 2009.

Lundh A, Lexchin J, Mintzes B, Schroll JB, Bero L. Industry sponsorship and research outcome. Cochrane Database of Systematic Reviews 2017; 2 : MR000033.

Mann H, Djulbegovic B. Comparator bias: why comparisons must address genuine uncertainties. Journal of the Royal Society of Medicine 2013; 106 : 30-33.

Marret E, Elia N, Dahl JB, McQuay HJ, Møiniche S, Moore RA, Straube S, Tramèr MR. Susceptibility to fraud in systematic reviews: lessons from the Reuben case. Anesthesiology 2009; 111 : 1279-1289.

Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association 2016; 23 : 193-201.

McGauran N, Wieseler B, Kreis J, Schuler YB, Kolsch H, Kaiser T. Reporting bias in medical research - a narrative review. Trials 2010; 11 : 37.

Millard LA, Flach PA, Higgins JPT. Machine learning to assist risk-of-bias assessments in systematic reviews. International Journal of Epidemiology 2016; 45 : 266-277.

Moher D, Shamseer L, Cobey KD, Lalu MM, Galipeau J, Avey MT, Ahmadzai N, Alabousi M, Barbeau P, Beck A, Daniel R, Frank R, Ghannad M, Hamel C, Hersi M, Hutton B, Isupov I, McGrath TA, McInnes MDF, Page MJ, Pratt M, Pussegoda K, Shea B, Srivastava A, Stevens A, Thavorn K, van Katwyk S, Ward R, Wolfe D, Yazdi F, Yu AM, Ziai H. Stop this waste of people, animals and money. Nature 2017; 549 : 23-25.

Montedori A, Bonacini MI, Casazza G, Luchetta ML, Duca P, Cozzolino F, Abraha I. Modified versus standard intention-to-treat reporting: are there differences in methodological quality, sponsorship, and findings in randomized trials? A cross-sectional study. Trials 2011; 12 : 58.

Morgan AJ, Ross A, Reavley NJ. Systematic review and meta-analysis of Mental Health First Aid training: Effects on knowledge, stigma, and helping behaviour. PloS One 2018; 13 : e0197102.

Morrison A, Polisena J, Husereau D, Moulton K, Clark M, Fiander M, Mierzwinski-Urban M, Clifford T, Hutton B, Rabb D. The effect of English-language restriction on systematic review-based meta-analyses: a systematic review of empirical studies. International Journal of Technology Assessment in Health Care 2012; 28 : 138-144.

Norris SL, Burda BU, Holmer HK, Ogden LA, Fu R, Bero L, Schunemann H, Deyo R. Author's specialty and conflicts of interest contribute to conflicting guidelines for screening mammography. Journal of Clinical Epidemiology 2012; 65 : 725-733.

Odutayo A, Emdin CA, Hsiao AJ, Shakir M, Copsey B, Dutton S, Chiocchia V, Schlussel M, Dutton P, Roberts C, Altman DG, Hopewell S. Association between trial registration and positive study findings: cross sectional study (Epidemiological Study of Randomized Trials-ESORT). BMJ 2017; 356 : j917.

Page MJ, Higgins JPT. Rethinking the assessment of risk of bias due to selective reporting: a cross-sectional study. Systematic Reviews 2016; 5 : 108.

Page MJ, Higgins JPT, Clayton G, Sterne JAC, Hróbjartsson A, Savović J. Empirical evidence of study design biases in randomized trials: systematic review of meta-epidemiological studies. PloS One 2016; 11 : 7.

Page MJ, McKenzie JE, Higgins JPT. Tools for assessing risk of reporting biases in studies and syntheses of studies: a systematic review. BMJ Open 2018; 8 : e019703.

Patel SV, Yu D, Elsolh B, Goldacre BM, Nash GM. Assessment of conflicts of interest in robotic surgical studies: validating author's declarations with the open payments database. Annals of Surgery 2018; 268 : 86-92.

Polanin JR, Tanner-Smith EE, Hennessy EA. Estimating the difference between published and unpublished effect sizes: a meta-review. Review of Educational Research 2016; 86 : 207-236.

Rasmussen K, Schroll J, Gøtzsche PC, Lundh A. Under-reporting of conflicts of interest among trialists: a cross-sectional study. Journal of the Royal Society of Medicine 2015; 108 : 101-107.

Riechelmann RP, Wang L, O'Carroll A, Krzyzanowska MK. Disclosure of conflicts of interest by authors of clinical trials and editorials in oncology. Journal of Clinical Oncology 2007; 25 : 4642-4647.

Rising K, Bacchetti P, Bero L. Reporting bias in drug trials submitted to the Food and Drug Administration: review of publication and presentation. PLoS Medicine 2008; 5 : e217.

Riveros C, Dechartres A, Perrodeau E, Haneef R, Boutron I, Ravaud P. Timing and completeness of trial results posted at ClinicalTrials.gov and published in journals. PLoS Medicine 2013; 10 : e1001566.

Rothenstein JM, Tomlinson G, Tannock IF, Detsky AS. Company stock prices before and after public announcements related to oncology drugs. Journal of the National Cancer Institute 2011; 103 : 1507-1512.

Safer DJ. Design and reporting modifications in industry-sponsored comparative psychopharmacology trials. Journal of Nervous and Mental Disease 2002; 190 : 583-592.

Saini P, Loke YK, Gamble C, Altman DG, Williamson PR, Kirkham JJ. Selective reporting bias of harm outcomes within studies: findings from a cohort of systematic reviews. BMJ 2014; 349 : g6501.

Sampson M, Barrowman NJ, Moher D, Klassen TP, Pham B, Platt R, St John PD, Viola R, Raina P. Should meta-analysts search Embase in addition to Medline? Journal of Clinical Epidemiology 2003; 56 : 943-955.

Savović J, Jones HE, Altman DG, Harris RJ, Jüni P, Pildal J, Als-Nielsen B, Balk EM, Gluud C, Gluud LL, Ioannidis JPA, Schulz KF, Beynon R, Welton NJ, Wood L, Moher D, Deeks JJ, Sterne JAC. Influence of reported study design characteristics on intervention effect estimates from randomized, controlled trials. Annals of Internal Medicine 2012; 157 : 429-438.

Scherer RW, Meerpohl JJ, Pfeifer N, Schmucker C, Schwarzer G, von Elm E. Full publication of results initially presented in abstracts. Cochrane Database of Systematic Reviews 2018; 11 : MR000005.

Schmid CH. Outcome Reporting Bias: A Pervasive Problem in Published Meta-analyses. American Journal of Kidney Diseases 2016; 69 : 172-174.

Schmucker C, Schell LK, Portalupi S, Oeller P, Cabrera L, Bassler D, Schwarzer G, Scherer RW, Antes G, von Elm E, Meerpohl JJ. Extent of non-publication in cohorts of studies approved by research ethics committees or included in trial registries. PloS One 2014; 9 : e114023.

Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995; 273 : 408-412.

Shawwa K, Kallas R, Koujanian S, Agarwal A, Neumann I, Alexander P, Tikkinen KA, Guyatt G, Akl EA. Requirements of Clinical Journals for Authors’ Disclosure of Financial and Non-Financial Conflicts of Interest: A Cross Sectional Study. PloS One 2016; 11 : e0152301.

Sterne JAC. Why the Cochrane risk of bias tool should not include funding source as a standard item [editorial]. Cochrane Database of Systematic Reviews 2013; 12 : ED000076.

Tramèr MR, Reynolds DJ, Moore RA, McQuay HJ. Impact of covert duplicate publication on meta-analysis: a case study. BMJ 1997; 315 : 635-640.

Turner RM, Spiegelhalter DJ, Smith GC, Thompson SG. Bias modelling in evidence synthesis. Journal of the Royal Statistical Society Series A, (Statistics in Society) 2009; 172 : 21-47.

Urrutia G, Ballesteros M, Djulbegovic B, Gich I, Roque M, Bonfill X. Cancer randomized trials showed that dissemination bias is still a problem to be solved. Journal of Clinical Epidemiology 2016; 77 : 84-90.

Vedula SS, Li T, Dickersin K. Differences in reporting of analyses in internal company documents versus published trial reports: comparisons in industry-sponsored trials in off-label uses of gabapentin. PLoS Medicine 2013; 10 : e1001378.

Viswanathan M, Carey TS, Belinson SE, Berliner E, Chang SM, Graham E, Guise JM, Ip S, Maglione MA, McCrory DC, McPheeters M, Newberry SJ, Sista P, White CM. A proposed approach may help systematic reviews retain needed expertise while minimizing bias from nonfinancial conflicts of interest. Journal of Clinical Epidemiology 2014; 67 : 1229-1238.

Welton NJ, Ades AE, Carlin JB, Altman DG, Sterne JAC. Models for potentially biased evidence in meta-analysis using empirically based priors. Journal of the Royal Statistical Society: Series A (Statistics in Society) 2009; 172 : 119-136.

Wieland LS, Berman BM, Altman DG, Barth J, Bouter LM, D'Adamo CR, Linde K, Moher D, Mullins CD, Treweek S, Tunis S, van der Windt DA, Zwarenstein M, Witt C. Rating of Included Trials on the Efficacy-Effectiveness Spectrum: development of a new tool for systematic reviews. Journal of Clinical Epidemiology 2017; 84 .

Wieseler B, Kerekes MF, Vervoelgyi V, McGauran N, Kaiser T. Impact of document type on reporting quality of clinical drug trials: a comparison of registry reports, clinical study reports, and journal publications. BMJ 2012; 344 : d8141.

Wood L, Egger M, Gluud LL, Schulz K, Jüni P, Altman DG, Gluud C, Martin RM, Wood AJG, Sterne JAC. Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study. BMJ 2008; 336 : 601-605.

Zarin DA, Tse T, Williams RJ, Carr S. Trial Reporting in ClinicalTrials.gov - The Final Rule. New England Journal of Medicine 2016; 375 : 1998-2004.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

  • Mayo Clinic Libraries
  • Systematic Reviews
  • Minimize Bias

Systematic Reviews: Minimize Bias

  • Knowledge Synthesis Comparison
  • Knowledge Synthesis Decision Tree
  • Standards & Reporting Results
  • Materials in the Mayo Clinic Libraries
  • Training Resources
  • Review Teams
  • Develop & Refine Your Research Question
  • Develop a Timeline
  • Project Management
  • Communication
  • PRISMA-P Checklist
  • Eligibility Criteria
  • Register your Protocol
  • Other Resources
  • Other Screening Tools
  • Grey Literature Searching
  • Citation Searching
  • Data Extraction Tools
  • Critical Appraisal by Study Design
  • Synthesis & Meta-Analysis
  • Publishing your Systematic Review

Minimizing Bias

bias in literature review

Multiple types of bias may impact health evidence.  The Cochrane Handbook for Systematic Reviews of Interventions ( Table 7.2.a ) 1 provides definitions of non-reporting biases that can be minimized by identifying all relevant literature on a research topic.

Bias in Locating Studies

  • Publication Bias
  • Time-Lag Bias
  • Language Bias
  • Citation Bias
  • Multiple Publication Bias
  • Location Bias
  • Non-Reporting Bias

The publication or non-publication of research findings, depending on the nature and direction of the results, i.e., “the selective publication of manuscripts based on the magnitude or direction of the study results.” 2

The rapid or delayed publication of research findings, depending on the nature and direction of the results, i.e., “[t]ime-lag bias occurs when the results of negative trials take substantially longer to publish than positive trials.” 3

The publication of research findings in a particular language, depending on the nature and direction of results, i.e., language bias “introduces the risk of ignoring key data… as well as missing important cultural contexts, which may limit the review’s findings and usefulness.” 4

The citation or non-citation of research findings, depending on the nature and direction of the results. Citation bias occurs during the process of citation searching for related publications to include in the review. Bias may be introduced by “selective inclusion of statistically significant studies with effect sizes similar to other published studies retrieved from database searching.” 5

The multiple or singular publication of research findings. When “studies are published in more than one journal to maximize readership and impact of study findings,” they may inadvertently be included in the systematic review more than once. 6

The publication of research findings in journals with different ease of access or levels of indexing in standard databases, depending on the nature and direction of results

The selective reporting of some outcomes or analyses, but not others, depending on the nature and direction of the results, i.e., “Selective reporting bias…, the incomplete publication of outcomes measured or of analyses performed in a study, may lead to the over- or underestimation of treatment effects of harms.” 7

References & Recommended Reading

1.       Boutron I, Page MJ, Higgins JP, Altman DG, Lundh A, Hróbjartsson A.  Considering bias and conflicts of interest among the included studies.  In: Higgins JPT, Thomas J, Chandler J, et al., eds.  Cochrane Handbook for Systematic Reviews of Interventions . version 6.2: Cochrane; 2021.

2.       Montori VM, Smieja M, Guyatt GH.  Publication bias: a brief review for clinicians.   Mayo Clinic proceedings.  2000;75(12):1284-1288.

3.       Reyes MM, Panza KE, Martin A, Bloch MH.  Time-lag bias in trials of pediatric antidepressants: a systematic review and meta-analysis.   Journal of the American Academy of Child and Adolescent Psychiatry.  2011;50(1):63-72.

4.       Stern C, Kleijnen J.  Language bias in systematic reviews: you only get out what you put in.  JBI Evidence Synthesis.  2020;18(9).

5.       Vassar M, Johnson AL, Sharp A, Wayant C.  Citation bias in otolaryngology systematic reviews.   Journal of the Medical Library Association : JMLA.  2021;109(1):62-67.

6.       Fairfield CJ, Harrison EM, Wigmore SJ.  Duplicate publication bias weakens the validity of meta-analysis of immunosuppression after transplantation.  World journal of gastroenterology.  2017;23(39):7198-7200.

7.       Reid EK, Tejani AM, Huan LN, et al.  Managing the incidence of selective reporting bias: a survey of Cochrane review groups.   Systematic reviews.  2015;4:85.

  • << Previous: Quality Assessment & Critical Appraisal
  • Next: Critical Appraisal by Study Design >>
  • Last Updated: Jun 14, 2024 12:57 PM
  • URL: https://libraryguides.mayo.edu/systematicreviewprocess

Systematic Reviews: Reporting the quality/risk of bias

  • Types of literature review, methods, & resources
  • Protocol and registration
  • Search strategy
  • Medical Literature Databases to search
  • Study selection and appraisal
  • Data Extraction/Coding/Study characteristics/Results
  • Reporting the quality/risk of bias
  • Manage citations using RefWorks This link opens in a new window
  • GW Box file storage for PDF's This link opens in a new window

Apply a grading criteria to your selected studies: PRISMA Item 12

When reading the full text of each article identified for inclusion in the review you may wish to apply one of the following Scales/Assessment for quality to each study selected for inclusion: you can choose a method that best fits with your type of review, but before you make your selection first please read other reviews written by/for subject matter experts in your discipline/field/profession, you want to use a grading criteria recognized and used by your peers.  You can report the quality/risk of bias scale you used in your Methods section, and report the grade/level of quality you assign to each study either summarised in the results section or as an extra column in your study characteristics table.

  • GRADE - Grading of Recommendations Assessment, Development and Evaluation GRADE has two levels: strong and weak recommendations. It is a tool for judging the body of evidence as a whole.
  • RoB 2: Cochrane risk-of-bias tool for randomized trials The tool is structured into five domains through which bias might be introduced into the result. The evaluation is assessed into one of 3 categories: high risk of bias, some concerns, and low risk of bias. The assessment is specific to a single trial result that is an estimate of the relative effect of two interventions or intervention strategies on a particular outcome.
  • ROBINS-I: Cochrane risk-of-bias tool for non-randomised studies of interventions The tool has seven domains for appraising non-randomized observational studies such as cohort studies and case-control studies in which intervention groups are allocated during the course of usual treatment decisions and subsequently lack detail, or quasi-randomised studies in which the method of allocation falls short of full randomisation. The evaluation is expressed in terms of 5 outcomes from Low risk of bias to Critical risk bias, or No information on which to base a judgement.
  • CEBM Levels of Evidence, Oxford Centre for Evidence Based Medicine Grades of Recommendation are A-D based on a table evaluating levels of evidence for five types of review: Therapy/Prevention, Aetiology/Harm; Prognosis; Diagnosis; Differential diagnosis/symptom prevalence study; and Economic and decision analyses
  • JADAD - Jadad quality assessment scale for rating Randomized Controlled Trials The higher the score, the higher the quality, this permits an author to rank the studies in a review. The scale was introduced in the journal article Jadad, A., Moore, R., Carroll, D., Jenkinson, C., Reynolds, D., Gavaghan, D., & McQuay, H. (1996). Assessing the quality of reports of randomized clinical trials: is blinding necessary?. Controlled Clinical Trials, 17(1), 1-12.
  • NOS - Newcastle-Ottawa quality assessment scale for case-control studies In the Newcastle-Ottawa Scale (NOS) the reviewer assigns a star rating to case-control studies in three areas: selection, comparability, and exposure.
  • AHRQ checklist for Risk of Bias assessment in Comparative Effectiveness Reviews Unlike the methods described above this checklist does not compute a score but rather can be used to ensure assumptions and limitations are understood and taken into account when interpreting validity and generalizability. Risk of Bias assessment for AHRQ Evidence-based Practice Centers (EPCs) for assessing the quality of studies included in Comparative Effectiveness reviews.
  • CHEC-list for methodological quality in Cost Effectiveness Analyses & Comparative Effectiveness Reviews This checklist does not compute a score but rather can be used to ensure assumptions and limitations are understood and taken into account when interpreting validity and generalizability. See the justifying article at http://journals.cambridge.org/action/displayFulltext?type=1&pdftype=1&fid=292675&jid=THC&volumeId=21&issueId=02&aid=292673
  • QUADAS-2 QUality Assessment tool for Diagnostic Accuracy Studies QUADAS is for assessing the quality of diagnostic accuracy studies. Use this tool when you are following the STARD guidelines for systematic reviews of diagnostic studies.
  • AGREE-II Appraisal of Guidelines for Research and Evaluation AGREE is a tool for assessing the quality of Clinical Practice Guidelines. It is not to be used to appraise journal articles reporting the results of clinical trials but instead it is a quality assessment instrument for evaluating or deciding which guidelines could be recommended for use in practice or to inform health policy decisions.
  • National Guideline Clearinghouse Extent of Adherence to Trustworthy Standards (NEATS) Instrument A 15-item instrument using scales of 1-5 to evaluate a guideline's adherence to the Institute of Medicine's standard for trustworthy guidelines. It has good external validity among guideline developers and good interrater reliability across trained reviewers.
  • The Navigation Guide: Environmental Health/Toxicology studies A method for evaluating the evidence about environmental contaminants and their potential effects on reproductive and developmental health. The GRADE method considers only human experimental and observational evidence. The Navigation Guide also rates studies of laboratory animals and other nonhuman streams of evidence. The result is one of five possible statements about the overall strength of the evidence pertaining to a particular environmental exposure: “known to be toxic,” “probably toxic,” “possibly toxic,” “not classifiable,” or “probably not toxic” to reproductive or developmental health. The link is to the appendix of Woodruff, T., & Sutton, P. (2011). An Evidence-Based Medicine Methodology To Bridge The Gap Between Clinical And Environmental Health Sciences. Health Affairs, 30(5), 931-937.
  • NutriGrade: A Scoring System to Assess and Judge the Meta-Evidence of Randomized Controlled Trials and Cohort Studies in Nutrition Research NutriGrade is for evaluating the quality of evidence in Human Nutrition research
  • Risk of Bias assessment instruments From the CLARITY Group at McMaster University, includes: Tool to Assess Risk of Bias in Cohort Studies Tool to Assess Risk of Bias in Case Control Studies Tool to Assess Risk of Bias in Randomized Controlled Trials Tool to Assess Risk of Bias in Longitudinal Symptom Research Studies Aimed at the General Population Risk of bias instrument for cross-sectional surveys of attitudes and practices.
  • Critical Appraisal Worksheets Oxford Centre for Evidence Based Medicine (CEBM) Used to evaluate the quality of individual Systematic Reviews, Diagnostic, Prognosis, Randomised Controlled Trials (RCT), or Qualitative Studies

Evaluating and Critically Appraising Systematic Reviews

  • AMSTAR: a measurement tool to assess the methodological quality of systematic reviews Does not compute a score, but could be used to double check the content validity of analytical reviews of randomized studies. AMSTAR is an 11-item checklist for evaluating the methodological quality of systematic reviews of randomized/RCT's. The equivalent tool for examining systematic reviews of non-randomized/observational studies would be RoBANS.
  • RoBANS: Risk of Bias Assessment Tool for Nonrandomized Studies Does not compute a score but could be used when examining systematic reviews of non-randomized/observational studies. The RoBANS tool looks at six domains of study methodology: the selection of participants, confounding variables, the measurement of exposure, the blinding of the outcome assessments, incomplete outcome data, and selective outcome reporting. The equivalent tool for examining systematic reviews of randomized/RCT's would be AMSTAR.
  • ROBIS Tool ROBIS is a tool for assessing the risk of bias in systematic reviews (rather than in primary studies). The target audience of ROBIS is primarily guideline developers, and authors of overviews of systematic reviews (‘‘reviews of reviews’’).
  • << Previous: Data Extraction/Coding/Study characteristics/Results
  • Next: Manage citations using RefWorks >>

Creative Commons License

  • Last Updated: Jun 10, 2024 2:14 PM
  • URL: https://guides.himmelfarb.gwu.edu/systematic_review

GW logo

  • Himmelfarb Intranet
  • Privacy Notice
  • Terms of Use
  • GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
  • Himmelfarb Health Sciences Library
  • 2300 Eye St., NW, Washington, DC 20037
  • Phone: (202) 994-2850
  • [email protected]
  • https://himmelfarb.gwu.edu
  • - Google Chrome

Intended for healthcare professionals

  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Assessment of...

Assessment of publication bias, selection bias, and unavailable data in meta-analyses using individual participant data: a database survey

  • Related content
  • Peer review
  • Ikhlaaq Ahmed , postgraduate student 1 ,
  • Alexander J Sutton , professor of medical statistics 2 ,
  • Richard D Riley , senior lecturer in medical statistics 3
  • 1 MRC Midlands Hub for Trials Methodology Research, School of Health and Population Sciences, University of Birmingham, Birmingham B15 2TT, UK
  • 2 Department of Health Sciences, University of Leicester, Leicester LE1 7RH, UK
  • 3 School of Health and Population Sciences, University of Birmingham
  • Correspondence to: R D Riley r.d.riley{at}bham.ac.uk
  • Accepted 7 November 2011

Objective To examine the potential for publication bias, data availability bias, and reviewer selection bias in recently published meta-analyses that use individual participant data and to investigate whether authors of such meta-analyses seemed aware of these issues.

Design In a database of 383 meta-analyses of individual participant data that were published between 1991 and March 2009, we surveyed the 31 most recent meta-analyses of randomised trials that examined whether an intervention was effective. Identification of relevant articles and data extraction was undertaken by one author and checked by another.

Results Only nine (29%) of the 31 meta-analyses included individual participant data from “grey literature” (such as unpublished studies) in their primary meta-analysis, and the potential for publication bias was discussed or investigated in just 10 (32%). Sixteen (52%) of the 31 meta-analyses did not obtain all the individual participant data requested, yet five of these (31%) did not mention this as a potential limitation, and only six (38%) examined how trials without individual participant data might affect the conclusions. In nine (29%) of the meta-analyses reviewer selection bias was a potential issue, as the identification of relevant trials was either not stated or based on a more selective, non-systematic approach. Investigation of four meta-analyses containing data from ≥10 trials revealed one with an asymmetric funnel plot consistent with publication bias, and the inclusion of studies without individual participant data revealed additional heterogeneity between trials.

Conclusions Publication, availability, and selection biases are a potential concern for meta-analyses of individual participant data, but many reviewers neglect to examine or discuss them. These issues warn against uncritically viewing any meta-analysis that uses individual participant data as the most reliable. Reviewers should seek individual participant data from all studies identified by a systematic review; include, where possible, aggregate data from any studies lacking individual participant data to consider their potential impact; and investigate funnel plot asymmetry in line with recent guidelines.

Introduction

Meta-analysis combines the quantitative evidence from related studies to summarise a whole body of research on a particular clinical question, such as whether a treatment is effective. A known threat to the validity of meta-analysis is publication bias, which occurs when studies with statistically significant or clinically favourable results are more likely to be published than studies with non-significant or unfavourable results. 1 2 3 4 Other related biases exist on the continuum towards publication, 5 such as time lag bias 6 7 (where studies with unfavourable findings take longer to be published), language bias 8 (where non-English language articles are more likely to be rewritten in English if they report significant results), and selective outcome reporting 9 (where non-significant study outcomes are entirely excluded on publication). All these biases lead to meta-analyses which synthesise an incomplete set of the evidence and produce summary results potentially biased towards favourable treatment effects. 10 11

Methods to detect publication related biases and assess their potential impact have been well documented for meta-analyses that use extracted aggregated study results (such as treatment effect estimates). 2 4 12 13 14 However, there are relatively few articles that consider biases for meta-analyses that use individual participant data, 15 16 17 18 where the raw, individual level data are obtained for each study and used for synthesis. Individual participant data can be considered the original source material, and—as it allows trial results to be derived directly and independent of study reporting—it (theoretically at least) has potential to reduce publication bias in meta-analysis, especially when it is obtained for unpublished trials. 18 For this, and many other reasons documented previously in the BMJ , 16 meta-analyses using individual participant data are generally considered the most reliable approach to evidence synthesis, 15 19 20 21 22 but this does not guarantee they are bias-free.

When reviewers identify and seek individual participant data from only published trials, publication related biases can affect the subsequent analysis. Burdett et al 23 found that meta-analyses of individual participant data tended to give more favourable treatment effects when excluding data from trials in the “grey literature” (that is, unpublished trials, trials published in non-English language journals, and trials reported as meeting abstracts, book chapters, and letters). But publication related biases are not the only mechanism that may cause an incomplete and potentially biased set of evidence within meta-analyses of individual participant data; two further concerns are data availability bias and reviewer selection bias.

Data availability bias may occur if individual participant data are unavailable for some studies and their unavailability is related to the study results. 24 As with publication bias, this situation leads to a set of available studies that do not reflect the entire evidence base. The impact of availability bias is hard to predict. If researchers of studies with non-significant or clinically unimportant results are more likely to have destroyed or lost their individual participant data, this will bias meta-analyses toward a favourable treatment effect. Conversely, if researchers of studies with favourable findings do not provide their individual participant data because they want to use them further—for example, to examine subgroup effects or an extended follow-up—this may lead to meta-analyses being biased towards a lower treatment effect.

Reviewer selection bias can occur if reviewers deliberately seek only individual participant data from a subset of existing studies and this subset does not reflect the entire evidence base. 25 This is a particular concern when relevant studies are not identified by a systematic review but rather through contacts or friends in their research field, and when the selection takes place with knowledge of individual study results. The impact of selection bias on a given meta-analysis could vary, and may (directly or indirectly) be affected by the selectors’ knowledge of the subject, their research contacts and existing collaborations, and their informed opinion about the research question of interest. Note that agreement to pool individual participant data before knowing the results of studies is less of a concern, and collaborations towards meta-analysis beginning at the onset of individual studies have been encouraged under the term “prospective meta-analysis.” 26

The aim of this article was to survey recently published meta-analyses of individual participant data to empirically examine the potential for publication bias, data availability bias, and reviewer selection bias. We then investigated whether the authors of the meta-analyses seemed aware of these issues. We have used two case studies from our survey to show how such biases may affect clinical conclusions.

Identification and classification of relevant articles

We used an existing database of 383 meta-analyses of individual participant data published between 1991 and March 2009. This database was established using a systematic review of published articles in Medline, Embase, and the Cochrane Library as described elsewhere, 16 that aimed to identify all published meta-analyses of individual participant data. We searched the database to identify recent meta-analyses of randomised controlled trials. We focused on meta-analyses published between 2007 and March 2009 that aimed to establish whether an intervention was effective. Articles synthesising observational studies or a mixture of randomised trials and observational studies were excluded, as were those synthesising randomised trials but not evaluating an intervention effect (such as those investigating development of a prognostic model).

We decided a priori that a sample of about 30 articles would be suitable for uncovering whether the aforementioned biases are a concern and whether authors raise awareness of them. Using the article abstracts, IA classified all articles as a “meta-analysis of randomised trials,” “unclear,” or “not relevant.” IA took all the meta-analysis articles published in 2008 and 2009 and kept randomly sampling additional articles from 2007 until we had a total of 30 articles of a meta-analysis of randomised trials. RDR then checked all these classifications; any discrepancies between IA and RDR were resolved by discussion between all authors. Any articles classed as “unsure” by IA were discussed by all authors and a final classification decision made.

IA then obtained the full text of those articles classed as a meta-analysis of randomised trials and further classed each as “evaluating an intervention,” “unclear,” or “not evaluating an intervention.” As before, these classifications were checked by RDR and discrepancies resolved via discussion between all authors. This resulted in a final set of relevant articles.

Data extraction

For each relevant article IA read the full publication and extracted information to answer the following questions:

How did the reviewers identify the trials for which individual participant data were sought?

Did the authors seek to obtain grey literature trials, and how many (if any) were included in their primary meta-analysis?

What proportion of requested trials actually gave their individual participant data?

If relevant, what reasons were given as to why trials did not provide individual participant data?

If relevant, was the potential impact of trials not providing individual participant data considered in the primary meta-analysis and, if so, how and what was concluded? If not, was data availability bias raised as a potential concern?

Was the potential for publication bias considered in the primary meta-analysis and, if so, how and what was concluded? If not, was publication bias even discussed as a potential concern?

All extracted information was checked by either RDR or AJS.

Statistical assessment of publication bias

For each meta-analysis article containing at least 10 trials, we aimed to examine the potential for publication bias by using a contour enhanced funnel plot and a statistical test for asymmetry. A contour enhanced funnel plot displays trial treatment effect estimates (x axis) against some measure of their precision such as standard error (y axis). When no publication bias is present the plot should show a funnel-like shape, with estimates spanning down from the larger trials symmetrically in both directions with increasing variability. Asymmetry in a funnel plot (also known as small study effects 27 ) is potentially indicative of publication biases, 13 but other sources of heterogeneity may also induce asymmetry in a funnel plot. 13 If there is asymmetry and studies are perceived to be missing in those contour regions of non-statistical significance, there is greater likelihood that the asymmetry is due to publication bias. For each funnel plot, we chose a test for asymmetry in accordance with recent recommendations, 13 and a P value <0.10 was taken to indicate statistical evidence of asymmetry.

The first author classified 73 articles until 30 were deemed a meta-analysis of individual participant data from randomised trials. The second and third authors checked all these 73 article classifications, and subsequently articles with a primary objective to evaluate an intervention were also identified. This produced a final set of 31 articles that were deemed relevant and included in our in depth assessment below. 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 The flow chart for the selection and classification of articles is shown in fig 1 ⇓ .

Fig 1 Flow chart for identification of relevant articles describing meta-analyses using individual participant data from randomised trials that evaluated an intervention

  • Download figure
  • Open in new tab
  • Download powerpoint

Publication bias

In our survey, nine of the 31 articles mentioned seeking individual participant data from trials in the grey literature; all nine reported success and included grey literature data in their primary meta-analysis. For the remaining 22 articles (71%) publication bias is more of a concern, as 20 sought individual participant data only from fully published trials and in the other two it was unclear whether individual participant data from grey literature was included.

Despite the threat of publication bias, only 10 of the 31 articles discussed (9 articles) or examined statistically (1 article) the potential for publication bias in their meta-analysis. Seven of these 10 articles infer that the threat of publication bias was low. For example, De Backer et al comprehensively use “a range of techniques to find unpublished trials: searches of trials registers, contacts with other researchers and contact with the manufacturing company. None of these approaches revealed evidence of unpublished trials … this is in contrast with documented publication bias for other products in this field.” 51

Data availability bias

In our survey, 30 of the 31 articles stated the number of trials providing individual participant data out of the total number requested. Fourteen of the 30 reported obtaining data from all the trials requested; the mean percentage of trial data obtained was 87% and the median was 91% (range was 60–100%). Fourteen of the 30 reported obtaining individual participant data for fewer than 90% of the trials requested, and 10 (33%) of the 30 articles reported obtaining less than 80%. The reasons for unavailability of individual participant data included trial data being lost or destroyed and trial authors not being contactable, unwilling to collaborate, or unable to send their data.

Availability bias is thus a potential concern in the 16 meta-analyses that did not have all the individual participant data requested. Twelve of these reported the percentage of total patients across all trials covered by the individual participant data obtained. This ranged from 66% to 98%, with five (42%) of the 12 analyses obtaining individual participant data for less than 80% of the total patients. The proportion of the total events covered by the available individual participant data was rarely reported.

Five of the 16 articles with unavailable individual participant data (31%) never mention availability bias as a potential limitation, and only six (38%) examine statistically how trials lacking individual participant data might affect the conclusions of the meta-analyses presented. All these six conclude that including the trials lacking individual participant data does not change the statistical or clinical conclusions. For example, Vale et al obtained aggregate results for three of their 10 trials with missing individual data and conclude: “incorporating them into the meta-analysis did not materially change the results.” 34

Reviewer selection bias

In our survey, 22 of the 31 articles performed a systematic review to identify all relevant trials, for which individual participant data were then requested; selection bias is thus not a concern in these articles. However, in the other nine articles (29%) selection bias is a potential issue, as the identification of relevant trials was either not stated (six articles) or based on a selective, non-systematic approach (three articles). For example, Sakamoto et al state that they used a “meticulous search” to identify the five trials in their meta-analysis, 54 but this search is not described. In contrast, Papakostas et al clearly include only eligible studies sponsored by GlaxoSmithKline, but note the potential for other trials in their Methods (“To our knowledge, only two other studies comparing bupropion with an SSRI were not included”) and their Discussion (“it is quite possible that studies sponsored by other sources have been conducted but have not been yet published or presented at major scientific meetings”). 55

Detailed investigation of biases

There were eight meta-analyses that contained 10 or more trials, and in four of these we could extract suitable information to investigate funnel plot asymmetry (potential publication bias). A test for asymmetry was significant (P<0.1) in one, 50 and non-significant in the other three. 34 47 52 We now take two of these (one without asymmetry 47 and the one with asymmetry 50 ) to show our funnel plot assessments in detail and to demonstrate a possible approach for dealing with trials lacking individual participant data.

High dose chemotherapy for treatment of non-Hodgkin’s lymphoma

Greb et al reviewed whether high dose chemotherapy with autologous stem cell transplantation as part of first line treatment improves survival in adults with aggressive non-Hodgkin’s lymphoma. 47 By a systematic review, they identified 15 randomised trials comparing high dose versus conventional chemotherapy. They sought individual participant data from all 15 trials, so selection bias is not a concern. However, publication and availability biases are a threat, as all the trials were fully published and individual participant data were unavailable for five of them (33%). Greb et al examined both these issues, 47 and we now summarise their work and extend it by examining contour enhanced funnel plots.

A fixed effect meta-analysis of the 10 trials with individual participant data gives a summary hazard ratio of 1.14 (95% confidence interval 0.98 to 1.34; I 2 =4%), providing weak evidence that high dose chemotherapy has a modest increase in the hazard of death over time (top part of fig 2 ⇓ ). To investigate availability bias, Greb et al managed to extract hazard ratio estimates for four of the five trials lacking individual participant data. 47 An updated, fixed effect meta-analysis of the 14 trials (10 with individual participant data, four without) now gives a summary hazard ratio of 1.05 (0.92 to 1.19; I 2 =30%), slightly closer to the null value of 1 since the trials without individual participant data have treatment effect estimates more favourable towards high dose chemotherapy than the trials with individual participant data (bottom part of fig 2 ⇓ ), though nearly all trial confidence intervals overlap 1 (the value of no treatment effect). An alternative random effects analysis gives the same conclusion.

Fig 2  Fixed effect meta-analysis by Greb et al, 47 which compared high dose with conventional chemotherapy for survival of patients with aggressive non-Hodgkin’s lymphoma: of the 14 trials included in the analysis, 10 provided individual participant data, and four provided only aggregate results

To investigate potential publication bias, we consider the contour enhanced funnel plot of the 10 trials with individual participant data plus the four trials lacking individual participant data (fig 3 ⇓ ). Visually, this shows only minor asymmetry (with or without inclusion of the studies lacking individual participant data), and Egger’s test for asymmetry is not significant (P=0.14). Thus a publication bias mechanism is not a major cause for concern here. 13

Fig 3  Contour enhanced funnel plot of the 14 trials included in the meta-analysis of Greb et al 47 : 10 trials provided individual participant data, four provided only aggregate results

In summary, the consideration of aggregate data from studies not providing individual participant data and the investigation of publication bias have strengthened the original clinical conclusion from the analysis of individual participant data only that high dose chemotherapy does not affect overall survival. Publication bias does not seem to pose a threat to this meta-analysis, and the pooled effect estimate moves slightly closer to 1 when those studies for which individual participant data were not available are considered.

Early glycoprotein IIb/IIIa inhibitors in primary angioplasty

De Luca et al performed a meta-analysis of individual participant data from randomised trials to evaluate the benefits of early versus late use of glycoprotein IIb/IIIa inhibitors in patients undergoing primary angioplasty for ST segment elevation myocardial infarction. 50 A primary angiographic end point was whether patients achieved a preprocedural Thrombolysis in Myocardial Infarction Study (TIMI) grade 3 flow distal embolisation. A systematic review identified 14 relevant trials, and individual participant data were sought from them all, so selection bias is not a concern. However, availability and publication biases are a threat, as individual participant data were unavailable for three trials (21%), and all 11 trials providing individual participant data were fully published. 50 De Luca et al did not consider statistically the potential impact of studies lacking individual participant data and did not investigate publication bias. We now extend their work accordingly.

A random effects meta-analysis of the 11 trials with individual participant data gives an odds ratio of 2.06 (1.48 to 2.86), with a 95% prediction interval for the odds ratio in an individual clinical setting from 1.03 to 4.89 (fig 4 ⇓ ); this indicates that early use of glycoprotein IIb/IIIa inhibitors was associated with a significantly improved TIMI grade 3 flow. To investigate availability bias, we managed to extract odds ratios for two of the three trials not providing individual participant data (fig 5 ⇓ ). 56 57 Including them alongside the 11 studies with individual participant data in an updated random effects meta-analysis (fig 4 ⇓ ) has a minimal impact of the summary odds ratio estimate (2.02 (1.45 to 2.81)) but increases the extent of between-trial heterogeneity (I 2 =40%), leading to a 95% prediction interval which now includes 1 (0.85 to 4.81), implying early use may not be superior in every clinical setting. 58

Fig 4 Random effects meta-analysis of the 11 studies with individual participant data considered by De Luca et al 50 (evaluating the effects of early or late use of glycoprotein IIb/IIIa inhibitors for patients achieving TIMI grade 3 flow after primary angioplasty) with investigation of the impact of two additional studies lacking individual participant data

Fig 5 Contour enhanced funnel plot for the 11 trials considered in the meta-analysis of individual participant data by De Luca et al 50 plus two trials lacking individual participant data. The solid line indicates the summary result from a meta-analysis of just individual participant data trials (odds ratio 2.06); the dotted line indicates the summary result from a meta-analysis of individual participant data combined with aggregate data from two studies lacking individual participant data (odds ratio 2.02)

To investigate potential publication bias, we examined the contour enhanced funnel plot of the 11 trials with individual participant data plus two trials lacking individual participant data (fig 5 ⇑ ). This shows asymmetry, with small studies systematically having larger effect sizes than the larger studies (Peters’ test for asymmetry, P=0.016). This potentially suggests missing studies on the (bottom) left hand side of the plot. Since such studies would predominantly be in the region of statistical non-significance close to an odds ratio of 1 (that is, no difference between early and late use of glycoprotein IIb/IIIa inhibitors) or less than 1 (that is, early use is not beneficial), this adds strength to the notion that publication bias mechanisms may be operating here, biasing the meta-analysis result in favour of early use. Indeed, when we use a regression method to adjust for this asymmetry, 14 59 the adjusted summary odds ratio is 1.18 (0.79 to 1.76) and non-significant. The asymmetry remains (P=0.045) even when the FINESSE-ANGIO trial 57 is removed, which De Luca et al suggested was of lower quality than the other trials in the meta-analysis. 50

In conclusion, although De Luca et al performed a thorough systematic review (that included searching conference abstracts) and clearly raise awareness that trials lacking individual participant data were excluded, our investigations reveal additional heterogeneity when studies lacking individual participant data are included and an asymmetric funnel plot consistent with publication bias. These issues were not identified in the original publication by De Luca et al. 50 In light of this, we recommend further research is needed to identify the causes of heterogeneity (perhaps factors such as study quality and study definitions of “early”) and establish whether they contribute to the asymmetric nature of the plot.

Though they can be time consuming and expensive, meta-analyses of individual participant data have considerable potential advantages over a traditional meta-analysis of extracted aggregate data. 16 These include the ability to use consistent inclusion-exclusion criteria and statistical methods in each trial; to use up to date follow-up information, which is potentially longer than that used in the original trial publications; to obtain results for unpublished or poorly reported outcomes; and to increase power to detect differential treatment effects (that is, subgroup effects, treatment-covariate interactions). For these reasons, meta-analysis of individual participant data is increasingly popular, with an average of 49 published a year between 2005 and 2009. 16

However, our survey of existing meta-analyses of individual participant data from randomised trials shows that individual participant data from the grey literature are often not included, individual participant data are commonly unavailable, and a selective, non-systematic approach is sometimes used to identify relevant trials. These problems raise the threat of publication, availability, and selection biases, respectively, but many reviewers neglect to examine or discuss them. Such shortcomings warn against uncritically accepting all meta-analyses of individual participant data as optimal without due thought as to how the data were chosen, whether data from unpublished studies were obtained, and whether data were obtained from all studies requested.

Strengths and limitations of study

We recognise that our survey contained only a modest sample of 31 meta-analyses of individual participant data and that, as we did not question review authors directly, methodological deficiencies identified in the meta-analyses are impossible to disentangle from their reporting standards (for example, some reviewers may have investigated publication bias but not reported this). However, we consider our findings sufficient to show that there needs to be greater recognition and investigation of potential biases in meta-analysis of individual participant data.

Recommendations for avoiding and assessing bias in meta-analyses

In a text box we make recommendations for dealing with biases in meta-analyses of individual participant data. All such endeavours should be clearly reported in the publication describing the meta-analysis according to recent reporting guidelines. 16 Though we have focused on meta-analyses of randomised trials, such guidance is also relevant to syntheses of individual participant data from observational studies. 17 For example, funnel plot asymmetry has been shown in a meta-analysis by the Emerging Risk Factors Collaboration (ERFC), which included individual participant data from 31 studies of cardiovascular disease. 17 Further, in a meta-analysis of individual participant data from studies of prognostic factors in lung cancer by Trivella et al, 60 10 of the 38 research groups contacted did not provide the individual participant data requested.

Recommendations for avoiding and assessing publication related biases, data availability bias, and reviewer selection bias in individual participant data meta-analyses

Meta-analyses of individual participant data should ideally be informed by a rigorous systematic review that searches for both published and unpublished studies

Researchers should seek individual participant data for all relevant studies identified (or at least those of highest quality)

When some individual participant data cannot be obtained, the impact of this on meta-analysis conclusions should be investigated by means of including the aggregate data from the studies lacking individual participant data, 24 65 66 though this may not always be possible (for example, if suitable aggregate data are not available or if individual participant data are required for complex statistical modelling)

This is especially important when the number of studies with individual participant data is small or the proportion of individual participant data missing is large (for example, when individual participant data for >10% of trials or >10% of patients or events in all the trials are unavailable)

Where the inclusion of studies lacking individual participant data seems to have an important statistical or clinical impact, it may be helpful to compare the characteristics of the studies with individual participant data and of those without and to see if there are any key differences (such as in their quality, follow-up length, statistical methods, etc)

The potential for publication bias should be considered, with assessment of funnel plot asymmetry (with and without studies lacking individual participant data) adhering to the guidelines published recently in the BMJ 13

Our survey found that most (71%) articles do not include individual participant data from the grey literature, emphasising why obtaining individual participant data does not automatically remove the potential for publication bias in meta-analysis. It was disappointing to find that grey literature was sought in only 29% of the meta-analyses. In reviews that use extracted aggregated study results there is a similar problem: Song et al found that grey literature was explicitly sought in only 50% of treatment reviews, 30% of diagnostic reviews, 32% of risk-factor reviews, and 8% of genetic reviews, and furthermore, although 34% of 300 reviews explicitly searched for grey literature, only 13% included them. 11

The potential for publication bias should thus be examined wherever possible in meta-analyses of individual participant data (box ). In particular, assessment of funnel plot asymmetry, and thus potential publication bias, should be routinely used in meta-analyses synthesising 10 or more trials, and we refer readers to more detailed guidelines in the BMJ about this. 13 Our survey shows that funnel plot investigations are currently rare in meta-analyses of individual participant data and publication bias is often not even discussed. Publication bias is also often neglected in standard meta-analyses of aggregate data: for instance, a recent review found that only 7 of 75 Cochrane reviews investigated publication bias or explained why not, 61 and the wider review by Song et al found that potential publication bias was discussed more often in genetic reviews (70%) than in treatment reviews (32%), diagnostic reviews (48%), and risk factor reviews (42%). 11

For any meta-analysis, the aim should be to obtain individual participant data or suitable aggregate data for all trials rather than selecting a potentially biased subset. 62 Meta-analyses need to be inclusive rather than exclusive; for example, a meta-analysis of individual participant data by the Early Breast Cancer Trialists’ Collaborative Group involved over 400 named collaborators, 63 who commendably provided individual participant data for 42 000 women from 78 randomised treatment comparisons. To avoid reviewer selection bias, meta-analyses should ideally be informed by rigorous systematic reviews that search for published and unpublished studies, and we encourage researchers to seek individual participant data for all relevant studies identified (or at least those of highest quality). The possible exception to this is for trials where suitable aggregate data can already be extracted from trial publications, as, other things being equal (such as length of follow-up, number of included patients, etc), such aggregate data will be sufficient 64 and so individual participant data are not needed. 16 However, because of the advantages of having individual participant data, reviewers aiming to use individual participant data will usually prefer to obtain individual participant data for as many trials as possible.

Our survey found that 33% of the meta-analyses between 2007 and 2009 obtained less than 80% of the individual participant data requested. This builds on earlier reviews of availability of individual participant data, 20 24 which found that 24% of 175 meta-analyses published up to 2005 obtained less than 80% of the individual participant data requested. 24 Thus, there is no indication that availability of individual participant data is improving over time, though we note that the UK MRC Clinical Trials Unit seems more consistently successful. 18 When reviewers are unsuccessful in obtaining individual participant data for some trials, it does not necessarily follow that a meta-analysis of the subset of trials with individual participant data is more desirable than a meta-analysis using suitable aggregate data from all trials. Indeed, the reviewers face a conundrum: the meta-analysis of individual participant data may be prone to data availability bias, but the meta-analysis of aggregate data from all trials may be limited by, for example, shorter follow-up time and inconsistent inclusion criteria and statistical methods in each study (the very reasons why the individual participant data were originally sought).

In such situations we recommend that, ideally, all synthesis options are reported and each of their limitations noted: that is, the meta-analysis of individual participant data from a subset of trials, the meta-analysis of aggregate data from all trials, and a meta-analysis that combines the individual participant data with the aggregate data from the trials lacking individual participant data. The last approach has been recommended to allow reviewers to investigate the potential impact of trials lacking individual participant data on the conclusions from the meta-analysis of individual participant data, 24 and this was illustrated in our two detailed examples (figs 2-5), where we obtained suitable aggregate data from trials lacking individual participant data and added them to the meta-analyses and funnel plot assessments. Statistical approaches that synthesise both individual participant data and aggregate data are potentially valuable, 24 65 66 though we recognise the extraction and inclusion of aggregate data become more difficult when going beyond the overall treatment effect, such as the assessment of differential treatment effects across individuals, 65 and may only serve to amplify why individual participant data were desired.

It may also be worth comparing the characteristics (such as quality) of studies lacking individual participant data and those with individual participant data. For example, McCormack et al compared hernia trials that provided individual participant data with those not providing such data and concluded: “Other than the availability of unpublished data, there were no clear differences in trial characteristics between those with or without individual participant data.” 67 We also did not identify any clear differences in the characteristics of studies with or without individual participant data in our two detailed examples (figs 2 and 4), but a broader investigation of any differences in a wide range of fields would be informative. In situations where there are differences (such as the studies lacking individual participant data being of poorer quality or having different inclusion criteria, statistical methods, etc) this may lead to different summary results and increased between-trial heterogeneity in a meta-analysis combining studies with and without individual participant data compared with an analysis of individual participant data only. In a sensitivity analysis reviewers could investigate whether any indication of bias (such as different sizes of estimates from studies with individual participant data and from those without, or evidence of funnel plot asymmetry, etc) remains when studies with individual participant data are standardised to match those lacking individual participant data as far as possible (such as in terms of length of follow-up, statistical analysis methods, inclusion criteria, etc).

Finally, we recognise it is clearly best to prevent biases occurring in the first place, so we strongly support calls for data sharing 68 and transparency of research through study protocols, study registers, 1 69 and complete reporting. 70

What is already known on this topic

Publication related biases hide relevant trials and their results, and potentially lead to meta-analyses being biased toward favourable treatment effects

This problem has received little attention in meta-analyses that use individual participant data

What this study adds?

A survey of 31 meta-analyses of individual participant data from randomised trials published between 2007 and 2009 reveals that only 29% included trials from “grey literature” (such as unpublished trials or trials published only as conference abstracts), thus publication bias is still a concern for many meta-analyses, but this was often not discussed by authors

A third of the meta-analyses obtained less than 80% of the individual participant data requested, making them susceptible to data availability bias, but this was often not considered by authors

In 29% of the meta-analyses identification of relevant trials was either not stated or based on a selective, non-systematic approach, raising the possibility of reviewer selection bias

Cite this as: BMJ 2012;344:d7762

Contributorship: RDR conceived and supervised the study, alongside AJS. IA identified relevant articles for the survey, checked by RDR and AJS. IA performed data extractions and initial meta-analyses, checked by RDR and AJS. RDR obtained the aggregate data from the two non-individual participant data trials in example 2 and extended the meta-analysis accordingly. IA drafted the first version of the article, and this was revised by RDR and AJS.

Funding: IA is funded by the MRC Midlands Hub for Trials Methodology Research, of which RDR is its deputy director.

Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Ethical approcal: Not required.

Data sharing: No additional data available.

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode .

  • ↵ Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol 1986 ; 4 : 1529 -41. OpenUrl Abstract / FREE Full Text
  • ↵ Sterne JA, Egger M, Smith GD. Systematic reviews in health care: investigating and dealing with publication and other biases in meta-analysis. BMJ 2001 ; 323 : 101 -5. OpenUrl FREE Full Text
  • ↵ Sutton AJ, Duval SJ, Tweedie RL, Abrams KR, Jones DR. Empirical assessment of effect of publication bias on meta-analyses. BMJ 2000 ; 320 : 1574 -7. OpenUrl Abstract / FREE Full Text
  • ↵ Rothstein HR, Sutton AJ, Borenstein M (eds) . Publication bias in meta-analysis . Wiley, 2005 .
  • ↵ Smith R. What is publication? A continuum. BMJ 1999 ; 318 : 142 . OpenUrl FREE Full Text
  • ↵ Ioannidis JPA. Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials. JAMA 1998 ; 279 : 281 -6. OpenUrl CrossRef PubMed Web of Science
  • ↵ Clarke M, Stewart LA. Time lag bias in publishing clinical trials [letter]. JAMA 1998 ; 279 : 1952 -3. OpenUrl CrossRef PubMed Web of Science
  • ↵ Egger M, Zellweger-Zahner T, Schneider M, Junker C, Lengeler C, Antes G. Language bias in randomised controlled trials published in English and German. Lancet 1997 ; 350 : 326 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Kirkham JJ, Dwan KM, Altman DG, Gamble C, Dodd S, Smyth R, et al. The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews. BMJ 2010 ; 340 : c365 . OpenUrl Abstract / FREE Full Text
  • ↵ Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med 2008 ; 358 : 252 -60. OpenUrl CrossRef PubMed Web of Science
  • ↵ Song F, Parekh S, Hooper L, Loke YK, Ryder J, Sutton AJ. Dissemination and publication of research findings: an updated review of related biases. Health Technol Assess 2010 ; 14 : iii , ix-xi, 1-193. OpenUrl PubMed Web of Science
  • ↵ Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997 ; 315 : 629 -34. OpenUrl Abstract / FREE Full Text
  • ↵ Sterne JAC, Sutton AJ, Ioannidis JPA, Terrin N, Jones DR, Lau J, et al. Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials. BMJ 2011 ; 342 : d4002 . OpenUrl
  • ↵ Moreno SG, Sutton AJ, Turner EH, Abrams KR, Cooper NJ, Palmer TM, et al. Novel methods to deal with publication biases: secondary analysis of antidepressant trials in the FDA trial registry database and related journal publications. BMJ 2009 ; 339 : b2981 . OpenUrl Abstract / FREE Full Text
  • ↵ Stewart LA, Tierney JF. To IPD or not to IPD? Advantages and disadvantages of systematic reviews using individual patient data. Eval Health Prof 2002 ; 25 : 76 -97. OpenUrl Abstract / FREE Full Text
  • ↵ Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: conduct, rationale and reporting. BMJ 2010 ; 340 : c221 . OpenUrl FREE Full Text
  • ↵ Riley RD. Commentary: Like it and lump it? Meta-analysis using individual participant data. Int J Epidemiol 2010 ; 39 : 1359 -61. OpenUrl FREE Full Text
  • ↵ Stewart L, Tierney J, Burdett S. Do systematic reviews based on individual patient data offer a means of circumventing biases associated with trial publications? In: Rothstein HR, Sutton AJ, Borenstein M, eds. Publication bias in meta-analysis: prevention, assessment and adjustments . John Wiley, 2006 .
  • ↵ Broeze KA, Opmeer BC, van der Veen F, Bossuyt PM, Bhattacharya S, Mol BW. Individual patient data meta-analysis: a promising approach for evidence synthesis in reproductive medicine. Hum Reprod Update 2010 ; 16 : 561 -7. OpenUrl Abstract / FREE Full Text
  • ↵ Simmonds MC, Higgins JPT, Stewart LA, Tierney JF, Clarke MJ, Thompson SG. Meta-analysis of individual patient data from randomized trials: a review of methods used in practice. Clin Trials 2005 ; 2 : 209 -17. OpenUrl Abstract / FREE Full Text
  • ↵ Zheng MH, Shi KQ, Fan YC, Chen YP. Meta-analysis using individual participant data is the gold standard for diagnostic studies. Hepatology 2011 ; 53 : 1062 [author reply 1062-3]. OpenUrl PubMed
  • ↵ Stewart LA, Parmar MK. Meta-analysis of the literature or of individual patient data: is there a difference? Lancet 1993 ; 341 : 418 -22. OpenUrl CrossRef PubMed Web of Science
  • ↵ Burdett S, Stewart LA, Tierney JF. Publication bias and meta-analyses: a practical example. Int J Technol Assess Health Care 2003 ; 19 : 129 -34. OpenUrl CrossRef PubMed Web of Science
  • ↵ Riley RD, Simmonds MC, Look MP. Evidence synthesis combining individual patient data and aggregate data: a systematic review identified current practice and possible methods. J Clin Epidemiol 2007 ; 60 : 431 -9. OpenUrl PubMed Web of Science
  • ↵ Clarke MJ, Stewart LA. Obtaining data from randomised controlled trials: how much do we need for reliable and informative meta-analyses? In: Chalmers I, Altman DG, eds. Systematic reviews . BMJ Publishing, 1995 :37-47.
  • ↵ Margitic SE, Morgan TM, Sager MA, Furberg CD. Lessons learned from a prospective meta-analysis. J Am Geriatr Soc 1995 ; 43 : 435 -9. OpenUrl PubMed Web of Science
  • ↵ Sterne JA, Gavaghan D, Egger M. Publication and related bias in meta-analysis: power of statistical tests and prevalence in the literature. J Clin Epidemiol 2000 ; 53 : 1119 -29. OpenUrl CrossRef PubMed Web of Science
  • ↵ Shepperd S, Doll H, Broad J, Gladman J, Iliffe S, Langhorne P, et al. Early discharge hospital at home. Cochrane Database Syst Rev 2009 ;(1):CD000356.
  • ↵ Rosenfeld E, Beyerlein A, Hadders-Algra M, Kennedy K, Singhal A, Fewtrell M, et al. IPD meta-analysis shows no effect of LC-PUFA supplementation on infant growth at 18 months. Acta Paediatr 2009 ; 98 : 91 -7. OpenUrl CrossRef PubMed
  • ↵ Geddes JR, Calabrese JR, Goodwin GM. Lamotrigine for treatment of bipolar depression: independent meta-analysis and meta-regression of individual patient data from five randomised trials. Br J Psychiatry 2009 ; 194 : 4 -9. OpenUrl Abstract / FREE Full Text
  • ↵ Cranney A, Wells GA, Yetisir E, Adami S, Cooper C, Delmas PD, et al. Ibandronate for the prevention of nonvertebral fractures: a pooled analysis of individual patient data. Osteoporosis Int 2009 ; 20 : 291 -7. OpenUrl CrossRef PubMed Web of Science
  • ↵ Young J, De Sutter A, Merenstein D, van Essen GA, Kaiser L, Varonen H, et al. Antibiotics for adults with clinically diagnosed acute rhinosinusitis: a meta-analysis of individual patient data. Lancet 2008 ; 371 : 908 -14. OpenUrl CrossRef PubMed Web of Science
  • ↵ Wang D, Connock M, Barton P, Fry-Smith A, Aveyard P, Moore D. ‘Cut down to quit’ with nicotine replacement therapies in smoking cessation: a systematic review of effectiveness and economic analysis. Health Technol Assess 2008 ; 12 : i -156. OpenUrl
  • ↵ Vale C, Tierney JF, Stewart LA, Brady M, Dinshaw K, Jakobsen A, et al. Reducing uncertainties about the effects of chemoradiotherapy for cervical cancer: a systematic review and meta-analysis of individual patient data from 18 randomized trials. J Clin Oncol 2008 ; 26 : 5802 -12. OpenUrl Abstract / FREE Full Text
  • ↵ Uronis HE, Currow DC, McCrory DC, Samsa GP, Abernethy AP. Oxygen for relief of dyspnoea in mildly- or non-hypoxaemic patients with cancer: a systematic review and meta-analysis. Br J Cancer 2008 ; 98 : 294 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Shepperd S, Doll H, Angus RM, Clarke MJ, lliffe S, Kalra L, et al. Admission avoidance hospital at home. Cochrane Database Syst Rev 2008 ;(4):CD007491.
  • ↵ Pollack MH, Endicott J, Liebowitz M, Russell J, Detke M, Spann M, et al. Examining quality of life in patients with generalized anxiety disorder: Clinical relevance and response to duloxetine treatment. J Psychiatr Res 2008 ; 42 : 1042 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Pignon JP, Tribodet H, Scagliotti GV, Douillard JY, Shepherd FA, Stephens RJ, et al. Lung adjuvant cisplatin evaluation: a pooled analysis by the LACE collaborative group. J Clin Oncol 2008 ; 26 : 3552 -9. OpenUrl Abstract / FREE Full Text
  • ↵ Piccart-Gebhart MJ, Burzykowski T, Buyse M, Sledge G, Carmichael J, Luck HJ, et al. Taxanes alone or in combination with anthracyclines as first-line therapy of patients with metastatic breast cancer. J Clin Oncol 2008 ; 26 : 1980 -6. OpenUrl Abstract / FREE Full Text
  • ↵ Pearse RM, Belsey JD, Cole JN, Bennett ED. Effect of dopexamine infusion on mortality following major surgery: Individual patient data meta-regression analysis of published clinical trials. Crit Care Med 2008 ; 36 : 1323 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Papakostas GI, Trivedi MH, Alpert JE, Seifert CA, Krishen A, Goodale EP, et al. Efficacy of bupropion and the selective serotonin reuptake inhibitors in the treatment of anxiety symptoms in major depressive disorder: a meta-analysis of individual patient data from 10 double-blind, randomized clinical trials. J Psychiatr Res 2008 ; 42 : 134 -40. OpenUrl CrossRef PubMed
  • ↵ Mehta SR, Boden WE, Eikelboom JW, Flather M, Steg PG, Avezum A, et al. Antithrombotic therapy with fondaparinux in relation to interventional management strategy in patients with ST-and non-ST-segment elevation acute coronary syndromes: an individual patient-level combined analysis of the fifth and sixth organization to assess strategies in ischemic syndromes (OASIS 5 and 6) randomized trials. Circulation 2008 ; 118 : 2038 -46. OpenUrl Abstract / FREE Full Text
  • ↵ Koopman L, Hoes AW, Glasziou PP, Appelman CL, Burke P, McCormick DP, et al. Antibiotic therapy to prevent the development of asymptomatic middle ear effusion in children with acute otitis media: a meta-analysis of individual patient data. Arch Otolaryngol Head Neck Surg 2008 ; 134 : 128 -32. OpenUrl CrossRef PubMed
  • ↵ Harris ST, Blumentals WA, Miller PD. Ibandronate and the risk of non-vertebral and clinical fractures in women with postmenopausal osteoporosis: Results of a meta-analysis of phase III studies. Cur Med Res Opin 2008 ; 24 : 237 -45. OpenUrl CrossRef
  • ↵ Halkes PHA, Gray LJ, Bath PMW, Diener HC, Guiraud-Chaumeil B, Yatsu FM, et al. Dipyridamole plus aspirin versus aspirin alone in secondary prevention after TIA or stroke: a meta-analysis by risk. J Neurol Neurosurg Psychiatry 2008 ; 79 : 1218 -23. OpenUrl Abstract / FREE Full Text
  • ↵ Group EBCTC. Adjuvant chemotherapy in oestrogen-receptor-poor breast cancer: patient-level meta-analysis of randomised trials. Lancet 2008 ; 371 : 29 -40. OpenUrl CrossRef PubMed Web of Science
  • ↵ Greb A, Bohlius J, Schiefer D, Schwarzer G, Schulz H, Engert A. High-dose chemotherapy with autologous stem cell transplantation in the first line treatment of aggressive non-Hodgkin lymphoma (NHL) in adults. Cochrane Database System Rev 2008 (1):CD004024.
  • ↵ Fruh M, Rolland E, Pignon JP, Seymour L, Ding K, Tribodet H, et al. Pooled analysis of the effect of age on adjuvant cisplatin-based chemotherapy for completely resected non-small-cell lung cancer. J Clin Oncol 2008 ; 26 : 3573 -81. OpenUrl Abstract / FREE Full Text
  • ↵ Ford AC, Moayyedi P, Jarbol DE, Logan RFA, Delaney BC. Meta-analysis: Helicobacter pylori ‘test and treat’ compared with empirical acid suppression for managing dyspepsia. Aliment Pharmacol Ther 2008 ; 28 : 534 -44. OpenUrl CrossRef PubMed
  • ↵ De Luca G, Gibson CM, Bellandi F, Murphy S, Maioli M, Noc M, et al. Early glycoprotein IIb-IIIa inhibitors in primary angioplasty (EGYPT) cooperation: an individual patient data meta-analysis. Heart 2008 ; 94 : 1548 -58. OpenUrl Abstract / FREE Full Text
  • ↵ De Backer TLM, Vander Stichele R, Lehert P, Van Bortel L. Naftidrofuryl for intermittent claudication. Cochrane Database Syst Rev 2008 ;(2):CD001368.
  • ↵ Burdett S, Stephens R, Stewart L, Tierney J, Auperin A, Le CT, et al. Chemotherapy in addition to supportive care improves survival in advanced non-small-cell lung cancer: A systematic review and meta-analysis of individual patient data from 16 randomized controlled trials. J Clin Oncol 2008 ; 26 : 4617 -25. OpenUrl Abstract / FREE Full Text
  • ↵ Timmer JR, Ottervanger JP, De Boer MJ, Boersma E, Grines CL, Westerhout CM, et al. Primary percutaneous coronary intervention compared with fibrinolysis for myocardial infarction in diabetes mellitus: results from the primary coronary angioplasty vs thrombolysis-2 trial. Arch Intern Med 2007 ; 167 : 1353 -9. OpenUrl CrossRef PubMed Web of Science
  • ↵ Sakamoto J, Hamada C, Yoshida S, Kodaira S, Yasutomi M, Kato T, et al. An individual patient data meta-analysis of adjuvant therapy with uracil-tegafur (UFT) in patients with curatively resected rectal cancer. Br J Cancer 2007 ; 96 : 1170 -7. OpenUrl CrossRef PubMed Web of Science
  • ↵ Papakostas GI, Montgomery SA, Thase ME, Katz JR, Krishen A, Tucker VL. Comparing the rapidity of response during treatment of major depressive disorder with bupropion and the SSRIs: a pooled survival analysis of 7 double-blind, randomized clinical trials. J Clin Psychiatry 2007 ; 68 : 1907 -12. OpenUrl CrossRef PubMed Web of Science
  • ↵ Lee DP, Herity NA, Hiatt BL, Fearon WF, Rezaee M, Carter AJ, et al. TIrofiban given in the emergency room before primary angioplasty. Adjunctive platelet glycoprotein IIb/IIIa receptor inhibition with tirofiban before primary angioplasty improves angiographic outcomes: results of the TIrofiban Given in the Emergency Room before Primary Angioplasty (TIGER-PA) pilot trial. Circulation 2003 ; 107 : 1497 -501. OpenUrl Abstract / FREE Full Text
  • ↵ Prati F, Petronio S, Van Boven AJ, Tendera M, De Luca L, de Belder MA, et al. Evaluation of infarct-related coronary artery patency and microcirculatory function after facilitated percutaneous primary coronary angioplasty: the FINESSE-ANGIO (facilitated intervention with enhanced reperfusion speed to stop events-angiographic) Study. JACC Cardiovasc Interv 2010 ; 3 : 1284 -91. OpenUrl PubMed Web of Science
  • ↵ Riley RD, Higgins JP, Deeks JJ. Interpretation of random effects meta-analyses. BMJ 2011 ; 342 : d549 . OpenUrl FREE Full Text
  • ↵ Moreno SG, Sutton AJ, Ades AE, Stanley TD, Abrams KR, Peters JL, et al. Assessment of regression-based methods to adjust for publication bias through a comprehensive simulation study. BMC Med Res Methodol 2009 ; 9 : 2 . OpenUrl CrossRef PubMed
  • ↵ Trivella M, Pezzella F, Pastorino U, Harris AL, Altman DG. Microvessel density as a prognostic factor in non-small-cell lung carcinoma: a meta-analysis of individual patient data. Lancet Oncol. 2007 ; 8 : 488 -99. OpenUrl CrossRef PubMed Web of Science
  • ↵ Riley RD, Gates SG, Neilson J, Alfirevic Z. Statistical methods used within Cochrane pregnancy and childbirth reviews: a review found improvements are necessary. J Clin Epidemiol 2011 ; 64 : 608 -18. OpenUrl CrossRef PubMed
  • ↵ Higgins JPT, Green S, eds. Cochrane handbook for systematic reviews of interventions . John Wiley, 2008 .
  • ↵ Group EBCTC. Effects of radiotherapy and of differences in the extent of surgery for early breast cancer on local recurrence and 15-year survival: an overview of the randomised trials. Lancet 2005 ; 366 : 2087 -106. OpenUrl CrossRef PubMed Web of Science
  • ↵ Mathew T, Nordstrom K. On the equivalence of meta-analysis using literature and using individual patient data. Biometrics 1999 ; 55 : 1221 -3. OpenUrl CrossRef PubMed Web of Science
  • ↵ Riley RD, Lambert PC, Staessen JA, Wang J, Gueyffier F, Thijs L, et al. Meta-analysis of continuous outcomes combining individual patient data and aggregate data. Stat Med 2008 ; 27 : 1870 -93. OpenUrl CrossRef PubMed Web of Science
  • ↵ Sutton AJ, Kendrick D, Coupland CA. Meta-analysis of individual- and aggregate-level data. Stat Med 2008 ; 27 : 651 -69. OpenUrl CrossRef PubMed Web of Science
  • ↵ McCormack K, Scott N, Grant A. Are trials with individual patient data available different from trials without individual patient data available? 9th Annual Cochrane Colloquium Abstracts, Lyon. 2001 .
  • ↵ Vickers AJ. Making raw data more widely available. BMJ 2011 ; 342 : d2323 . OpenUrl FREE Full Text
  • ↵ Horton R, Smith R. Time to register randomised trials. The case is now unanswerable. BMJ 1999 ; 319 : 865 -6. OpenUrl FREE Full Text
  • ↵ Schulz KF, Altman DG, Moher D. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ 2010 ; 340 : c332 OpenUrl FREE Full Text

bias in literature review

DistillerSR Logo

Types of Bias in Systematic Reviews

bias in literature review

Automate every stage of your literature review to produce evidence-based research faster and more accurately.

If you’re interested in understanding the automated process of how to do a systematic review , you can check out our article above.

A bias can be introduced in a study at any stage of the process – from formulating the research question, establishing the eligibility criteria for inclusion and exclusion of primary studies, reviewing collected resources, to choosing which findings to publish. The hallmark of a systematic review is a reduced risk of bias. However, they are not fully immune to bias. The strengths and weaknesses of a systematic review depend solely on how the reviewer addresses the introduced errors. Let us look at the type of biases that can creep into a review in each of its stages.

Bias In Study Design

This kind of bias arises in the first step of formulating the review design and protocol. It could introduce a bias in the way the author frames the research question due to insufficient knowledge in the field of research. The author could, for example, decide to include only males in the study, assuming that no previous studies have been conducted on females. Other errors may arise due to an inefficient search strategy. For example, if reviewers have assigned arbitrary search limiters such as geographical regions or year of publication. Imposing such limiters will undoubtedly produce a biased sample set since it fails to collect all the available evidence.

Selection Bias

This kind of bias is introduced while collecting the primary resources for the study. If the collection of resources is not exhaustive, it could lead to over or underestimation of the results. The collection of resources for a systematic review must include all available resources, including grey literature. Potential personal bias can also be introduced by the reviewers in charge of selecting the primary studies. Key concepts regarding the eligibility criteria of studies included and excluded in the review must be clearly stated to avoid this kind of bias. Most of the known errors in systematic reviews arise in the selection and publication stages.

Publication Bias

An author or publisher may not publish a study whose results are negative or are not statistically significant. This is called publication bias. The outcomes may not be of relevance to the publisher but may have serious clinical implications.

Selective Outcome Reporting

Learn more about distillersr.

(Article continues below)

bias in literature review

Lack Of A Risk Of Bias Assessment

When primary resources are picked up to be included in a study, risk of bias assessment for each of these primary studies has to be done. Failure to critically appraise each of the primary studies by a reviewer can result in the accumulation of bias in the final outcomes of the systematic review.

Conclusion Bias

It relates to the way the author decides to relay the conclusions derived from the systematic review. Again, this goes back to careful consideration of the research question. The decision on representing the outcomes qualitatively or quantitatively is crucial to how the outcome is utilized in the future. But if you’re wondering, are systematic reviews quantitative or qualitative , you can learn more on the topic from our article linked above.

Final Takeaway

3 reasons to connect.

bias in literature review

LSE - Small Logo

  • About the LSE Impact Blog
  • Comments Policy
  • Popular Posts
  • Recent Posts
  • Subscribe to the Impact Blog
  • Write for us
  • LSE comment

Neal Haddaway

October 19th, 2020, 8 common problems with literature reviews and how to fix them.

3 comments | 318 shares

Estimated reading time: 5 minutes

Literature reviews are an integral part of the process and communication of scientific research. Whilst systematic reviews have become regarded as the highest standard of evidence synthesis, many literature reviews fall short of these standards and may end up presenting biased or incorrect conclusions. In this post, Neal Haddaway highlights 8 common problems with literature review methods, provides examples for each and provides practical solutions for ways to mitigate them.

Enjoying this blogpost? 📨 Sign up to our  mailing list  and receive all the latest LSE Impact Blog news direct to your inbox.

Researchers regularly review the literature – it’s an integral part of day-to-day research: finding relevant research, reading and digesting the main findings, summarising across papers, and making conclusions about the evidence base as a whole. However, there is a fundamental difference between brief, narrative approaches to summarising a selection of studies and attempting to reliably and comprehensively summarise an evidence base to support decision-making in policy and practice.

So-called ‘evidence-informed decision-making’ (EIDM) relies on rigorous systematic approaches to synthesising the evidence. Systematic review has become the highest standard of evidence synthesis and is well established in the pipeline from research to practice in the field of health . Systematic reviews must include a suite of specifically designed methods for the conduct and reporting of all synthesis activities (planning, searching, screening, appraising, extracting data, qualitative/quantitative/mixed methods synthesis, writing; e.g. see the Cochrane Handbook ). The method has been widely adapted into other fields, including environment (the Collaboration for Environmental Evidence ) and social policy (the Campbell Collaboration ).

bias in literature review

Despite the growing interest in systematic reviews, traditional approaches to reviewing the literature continue to persist in contemporary publications across disciplines. These reviews, some of which are incorrectly referred to as ‘systematic’ reviews, may be susceptible to bias and as a result, may end up providing incorrect conclusions. This is of particular concern when reviews address key policy- and practice- relevant questions, such as the ongoing COVID-19 pandemic or climate change.

These limitations with traditional literature review approaches could be improved relatively easily with a few key procedures; some of them not prohibitively costly in terms of skill, time or resources.

In our recent paper in Nature Ecology and Evolution , we highlight 8 common problems with traditional literature review methods, provide examples for each from the field of environmental management and ecology, and provide practical solutions for ways to mitigate them.

Problem Solution
Lack of relevance – limited stakeholder engagement can produce a review that is of limited practical use to decision-makers Stakeholders can be identified, mapped and contacted for feedback and inclusion without the need for extensive budgets – check out best-practice guidance
Mission creep – reviews that don’t publish their methods in an a priori protocol can suffer from shifting goals and inclusion criteria Carefully design and publish an a priori protocol that outlines planned methods for searching, screening, data extraction, critical appraisal and synthesis in detail. Make use of existing organisations to support you (e.g. the Collaboration for Environmental Evidence).
A lack of transparency/replicability in the review methods may mean that the review cannot be replicated – a central tenet of the scientific method! Be explicit, and make use of high-quality guidance and standards for review conduct (e.g. CEE Guidance) and reporting (PRISMA or ROSES)
Selection bias (where included studies are not representative of the evidence base) and a lack of comprehensiveness (an inappropriate search method) can mean that reviews end up with the wrong evidence for the question at hand Carefully design a search strategy with an info specialist; trial the search strategy (against a benchmark list); use multiple bibliographic databases/languages/sources of grey literature; publish search methods in an a priori protocol for peer-review
The exclusion of grey literature and failure to test for evidence of publication bias can result in incorrect or misleading conclusions Include attempts to find grey literature, including both ‘file-drawer’ (unpublished academic) research and organisational reports. Test for possible evidence of publication bias.
Traditional reviews often lack appropriate critical appraisal of included study validity, treating all evidence as equally valid – we know some research is more valid and we need to account for this in the synthesis. Carefully plan and trial a critical appraisal tool before starting the process in full, learning from existing robust critical appraisal tools.
Inappropriate synthesis (e.g. using vote-counting and inappropriate statistics) can negate all of the preceding systematic effort. Vote-counting (tallying studies based on their statistical significance) ignores study validity and magnitude of effect sizes. Select the synthesis method carefully based on the data analysed. Vote-counting should never be used instead of meta-analysis. Formal methods for narrative synthesis should be used to summarise and describe the evidence base.

There is a lack of awareness and appreciation of the methods needed to ensure systematic reviews are as free from bias and as reliable as possible: demonstrated by recent, flawed, high-profile reviews. We call on review authors to conduct more rigorous reviews, on editors and peer-reviewers to gate-keep more strictly, and the community of methodologists to better support the broader research community. Only by working together can we build and maintain a strong system of rigorous, evidence-informed decision-making in conservation and environmental management.

Note: This article gives the views of the authors, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our  comments policy  if you have any concerns on posting a comment below

Image credit:  Jaeyoung Geoffrey Kang  via unsplash

Print Friendly, PDF & Email

About the author

bias in literature review

Neal Haddaway is a Senior Research Fellow at the Stockholm Environment Institute, a Humboldt Research Fellow at the Mercator Research Institute on Global Commons and Climate Change, and a Research Associate at the Africa Centre for Evidence. He researches evidence synthesis methodology and conducts systematic reviews and maps in the field of sustainability and environmental science. His main research interests focus on improving the transparency, efficiency and reliability of evidence synthesis as a methodology and supporting evidence synthesis in resource constrained contexts. He co-founded and coordinates the Evidence Synthesis Hackathon (www.eshackathon.org) and is the leader of the Collaboration for Environmental Evidence centre at SEI. @nealhaddaway

Why is mission creep a problem and not a legitimate response to an unexpected finding in the literature? Surely the crucial points are that the review’s scope is stated clearly and implemented rigorously, not when the scope was finalised.

  • Pingback: Quick, but not dirty – Can rapid evidence reviews reliably inform policy? | Impact of Social Sciences

#9. Most of them are terribly boring. Which is why I teach students how to make them engaging…and useful.

Leave a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Notify me of follow-up comments by email.

Related Posts

bias in literature review

“But I’m not ready!” Common barriers to writing and how to overcome them

November 16th, 2020.

bias in literature review

“Remember a condition of academic writing is that we expose ourselves to critique” – 15 steps to revising journal articles

January 18th, 2017.

bias in literature review

A simple guide to ethical co-authorship

March 29th, 2021.

bias in literature review

How common is academic plagiarism?

February 8th, 2024.

bias in literature review

Visit our sister blog LSE Review of Books

bias in literature review

  • Subscribe to journal Subscribe
  • Get new issue alerts Get alerts
  • Submit your manuscript

Secondary Logo

Journal logo.

Colleague's E-mail is Invalid

Your message has been successfully sent to your colleague.

Save my selection

Language bias in systematic reviews: you only get out what you put in

Stern, Cindy 1 ; Kleijnen, Jos 2

1 JBI, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia

2 Department of Family Medicine, School for Public Health and Primary Care, Maastricht, The Netherlands

Limiting study inclusion on the basis of language of publication is a common practice in systematic reviews. Neimann Rasmussen and Montgomery cite lack of time, insufficient funding, and unavailability of language resources (e.g. professional translators) as the most common reasons for not including languages other than English (LOTE) in a systematic review. 1 Thirty-eight percent (95% confidence interval, 34-42%) from a random sample of 516 reviews (out of a total of 18,140 systematic reviews published in 2016) reported language restrictions ( source: www.ksrevidence.com ). While often the most feasible option, it introduces the risk of ignoring key data, introducing bias (referred to as language bias), as well as missing important cultural contexts, which may limit the review's findings and usefulness. 2-4 Cultural context may simply be tied to geography, or in some instances, fundamentally entwined with the review question: for example, conducting a review on Chinese herbal remedies that does not include Chinese-language studies, nor searches Chinese databases or resources; or a review that focuses on health promotion strategies for indigenous populations in Canada that does not consider French-language studies. Such examples would seemingly demand the inclusion of LOTE.

Currently, JBI methodology does not require authors to include papers in LOTE but recommends that, where a review team has capacity, the search should ideally attempt to identify studies and papers published in any language, and may expand the search to include databases and resources that index LOTE. 2 Further, authors are advised to outline any language restrictions with appropriate justifications, and consider the potential consequences of language restriction in their discussion, 1 which aligns with the PRISMA Statement (Item 6: Eligibility criteria, and Item 25: Limitations of the review process). 5 The Campbell Collaboration takes a similar stance and warns against the risk of language bias, recommending that “ideally no language restrictions should be included in the search strategy,” 6 (p.28) while Cochrane advocates that searches should not be restricted by language. 7

Despite this overarching recommendation, across the diverse range of synthesis methodology and methods espoused by JBI, there are other important considerations for LOTE. If we consider the type of review question and thus the methodological design required, there may be different implications for qualitative reviews and mixed methods reviews due to the nature of their data and the potential issues in their translation. 8 Scoping reviews may also not fall under this remit due to their very nature; therefore, it is clear that we cannot assume a one-size-fits-all approach for the inclusion of LOTE.

Many protocols and reviews submitted to JBI Evidence Synthesis limit the search parameters to English only, with authors overwhelmingly stating this is due to the limited resources available. The infrequent exception to this arises from author teams in Europe, South America, and Asia who include at least one additional LOTE (largely based on the languages spoken by the author team) and search databases or resources in LOTE. Of the 17 reviews published in JBI Evidence Synthesis in the first half of 2020, seven (41%) did not limit the language to English. Pleasingly, in this issue, half of the protocols published also do not limit the language to English, with the languages chosen to represent those of the author team and/or those relevant to the cultural context (see examples 9,10 ).

A key message that JBI highlights in its global systematic review training program 11 is that an attempt should be made to locate all evidence (published and unpublished) that is relevant to a review question; however, by allowing reviews that limit by language, JBI systematic reviews are essentially overlooking this very feature that they should be promoting. JBI has reconsidered its stance on the inclusion of LOTE in JBI systematic reviews and is currently deliberating on how best to implement this; for example, standards regarding databases and other resources in LOTE (e.g. which to include as well as training and access), the use of Google Translate and other translation tools to screen/assess suitability, recruitment of collaborators to assist with LOTE, and acknowledgment versus authorship of collaborators.

There are also multiple ways to deal with difficulties in reading and managing LOTE studies in a systematic review. Rather than expensive full translations of published articles, which are often not necessary, a more economical solution may be for a reviewer to work closely with a person who can read the language and facilitate identification and extraction of the required information. In addition, studies for which nobody can be found to help with translation could be listed in the review with a remark that the reviewers could not process the study. This would at least enable the readers to make a judgment about the possible bias involved.

While it is clear this will impact authors, we must move forward to ensure we capture a truly global picture of the evidence. Should we expect authors to include every piece of research ever written that fits their review's inclusion criteria? It simply may not be feasible; however, by limiting a review to one language from the outset, we are violating the very essence of what a systematic review is and its purpose in assisting in making informed decisions from the best available evidence.

  • + Favorites
  • View in Gallery

Readers Of this Article Also Read

Nurses’ experience in providing care at shelters following natural hazards and..., effectiveness and family experiences of interventions promoting partnerships..., experiences of patients with lysosomal storage disorders treated with enzyme..., impact of primary and recurrent genital herpes on the quality of life of young..., experiences of patients with lysosomal storage disorders who are receiving....

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Identifying stigmatizing language in clinical documentation: A scoping review of emerging literature

Roles Conceptualization, Funding acquisition, Supervision, Writing – original draft

* E-mail: [email protected]

Affiliation Columbia University School of Nursing, New York, New York, United States of America

ORCID logo

Roles Data curation, Formal analysis, Writing – review & editing

Roles Data curation, Writing – review & editing

Affiliation Department of Biomedical Informatics, Columbia University, New York, New York, United States of America

Roles Conceptualization, Writing – review & editing

Affiliation Department of Computer Science, Aalto University, Aalto, Finland

Affiliation University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, United States of America

Roles Conceptualization, Supervision, Writing – review & editing

  • Veronica Barcelona, 
  • Danielle Scharp, 
  • Betina R. Idnay, 
  • Hans Moen, 
  • Kenrick Cato, 
  • Maxim Topaz

PLOS

  • Published: June 28, 2024
  • https://doi.org/10.1371/journal.pone.0303653
  • Reader Comments

Fig 1

Racism and implicit bias underlie disparities in health care access, treatment, and outcomes. An emerging area of study in examining health disparities is the use of stigmatizing language in the electronic health record (EHR).

We sought to summarize the existing literature related to stigmatizing language documented in the EHR. To this end, we conducted a scoping review to identify, describe, and evaluate the current body of literature related to stigmatizing language and clinician notes.

We searched PubMed, Cumulative Index of Nursing and Allied Health Literature (CINAHL), and Embase databases in May 2022, and also conducted a hand search of IEEE to identify studies investigating stigmatizing language in clinical documentation. We included all studies published through April 2022. The results for each search were uploaded into EndNote X9 software, de-duplicated using the Bramer method, and then exported to Covidence software for title and abstract screening.

Studies (N = 9) used cross-sectional (n = 3), qualitative (n = 3), mixed methods (n = 2), and retrospective cohort (n = 1) designs. Stigmatizing language was defined via content analysis of clinical documentation (n = 4), literature review (n = 2), interviews with clinicians (n = 3) and patients (n = 1), expert panel consultation, and task force guidelines (n = 1). Natural language processing was used in four studies to identify and extract stigmatizing words from clinical notes. All of the studies reviewed concluded that negative clinician attitudes and the use of stigmatizing language in documentation could negatively impact patient perception of care or health outcomes.

The current literature indicates that NLP is an emerging approach to identifying stigmatizing language documented in the EHR. NLP-based solutions can be developed and integrated into routine documentation systems to screen for stigmatizing language and alert clinicians or their supervisors. Potential interventions resulting from this research could generate awareness about how implicit biases affect communication patterns and work to achieve equitable health care for diverse populations.

Citation: Barcelona V, Scharp D, Idnay BR, Moen H, Cato K, Topaz M (2024) Identifying stigmatizing language in clinical documentation: A scoping review of emerging literature. PLoS ONE 19(6): e0303653. https://doi.org/10.1371/journal.pone.0303653

Editor: Guanghui Liu, State University of New York at Oswego, UNITED STATES

Received: April 13, 2023; Accepted: April 30, 2024; Published: June 28, 2024

Copyright: © 2024 Barcelona et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are fully publicly available; full citations and DOIs can be found in the "Data Availability Statement" supporting information file.

Funding: Columbia University Data Science Institute Seeds Funds Program (VB, MT, KC). https://datascience.columbia.edu/ The Gordon and Betty Moore Foundation (Grant number: GBMF9048) (VB, MT, KC). https://health.ucdavis.edu/nursing/NurseLeaderFellows/index.html The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Racial and ethnic disparities in health care access, treatment, and outcomes have been documented for decades [ 1 ]. Prior studies have shown that concerns expressed by Black patients are more likely to be dismissed or ignored than White patients [ 2 ]. This differential treatment has been observed among Black and African American patients leading to disparities in outcomes, [ 1 , 3 , 4 ] and specifically in the treatment of cardiovascular diseases, [ 5 ] pain, [ 6 ] and breast cancer [ 7 ]. Racism occurring on the structural, interpersonal, or cultural levels has been identified as the primary reason for disparities in health outcomes [ 8 ]. Researchers have examined clinician biases by studying racial bias in patient-clinician interactions, finding that stereotyping and lack of empathy towards patients by race influenced health care outcomes [ 9 ].

Stigmatizing language has been defined as language that communicates unintended meanings that can perpetuate socially constructed power dynamics and result in bias [ 10 ]. Recent studies suggest that racial biases may also be identified by examining stigmatizing language in clinician notes documented in the electronic health record (EHR) [ 11 – 14 ]. Racial differences in documentation patterns may reflect unconscious biases and stereotypes that could negatively affect the quality of care [ 14 ]. Examples of stigmatizing language may include the use of quotations to identify disbelief in what the patient is reporting, questioning patient credibility, sentence construction that implies hearsay, and the use of judgment words [ 13 ]. Stigmatizing language in clinical notes has been associated with more negative attitudes towards the patient and less effective management of patient pain by physicians [ 14 ].

It is unknown to what extent and how stigmatizing language has been studied in healthcare settings, and study designs and foci differ. Emerging studies have used traditional qualitative methods, including interviews with patients and clinicians. Other research has used natural language processing (NLP), a computer science-based technique that helps extract meaning from large bodies of text, to quantify how EHR notes reflect stigmatizing language by race and ethnicity. The purpose of this scoping review was to identify, describe, and evaluate the presence and type of stigmatizing language in clinician documentation in the literature.

A scoping review was chosen instead of a systematic review as the purpose was to identify and map the emerging evidence [ 15 ]. This review was conducted using PRISMA-ScR guidelines for scoping reviews [ 16 ].

Materials & methods

Search strategy.

The authors discussed the selection and coverage of three concepts (i.e., stigmatizing language, clinician, and clinical documentation) for review based on the research question. For purposes of the current study, the concept of “clinician” includes physicians and nurses. We searched PubMed, Cumulative Index of Nursing and Allied Health Literature (CINAHL), and Embase databases in May 2022 to identify studies investigating stigmatizing language in clinical documentation. We also conducted an updated hand-search of the IEEE Explore database for articles published through April 2022. However, we did not identify additional articles that met inclusion criteria and were not already included in our review. The results for each search were uploaded into EndNote X9 software, de-duplicated using the Bramer method [ 17 ], and then exported to Covidence software for title and abstract screening. The search strategy is detailed in S1 Table .

Inclusion criteria

The initial search yielded 1,482 articles for review. After de-duplication, 897 articles were included for title and abstract screening. Two authors (BI, DS) independently screened all articles by title and abstract and documented reasons for exclusion, when applicable. Studies were included if they investigated stigmatizing language in clinical documentation. Studies that looked into stigmatizing language with patient-provider interaction that did not include documentation (e.g., verbal communication) were excluded. Articles not in English, review articles, editorials, commentaries, and articles without full-text availability were also excluded. The same reviewers independently assessed all potentially relevant articles in the full-text review to comprehensively determine eligibility for inclusion, as well as searching reference lists for additional articles. Discrepancies were discussed with the team to achieve consensus. From the 40 articles included for full-text review, nine articles were included for final synthesis ( Fig 1 ).

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0303653.g001

Data extraction and quality assessment

Relevant information categories from each included article were extracted by two authors (BI, DS). Two other co-authors with expertise in health informatics (MT, HM) reviewed and validated all the extracted data elements. These information categories included: authors, year of publication, study aim and design, clinical setting, data source, clinician specialty, clinical note type (when available), study population, number of clinical notes used, data analysis approach, outcomes, and stigmatizing language identified. The Mixed Methods Appraisal Tool (MMAT) [ 18 ] was used to evaluate study quality and the risk of bias in the included articles.

Nine articles meeting all inclusion criteria were included in this scoping review ( Table 1 ). Overall, study designs (N = 9) included cross-sectional (n = 3), [ 11 – 13 ] qualitative (n = 3), [ 19 – 21 ] mixed methods (n = 2), [ 22 , 23 ] and retrospective cohort (n = 1) [ 24 ]. Studies took place in exclusively inpatient (n = 3) [ 12 , 19 , 24 ] or outpatient (n = 4) [ 13 , 21 – 23 ] settings. One study was conducted in an emergency department (ED) (n = 1), [ 20 ] and another included participants from inpatient, outpatient, and ED settings (n = 1) [ 11 ]. In terms of patient population, six focused on general medicine, [ 11 – 13 , 19 , 21 , 23 ] and one article each on oncology, [ 22 ] psychiatry, [ 24 ] and pediatrics [ 20 ].

thumbnail

https://doi.org/10.1371/journal.pone.0303653.t001

Methods for measuring and defining stigmatizing language varied by study. Specifically, stigmatizing language was identified via interviews with clinicians [ 19 , 20 , 22 ] and patients, [ 19 ] content analysis of clinical documentation, [ 13 , 21 , 23 , 24 ] literature review, [ 11 , 12 ] expert panel consultation, [ 11 ] and task force guidelines from relevant professional organizations [ 12 ]. Definitions of stigmatizing language or bias varied as well by study, with most studies focusing on discipline-specific words communicating judgment or negative bias ( Table 1 ). Stigmatizing language often included stereotyping by race and ethnicity. An example found in clinician documentation in the EHR was in the form of quotes highlighting “unsophisticated” patient language, i.e., “…patient states that the wound ‘busted open’” [ 21 ]. Another study found that physician notes written about Black patients had up to 50% higher odds of containing evidentials (language used by the writer questioning the veracity of the patient’s words) and stigmatizing language than those of White patients [ 13 ]. Similarly, physicians documented more negative feelings such as disapproval, discrediting, and stereotyping toward Black patients than White patients [ 21 ].

Often, clinical documentation studied was in the form of clinical notes. The most commonly analyzed clinical notes included those documented by physicians (n = 3), [ 12 , 13 , 22 ] followed by nurses (n = 1), [ 24 ] advanced practice providers (n = 1), [ 12 ] and interdisciplinary team members including radiologists, respiratory therapists, nutritionists, social workers, case managers, and pharmacists (n = 1). Sun et al. examined history and physical notes written by medical providers, although no further detail about the type of providers was specified [ 11 ].

Reporting of race and ethnicity of study participants varied widely. In three studies, race was not specified at all, [ 20 , 22 , 24 ] or studies reported only White and Black participant races (n = 2) [ 13 , 21 ]. Two studies described findings by race and ethnicity, including Black (or African American), Hispanic, White, and Asian categories [ 12 , 23 ]. The remaining studies either reported race and ethnicity as: White, Black or Hispanic, [ 11 ] or White or Hispanic [ 19 ].

Studies that conducted interviews focused on how clinical notes were written and may be interpreted by patients, [ 22 ] barriers and facilitators to providing care, [ 19 ] patients’ perceptions of their hospitalization, [ 19 ] and clinician insights on racial bias and EHR documentation [ 20 ]. Qualitative themes identified related to stigmatizing language included a reluctance to describe patients as “difficult” or “obese” due to the social stigma attached to common medical language, [ 22 ] intentional and unintentional perpetration of stigma in clinical notes, [ 19 ] and identification of potential racial bias through documentation [ 20 ].

In terms of methods, four studies used NLP [ 11 – 13 , 22 ] to extract terms from clinical notes matching those in predefined vocabularies of stigmatizing language terms. After NLP, statistical analyses were conducted to calculate and compare the odds of stigmatizing language occurrence among different patient populations. Two of the NLP-based studies used Linguistic Inquiry and Word Count (LIWC: a standardized vocabulary of terms), while others created their own hand-crafted vocabularies. One of the studies that involved the use of NLP [ 11 ] developed a machine learning classifier that would automatically detect stigmatizing language. This was the only study that measured the accuracy of automated NLP-based stigmatizing language detection and found it very accurate (F1 score = 0.94).

Despite a wide variety of clinical settings in the reviewed studies, negative language, bias, racial bias, or stigmatizing language was identified in clinician attitudes and/or documentation across all studies that could negatively impact patient perception or outcomes. Disparities in stigmatizing language use in the EHR were evident by race and ethnicity both in clinician interviews [ 20 , 22 , 24 ] and analyses of clinical notes [ 11 – 13 , 19 , 21 , 23 ]. There may be discipline-specific stigmatizing language and terms [i.e., addiction [ 19 ]] and paternalistic attitudes that state that clinical notes are for clinician communication and not for patients to read [i.e., oncology [ 22 ]] that warrant further investigation.

In Table 2 , results of the study quality assessments are presented. All studies asked clear research questions and collected data to address the research questions. Among quantitative studies (n = 4), three met all five criteria for quality, and the remaining study did not adequately describe measurement, confounders, or intervention fidelity. The qualitative studies (n = 3) met the criteria for four of five quality components assessed, with two studies lacking an explicit discussion of the qualitative approach. Neither mixed methods studies (n = 2) met all quality criteria, as one did not include an adequate rationale for using this design, the other study did not discuss inconsistencies between quantitative and qualitative results, and both did not adhere to all criteria for quantitative and/or qualitative methods.

thumbnail

https://doi.org/10.1371/journal.pone.0303653.t002

In this review, we identified the types and frequency of stigmatizing language in EHR notes, establishing an underpinning for future research on the correlation between communication patterns and outcomes (i.e., hospitalization, mortality, complications, disease stability, symptom control). With continuous advancements in the field of NLP, we believe that these methods (including deep learning-based methods) will be essential tools in future stigmatizing language studies.

It is crucial to evaluate NLP-based system performance to ensure accurate concept identification and reliable results; however, this was only done in one study [ 11 ]. Further studies that use NLP are needed that evaluate the accuracy of the resulting NLP systems and to ensure stigmatizing language is identified correctly. The two studies reviewed here that used NLP did not assess clinical relevance, limiting their findings. In addition to accurate stigmatizing language identification, clinical relevance must be assessed to determine to what extent NLP systems are useful for predicting the association between language use and clinical outcomes. Finally, there is a gap in the literature for NLP-specific bias assessment. There is a need for further development of NLP for identifying stigmatizing language, as these methods may not detect all stigmatizing language, and outcomes may be driven by the level of bias among annotators. Quality from training data is vital in algorithm development, and more research should be done describing biases of people performing annotation. This type of acknowledgment is increasingly common in journals where authors are required to submit positionality statements, however, we suggest that this go further for annotators, as life experiences influence assessments of whether bias or stigma is present. We did not do a specific evaluation of the NLP-only studies, due to the small number. However, further studies should be done to evaluate the quality of NLP studies and the validity of NLP results. Specific criteria for this domain should be developed.

The identification of stigmatizing language use in EHR notes is vital as this language may foster the transmission of bias between clinicians and may represent a value judgment of the intrinsic worth assigned to a patient [ 11 ]. Further, with the passage of The 21 st Century Cures Act in the US, federal policy now requires the availability of clinical notes to patients [ 25 ]. Clinical notes that reflect clinician bias may harm the patient-clinician relationship and hinder or damage the establishment of trust required for positive interactions in health care settings. Medical mistrust is a persistent problem contributing to delays in seeking care and widening disparities in disease outcomes for many vulnerable populations, [ 26 ] hence efforts are needed to improve the current situation.

Definitions of stigmatizing language varied in the studies reviewed, and also represent an area for future research. Stigmatizing language may best be defined by the vulnerable populations at risk, in partnership with researchers. Further, discipline-specific language should be discussed and agreed upon, as this may vary by patient population. For example, guidelines have been suggested for addressing the intersectional nature of language in the care of birthing people [ 27 ].

Three studies reviewed here did not specify race or ethnicity of their clinician and patient participants [ 20 , 22 , 24 ]. This is a significant issue as patient-clinician race discordance has been associated with increased risk of mortality [ 28 ]. Racial concordance, however, does not necessarily lead to better communication as perceived by patients [ 29 ]. Given the inconsistency in reporting of race and ethnicity in the reviewed studies, future research in this area should carefully operationalize and define race and ethnicity variables extracted from the EHR. In addition, studies whose primary focus was to identify bias did not blind for patient race, as in many cases race was considered a primary predictor or variable of interest. This underscores an important gap in the literature for NLP-specific bias assessment. Blinding sensitive categories when screening records for bias may improve validity of outcome ascertainment, however, it is often necessary for reviewers to rely on context and include categories such as race and ethnicity when evaluating for stigmatizing language.

The measurement of race is a contentious issue in many medical and scientific disciplines, and though it is a social construction with no biological basis, it remains an indicator of likelihood of encountering racism and racist structures that lead to health disparities. EHR demographic data have been shown to have several quality issues, with some studies indicating that data from Latinos having higher rates of misclassification than other racial/ethnic groups [ 30 ]. It is important to consider who enters race and ethnicity data in the EHR, as patient self-identification is often used as the “gold-standard” in research, yet the patient’s apparent phenotype may be an even more important predictor of clinician perception and subsequent clinical documentation. Indeed, recent work has identified that patient race can be predicted using machine learning algorithms applied to other clinical indicators from the EHR [ 31 – 33 ]. From a validity and reliability perspective, researchers must align their methodological definition of race and ethnicity with the stated research objectives. Further, consistent definitions of racial and ethnic categories are essential to identifying associations between stigmatizing language use and patient outcomes as future studies developing interventions are considered. Future research should include larger proportions of minoritized patient and clinician participants to elucidate these issues further, and examine the underlying factors associated with poorer outcomes in various healthcare settings.

Finally, six of the studies reviewed [ 12 , 13 , 19 – 22 ] included physicians, and many included other health care provider types (i.e. nurses, respiratory therapists, pharmacists, etc.) either alone [ 24 ] or in addition to physician notes/participants [ 12 , 19 , 20 ]. Limited information was provided about the type of notes that were analyzed. Further detail about the type of clinicians and notes would allow for the identification of what other disciplines are reading or writing to draw conclusions about the transmission of bias over the trajectory of patient care.

There are several opportunities for policy change to address the use of stigmatizing language in clinical documentation. First, stigmatizing language can be identified automatically with NLP. NLP-based solutions can be developed and integrated into routine documentation systems to screen for stigmatizing language and alert clinicians or their supervisors. Previously published instances of flags in EHR documentation have provided evidence of improved outcomes of care, including in diagnosis of stroke, increasing health care access for patients at risk of suicide, and improving community rates of Hepatitis C screening for those at high risk [ 34 – 36 ]. To our knowledge, NLP findings of stigmatizing language use in the EHR has not yet been applied to clinical practice, identifying a need for future research that could lead to practice and policy change.

Second, clinicians’ less than optimal working conditions may contribute to burnout and negative language use toward patients. One study found that resident physicians who reported higher levels of burnout had greater explicit and implicit racial biases [ 37 ]. Individually-focused interventions for clinicians, such as mindfulness training, have also been suggested as a method to reduce bias in clinical care, [ 38 ] but have yet to be evaluated. A study carried out on nurses in Taiwan suggested that workplace burnout was associated with poorer patient care outcomes, though stigmatizing language was not examined [ 39 ]. The COVID-19 pandemic has also contributed to moral injury for nurses, affecting patient care [ 40 ]. Burnout does not foster an environment where clinicians can foster and sustain empathy for patients, and empathy is a critical component of reducing bias and building support for antiracism efforts to reduce inequities [ 41 , 42 ] Antiracism and bias efforts in hospitals should include analyzing if clinician burnout is associated with stigmatizing language use in EHR documentation, and if it reinforces bias between clinicians, potentially contributing to health inequities.

In summary, this review highlights a new and promising application of qualitative research and NLP to clinical documentation in the study of racial and ethnic disparities in health care. We suggest that further research be done applying NLP to identify stigmatizing language, with the ultimate goal of reducing clinicians’ stigmatizing language use in health documentation. By improving identification of stigmatizing language through NLP and other methods, potential interventions can be developed to generate awareness and design educational interventions about how implicit biases affect communication patterns and work to achieve equitable health care for diverse populations.

Supporting information

S1 checklist. preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (prisma-scr) checklist..

https://doi.org/10.1371/journal.pone.0303653.s001

S1 File. Data availability statement.

https://doi.org/10.1371/journal.pone.0303653.s002

S1 Table. Search strategy.

https://doi.org/10.1371/journal.pone.0303653.s003

  • 1. Smedley BD, Stith AY, Nelson AR. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care (with CD) Smedley Brian D., Stith Adrienne Y., and Nelson Alan R., Editors, Committee on Understanding and Eliminating Racial and Ethnic Disparities in Health Care. 1st ed. Washington, D.C.: National Academy of Sciences; 2003.
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 7. Wilson J, Sule AA. Disparity In Early Detection Of Breast Cancer. StatPearls. Treasure Island (FL): StatPearls Publishing LLC; 2021.
  • 25. United States Department of Health and Human Services. 21st Century Cures Act: Interoperability, information blocking, and the ONC health IT certification program. Federal Register: National Archives; 2020 [updated 08/04/2020; cited 2021 November 5]. Available from: https://www.federalregister.gov/documents/2020/05/01/2020-07419/21st-century-cures-act-interoperability-information-blocking-and-the-onc-health-it-certification .
  • Submit a Manuscript
  • Advanced search

American Journal of Neuroradiology

American Journal of Neuroradiology

Advanced Search

Does Long-term Surveillance Imaging Improve Survival in Patients Treated for Head and Neck Squamous Cell Carcinoma? A Systematic Review of the Current Evidence

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Info & Metrics

This article requires a subscription to view the full text. If you have a subscription you may use the login form below to view the article. Access to this article can also be purchased.

BACKGROUND: Long-term post-treatment surveillance imaging algorithms for head and neck squamous cell carcinoma are not standardized due to debates over optimal surveillance strategy and efficacy. Consequently, current guidelines do not provide long-term surveillance imaging recommendations beyond 6 months.

PURPOSE: We performed a systematic review to evaluate the impact of long-term imaging surveillance (i.e., imaging beyond 6 months following treatment completion) on survival in patients treated definitively for head and neck squamous cell carcinoma.

DATA SOURCES: A search was conducted on PubMed, Embase, Scopus, the Cochrane Central Register of Controlled Trials, and Web of Science for English literature published between 2003 and 2024 evaluating the impact of long-term surveillance imaging on survival in patients with head and neck squamous cell carcinoma.

STUDY SELECTION: 718 abstracts were screened and 9 5 underwent full-text review, with 2 articles meeting inclusion criteria. The Risk of Bias in Non-randomized Studies of Interventions assessment tool was used.

DATA ANALYSIS: A qualitative assessment without a pooled analysis was performed for the two studies meeting inclusion criteria.

DATA SYNTHESIS: No randomized prospective controlled trials were identified. Two retrospective two-arm studies were included comparing long-term surveillance imaging with clinical surveillance and were each rated as having moderate risk of bias. Each study included heterogeneous populations with variable risk profiles and imaging surveillance protocols. Both studies investigated the impact of long-term surveillance imaging on overall survival and came to a different conclusion with one study reporting a survival benefit for long-term surveillance imaging with FDG PET/CT in patients with stage III or IV disease or an oropharyngeal primary tumor and the other study demonstrating no survival benefit.

LIMITATIONS: Limited heterogeneous retrospective data available precludes definitive conclusions on the impact of long-term surveillance imaging in head and neck squamous cell carcinoma.

CONCLUSIONS: There is insufficient quality evidence regarding the impact of long-term surveillance imaging on survival in patients treated definitively for head and neck squamous cell carcinoma. There is a lack of standardized definition of long-term surveillance, variable surveillance protocols, and inconsistencies in results reporting, underscoring the need for a prospective multi-center registry assessing outcomes.

ABBREVIATIONS: HNSCC = Head and Neck Squamous Cell Carcinoma; RT= radiotherapy; NCCN = National Comprehensive Cancer Network; MPC = metachronous primary cancer; CR = complete response; OS = overall survival; CRT = chemoradiotherapy; HPV = human papillomavirus; PFS = progression-free survival; CFU = clinical follow up; NI-RADS = Neck Imaging Reporting and Data System.

The authors declare no conflicts of interest related to the content of this article.

  • © 2024 by American Journal of Neuroradiology

Log in using your username and password

Thank you for your interest in spreading the word on American Journal of Neuroradiology.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager

del.icio.us logo

  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

Related articles.

  • No related articles found.
  • Google Scholar

Cited By...

  • No citing articles found.

This article has not yet been cited by articles in journals that are participating in Crossref Cited-by Linking.

More in this TOC Section

  • Endovascular Thrombectomy for Carotid Pseudo-occlusion in the Setting of Acute Ischemic Stroke: A Comparative Systematic Review and Meta-Analysis
  • Double stent-retriever technique for mechanical thrombectomy: a systematic review and meta-analysis

Similar Articles

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Bias in Observed Assessments in Medical Education: A Scoping Review

Affiliations.

  • 1 R. Ismaeel is a medical student, College of Medicine, University of Saskatchewan, Saskatoon, Canada; ORCID: https://orcid.org/0009-0009-5975-4847 .
  • 2 L. Pusic is a student, Columbia College, Colombia University in the City of New York, New York, New York; ORCID: https://orcid.org/0009-0007-5994-7222 .
  • 3 M. Gottlieb is associate professor, vice chair of research, and ultrasound division director, Department of Emergency Medicine, Rush University Medical Center, Chicago, Illinois; ORCID: https://orcid.org/0000-0003-3276-8375 .
  • 4 T.M. Chan is dean, Toronto Metropolitan University School of Medicine, Toronto, Ontario, Canada, and associate clinical professor, Department of Medicine, and adjunct scientist, McMaster Education Research, Innovation, and Theory, Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada; ORCID: https://orcid.org/0000-0001-6104-462X .
  • 5 T.O. Oyedokun is clinical associate professor, Department of Emergency Medicine, College of Medicine, University of Saskatchewan, Saskatoon, Saskatchewan, Canada; ORCID: https://orcid.org/0000-0002-6684-6549 .
  • 6 B. Thoma is professor, Department of Emergency Medicine, College of Medicine, University of Saskatchewan, Saskatoon, Saskatchewan, Canada, and clinical professor, Toronto Metropolitan University School of Medicine, Toronto, Ontario, Canada; ORCID: https://orcid.org/0000-0003-1124-5786 .
  • PMID: 38924499
  • DOI: 10.1097/ACM.0000000000005794

Purpose: Observed assessments are integral to medical education but may be biased against structurally marginalized communities. Current understanding of assessment bias is limited because studies have focused on single specialties, levels of training, or social identity characteristics (SIDCs). This scoping review maps studies investigating bias in observed assessments in medical education arising from trainees' observable SIDCs at different medical training levels, with consideration of medical specialties, assessment environments, and assessment tools.

Method: MEDLINE, Embase, ERIC, PsycINFO, Scopus, Web of Science Core Collection, and Cochrane Library were searched for articles published between January 1, 2008, and March 15, 2023, on assessment bias related to 6 observable SIDCs: gender (binary), gender nonconformance, race and ethnicity, religious expression, visible disability, and age. Two authors reviewed the articles, with conflicts resolved by consensus or a third reviewer. Results were interpreted through group review and informed by consultation with experts and stakeholders.

Results: Sixty-six of 2,920 articles (2.3%) were included. These studies most frequently investigated graduate medical education (44 [66.7%]), used quantitative methods (52 [78.8%]), and explored gender bias (63 [95.5%]). No studies investigated gender nonconformance, religious expression, or visible disability. One evaluated intersectionality, with SIDCs described inconsistently. General surgery (16 [24.2%]) and internal medicine (12 [18.2%]) were the most studied specialties. Simulated environments (37 [56.0%]) were studied more frequently than clinical environments (29 [43.9%]). Bias favoring men was found more in assessments of intraoperative autonomy (5 of 9 [55.6%]), whereas clinical examination bias often favored women (15 of 19 [78.9%]). When race and ethnicity bias was identified, it consistently favored White students.

Conclusions: This review mapped studies of gender, race, and ethnicity bias in the medical education assessment literature, finding limited studies on other SIDCs and intersectionality. These findings will guide future research by highlighting the importance of consistent terminology, unexplored SIDCs, and intersectionality.

Copyright © 2024 the Association of American Medical Colleges.

PubMed Disclaimer

Similar articles

  • Pride & prejudice: A scoping review of LGBTQ + medical trainee experiences. Sorgini A, Istl AC, Downie ML, Kirpalani A. Sorgini A, et al. Med Teach. 2024 Jan;46(1):73-81. doi: 10.1080/0142159X.2023.2229503. Epub 2023 Jul 7. Med Teach. 2024. PMID: 37418565 Review.
  • Recovery schools for improving behavioral and academic outcomes among students in recovery from substance use disorders: a systematic review. Hennessy EA, Tanner-Smith EE, Finch AJ, Sathe N, Kugley S. Hennessy EA, et al. Campbell Syst Rev. 2018 Oct 4;14(1):1-86. doi: 10.4073/csr.2018.9. eCollection 2018. Campbell Syst Rev. 2018. PMID: 37131375 Free PMC article.
  • Intersectionality Within Critical Autism Studies: A Narrative Review. Mallipeddi NV, VanDaalen RA. Mallipeddi NV, et al. Autism Adulthood. 2022 Dec 1;4(4):281-289. doi: 10.1089/aut.2021.0014. Epub 2022 Dec 13. Autism Adulthood. 2022. PMID: 36777375 Free PMC article. Review.
  • Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P. Crider K, et al. Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
  • Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia. Osborne SR, Alston LV, Bolton KA, Whelan J, Reeve E, Wong Shee A, Browne J, Walker T, Versace VL, Allender S, Nichols M, Backholer K, Goodwin N, Lewis S, Dalton H, Prael G, Curtin M, Brooks R, Verdon S, Crockett J, Hodgins G, Walsh S, Lyle DM, Thompson SC, Browne LJ, Knight S, Pit SW, Jones M, Gillam MH, Leach MJ, Gonzalez-Chica DA, Muyambi K, Eshetie T, Tran K, May E, Lieschke G, Parker V, Smith A, Hayes C, Dunlop AJ, Rajappa H, White R, Oakley P, Holliday S. Osborne SR, et al. Med J Aust. 2020 Dec;213 Suppl 11:S3-S32.e1. doi: 10.5694/mja2.50881. Med J Aust. 2020. PMID: 33314144

Related information

Linkout - more resources, full text sources.

  • Wolters Kluwer

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals

You are here

  • Online First
  • Ethical implications of AI-driven bias assessment in medicine
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0009-0002-2948-0493 Yanyi Wu 1 , 2 ,
  • Chenghua Lin 1 , 2
  • 1 School of Public Affairs , Zhejiang University , Hangzhou , China
  • 2 Institute of China's Science, Technology and Policy , Zhejiang University , Hangzhou , China
  • Correspondence to Dr Yanyi Wu, School of Public Affairs, Zhejiang University, Hangzhou, China; yanyi.wu{at}hotmail.com

https://doi.org/10.1136/bmjebm-2024-113095

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Barsby et al present a thought-provoking pilot study which discusses the application of large language models (LLMs) for automating risk-of-bias (RoB) assessments in system evaluation. 1 Although LLMs show potential in streamlining evidence synthesis, the major ethical concerns caused by their integration into the medical decision-making process require careful consideration.

Patient safety is paramount. RoB assessments directly impact the quality of evidence used to guide clinical decisions. As highlighted by Barsby et al , current LLM performance in RoB assessment remains suboptimal, with both ChatGPT 3.5 and ChatGPT 4 demonstrating only moderate agreement with human assessors. 1 Prematurely relying on these models could lead to misinformed judgements, …

Contributors YW conceptualised the study and drafted the manuscript. CL contributed to reviewing and editing the manuscript. Both authors reviewed and approved the final manuscript.

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Provenance and peer review Not commissioned; internally peer reviewed.

Linked Articles

  • Letter Pilot study on large language models for risk-of-bias assessments in systematic reviews: A(I) new type of bias? Joseph Barsby Samuel Hume Hamish AL Lemmey Joseph Cutteridge Regent Lee Katarzyna D Bera BMJ Evidence-Based Medicine 2024; - Published Online First: 23 May 2024. doi: 10.1136/bmjebm-2024-112990

Read the full text or download the PDF:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Psychiatry Neurosci
  • v.34(6); 2009 Nov

Logo of jpn

Bias in the research literature and conflict of interest: an issue for publishers, editors, reviewers and authors, and it is not just about the money

Conflicts of interest (COIs) of researchers have been a frequent topic recently in the popular press and scientific journals. Of particular interest to psychiatric researchers are the investigations in the US senate, led by Senator Charles Grassley. A recent article in Science discusses the history and current state of these investigations. 1 For those who like to keep score, Science has a list of the 9 psychiatric researchers who have been investigated, the amounts of money they received from drug companies and the amounts they mention in COI disclosures. 2 Much of what has been written about COI concerns drug company payments to researchers. However, COIs are an issue for publishers and editors of journals, reviewers of manuscripts and authors. Conflicts of interest exist in every aspect of the production of research journals, and the conflicts derive from more than just money paid to researchers by drug companies. The purpose of this editorial is first to discuss the nature of COIs and to describe some of the human behavioural research relevant to COIs. I will then discuss how COIs pervade every aspect of publishing and how the Journal of Psychiatry and Neuroscience attempts to deal with these issues. Finally, I will argue that there is no entirely satisfactory way of dealing with COIs, but that all researchers should be aware of the issues discussed here to minimize the extent to which COIs can distort the scientific literature.

Creation of bias and the nature of COIs

A COI occurs when individuals’ personal interests are in conflict with their professional obligations. Often this means that someone will profit personally from decisions made in his or her professional role. The personal profit is not necessarily monetary; it could be progress toward the personal goals of the individual or organization, for example the success of a journal for a publisher or editor or the acceptance of ideas for a researcher. The concern is that a COI may bias behaviour, and it is the potential for bias that makes COIs so important. Before getting into the specifics of COIs, I will describe some of the research on the biases we all have, the evidence that we are not always aware of our own biases, how biases can be created by vested interests and how people behave in response to revelations of COIs. The idea that scientists are objective seekers of truth is a pleasing fiction, but counterproductive in so far as it can lessen vigilance against bias.

A recent short review in Science asks how well people know their own minds and concludes the answer is not very well. 3 This is because “In real life, people do not realize that their self-knowledge is a construction, and fail to recognize that they possess a vast adaptive unconscious that operates out of their conscious awareness.” Wilson and Brekke 4 reviewed some of the unwanted influences on judgments and evaluations. They concluded that people find it difficult to avoid unwanted responses because of mental processing that is unconscious or uncontrollable. Moore and Loewenstein 5 argue that “the automatic nature of self-interest gives it a primal power to influence judgment and makes it difficult for people to understand its influence on their judgment, let alone eradicate its influence.” They also point out that in contrast to self-interest, understanding one’s ethical and professional obligations involves a more thoughtful process. The involvement of different cognitive processes may make it difficult to reconcile self-interest and obligations. MacCoun, 6 in an extensive review, examined the experimental evidence about bias in the interpretation and use of research results. He also discussed the evidence and theories concerning the cognitive and motivational mechanisms that produce bias. He concluded that people assume that their own views are objective and “that subjectivity (e.g., due to personal ideology) is the most likely explanation for their opponents’ conflicting perceptions.” This is consistent with the suggestion of Platt, almost 50 years ago, that researchers’ attachment to their own ideas results in competition among researchers rather than ideas. 7

An early experimental study by Mahoney 8 is a particularly striking example of how researchers’ bias can influence their behaviour. Reviewers were asked to referee manuscripts, all of which had identical methodology but reported different results. Reviewers were strongly biased against manuscripts that reported results that contradicted their own theoretical perspectives. This can have a deleterious effect as ideas that have long since been contradicted can persist in the literature. 9 , 10 Researchers’ biases caused by preference for their own ideas can cause a serious COI when they present their own work and when they are involved in any aspect of peer review. Nonetheless, much more attention is paid to COIs owing to external influences such as money than to COIs related to researchers’ inherent biases.

Cain and Detsky 11 reviewed some of the evidence on how biases can be created and how they can bias opinions in everyone. Experimental evidence supports the idea that “individuals use different strategies to evaluate propositions depending on whether the hypothesis is desirable or threatening/disagreeable to them.” For example, a much higher proportion of people agree with the proposition that if someone sues you and you win the case the other person should pay your legal costs than with essentially the same proposition that if you sue someone and lose the case you should pay the costs. Cain and Detsky discuss some of the experimental work that demonstrates how people come to have biased opinions. For example, opinion can be biased by the first information encountered on a topic, a conclusion with obvious implications if the first information a physician or researcher learns about a drug is from the pharmaceutical company developing that drug. Experimental evidence also supports the idea that it is difficult to overcome the biases created by the effect of early information on beliefs. This explains why beliefs derived from experimental or epidemiological studies persist even after contradictory evidence from clinical trials provides more compelling contradictory evidence. Cain and Detsky suggest that “physicians have many relationships that may results in bias” — not just those involving pharmaceutical companies and not just those involving money —and warn that “such bias may be difficult to undo.” The same conclusions surely apply to researchers. In another review, Dana and Loewenstein 12 describe the evidence indicating that gifts from industry can create bias. They conclude that self-serving bias prevents individuals from being objective even when they have a motivation to be objective; that instructions given to individuals about bias do not prevent them from becoming biased, suggesting a role for the unconscious in this process; and that self-interest alters the way individuals seek out and assess information.

One of the main strategies used to mitigate the effects of bias related to COI is disclosure. Most peer-reviewed journals require authors to make a COI statement that is often published with the article. The idea behind disclosure is that the reader of the article will be more skeptical about any claims made in the article. In an experimental study, different groups read a manuscript in which a COI was mentioned or not mentioned. Those reading the study with the mention of a COI considered the study to be less interesting and important. 13 However, given the evidence that people do not always know their own minds, these results have limitations. On the basis of a review of the evidence on the effectiveness of disclosing COIs and on an experimental study, Cain and colleagues 14 concluded that disclosure may not always be useful for 2 reasons. First, those declaring a COI may feel entitled to deviate from what they consider objectivity because they have declared a COI. They may also exaggerate to overcome any diminished weight that the reader may put on what they have written. Second, those who read articles in which the author declares a COI may not discount biased information as much as they should because of a tendency to be influenced by information they know they should ignore and possibly because the act of disclosure may make them more likely to place greater weight on the author’s statements given the author’s openness in admitting to the COI. Whatever the reason, in some circumstances disclosure may result in the recipient of the biased information placing greater weight on the biased information.

A recent editorial in Nature Medicine discusses the difference between a perceived and an actual COI. 15 The editorial discusses the fact that the casual reader may consider there is a COI in sponsored content, but that because the “sponsors never have a say on the editorial content of anything [they] publish,” and because the editorial content for supplements is already commissioned before potential sponsors are approached, any COI is apparent rather than real. However, as discussed, humans do not always know their own minds and are not always aware of their own biases. Articles may be commissioned to suit a particular sponsor’s biases even without the person commissioning them being aware of that fact. In my opinion, it is not possible to state categorically that a COI is apparent rather than real.

All those involved in the research literature, including publishers, editors of journals, reviewers of manuscripts and authors, can have COIs. In the rest of this article I will discuss some of the factors that lead to COIs for each of these groups, describe how the Journal of Psychiatry and Neuroscience tries to deal with each of these issues and suggest how the current situation can be improved.

The pervasiveness of COIs in publishing

Publishers are acting with a COI whenever they interfere with the day-to-day management of a journal by the editorial staff. Two extreme versions of this have come to light recently. According to a recent report in the BMJ 16 concerning a court case about the Merck anti-arthritis drug rofecoxib (Vioxx), Elsevier has apologized for the improper publication of Merck-sponsored marketing material “that was made to look like journals.” More details are given in a report in Nature Medicine . 17 In a second case reported in Nature , 18 a computer-generated hoax article was submitted to The Open Information Science Journal published by Bentham Science Publishing. The paper was accepted and the authors were asked to pay US$800 for publication. At this point the authors withdrew their manuscript. The editor-in-chief of the journal, when contacted by Nature, reported that he had not seen the article and stated that he would resign.

Several of the top medical journals are owned by medical associations. As these journals often carry news and opinion items in addition to research reports there may sometimes be a conflict between the opinions of the editor of a journal and those of the officers of the association that owns the journal. Such conflicts have resulted in the departure of the editors of the New England Journal of Medicine , 19 the Journal of the American Medical Association 20 and the Canadian Medical Association Journal ( CMAJ ). 21 The CMAJ is published by the Canadian Medical Association, also the publisher of the Journal of Psychiatry and Neuroscience . The firing of the editor of the CMAJ led to the Canadian Medical Association adopting 25 recommendations of a review panel that enshrine editorial independence in the governance structure of all journals published by the association. 22

Given the cost of publishing, money is an important factor that can lead to COIs for publishers. This is true whether a publisher is for-profit or not-for-profit given that even not-for-profit publishers have to remain financially sound. The costs of publishing must be funded somehow, and the most common sources are journal subscriptions, advertising and publication charges. Advertising by drug companies is common in medical journals, and this is sometimes problematic. Othman and colleagues 23 did a systematic review of articles on advertisements in medical journals that included 24 articles assessing advertisements from journals in 26 countries. Although most of the advertisements made claims that were supported by a systematic review, meta-analysis or randomized controlled trial, some advertisements made claims that were not well supported by evidence. In some countries, most claims were not well supported. Another issue is that advertisements sometimes focus on the newest, most expensive drugs that may not be superior to cheaper alternatives. 24 One point of view is that medical journals should not accept advertising from industries relevant to medicine. 24 The alternatives, subscriptions and publication charges, also have their problems. The money spent on journal subscriptions by university and hospital libraries is not available for other purposes, and publication charges, which are usually paid from research grants, take away money that could otherwise be devoted to research. Thus, there is always a conflict between the publisher’s interest in remaining financially sound and its responsibility to the researchers who provide the manuscripts and read the papers. A recent article in Nature (published by the Nature Publishing Group, a for-profit publisher) on one of the most prominent open-access not-for-profit publishers, the Public Library of Science (PLoS), gives an interesting perspective on publication charges. 25 The title of the article is “PLoS stays afloat with bulk publishing.” The article states that the financial situation of PLoS has improved “thanks to a cash cow in the form of PLoS One, ” which “uses a system of ‘light’ peer review” and has generated substantial amounts of money from author fees. PLoS One reviews only for methodology, not for significance of the results, and minimizes costs by publishing only online. My own perspective is that this is an imaginative innovation that, in addition to being financially sound, may become an important model for publishing research. As the editorial board of Nature knows well, the significance of research is sometimes hard to discern. Nature itself turned down the opportunity to publish the paper by Hans Krebs describing what Krebs called the citric acid cycle and everyone else calls the Krebs cycle. 26 , 27 The issue with publication charges, as with advertising, is how the COI is addressed. Policies related to advertising in medical journals are usually available, and a recent review summarizes some of those policies from 9 of the top medical journals. 24

Publishers are capable of finding surprising ways to act inappropriately in the face of COIs. According to a recent report in the BMJ , Elsevier offered $25 gift cards to academics to encourage them to post favourable reviews of the academic textbook Clinical Psychology , although subsequently Elsevier admitted this was a mistake. 28

The Journal of Psychiatry and Neuroscience is an open-access journal that has no publication charge. Its main source of revenue is advertising in the print edition). The policies that govern advertising in the journal are available on the Canadian Medical Association website ( www.cma.ca/index.cfm/ci_id/25274/la_id/1.htm ). For me as an editor, the important issues are that I have no contact with those who obtain advertising for the journal and do not know what advertisements will appear in any issue. The administrative staff ensures that advertisements do not appear in inappropriate places (e.g., an advertisement for an antidepressant next to an article on depression or antidepressants).

Conflicts of interest for editors are usually taken to mean conflicts related to funding from industry, and the Journal of Psychiatry and Neuroscience is among those journals that publishes this information on the journal website ( www.cma.ca/jpn ). However, in my opinion non-financial issues are probably more important. Every editor wants his or her journal to be a success. The measure of success of a journal that has become widely used, but is much criticized, is the impact factor. Acting in a way that will increase the impact factor of a journal is not always entirely compatible with the professional responsibilities of an editor.

The impact factor for a journal is based on the rate at which articles in the journal are cited. For example, the impact factor for 2008 is the sum of citations in 2008 to articles published in the journal in 2006 and 2007, divided by the number of articles published in 2006 and 2007. The number of citations a paper received can certainly be an indication of its importance. However, the relation between citations and importance is not a tight one. Obviously papers in a popular field will tend to receive more citations than those in a less popular field, irrespective of quality. This is an issue of some concern. In an important paper on “Why most published research findings are false,” Ioannidis 29 discusses some of the factors that lead to false findings. He points out that, from a theoretical perspective, “the hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.” Pfeiffer and Hoffmann 30 have provided some empirical support for this prediction. In biological psychiatry research, one popular area is psychiatric genetics. Unfortunately, associations that are reported between particular gene polymorphisms and disorders or symptoms are often not replicated or confirmed by meta-analyses. 31 , 32 The false discovery rate may be as high as 95%. 33 Interestingly, genetic association studies published in journals with a high impact factor are more likely to provide an overestimate of the true effect size owing in part to small sample sizes. 34 The International Journal of Neuropsychopharmacology demonstrated an interesting approach to the problem of non-replication in psychiatric genetic studies when it published a paper on the interaction between the 5-HTTLPR serotonin transporter polymorphism and environmental adversity and the risk for depression and anxiety. 35 In the same issue there was a review on the lack of replication in such genetic studies that suggested the former paper might provide “further evidence that the literature to date is compatible with chance findings.” 36 All judgments about the quality of research papers are subjective. Nonetheless, an editor who selects for publication a psychiatric genetic study with a relatively small sample size and a level of significance not much better than 0.05 over an innovative and methodologically sound manuscript dealing with a topic that is not currently popular may be helping to enhance the impact factor of the journal at the expense of its scientific quality.

One direct way in which editors can manipulate impact factors is by altering the timing of publication of papers. If, for example, a paper that is likely to be highly cited was published in the December 2010 issue of a journal, citations that would contribute to the 2011 impact factor would have to occur within between 1 and 13 months after publication, but citations are unlikely to occur within 6 months of publication. If the same paper were published in January 2011, citations that occur between 12 and 24 months after publication would contribute to the 2012 impact factor. Thus, publishing papers that are likely to have a high citation rate early in any year will help to inflate the impact factor of a journal. Obviously this is unfair to authors if the publication of their paper is delayed, and I am not aware if it ever occurs. Nonetheless there is evidence that some editors do take estimated citation rates into account when making decisions. Chew and colleagues 37 analyzed impact factor trends for medical journals and interviewed the editors. They concluded that rising impact factors were due to deliberate editorial practices in spite of the editors’ dissatisfaction with impact factors as the measure of the quality of a journal. One quotation from an editor is particularly salient: “our basis for rejection is often ‘I don’t think this paper is going to be cited.’” It is not clear from this quotation whether the editors would reject a manuscript because they thought the citation rate was more important than the quality of the science or because they equated the citation rate with the quality of the science.

Not all COIs for editors are related to impact factors. The desire of editors to please authors by having a manuscript reviewed as quickly as possible, thereby encouraging authors to submit further manuscripts, can be in conflict with getting excellent reviews. The assertion by Ioannidis 29 that much of what is in research journals is false can only be correct if standards of reviewing are not very good. Unfortunately this idea is supported by research. In a test of what errors peer reviewers detect, reviewers detected an average of 2.6 of 9 major errors in test manuscripts, and this number was not improved after reviewer training. 38 Serious statistical errors are common even in some high-profile journals. 39 The best peer reviewers are usually busy people who will not necessarily be able to produce reviews promptly, and adding an expert statistical review to the content reviews may increase the time needed to review a manuscript. However, it is not possible to say to what extent, if at all, the poor standards of reviewing are due to the desire of some editors to speed up the process of review at the expense of the quality of the reviews.

Conflicts of interest for editors may also arise from the publication of supplements, the publication of papers by an editor, and the non-adherence to important guidelines for reporting. Journal supplements, which are often subsidized by the pharmaceutical industry, can help improve the financial standing of a journal, which is often a concern for editors and publishers. However a study concluded that manuscripts “published in journal supplements are generally of inferior quality compared with articles published in the parent journal.” 40 Editors can legitimately publish a peer-reviewed article in the journal they edit as long as the manuscript undergoes peer review that is as thorough as all other manuscripts, and the member of the editorial board overseeing the peer review does his or her best to ensure that any bias in the assessment of the manuscript is minimized. This may not always be so. Nature recently reported on the editor of a theoretical physics journal who was facing growing criticism after publishing nearly 60 papers in 1 year in the journal he edited. 41 In terms of guidelines for reporting, many journals adhere to the statement of the International Committee of Medical Journal Editors ( www.icmje.org/publishing_10register.html ). This requires that to be considered for publication clinical trials must be registered in a public trials registry at or before the onset of patient enrolment. However, some well-known journals in biological psychiatry publish the results of clinical trials without giving any information about trial registration, suggesting that the trials may not have been registered. One possible explanation for this is that the editors value the citations received by clinical trials, which are often highly cited, more than adherence to the trial registration policy.

Among the policies that the Journal of Psychiatry and Neuroscience has adopted to minimize any effect of editors’ COIs are reporting of financial COIs of editorial board members on the journal website, publishing peer-reviewed papers in the order in which they were accepted (with the exception of including short commentaries on topical subjects or moving a shorter paper forward when a longer paper will not fit the page allotment of the journal), giving all published papers that contain statistics a full review by a statistician, not publishing supplements, ensuring that all papers from members of the editorial board go through full peer review and adhering to guidelines such as the registration of clinical trials.

Conflicts of interest for reviewers are, in part, similar to those for authors. If a manuscript discusses medications and a reviewer has some connection with a pharmaceutical company that is involved with any medication mentioned in the manuscript or a drug of the same class produced by a competitor, this COI should be mentioned to the editor; the Journal of Psychiatry and Neuroscience asks reviewers to mention any COI to the editor. Other COIs for reviewers are less clear and are, in my experience, seldom mentioned. These include any possible personal relationship (positive or negative) with any of the authors of a manuscript and professional rivalry owing to the reviewer and authors researching similar topics. Reviewers have their own biases based on their own research approaches. In my experience, if a reviewer recommends that the authors cite an additional reference, more often than not it is to one of the reviewer’s own papers, and the recommendation is not always appropriate. An important COI for reviewers is the conflict between the professional obligation to produce a well thought-out review in a timely manner and the desire not to spend too much time on a task that is relatively thankless. Reviewers seldom read the instructions on what is required in a review. The editor of Obstetrics and Gynecology inserted the following sentence in the middle of a paragraph of instructions for reviewers: “If you read this and call or fax our office, we will send you a gift worth 20 dollars.” 42 The response rate was 17%. A minority of reviewers who agree to review a manuscript never submit their reviews or clearly do not devote the time needed to their reviews. The latter is readily apparent when, for example, a reviewer’s assessment includes factual errors about the design of the study. Behaviours like this inconvenience editors and can adversely impact authors by delaying decisions on manuscripts.

Little research has been done on the factors that influence reviewers’ decisions, and more is needed so that editors can take into account possible biases in reviewers’ assessments. As mentioned, reviewers miss many important flaws in manuscripts, and training does not improve this situation. 38 In ecology research, recommendations to reject are not influenced by age, but those who have more papers in high-impact journals recommend rejection of manuscripts at up to twice the rate of reviewers with few or no papers in high-impact journals. 43 Although this is an indication of different biases among different authors, it does not necessarily reflect a COI.

The COIs of authors include those conflicts that have potential to affect how the research was conducted and interpreted as well as those that influence how it is presented, which is why financial COIs for authors are an important issue. A review of studies on the extent, impact and management of financial COIs reported a significant association between industry sponsorship and pro-industry outcomes in published papers and concluded that financial ties between industry and academia influence biomedical research in important ways. 44 This is consistent with the idea discussed earlier in this editorial that admitting to a financial COI does not necessarily deal with the bias that the financial COI creates. The issue of what exactly constitutes a financial COI can be complex. The website of the National Institutes of Health (NIH) in the United States on frequently asked questions about financial COIs is more than 5,000 words long (http ://grants.nih.gov/grants/policy/coifaq.htm#c1). However, the bottom line is that NIH requires anything over $10 000 per year to be declared. This may seem high to some, but GlaxoSmithKline recently announced that they would limit the advisory payments and honoraria it gives to US doctors to (only?) $150 000 per year. 45 Some journals require any financial COI to be declared, no matter how small. Although payment of a $500 honorarium may not create as big a bias as a $50 000 consultant payment, it is unrealistic to think that researchers might mention the exact amount of payments when declaring COIs.

Because financial COIs have been the subject of many recent articles, this editorial focuses on other COIs that authors should be attempting to deal with. The first, and by no means trivial, COI that is an issue for the vast majority of authors is the pride and sense of ownership that authors take in the work they submit for publication. This presumably is responsible for the fact that when authors were interviewed about their published papers “important weaknesses were often admitted on direct questioning but were not included in the published article.” 46 Certainly editors are used to asking authors to mention the limitations of their studies and to be more cautious about the implications of the research. Another related factor is the desire for researchers to advance their careers and get recognition from their peers. Research suggests that social and monetary reward may work through both psychological 47 and neuroanatomical processes 48 that overlap to some extent. The big difference in relation to COIs is that social rewards, unlike monetary rewards, cannot be disclosed in any meaningful way.

In some situations COIs can arise because all the authors need to take responsibility for the content of a manuscript. If an author is included who does not fulfill the requirements of the International Committee of Medical Journal Editors for authorship ( www.icmje.org/ethical_1author.html ), then both that person and the other authors are not fulfilling their professional obligations. Another related problem is that of ghost authorship (i.e., when someone who was not involved in the work, often a pharmaceutical company employee, writes a manuscript but does not appear as an author; see Ross and colleagues 49 ). Ghostwriting may be part of a pharmaceutical company effort to promote products through “carefully orchestrated campaigns to pass off sympathetic, if not biased, research and review articles as the work of academic scientists rather than of their own contracted employees.” 50 Finally, there may be conflict among the different authors in how to present and interpret the results of a study. Attempts to resolve these issues are not always successful. Interviews of authors of papers published in The Lancet revealed that individual authors often disagreed with opinions expressed in the papers and that the papers revealed “evidence of (self)-censored criticism, obscured meaning, and confused assessment of implications.” 46 Overall, the evidence suggests that non-monetary COIs can create similar problems to monetary ones.

The Journal of Psychiatry and Neuroscience asks all authors to sign a statement about any financial COIs they may have, state what role they played in the research and writing of the manuscript, state whether they approved the final version of the manuscript, and indicate whether there was anyone involved in writing the manuscript who was not an author.

In spite of all the problems created by bias and COIs, research continues to advance. However, the speed of the advance might be enhanced if these problems could be reduced. Obviously there needs to be better training and mentoring of scientists concerning COIs and bias. Unfortunately, a recent study on the effects of mentoring and training in responsible conduct of research concluded that these interventions have the potential to influence behaviour in ways that can both increase and decrease the likelihood of problematic behaviour. 51 More research on effective training and mentoring techniques is needed urgently. Fortunately, some relevant information is available in the psychology literature. In experimental studies, for example, asking participants to consider the opposite of their own opinion was more effective in reducing their biases than asking them to be as fair and unbiased as possible without giving them a specific strategy to achieve this aim. 52

The investigations of Senator Charles Grassley have intensified the debate about sources of bias in the literature and how they may be reduced. However, the debate has focused rather narrowly on money and the objective of a literature relatively free of bias remains a pious but distant hope.

Competing interests: None declared (if you consider competing interests to be limited to financial ones).

IMAGES

  1. Overview of Potential Biases in the Systematic Literature Review

    bias in literature review

  2. Research bias: What it is, Types & Examples

    bias in literature review

  3. Risk of Bias Assessment During Systematic Literature Reviews: Why and How?

    bias in literature review

  4. 6 Examples of Bias in Literature

    bias in literature review

  5. PPT

    bias in literature review

  6. Research Bias and Literature Review

    bias in literature review

VIDEO

  1. Sampling Bias in Research

  2. Gender and Culture Bias

  3. Introduction to risk of bias assessment and issues in extracting data for systematic reviews

  4. Shocking! The View ADMITS Left Wing Media Bias When Defending CNN anchor Cutting Trump Press Sec Mic

  5. Photojournalism: how to dramatize the Brussels Lockdown (English subtitles)

  6. How To Speak Without Bias

COMMENTS

  1. Avoiding Bias in Selecting Studies

    Publication bias and outcome reporting bias can have implications for the conclusions of a review. Bias in selection of studies may overlap ... may also uncover changes in the definition the primary outcome or misrepresentation of the primary outcome. 41 Dual review of gray literature documents is recommended when assessing relevance for ...

  2. Assessing the Risk of Bias in Systematic Reviews of Health Care

    Risk-of-bias assessment is a central component of systematic reviews but little conclusive empirical evidence exists on the validity of such assessments. In the context of such uncertainty, we present pragmatic recommendations that can be applied consistently across review topics, promote transparency and reproducibility in processes, and address methodological advances in the risk-of-bias ...

  3. Eight problems with literature reviews and how to fix them

    Environment. Policy*. Research Design. Systematic Reviews as Topic*. Traditional approaches to reviewing literature may be susceptible to bias and result in incorrect decisions. This is of particular concern when reviews address policy- and practice-relevant questions. Systematic reviews have been introduced as a more rigorous approach to ...

  4. Research Techniques Made Simple: Assessing Risk of Bias in Systematic

    Research Techniques Made Simple. Systematic reviews are increasingly utilized in the medical literature to summarize available evidence on a research question. Like other studies, systematic reviews are at risk for bias from a number of sources. A systematic review should be based on a formal protocol developed and made publicly available ...

  5. PDF Assessing risk of bias in included studies

    Outcome assessment. Publication of study outcomes. Incomplete outcome data. when complete outcome data for all participants is not available for your review. attrition - loss to follow up, withdrawals, other missing data. exclusions - some available data not included in report. can lead to attrition bias. considerations.

  6. Writing a literature review

    Writing a literature review requires a range of skills to gather, sort, evaluate and summarise peer-reviewed published data into a relevant and informative unbiased narrative. Digital access to research papers, academic texts, review articles, reference databases and public data sets are all sources of information that are available to enrich ...

  7. Eight problems with literature reviews and how to fix them

    Traditional approaches to reviewing literature may be susceptible to bias and result in incorrect decisions. This is of particular concern when reviews address policy- and practice-relevant questions.

  8. Systematic review of publication bias in studies on ...

    Publication bias is a well known phenomenon in clinical literature,1 2 in which positive results have a better chance of being published, are published earlier, and are published in journals with higher impact factors. Conclusions exclusively based on published studies, therefore, can be misleading.3 Selective underreporting of research might be more widespread and more likely to have adverse ...

  9. Assessment of publication bias and outcome reporting bias in systematic

    In this study, a systematic review was defined as any literature review which presents explicit statements with regard to research question(s), search strategy and criteria for study selection. ... (26%) of the reviews which assessed publication bias. The remaining reviews mostly reported low/no risk of publication bias. One review, ...

  10. Peer Review Bias: A Critical Review

    Various types of bias and confounding have been described in the biomedical literature that can affect a study before, during, or after the intervention has been delivered. The peer review process can also introduce bias. A compelling ethical and moral rationale necessitates improving the peer revie …

  11. Writing a literature review

    Reducing bias in a literature review It is important to be mindful of introducing bias, as precon-ceived ideas about your subject area, whether intentional or not, can affect all stages of writing a literature review, from identifying literature sources, selecting articles to include and your evaluation of the evidence. Using a pro-

  12. Chapter 7: Considering bias and conflicts of interest among the

    7.6.1 Introduction #section-7-6-1. When performing and presenting meta-analyses, review authors should address risk of bias in the results of included studies ( MECIR Box 7.6.a ). It is not appropriate to present analyses and interpretations while ignoring flaws identified during the assessment of risk of bias.

  13. Minimize Bias

    Minimizing Bias. Multiple types of bias may impact health evidence. The Cochrane Handbook for Systematic Reviews of Interventions ( Table 7.2.a)1 provides definitions of non-reporting biases that can be minimized by identifying all relevant literature on a research topic.

  14. Research Techniques Made Simple: Assessing Risk of Bias in ...

    Abstract. Systematic reviews are increasingly utilized in the medical literature to summarize available evidence on a research question. Like other studies, systematic reviews are at risk for bias from a number of sources. A systematic review should be based on a formal protocol developed and made publicly available before the conduct of the ...

  15. Systematic Reviews: Reporting the quality/risk of bias

    Types of literature review, methods, & resources; Protocol and registration; Search strategy; Medical Literature Databases to search; Study selection and appraisal; Data Extraction/Coding/Study characteristics/Results; Reporting the quality/risk of bias; Manage citations using RefWorks This link opens in a new window

  16. Assessment of publication bias, selection bias, and ...

    Objective To examine the potential for publication bias, data availability bias, and reviewer selection bias in recently published meta-analyses that use individual participant data and to investigate whether authors of such meta-analyses seemed aware of these issues. Design In a database of 383 meta-analyses of individual participant data that were published between 1991 and March 2009, we ...

  17. Types of Bias in Systematic Reviews

    Conclusion Bias. It relates to the way the author decides to relay the conclusions derived from the systematic review. Again, this goes back to careful consideration of the research question. The decision on representing the outcomes qualitatively or quantitatively is crucial to how the outcome is utilized in the future.

  18. Assessing the Risk of Bias of Individual Studies in Systematic Reviews

    This document updates the existing Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Center (EPC) Methods Guide for Effectiveness and Comparative Effectiveness Reviews on assessing the risk of bias of individual studies. As with other AHRQ methodological guidance, our intent is to present standards that can be applied consistently across EPCs and topics, promote ...

  19. 8 common problems with literature reviews and how to fix them

    In our recent paper in Nature Ecology and Evolution, we highlight 8 common problems with traditional literature review methods, provide examples for each from the field of environmental management and ecology, and provide practical solutions for ways to mitigate them. Problem. Solution. Lack of relevance - limited stakeholder engagement can ...

  20. Language bias in systematic reviews: you only get out what you ...

    Limiting study inclusion on the basis of language of publication is a common practice in systematic reviews. Neimann Rasmussen and Montgomery cite lack of time, insufficient funding, and unavailability of language resources (e.g. professional translators) as the most common reasons for not including languages other than English (LOTE) in a systematic review. 1 Thirty-eight percent (95% ...

  21. A systematic review of the literature on interpretation bias and its

    There exist a large number of paradigms in the interpretation bias literature, each of which varies in the type of stimuli or material employed, such as ambiguous scenarios (offline measures), or single-target prime words and images (online measures; see Schoth & Liossi, 2017, for a review of methods).

  22. Identifying stigmatizing language in clinical documentation: A scoping

    Background Racism and implicit bias underlie disparities in health care access, treatment, and outcomes. An emerging area of study in examining health disparities is the use of stigmatizing language in the electronic health record (EHR). Objectives We sought to summarize the existing literature related to stigmatizing language documented in the EHR. To this end, we conducted a scoping review ...

  23. Allegiance effects in cognitive processing therapy (cpt) for

    ObjectiveWe sought to determine whether there is evidence of researcher allegiance bias in the reporting of cognitive processing therapy (CPT) for posttraumatic stress disorder (PTSD).MethodWe used a reprint analysis approach - whereby papers were coded for indications of potential bias - to determine the presence and magnitude of researcher allegiance in published randomized controlled ...

  24. Moving towards less biased research

    Introduction. Bias, perhaps best described as 'any process at any stage of inference which tends to produce results or conclusions that differ systematically from the truth,' can pollute the entire spectrum of research, including its design, analysis, interpretation and reporting. 1 It can taint entire bodies of research as much as it can individual studies. 2 3 Given this extensive ...

  25. Ageing and wellbeing co-creation: systematic literature review and

    The systematic literature review goes beyond mere aggregation. It critically appraises, synthesises, and presents research findings in a manner that offers clarity, direction, and a holistic understanding of the topic at hand (Xiao & Watson, Citation 2019). Considering the complex array of issues and requirements faced by the ageing population ...

  26. American Journal of Neuroradiology

    STUDY SELECTION: 718 abstracts were screened and 9 5 underwent full-text review, with 2 articles meeting inclusion criteria. The Risk of Bias in Non-randomized Studies of Interventions assessment tool was used. DATA ANALYSIS: A qualitative assessment without a pooled analysis was performed for the two studies meeting inclusion criteria.

  27. Bias in Observed Assessments in Medical Education: A Scoping Review

    When race and ethnicity bias was identified, it consistently favored White students. Conclusions: This review mapped studies of gender, race, and ethnicity bias in the medical education assessment literature, finding limited studies on other SIDCs and intersectionality. These findings will guide future research by highlighting the importance of ...

  28. Ethical implications of AI-driven bias assessment in medicine

    Barsby et al present a thought-provoking pilot study which discusses the application of large language models (LLMs) for automating risk-of-bias (RoB) assessments in system evaluation.1 Although LLMs show potential in streamlining evidence synthesis, the major ethical concerns caused by their integration into the medical decision-making process require careful consideration. Patient safety is ...

  29. Identifying and Avoiding Bias in Research

    Bias can occur in the planning, data collection, analysis, and publication phases of research. Understanding research bias allows readers to critically and independently review the scientific literature and avoid treatments which are suboptimal or potentially harmful. A thorough understanding of bias and how it affects study results is ...

  30. Bias in the research literature and conflict of interest: an issue for

    The involvement of different cognitive processes may make it difficult to reconcile self-interest and obligations. MacCoun, 6 in an extensive review, examined the experimental evidence about bias in the interpretation and use of research results. He also discussed the evidence and theories concerning the cognitive and motivational mechanisms ...