Homogeneity, Homogeneous Data & Homogeneous Sampling

Statistics Definitions > Homogeneity & Homogeneous Data

What is Homogeneity?

A data set is homogeneous if it is made up of things (i.e. people, cells or traits) that are similar to each other. For example a data set made up of 20-year-old college students enrolled in Physics 101 is a homogeneous sample .

What is Homogeneous Sampling?

In homogeneous sampling, all the items in the sample are chosen because they have similar or identical traits. For example, people in a homogeneous sample might share the same age, location or employment. The selected traits are ones that are useful to a researcher. It is a type of purposive sampling and is the opposite of maximum variation sampling .

Homogeneous samples tend to be:

  • Made up of similar cases.

The opposite of a homogeneous sample is a heterogeneous sample. For this example, you might have a heterogeneous sample of 18-21 year old students in history 112, chemistry 211 and physics 101. The same is true for a heterogeneous population (all items in the population have different characteristics) and a homogeneous population (all items in the population have the same characteristics).

Homogeneous in More General Terms

In data analysis , a set of data is also considered homogeneous if the variables are one type (i.e. binary or categorical); if the variables are mixed (i.e. binary + categorical), then the data set is heterogeneous.

While it’s common in statistics to use “homogeneous” to mean the general sense of being the same, a data set can be analyzed mathematically to see if the data set is homogeneous. There are several ways to achieve this:

  • Compare boxplots of the data sets.
  • Compare descriptive statistics (especially the variance , standard deviation and interquartile range .
  • Run a statistical test for homogeneity.

Statistical Tests

Running statistical tests for homogeneity becomes important when performing any kind of data analysis, as many hypothesis tests run on the assumption that the data has some type of homogeneity. For example, an ANOVA test assumes that the variances of different populations are equal (i.e. homogeneous).

One example of a test is the Chi-Square Test for Homogeneity . This tests to see if two populations come from the same unknown distribution (if they do, then they are homogeneous). The test is run the same way as the standard chi-square test; the Χ 2 statistic is computed, and the null hypothesis (that the data comes from the same distribution) is either accepted or rejected.

Homogeneity of Variance

homogeneity

  • Anatomy & Physiology
  • Astrophysics
  • Earth Science
  • Environmental Science
  • Organic Chemistry
  • Precalculus
  • Trigonometry
  • English Grammar
  • U.S. History
  • World History

... and beyond

  • Socratic Meta
  • Featured Answers

Search icon

What are homogeneous and heterogeneous populations?

what is homogeneous population in research

Related questions

  • What controls population growth? How does density affect population growth?
  • What is the effect of population growth on GDP?
  • How does population growth in other countries affect the US?
  • What is population momentum?
  • What are the pros and cons of population growth?
  • What are the 3 types of population growth?
  • What is negative population growth?
  • Can someone help me with this calculus population growth problem? I think I need to use a basic...
  • What are examples of limits on population growth?
  • How can you tell that a country’s population has had a baby boom?

Impact of this question

what is homogeneous population in research

  • Search Menu

Sign in through your institution

  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Papyrology
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Social History
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Evolution
  • Language Reference
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Media
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business Ethics
  • Business History
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic History
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Management of Land and Natural Resources (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Sustainability
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • Ethnic Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Theory
  • Politics and Law
  • Politics of Development
  • Public Administration
  • Public Policy
  • Qualitative Political Methodology
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

An Introduction to Model-Based Survey Sampling with Applications

  • < Previous chapter
  • Next chapter >

3 Homogeneous Populations

  • Published: January 2012
  • Cite Icon Cite
  • Permissions Icon Permissions

This chapter describes the simplest possible model for a finite population: the homogeneous population model. It is appropriate when there is no auxiliary information that can distinguish between different population units. The homogeneous population model assumes equal expected value and variance for the variable of interest for all population units. Values from different units are assumed to be independent although this is relaxed in the last section of the chapter. The empirical best and best linear unbiased predictor of a population total are derived under the model. Inference, sample design and sample size calculation are also discussed. The most appropriate design for this kind of population is usually simple random sampling without replacement. The urn model (also known as the hypergeometric model), a special case of the homogeneous population model, is also discussed.

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code
  • Add your ORCID iD

Institutional access

Sign in with a library card.

  • Sign in with username/password
  • Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

Month: Total Views:
October 2022 3
November 2022 2
December 2022 4
February 2023 1
March 2023 5
April 2023 2
June 2023 2
July 2023 5
August 2023 2
September 2023 4
October 2023 2
November 2023 6
February 2024 2
April 2024 1
May 2024 3
June 2024 3
July 2024 2
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Research Design Review

A discussion of qualitative & quantitative research design, focus groups: heterogeneity vs. homogeneity.

The following is a modified excerpt from Applied Qualitative Research Design: A Total Quality Framework Approach (Roller & Lavrakas, 2015, pp. 107-109).

Heterogeneity

Fundamental to the design of a focus group study is group composition. Specifically, the researcher must determine the degree of homogeneity or heterogeneity that should be represented by the group participants. As shown below, there are many questions the researcher needs to contemplate, such as the extent of similarity or dissimilarity in participants’ demographic characteristics, as well as in their experiences and involvement with the subject matter.

A few of the questions the focus group researcher might consider when determining the desired heterogeneity or homogeneity among group participants include:

Whether or not—or the degree to which—group participants should be homogeneous in some or all characteristics has been at the center of debate for some years. On the one hand, Grønkjær, Curtis, Crespigny, and Delmar (2011) claim that at least some “homogeneity in focus group construction is considered essential for group interaction and dynamics” (p. 23)—for example, participants belonging to the same age group may have similar frames of reference and feel comfortable sharing their thoughts with people who have lived through the same experience. In the same vein, Sim (1998) states that, “the more homogeneous the membership of the group, in terms of social background, level of education, knowledge, and experience, the more confident individual group members are likely to be in voicing their [own] views” (p. 348). Even among strangers, there is a certain amount of comfort and safety in the group environment when the participants share key demographic characteristics, cultural backgrounds, and/or relevant experience.

A problem arises, however, if this comfortable, safe environment breeds a single-mindedness (or “groupthink”) that, without the tactics of a skillful moderator, can stifle divergent thinking and result in erroneous, one-sided data. Heterogeneity of group participants (e.g., including users and nonusers of a particular child care service within the same focus group) potentially heads off these problems by stimulating different points of view and a depth of understanding that comes from listening to participants “defend” their way of thinking (e.g., product or service preferences). As Grønkjær et al. (2011) state, “a group may be too homogeneous; thus influencing the range and variety of the data that emerges” (p. 26). The tension that heterogeneity may create in a group discussion can serve to uncover deeper insights into what is being studied, providing the moderator is able to channel this tension in constructive directions. In addition to a heightened level of diversity, heterogeneous groups may also be a very pragmatic choice for the researcher who is working with limited time and financial resources, or whose target population for the research is confined to a very narrow group (e.g., nurses working at a community hospital).

Ultimately, the answer to the question of whether group participants should be homogeneous or heterogeneous is “it depends.” As a general rule, group participants should have similar experiences with, or knowledge of, the research topic (e.g., using the Web to diagnose a health problem, weekly consumption of fat-free milk), but the need for “sameness” among participants on other parameters can fluctuate depending on the circumstance. Halcomb, Gholizadeh, DiGiacomo, Phillips, and Davidson (2007), for example, report that homogeneity of age is particularly important in non-Western countries where younger people may believe it is disrespectful to offer comments that differ from those stated by their elders. Homogeneous groups are also important when investigating sensitive topics, such as drug use among teenagers, where a more mixed group of participants with people who are perceived as “different” (in terms of demographics and knowledge/experience with drugs) may choke the discussion and lead to a struggle for control among participants (e.g., one or more participants trying to dominate the discussion).

Homogeneity of gender, on the other hand, may or may not be important to the success (usefulness) of a focus group study. For example, an organization conducting employee focus group research to explore employees’ attitudes toward recent shifts in management would need to conduct separate groups with men and women in order to discover how the underlying emotional response to new management differs between male and female employees. In contrast, a focus group study among city residents concerning public transportation might benefit from including both men and women in the same discussion, among whom the varied use and perceptions of the transportation services would serve to stimulate thinking and enrich the research findings. The heightened level of dynamics in groups that are heterogeneous in gender and other aspects may also provoke conversations on taboo subjects (e.g., racism) that might not be forthcoming in other methods such as in-depth interviews.

Grønkjær, M., Curtis, T., de Crespigny, C., & Delmar, C. (2011). Analysing group interaction in focus group research: Impact on content and the role of the moderator. Qualitative Studies , 2 (1), 16–30.

Halcomb, E. J., Gholizadeh, L., DiGiacomo, M., Phillips, J., & Davidson, P. M. (2007). Literature review: Considerations in undertaking focus group research with culturally and linguistically diverse groups. Journal of Clinical Nursing , 16 (6), 1000–1011. https://doi.org/10.1111/j.1365-2702.2006.01760.x

Sim, J. (1998). Collecting and analysing qualitative data: Issues raised by the focus group. Journal of Advanced Nursing , 28 (2), 345–352. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/9725732

Images captured/created from: https://www.thoughtco.com/heterogeneous-definition-and-example-606355

Share this:

  • Click to share on Reddit (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to email a link to a friend (Opens in new window)
  • Click to print (Opens in new window)
  • Pingback: Focus Groups: Moving to the Online Face-to-face Mode | Research Design Review

Design decisions about focus groups – should be deliberate (unless it’s with a naturally occurring group). However, as with any research endeavor – what drives design decisions – is/are the research question(s) being asked.

Thank you, Joe, for the comment. Indeed, it is all about the research question(s) — and the type of participants needed to address the objectives — that guide all sorts of design decisions. Reflecting on the type of participants and the objectives is important to the heterogeneity-homogeneity consideration.

Leave a comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed .

' src=

  • Already have a WordPress.com account? Log in now.
  • Subscribe Subscribed
  • Copy shortlink
  • Report this content
  • View post in Reader
  • Manage subscriptions
  • Collapse this bar

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 14 October 2021

Improving diversity in medical research

  • Ashwarya Sharma   ORCID: orcid.org/0000-0002-8653-1325 1 &
  • Latha Palaniappan 1  

Nature Reviews Disease Primers volume  7 , Article number:  74 ( 2021 ) Cite this article

14k Accesses

48 Citations

299 Altmetric

Metrics details

  • Adaptive clinical trial
  • Health policy

Clinical research is essential for the advancement of medicine; however, trials often enrol homogeneous populations that do not accurately represent the patient populations served. Representative and diverse research participation is necessary to establish fair standards of care, minimize outcome disparities between populations, and achieve and uphold social equity.

Since the early 2000s, clinical research has become more global and complex, as the number of clinical trials conducted to create interventions that improve patients’ health is increasing worldwide. Interest in the community is growing to recognize unique characteristics of clinical trial participants, in an attempt to better understand variability in drug responses. However, the diversity of patient populations remains under-represented in clinical trials, despite being a vital component in enabling medical research to move towards precision medicine approaches.

Severe imbalance in the representation of minorities is not new; clinical research has long been criticized for enrolling homogeneous populations that do not accurately represent the communities served.

In a 2020 analysis of the global participation in clinical trials, the FDA highlighted the vast difference between the enrolled participants and the global population. Of 292,537 participants in clinical trials globally, 76% were white, 11% were Asian and only 7% were Black 1 . In comparison, the global population (~7.8 billion) is distributed with ~60% of the population in Asia, ~16% in Africa, ~10% in Europe and ~8% in Latin America ( World Population Review ). Similarly, a review of 379 clinical trials funded by the US National Institute of Mental Health published in 1995–2004 found that all racial or ethnic groups except white individuals and African Americans were under-represented, and only ~48% of the studies provided complete racial or ethnic information 2 . Thus, data for global populations are lacking, and current guidelines and clinical decisions are based on insufficiently diverse trials and studies.

Past thinking has favoured the enrolment of individuals with similar characteristics to limit heterogeneity and to decrease the effects of interindividual variability and achieve consistent short-term results. In the future, we must strive to represent all populations that will eventually use the tested drugs and devices. Although enrolling diverse populations may initially bring higher variability in results than enrolling homogeneous populations, the outcome data can be leveraged through statistical analysis and novel study designs to tailor and individualize therapies, which can ultimately result in improved generalizability for the populations we serve.

In the future, we must strive to represent all populations that will eventually use the tested drugs and devices

Development of interventions that are not tested in diverse populations can lead to treatments that are less effective and less trusted in some populations, despite their need for the intervention. For instance, 5-fluorouracil, a well-studied, commonly used chemotherapeutic drug, was found to lead to adverse effects, including haematological toxicities, in certain individuals. These toxic effects occurred at higher rates in African American individuals than in white individuals 3 . However, this observation was not revealed in preceding clinical trials, as these had limited patient diversity, which ultimately negatively affected African American individuals’ health care 4 . Similarly, Ninlaro (ixazomib), approved by the FDA in 2015 for the treatment of multiple myeloma, had only 1.8% Black participants in the phase III clinical trial despite African Americans having higher incidence and prevalence of this disease than European Americans 5 , 6 . Under-represented populations are deeply affected by these inequities, as they can lead to distrust and worse health outcomes for certain populations compared with others.

This failure for meaningful diversity in health research also has considerable social and ethical implications, as individuals and entire subgroups that are traditionally under-studied may be unable to access potentially beneficial research. Overall, the imbalance leads to substantial differences in their lifelong care, leading to additional health inequities. Notably, the COVID-19 pandemic has further exposed these great inequalities in health, as Black, Latinx, Pacific Islander and other vulnerable populations have been disproportionately affected by SARS-CoV-2. A 2020 study showed that 34% of overall deaths were among non-Latinx Black people, although this group only accounts for 13% of the overall US population 7 . This increased disease mortality in these populations is thought to be due to pre-existing comorbidities, such as hypertension or diabetes, decreased access to testing, inequities in health-care delivery, exposure risks and, potentially, genetic differences. However, the relative risk of any of these underlying factors is unknown, as data addressing these issues are lacking.

Development of interventions that are not tested in diverse populations can lead to treatments that are less effective and less trusted

Historically, five major challenges have been highlighted for reduced participation in clinical trials — low income; investigator bias; mistrust in medical research and professionals; limited health and research literacy; and lack of access to transportation 8 . In particular, investigator bias and medical mistrust are uniquely present in medical practice. Investigator bias or the implicit biases that health-care providers may have can interfere with enrolment in clinical studies and are associated with poor quality of care. Taken together, patients of racial or ethnic minority have been found to receive poorer care than white patients across numerous illnesses, in part owing to biases and a lack of research on how specific diseases may uniquely affect various populations 8 . For instance, studies have shown that Asian American individuals are more likely to develop diabetes mellitus at lower body weight than white Americans 9 . However, limited resources of health information and research dedicated specifically to Asian American audiences are available. Mistrust and skepticism of medical professionals and the health-care system by minority and other under-represented groups exist owing to historical abuses, such as the US Public Health Service (USPHS) Syphilis Study at Tuskegee and forced sterilization of American Indians. As a consequence, the affected communities have less participation in trials and, in some cases, poorer health outcomes 10 .

To overcome the long-standing inequalities in health care and patient outcomes, the research community must commit to diversity and inclusion in clinical research. Here, we provide a framework for increasing diversity in clinical trials using the socioecological model (Fig.  1 ). Changes to public policy, community, institutional, interpersonal and intrapersonal domains can be used to increase diversity in research. At the public policy level, we can set strict requirements for representation of diverse populations as a necessity for approval of new drugs and devices. Uniform standards across research are needed to collect and record variables that capture various aspects of diversity, such as race or ethnicity, ancestry, language, religious practices and sexual orientation. At the community level, researchers must consider the specific priorities of patients and communities affected by the condition to ensure that the intended populations can be effectively recruited. At the institutional level, we must develop knowledge resources specifically for communities with historical medical mistrust. Institutions can transparently provide data for drug efficacy across different populations and acknowledge and address areas in which data do not currently exist. At the interpersonal level, we need to increase representation across training pathways to ensure diversity in all research and development teams. Further work is needed to understand knowledge, beliefs and attitudes towards clinical research at the intrapersonal level and known barriers to involvement in research should be addressed with the required support measures.

figure 1

Changes to public policy, community, institutional, interpersonal and intrapersonal domains can result in increased diversity in research and help overcome inequalities in health care and patient outcomes.

The imbalance of representation of diverse groups in clinical research is a problem that continues to adversely affect health care for all and is one that medical professionals must be prepared to address.

U.S. Food & Drug Administration. 2015–2019 drug trials snapshots summary report. FDA https://www.fda.gov/media/143592/download (2020).

Mak, W. W. S. et al. Gender and ethnic diversity in NIMH-funded clinical trials: review of a decade of published research. Adm. Policy Ment. Health 34 , 497–503 (2007).

Article   Google Scholar  

McCollum, A. D. et al. Outcomes and toxicity in African-American and caucasian patients in a randomized adjuvant chemotherapy trial for colon cancer. J. Natl Cancer Inst. 94 , 1160–1167 (2002).

Article   CAS   Google Scholar  

Meta-Analysis Group In Cancer et al. Toxicity of fluorouracil in patients with advanced colorectal cancer: effect of administration schedule and prognostic factors. J. Clin. Oncol. 16 , 3537–3541 (1998).

U.S. Food & Drug Administration. Drug trials snapshots: ninlaro. FDA https://www.fda.gov/drugs/drug-approvals-and-databases/drug-trials-snapshots-ninlaro (2016).

Landgren, O. et al. Risk of monoclonal gammopathy of undetermined significance (MGUS) and subsequent multiple myeloma among African American and white veterans in the United States. Blood 107 , 904–906 (2006).

Holmes, L. Jr. et al. Black–white risk differentials in COVID-19 (SARS-COV2) transmission, mortality and case fatality in the United States: translational epidemiologic perspective and challenges. Int. J. Environ. Res. Public Health 17 , 4322 (2020).

Bierer, B. E. et al. Achieving diversity, inclusion, and equity in clinical research guidance document version 1.2. The MRCT Center of Brigham and Women’s Hospital and Harvard https://mrctcenter.org/diversity-in-clinical-research/download/2326/ (2021).

Hsu, W. C. et al. BMI cut points to identify at-risk Asian Americans for type 2 diabetes screening. Diabetes Care 38 , 150–158 (2015).

George, S., Duran, N. & Norris, K. A systematic review of barriers and facilitators to minority research participation among African Americans, Latinos, Asian Americans, and Pacific Islanders. Am. J. Public Health 104 , e16–e31 (2014).

Download references

Author information

Authors and affiliations.

Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA

Ashwarya Sharma & Latha Palaniappan

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Latha Palaniappan .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Related links.

World Population Review: https://worldpopulationreview.com/continents

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Sharma, A., Palaniappan, L. Improving diversity in medical research. Nat Rev Dis Primers 7 , 74 (2021). https://doi.org/10.1038/s41572-021-00316-8

Download citation

Published : 14 October 2021

DOI : https://doi.org/10.1038/s41572-021-00316-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

what is homogeneous population in research

  • En español – ExME
  • Em português – EME

Heterogeneity: what is it and why does it matter?

Posted on 29th November 2018 by Maximilian Siebert

multicoloured balls hetereogeneity

Heterogeneity is not something to be afraid of, it just means that there is variability in your data. So, if one brings together different studies for analysing them or doing a meta-analysis, it is clear that there will be differences found. The opposite of heterogeneity is homogeneity meaning that all studies show the same effect.

It is important to note that there are different types of heterogeneity:

  • Clinical : Differences in participants, interventions or outcomes
  • Methodological : Differences in study design, risk of bias
  • Statistical : Variation in intervention effects or results

We are interested in these differences because they can indicate that our intervention may not be working in the same way every time it’s used. By investigating these differences, you can reach a much greater understanding of what factors influence the intervention, and what result you can expect next time the intervention is implemented.

Although clinical and methodological heterogeneity are important, this blog will be focusing on statistical heterogeneity .

How to identify and measure heterogeneity

Eyeball test.

In your forest plot , have a look at overlapping confidence intervals, rather than on which side your effect estimates are. Whether the results are on either side of the line of no effect may not affect your assessment of whether heterogeneity is present, but it may influence your assessment of whether the heterogeneity matters.

With this in mind, take a look at the graph below and decide which plot is more homogeneous.

2 statistical forest plot diagrams

Of course, the more homogeneous one is the plot number 1 . The confidence intervals are all overlapping and in addition to that, all studies favour the control intervention.

For the people who love to measure things instead of just eyeballing them, don’t worry, there are still some statistical methods to help you seize the concept of heterogeneity.

Chi-squared (χ²) test

This test assumes the null hypothesis that all the studies are homogeneous, or that each study is measuring an identical effect, and gives us a p-value to test this hypothesis. If the p-value of the test is low we can reject the hypothesis and heterogeneity is present.

Because the test is often not sensitive enough and the wrong exclusion of heterogeneity happens quickly, a lot of scientists use a p-value of < 0.1 instead of < 0.05 as the cut-off.

This test was developed by Professor Julian Higgins and has a theory to measure the extent of heterogeneity rather than stating if it is present or not.

Thresholds for the interpretation of I² can be misleading, since the importance of inconsistency depends on several factors. A rough guide to interpretation is as follows:

  • 0% to 40%: might not be important
  • 30% to 60%: moderate heterogeneity
  • 50% to 90%: substantial heterogeneity
  • 75% to 100%: considerable heterogeneity

To understand the theory above have a look at the following example.

Example of the I2 test

We can see that the p-value of the chi-squared test is 0.11, confirming the null hypothesis and thus suggesting homogeneity. However, by looking at the interventions we can already see some heterogeneity in the results. Furthermore, the I² Value is 51% suggesting moderate to substantial heterogeneity.

This is a good example of how the χ² test can be misleading when there are only a few studies in the meta-analysis.

How to deal with heterogeneity?

Once you have detected variability in your results you need to deal with it. Here are some steps on how you can treat this issue:

  • Check your data for mistakes – Go back and see if you maybe typed in something wrong
  • Don’t do a meta-analysis if heterogeneity is too high – Not every systematic review needs a meta-analysis
  • Explore heterogeneity – This can be done by subgroup analysis or meta-regression
  • Perform a random effects meta-analysis – Bear in mind that this approach is for heterogeneity that cannot be explained because it’s due to chance
  • Changing the effect measures – Let’s say you use the Risk Difference and have high heterogeneity, then try out Risk Ratio or Odds Ratio

(1) Fletcher, J. What is heterogeneity and is it important? BMJ 2007; 334 :94

(2) Deeks JJ, Higgins JPT, Altman DG (editors). Chapter 9: Analysing data and undertaking meta-analyses . In: Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011. Available from www.cochrane-handbook.org .

(3) https://www.mathsisfun.com/data/chi-square-test.html

' src=

Maximilian Siebert

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

No Comments on Heterogeneity: what is it and why does it matter?

' src=

Dear Mr. Sieber,

I am involved in a coordinate-based meta-analysis of neuroimaging studies and would like to know what are the ways of assessing heterogeneity if the only information I have from primary studies are peak coordinates and sample size?

Thank you very much, Lucija

' src=

This is an insightful piece. Thank you so much.

Subscribe to our newsletter

You will receive our monthly newsletter and free access to Trip Premium.

Related Articles

what is homogeneous population in research

How to read a funnel plot

This blog introduces you to funnel plots, guiding you through how to read them and what may cause them to look asymmetrical.

""

Heterogeneity in meta-analysis

When you bring studies together in a meta-analysis, one of the things you need to consider is the variability in your studies – this is called heterogeneity. This blog presents the three types of heterogeneity, considers the different types of outcome data, and delves a little more into dealing with the variations.

""

Risk Communication in Public Health

Learn why effective risk communication in public health matters and where you can get started in learning how to better communicate research evidence.

eLife logo

  • Evolutionary Biology
  • Genetics and Genomics

Population Genetics: Why structure matters

  • Open access
  • Copyright information
  • Comment Open annotations (there are currently 0 annotations on this page).
  • 20,378 views
  • 1,290 downloads
  • 85 citations

Share this article

Cite this article.

  • Nick Barton
  • Joachim Hermisson
  • Magnus Nordborg
  • Copy to clipboard
  • Download BibTeX
  • Download .RIS

Is a corresponding author

  • IST Austria, Austria ;
  • University of Vienna, Austria ;
  • Austrian Academy of Sciences, Vienna BioCenter, Austria ;

Human height is the classic example of a quantitative trait: its distribution is continuous, presumably because it is influenced by variation at a very large number of genes, most with a small effect ( Fisher, 1918 ). Yet height is also strongly affected by the environment: average height in many countries increased during the last century and the children of immigrants are often taller than relatives in their country of origin – in both cases presumably due to changing diet and other environmental factors ( Cavalli-Sforza and Bodmer, 1971 ; Grasgruber et al., 2016 ; NCD Risk Factor Collaboration, 2016 ). This makes it very difficult to determine the cause of geographic patterns for height, such as the ‘latitudinal cline’ seen in Europe ( Figure 1 ).

what is homogeneous population in research

Distribution of average male height in Europe, calculated from studies performed between 1999–2013.

In general, southern Europeans tend to be shorter than northern Europeans. Image reproduced from Grasgruber et al., 2014  (CC BY 3.0).

Are such patterns caused by environmental or genetic differences – or by a complex combination of both? And to the extent that genetic differences are involved, do they reflect selection or simply random history? A number of recent papers have relied on so-called Genome-Wide Association Studies (GWAS) to address these questions, and reported strong evidence for both genetics and selection. Now, in eLife, two papers – one by Jeremy Berg, Arbel Harpak, Nasa Sinnott-Armstrong and colleagues ( Berg et al., 2019 ); the other by Mashaal Sohail, Robert Maier and colleagues ( Sohail et al., 2019 ) – independently reject these conclusions. Even more importantly, they identify problems with GWAS that have broader implications for human genetics.

As the name suggests, GWAS scan the genome for variants – typically single nucleotide polymorphisms (SNPs) – that are associated with a particular condition or trait (phenotype). The first GWAS for height found a small number of SNPs that jointly explained only a tiny fraction of the variation. Because this was in contrast with the high heritability seen in twin studies, it was dubbed ‘the missing heritability problem’ (reviewed in Yang et al., 2010 ). It was suggested that the problem was simply due to a lack of statistical power to detect polymorphisms of small effect. Subsequent studies with larger sample sizes have supported this explanation: more and more loci have been identified although most of the variation remains ‘unmappable’, presumably because sample sizes on the order of a million are still not large enough ( Yengo et al., 2018 ).

One way in which the unmappable component of genetic variation can be included in a statistical measure is via so-called polygenic scores. These scores sum the estimated contributions to the trait across many SNPs, including those whose effects, on their own, are not statistically significant. Polygenic scores thus represent a shift from the goal of identifying major genes to predicting phenotype from genotype. Originally designed for plant and animal breeding purposes, polygenic scores can, in principle, also be used to study the genetic basis of differences between individuals and groups.

This, however, requires accurate and unbiased estimation of the effects of all SNPs included in the score, which is difficult in a structured (non-homogeneous) population when environmental differences cannot be controlled. To see why this is a problem, consider the classic example of chopstick-eating skills ( Lander and Schork, 1994 ). While there surely are genetic variants affecting our ability to handle chopsticks, most of the variation for this trait across the globe is due to environmental differences (cultural background), and a GWAS would mostly identify variants that had nothing to do with chopstick skills, but simply happened to differ in frequency between East Asia and the rest of the world.

Several methods for dealing with this problem have been proposed. When a GWAS is carried out to identify major genes, it is relatively simple to avoid false positives by eliminating associations outside major loci regardless of whether they are due to population structure confounding or an unmappable polygenic background ( Vilhjálmsson and Nordborg, 2013 ). However, if the goal is to make predictions, or to understand differences among populations (such as the latitudinal cline in height), we need accurate and unbiased estimates for all SNPs. Accomplishing this is extremely challenging, and it is also difficult to know whether one has succeeded.

One possibility is to compare the population estimates with estimates taken from sibling data, which should be relatively unbiased by environmental differences. In one of many examples of this, Robinson et al. used data from the GIANT Consortium ( Wood et al., 2014 ) together with sibling data to estimate that genetic variation contributes significantly to height variation across Europe ( Robinson et al., 2015 ). They also argued that selection must have occurred, because the differences were too large to have arisen by chance. Using estimated effect sizes provided by Robinson et al., a more sophisticated analysis by Field et al. found extremely strong evidence for selection for height across Europe (p=10 −74 ; Field et al., 2016 ). Several other studies reached the same conclusion based on the GIANT data (reviewed in Berg et al., 2019 ; Sohail et al., 2019 ).

Berg et al. (who are based at Columbia University, Stanford University, UC Davis and the University of Copenhagen) and Sohail et al. (who are based at Harvard Medical School, the Broad Institute, and other institutes in the US, Finland and Sweden) now re-examine these conclusions using the recently released data from the UK Biobank ( Sudlow et al., 2015 ). Estimating effect sizes from these data allows possible biases due to population structure confounding to be investigated, because the UK Biobank data comes from a (supposedly) more homogenous population than the GIANT data.

Using these new estimates, Berg et al. and Sohail et al. independently found that evidence for selection vanishes – along with evidence for a genetic cline in height across Europe. Instead, they show that the previously published results were due to the cumulative effects of slight biases in the effect-size estimates in the GIANT data. Surprisingly, they also found evidence for confounding in the sibling data used as a control by Robinson et al. and Field et al. This turned out to be due to a technical error in the data distributed by Robinson et al. after they published their paper.

This means we still do not know whether genetics and selection are responsible for the pattern of height differences seen across Europe. That genetics plays a major role in height differences between individuals is not in doubt, and it is also clear that the signal from GWAS is mostly real. The issue is that there is no perfect way to control for complex population structure and environmental heterogeneity. Biases at individual loci may be tiny, but they become highly significant when summed across thousands of loci – as is done in polygenic scores. Standard methods to control for these biases, such as principal component analysis, may work well in simulations but are often insufficient when confronted with real data. Importantly, no natural population is unstructured: indeed, even the data in the UK Biobank seems to contain significant structure ( Haworth et al., 2019 ).

Berg et al. and Sohail et al. demonstrate the potential for population structure to create spurious results, especially when using methods that rely on large numbers of small effects, such as polygenic scores. Caution is clearly needed when interpreting and using the results of such studies. For clinical predictions, risks must be weighed against benefits ( Rosenberg et al., 2019 ). In some cases, such as recommendations for more frequent medical checkups for patients found at higher ‘genetic’ risk of a condition, it may not matter greatly whether predictors are confounded as long as they work. By contrast, the results of behavioral studies of traits such as IQ and educational attainment ( Plomin and von Stumm, 2018 ) must be presented carefully, because while the benefits are far from obvious, the risks of such results being misinterpreted and misused are quite clear. The problem is worsened by the tendency of popular media to ignore caveats and uncertainties of estimates.

Finally, although quantitative genetics has proved highly successful in plant and animal breeding, it should be remembered that this success has been based on large pedigrees, well-controlled environments, and short-term prediction. When these methods have been applied to natural populations, even the most basic predictions fail, in large part due to poorly understood environmental factors ( Charmantier et al., 2014 ). Natural populations are never homogeneous, and it is therefore misleading to imply there is a qualitative difference between ‘within-population’ and ‘between-population’ comparisons – as was recently done in connection with James Watson’s statements about race and IQ ( Harmon, 2019 ). With respect to confounding by population structure, the key qualitative difference is between controlling the environment experimentally, and not doing so. Once we leave an experimental setting, we are effectively skating on thin ice, and whether the ice will hold depends on how far out we skate.

  • Sinnott-Armstrong N
  • Joergensen AM
  • Mostafavi H
  • Pritchard JK
  • Google Scholar
  • Cavalli-Sforza LL
  • Charmantier A
  • Rocheleau G
  • McCarthy MI
  • Grasgruber P
  • Budu-Aggrey A
  • Paternoster L
  • J Timpson N
  • NCD Risk Factor Collaboration
  • von Stumm S
  • Robinson MR
  • Medina-Gomez C
  • Mezzavilla M
  • Shakhbazov K
  • Vinkhuyzen A
  • Gustafsson S
  • van Rheenen W
  • Andreassen OA
  • Gasparini P
  • Rivadeneira F
  • Abecasis GR
  • Frayling TM
  • Hirschhorn JN
  • Hottenga JJ
  • Ingelsson E
  • Magnusson PK
  • Montgomery GW
  • Pedersen NL
  • Speliotes EK
  • Visscher PM
  • Rosenberg NA
  • Bloemendal A
  • Hirschhorn J
  • Patterson N
  • Mathieson I
  • Gallacher J
  • Vilhjálmsson BJ
  • Buchkovich ML
  • Croteau-Chonka DC
  • Karjalainen J
  • Vinkhuyzen AA
  • Workalemahu T
  • Kristiansson K
  • Mateo Leach I
  • Pechlivanis S
  • Prokopenko I
  • Stancáková A
  • Strawbridge RJ
  • van der Laan SW
  • van Setten J
  • Van Vliet-Ostaptchouk JV
  • Bandinelli S
  • Bruinenberg M
  • Caspersen IH
  • Claudi-Boehm S
  • Dhonukshe-Rutten R
  • Dimitriou M
  • Folkersen L
  • Giedraitis V
  • de Groot LC
  • Hannemann A
  • Heard-Costa NL
  • Houwing-Duistermaat JJ
  • Johansson Å
  • Juliusdottir T
  • Lindström J
  • Lorentzon M
  • McKenzie CA
  • McLachlan S
  • Müller-Nurasyid M
  • Robertson NR
  • Scharnagl H
  • Scholtens S
  • Schumacher FR
  • Schunkert H
  • Seufferlein T
  • Silventoinen K
  • Smolonska J
  • Stringham HM
  • Sundström J
  • Thorleifsson G
  • van Schoor NM
  • van der Velde N
  • van Heemst D
  • van Oort FV
  • Vermeulen SH
  • Waldenberger M
  • Willenborg C
  • Wilsgaard T
  • Wojczynski MK
  • Bornstein SR
  • Brambilla P
  • Caulfield MJ
  • Chakravarti A
  • Crawford DC
  • den Ruijter HM
  • Eriksson JG
  • Ferrannini E
  • Ferrières J
  • Forrester T
  • Gansevoort RT
  • Gottesman O
  • Gyllensten U
  • Hattersley AT
  • Hengstenberg C
  • Hindorff LA
  • Hingorani AD
  • Humphries SE
  • Jarvelin MR
  • Jousilahti P
  • Kastelein JJ
  • Keinanen-Kiukaanniemi SM
  • Kiemeney LA
  • Kooperberg C
  • Langenberg C
  • Le Marchand L
  • Lehtimäki T
  • Meitinger T
  • Oldehinkel AJ
  • Ouwehand WH
  • Pasterkamp G
  • Pramstaller PP
  • Raitakari OT
  • Sarzynski MA
  • Shuldiner AR
  • Steinthorsdottir V
  • Asselbergs FW
  • Boerwinkle E
  • Bottinger EP
  • Chambers JC
  • de Bakker PI
  • Dedoussis G
  • Quertermous T
  • Saaristo TE
  • Schlessinger D
  • Slagboom PE
  • Tuomilehto J
  • van der Harst P
  • Wichmann HE
  • Lindgren CM
  • Thorsteinsdottir U
  • Strachan DP
  • Beckmann JS
  • Stefansson K
  • Uitterlinden AG
  • van Duijn CM
  • O'Connell JR
  • Electronic Medical Records and Genomics (eMEMERGEGE) Consortium MIGen Consortium PAGEGE Consortium LifeLines Cohort Study
  • Sidorenko J

Article and author information

Author details.

Nick Barton is at IST Austria, Klosterneuburg, Austria

For correspondence

Competing interests.

ORCID icon

Joachim Hermisson is at the Department of Mathematics and at the Max F. Perutz Laboratories, University of Vienna, Vienna, Austria

Magnus Nordborg is at the Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenter, Vienna, Austria

Acknowledgements

We thank Jeremy Berg and Peter Visscher for answering our questions, and Molly Przeworski for helpful discussions.

Publication history

  • Version of Record published: March 21, 2019 (version 1)

© 2019, Barton et al.

This article is distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use and redistribution provided that the original author and source are credited.

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

Downloads (link to download the article as pdf).

  • Article PDF

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools), categories and tags.

  • polygenic adaptation
  • population structure
  • population genetics
  • quantitative genetics
  • selection for human height

Research organism

Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies, further reading.

Genetic predictions of height differ among human populations and these differences have been interpreted as evidence of polygenic adaptation. These differences were first detected using SNPs genome-wide significantly associated with height, and shown to grow stronger when large numbers of sub-significant SNPs were included, leading to excitement about the prospect of analyzing large fractions of the genome to detect polygenic adaptation for multiple traits. Previous studies of height have been based on SNP effect size measurements in the GIANT Consortium meta-analysis. Here we repeat the analyses in the UK Biobank, a much more homogeneously designed study. We show that polygenic adaptation signals based on large numbers of SNPs below genome-wide significance are extremely sensitive to biases due to uncorrected population stratification. More generally, our results imply that typical constructions of polygenic scores are sensitive to population stratification and that population-level differences should be interpreted with caution.

Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed ( see decision letter ).

Reduced signal for polygenic adaptation of height in UK Biobank

Several recent papers have reported strong signals of selection on European polygenic height scores. These analyses used height effect estimates from the GIANT consortium and replication studies. Here, we describe a new analysis based on the the UK Biobank (UKB), a large, independent dataset. We find that the signals of selection using UKB effect estimates are strongly attenuated or absent. We also provide evidence that previous analyses were confounded by population stratification. Therefore, the conclusion of strong polygenic adaptation now lacks support. Moreover, these discrepancies highlight (1) that methods for correcting for population stratification in GWAS may not always be sufficient for polygenic trait analyses, and (2) that claims of differences in polygenic scores between populations should be treated with caution until these issues are better understood.

  • Biochemistry and Chemical Biology

Fitness landscape of substrate-adaptive mutations in evolved amino acid-polyamine-organocation transporters

The emergence of new protein functions is crucial for the evolution of organisms. This process has been extensively researched for soluble enzymes, but it is largely unexplored for membrane transporters, even though the ability to acquire new nutrients from a changing environment requires evolvability of transport functions. Here, we demonstrate the importance of environmental pressure in obtaining a new activity or altering a promiscuous activity in members of the amino acid-polyamine-organocation (APC)-type yeast amino acid transporters family. We identify APC members that have broader substrate spectra than previously described. Using in vivo experimental evolution, we evolve two of these transporter genes, AGP1 and PUT4 , toward new substrate specificities. Single mutations on these transporters are found to be sufficient for expanding the substrate range of the proteins, while retaining the capacity to transport all original substrates. Nonetheless, each adaptive mutation comes with a distinct effect on the fitness for each of the original substrates, illustrating a trade-off between the ancestral and evolved functions. Collectively, our findings reveal how substrate-adaptive mutations in membrane transporters contribute to fitness and provide insights into how organisms can use transporter evolution to explore new ecological niches.

Be the first to read new articles from eLife

Howard Hughes Medical Institute

  • Introduction

Data Homogeneity

It is often important to determine if a set of data is homogeneous before any statistical technique is applied to it. Homogeneous data are drawn from a single population. In other words, all outside processes that could potentially affect the data must remain constant for the complete time period of the sample. Inhomogeneities are caused when artificial changes affect the statistical properties of the observations through time. These changes may be abrupt or gradual, depending on the nature of the disturbance. Realistically, obtaining perfectly homogeneous data is almost impossible, as unavoidable changes in the area surrounding the observing station will often affect the data.

  • Many observed trends are the result of inhomogeneities and not some other large-scale climatic change (e.g., global warming).
  • Characterizing a short-term artificially-induced trend as natural variations in the climate system can cause substantial errors in conclusions drawn from the data.
  • The interpretation of a given process depends on the time scales considered (i.e., analyses using short time scales should not be used as evidence to support phenomena observed over larger time scales).

Analyzing the Homogeneity of a Dataset

  • Calculate the median.
  • Subtract the median from each value in the dataset.
  • Count how many times the data will make a run above or below the median (i.e., persistance of positive or negative values).
  • Use significance tables to determine thresholds for homogeneity.
Locate Dataset, Variable, and Station dataset. .
You have selected the station identification number for Sherbrooke. To get general information on finding station ID's, click the following link to the tutorial:
Select Temporal Domain in the Time text box.
Compute Yearly Mean Minimum Temperature
This command computes the mean minimum temperature for each year by taking a 365-day average of the minimum daily temperature. This is not an exact yearly average because every 4 years is a leap year, with one extra day. Every four years, the 365-day range will start one day earlier. This being ignored, we are still left with a good approximation of the mean minimum temperature per year.
View Yearly Mean Minimum Temperature Time Series
A gradual upward trend is noticeable over the selected time range. The increase in temperature may have been caused by urbanization in the region surrounding the observing station. Has the urbanization made a sufficient impact on the data so that it may no longer be considered homogeneous over this time period? To answer this question, it is necessary to analyze the distribution of the data around the median.

Subtract Median From Dataset
The median should be located below the expert mode text box in bold: 0.8683567 degrees Celsius. Take note of this value. The function is further explained in the section. link.
This will undo the medianover command.
The above command subtracts the median (0.8683567° Celsius) from each value in the dataset.
Analyze Homogeneity of Data link.
A table will appear with Time in one column and (Min Temp - 0.8683567) in the other column. The day in the Time column changes every four years because of the leap year issue mentioned earlier. The table lists the number of runs for a given number above (N ) and below (N ) the median. For a 40 year series, for example, N = N = 20. If the number of runs falls between the .10 and .90 significance limits, there is a high probability that the data is homogeneous. Other significance tables can be obtained for sample sizes not contained in the table. = N
10 8 13
11 9 14
12 9 16
13 10 17
14 11 18
15 12 19
16 13 20
17 14 21
18 15 22
19 16 23
20 16 25
25 22 30
30 26 36
35 31 41
40 35 47
45 40 52
50 45 57

Oliver, John E. Climatology: Selected Applications. p 7.

There are 18 runs in the Sherbrooke data from 1920 to 1970. The total number of elements that make up the sample is 50 (each yearly mean minimum temperature constitutes one element). According to the table, at a .10 significance limit there should be at least 22 runs. We can therefore conclude, with 90% confidence, that this data is not homogeneous. Is this inhomogeneity caused by a large-scale climatic change or by an inconsistancy in the area surrounding the observing station? To answer this question, we analyze the mean minimum temperature at another station only a few miles away.

Locate Dataset and Variable dataset.
Select Temporal Domain and Station in the Time text box. in the ISTA Station text box.
The station ID 7018000 is for Shawinigan. To get more information on finding station ID's, click the following link to the tutorial:
Compute Yearly Mean Minimum Temperature
View Yearly Mean Minimum Temperature Time Series
Based on visual inspection, these data appear to be more homogeneous than the data that taken at Sherbrooke. There isn't a distinct upward trend in the minimum temperatures, as there was in the Sherbrooke data.
Subtract Median From Dataset
The median should be -0.6845208 degrees Celsius. Take note of this value. link.
The above command subtracts the median (-0.6845208° Celsius) from each value in the dataset.
Analyze Homogeneity of Data link.
A table will appear with Time in one column and (Min Temp - -0.6845208) in the other column. Shawinigan is only located a few miles northwest of Sherbrooke across the St. Lawrence River, yet the minimum temperature at Sherbrooke exhibited a noticeable upward trend over the time period while the minimum temperature at Shawinigan did not. Therefore, we can conclude that the inhomogeneity at Sherbrooke is not the result of large-scale climatic change. Instead, from 1920 to 1970, Sherbrooke had been heavily affected by human development. The increased density and height of buildings surrounding the observing station in Sherbrooke caused a small heat island, which in turn created an inhomogeneity in the data. Shawinigan, on the other hand, was not affected by development and in turn, did not experience a gradual warming over the period.

This website may not work correctly because your browser is out of date. Please update your browser .

Homogenous sampling

sample_1.jpg

Homogenous sampling involves selecting similar cases to further investigate a particular phenomenon or subgroup of interest.

The logic of homogenous sampling is in contrast to the logic of maximum variation sampling.

In a recent evaluation of village level revitalization in Aceh, post-tsunami, leadership was identified as a contributing factor to village success. Those villages with effective leaders were able to rebuild more productively than those without effective leadership. A homogenous sample of village leaders could be a useful augment to the study, identifying common leadership characteristics and circumstances.

This page is a Stub (a minimal version of a page). You can help expand it. Contact Us  to recommend resources or volunteer to expand the description.

'Homogenous sampling' is referenced in:

Framework/guide.

  • Rainbow Framework :  Sample

Back to top

© 2022 BetterEvaluation. All right reserved.

9.5.1   What is heterogeneity?

Inevitably, studies brought together in a systematic review will differ. Any kind of variability among studies in a systematic review may be termed heterogeneity. It can be helpful to distinguish between different types of heterogeneity. Variability in the participants, interventions and outcomes studied may be described as clinical diversity (sometimes called clinical heterogeneity), and variability in study design and risk of bias may be described as methodological diversity (sometimes called methodological heterogeneity). Variability in the intervention effects being evaluated in the different studies is known as statistical heterogeneity , and is a consequence of clinical or methodological diversity, or both, among the studies. Statistical heterogeneity manifests itself in the observed intervention effects being more different from each other than one would expect due to random error (chance) alone. We will follow convention and refer to statistical heterogeneity simply as heterogeneity .

Clinical variation will lead to heterogeneity if the intervention effect is affected by the factors that vary across studies; most obviously, the specific interventions or patient characteristics. In other words, the true intervention effect will be different in different studies.

Differences between studies in terms of methodological factors, such as use of blinding and concealment of allocation, or if there are differences between studies in the way the outcomes are defined and measured, may be expected to lead to differences in the observed intervention effects. Significant statistical heterogeneity arising from methodological diversity or differences in outcome assessments suggests that the studies are not all estimating the same quantity, but does not necessarily suggest that the true intervention effect varies. In particular, heterogeneity associated solely with methodological diversity would indicate the studies suffer from different degrees of bias. Empirical evidence suggests that some aspects of design can affect the result of clinical trials, although this is not always the case. Further discussion appears in Chapter 8 .

The scope of a review will largely determine the extent to which studies included in a review are diverse. Sometimes a review will include studies addressing a variety of questions, for example when several different interventions for the same condition are of interest (see also Chapter 5, Section 5.6 ). Studies of each intervention should be analysed and presented separately. Meta-analysis should only be considered when a group of studies is sufficiently homogeneous in terms of participants, interventions and outcomes to provide a meaningful summary. It is often appropriate to take a broader perspective in a meta-analysis than in a single clinical trial. A common analogy is that systematic reviews bring together apples and oranges, and that combining these can yield a meaningless result. This is true if apples and oranges are of intrinsic interest on their own, but may not be if they are used to contribute to a wider question about fruit. For example, a meta-analysis may reasonably evaluate the average effect of a class of drugs by combining results from trials where each evaluates the effect of a different drug from the class.

There may be specific interest in a review in investigating how clinical and methodological aspects of studies relate to their results. Where possible these investigations should be specified a priori , i.e. in the systematic review protocol. It is legitimate for a systematic review to focus on examining the relationship between some clinical characteristic(s) of the studies and the size of intervention effect, rather than on obtaining a summary effect estimate across a series of studies (see Section 9.6 ). Meta-regression may best be used for this purpose, although it is not implemented in RevMan (see Section 9.6.4 ).

ResearchTweet

Homogeneous: definition, types, and examples.

  • Reading time: 8 mins read

Table of Contents

Homogeneous definition.

Homogeneous can be defined as “the same” or “similar.” It can be used to describe things that have similar characteristics. Homogeneous substances, for example, are substances that are homogeneous in volume and composition across their whole volume. As a result, two samples obtained from two different portions of homogeneous mixtures and substances will have the same compositions and properties.

Homogeneous Etymology

The word homogeneous is derived from two Greek words: “homo” (meaning “the same”) and “genous” (meaning “kind”). As a result, homogenous refers to individuals who are all perceived to be the same, similar, or present in the same proportion.

What is Homogeneous Mixture?

Homogenous means “of the same sort” or “similar.” It’s the ancient name for homologous in biology, which means “having matching components, similar structures, or the same anatomical locations.” Homogenous is derived from the Latin homo, which means “same,” and “genous,” which means “kind.” homogenous is a variant. Heterogeneous is the antonym of homogeneous.

A mixture is formed when two or more components combine without undergoing any chemical changes. The mechanical blending or mixing of objects like elements and compounds defines a mixture. There is no chemical bonding or chemical change in this process.

As a result, the chemical characteristics and structure of the components in a combination are preserved. Size, form, colour, height, weight, distribution, texture, temperature, radioactivity, structure, and a variety of other characteristics all stay consistent throughout the homogeneous material.

When a pigment (such as ink) is combined with water, the resultant solution is highly homogeneous, which is a fairly common example of homogeneous in our daily lives. The colour combines equally with water, and any area of the solution has the same makeup.

Mechanical techniques can be used to separate them. Centrifugation, filtration, heat, and gravity sorting are some of the methods.

That’s all there is to it when it comes to the term’s use in chemistry or biology. The term “homogenous” is used in various research areas, such as ecology, to describe a population’s homogeneity.

A group of humans raised only by asexual reproduction – with identical genes and traits — is homogeneous, for example. Scientists hypothesized that if various orientations came from the same source, the cosmos would behave similarly. Evolutionary biology is another area of biology where the term homogeneous is employed.

Homogeneous is an ancient word for homologous, which refers to anatomical components that exhibit structural similarities, such as those generated by descent from a common ancestor.

The term homogeneous has been used widely in different fields of research, such as biology, chemistry, and ecology, but it is always used to describe organisms in a mixture who have the same properties.

In chemistry, homogeneous refers to a combination in which the ingredients are uniformly distributed. However, there are no chemical connections between them at the molecular level. Air is the most typical example of a homogeneous mixture in our environment.

Homogenous vs Heterogenous

A mixture, as previously stated, is the physical coming together of components (which, in chemistry, can be elements or compounds). There are two sorts of mixtures: homogeneous and heterogeneous.

The opposite of homogeneous is heterogenous (variant: heterogeneous). It refers to the components in a combination that have distinct properties (“hetero,” which means “different”). The most obvious example of a heterogeneous combination is oil and water, which form two distinct layers that are immiscible with each other, resulting in two distinct layers.

One of the most notable characteristics of heterogeneous mixes is that the particles are not dispersed equally throughout the mixture. Analysing the combination with the naked eye reveals the heterogeneous character of the mixture. In addition, the components of all heterogeneous mixes are not uniform.

Composition is similar in homogenous mixtures and dissimilar in heterogenous. In heterogenous mixtures, various phases are seen and single phase is seen in homogenous mixtures.

Substance can be sorted from each other by physical methods such as distillation, evaporation, centrifugation, chromatography, crystallization in both types of mixtures. Variation and a smaller number of species exist in homogenous mixtures, and the reverse is seen in heterogenous mixtures.

Although the concepts and compositions of homogeneous and heterogeneous substances are vastly different, both are prone to change depending on context and composition. Let’s take the example of blood. If we look at the blood with our naked eyes, it seems to be homogeneous.

Blood, on the other hand, has a variety of components under the microscope, including red blood cells, plasma, and platelets, showing that it is heterogeneous.

Homogeneous Examples

We come across numerous examples of homogeneous mixes and entities in our daily lives. In biology, a homogeneous population is one in which all of the individuals have virtually the same genetic makeup, as a result of some types of asexual reproduction.

Asexual reproduction produces homogeneous children who are identical to each other, including their parents.

Many animals, such as goat populations, look homogenous but are not because they reproduce through sexual reproduction.

According to experts, homogeneity reduces biodiversity, and as a result, the odds of early extinction due to environmental changes are significant. Animal cloning is a frequent example of a homogeneous population.

Dolly the sheep was the first mammal to be successfully cloned from a somatic cell in an adult.

Homogeneous species are those that exhibit indistinguishable characteristics and appear to be identical. Such species appear to have a lower level of biodiversity.

The diversity and frequency of species in a particular region and period, as well as the ecosystem’s homogeneity, may be quantified using a specific fundamental unit called species richness.

Species richness refers to the number of different species found in a specific ecological community. It displays the relative abundance of species rather than the total number of species in the environment. As a result, in a homogeneous environment, species richness will be lower, as high species richness indicates variability.

This is particularly evident in endemic species, which are species that have evolved through time in a specific geographic region and aren’t found anywhere else.

Grass, trees, ants, fungus, and certain animals are all instances of homogeneous in the ecosystem. Many endemic species found nowhere else in the world may be found in New Zealand.

Homogeneous used to be a very popular term in evolutionary biology to describe physically comparable features in various species, indicating a shared evolutionary origin.

The anatomical characteristics of several animal forelimbs are depicted. A similar evolutionary ancestor is shown by the identical forelimb bone components.

Homogeneous Summary

As a result of the preceding discussion, homogeneous substances are those that are uniform in volume and composition throughout. Homogeneous mixtures in chemistry have the same size, shape, colour, texture, and many other characteristics.

A solution that does not separate from each other over time is known as a homogeneous mixture. Homogeneous species are those that are genetically similar but lack biodiversity and species richness, as defined in biology and ecology.

Similarly, various solutions are widely used in our daily lives, and the blood and DNA in our bodies are both homogeneous. Heterogeneous mixes have properties that are the polar opposite of homogeneous mixtures.

As a result, the heterogeneous mixture contains non-uniform compositions and numerous phases that cannot be distinguished by physical changes. Furthermore, they are culturally varied and affluent.

Similarly, it has been demonstrated that both homogeneous and heterogeneous mixes are prone to change depending on their environment and composition. As a result, both heterogeneous and homogeneous mixes might be seen as equally important.

Hyaline Cartilage, Hyaline Cartilage Function, Hyaline Cartilage Location,

Hyaline Cartilage: Definition, Function, and Examples

Homogeneous citations.

  • Synthesis of Oxazolidin-2-ones from Unsaturated Amines with CO 2 by Using Homogeneous Catalysis. Chem Asian J . 2018 Sep 4;13(17):2292-2306.
  • Recent Advances Utilized in the Recycling of Homogeneous Catalysis. Chem Rec . 2019 Sep;19(9):2022-2043.
  • A review of thermal homogeneous catalytic deoxygenation reactions for valuable products. Heliyon . 2020 Feb 20;6(2):e03446.

Similar Post:

A New Path to Combat Obesity and Liver Disease

A New Path to Combat Obesity and Liver Disease

The Secret Weapon Plants Use to Outsmart Their Neighbors

The Secret Weapon Plants Use to Outsmart Their Neighbors

Breaking Barriers in Alzheimer's Research

Breaking Barriers in Alzheimer’s Research

A New Target in the Fight Against Colorectal Cancer

A New Target in the Fight Against Colorectal Cancer

How Climate Change Impacts Newborns in Vulnerable Countries

How Climate Change Impacts Newborns in Vulnerable Countries

How TGR5 Fights Diabetic Heart Disease?

How TGR5 Fights Diabetic Heart Disease?

Precision in Flight: Robots Transforming Aerospace Inspection

Precision in Flight: Robots Transforming Aerospace Inspection

The Hidden Messages in Our Brain's Fluid: A Breakthrough in Alzheimer's Research

The Hidden Messages in Our Brain’s Fluid: A Breakthrough in Alzheimer’s Research

The Secret Life of Immune Cells: A Journey Through Our Body

The Secret Life of Immune Cells: A Journey Through Our Body

A Game-Changer for Factory Workers: The Wearable Robot Assistant

A Game-Changer for Factory Workers: The Wearable Robot Assistant

How Vitamin D Deficiency Triggers Fat Storage?

How Vitamin D Deficiency Triggers Fat Storage?

How High-Fat Foods Accelerate Prostate Cancer

How High-Fat Foods Accelerate Prostate Cancer?

Leave a reply cancel reply.

Save my name, email, and website in this browser for the next time I comment.

Frequently asked questions

When are populations used in research.

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

  • Privacy Policy

Research Method

Home » Purposive Sampling – Methods, Types and Examples

Purposive Sampling – Methods, Types and Examples

Table of Contents

Purposive Sampling

Purposive Sampling

Definition:

Purposive sampling is a non-probability sampling technique used in research to select individuals or groups of individuals that meet specific criteria relevant to the research question or objective.

This sampling technique is also known as judgmental sampling or selective sampling, and it is often used when the population being studied is too small, too difficult to access, or too heterogeneous to use probability sampling methods.

Purposive Sampling Methods

Purposive Sampling Methods are as follows:

  • Expert sampling: In expert sampling, the researcher selects participants who are experts in a particular field or subject matter. This can be useful when studying a specialized or technical topic, as experts are likely to have a deeper understanding of the subject matter and can provide valuable insights.
  • Maximum variation sampling: Maximum variation sampling involves selecting participants who represent a wide range of characteristics or perspectives. This can be useful when the researcher wants to capture a diverse range of experiences or viewpoints.
  • Homogeneous sampling : In homogeneous sampling, the researcher selects participants who have similar characteristics or experiences. This can be useful when studying a specific subpopulation that shares common traits or experiences.
  • Critical case sampling : Critical case sampling involves selecting participants who are likely to provide important or unique insights into the research question. This can be useful when the researcher wants to focus on cases that are particularly relevant or informative.
  • Snowball sampling : Snowball sampling involves selecting participants based on referrals from other participants in the study. This can be useful when studying hard-to-reach or hidden populations, as it allows the researcher to gain access to individuals who may not be easily identifiable or accessible.

How to Conduct Purposive Sampling

Here are the general steps involved in conducting purposive sampling:

  • Identify the research question or objective: The first step in conducting purposive sampling is to clearly define the research question or objective. This will help you determine the criteria for participant selection.
  • Determine the criteria for participant selection : Based on the research question or objective, determine the specific criteria for selecting participants. These criteria should be relevant to the research question and should help you identify individuals who are most likely to provide valuable insights.
  • Identify potential participants: Once you have determined the criteria for participant selection, identify potential participants who meet these criteria. Depending on the sampling method you are using, this may involve reaching out to experts in the field, identifying individuals who share certain characteristics or experiences, or asking for referrals from existing participants.
  • Select participants: Based on the identified potential participants, select the individuals who will participate in the study. Make sure to select a sufficient number of participants to ensure that you have a representative sample.
  • Collect data: After selecting participants, collect data using the appropriate research methods. Depending on the research question and objectives, this may involve conducting interviews, administering surveys, or collecting observational data.
  • Analyze data: After collecting data, analyze it to answer the research question or objective. This may involve using statistical analysis, qualitative analysis, or a combination of both.

Examples of Purposive Sampling

Here are some examples of how purposive sampling might be used in research:

  • Studying the experiences of cancer survivors : A researcher might use maximum variation sampling to select a diverse group of cancer survivors, with the aim of capturing a range of experiences and perspectives on the impact of cancer on their lives.
  • Exploring the perspectives of teachers on a new curriculum : A researcher might use expert sampling to select teachers who are experts in a particular subject area or who have experience teaching the new curriculum. These teachers can provide valuable insights on the strengths and weaknesses of the new curriculum.
  • Investigating the impact of a new therapy on a specific population: A researcher might use homogeneous sampling to select participants who share certain characteristics, such as a particular diagnosis or age group. This can help the researcher assess the effectiveness of the new therapy on this specific population.
  • Examining the experiences of refugees resettling in a new country : A researcher might use critical case sampling to select participants who have experienced particularly challenging resettlement experiences, such as those who have experienced discrimination or faced significant barriers to accessing services.
  • Understanding the experiences of homeless individuals : A researcher might use snowball sampling to identify and select homeless individuals to participate in the study. This method allows the researcher to gain access to a hard-to-reach population and capture a range of experiences and perspectives on homelessness.

Applications of Purposive Sampling

Purposive sampling has a wide range of applications across different fields of research. Here are some examples of how purposive sampling can be used:

  • Medical research: Purposive sampling is commonly used in medical research to study the experiences of patients with specific medical conditions. Researchers might use homogeneous sampling to select patients who share specific medical characteristics, such as a particular diagnosis or treatment history.
  • Market research: In market research, purposive sampling can be used to select participants who represent a particular demographic or consumer group. This might involve using quota sampling to select participants based on age, gender, income, or other relevant criteria.
  • Education research: Purposive sampling can be used in education research to select participants who have specific educational experiences or backgrounds. For example, researchers might use maximum variation sampling to select a diverse group of students who have experienced different teaching styles or classroom environments.
  • Social science research : In social science research, purposive sampling can be used to select participants who have specific social or cultural backgrounds. Researchers might use snowball sampling to identify and select participants from hard-to-reach or marginalized populations.
  • Business research: In business research, purposive sampling can be used to select participants who have specific job titles, work in particular industries, or have experience with specific products or services

Purpose of Purposive Sampling

The purpose of purposive sampling is to select participants based on specific criteria relevant to the research question or objectives. Unlike probability sampling techniques, which rely on random selection to ensure representativeness, purposive sampling allows researchers to select participants who are most relevant to their research question or objectives.

Purposive sampling is often used when the population of interest is rare, hard to reach, or has specific characteristics that are important to the research question. By selecting participants who meet specific criteria, researchers can gather valuable insights that can help inform their research.

The ultimate goal of purposive sampling is to increase the validity and reliability of research findings by selecting participants who are most relevant to the research question or objectives. This can help researchers to make more accurate inferences about the population of interest and to develop more effective interventions or solutions based on their findings.

When to use Purposive Sampling

Purposive sampling is appropriate when researchers need to select participants who meet specific criteria relevant to their research question or objectives. Here are some situations where purposive sampling might be appropriate:

  • Rare populations: Purposive sampling is often used when the population of interest is rare, such as people with a particular medical condition or individuals who have experienced a particular event or phenomenon.
  • Hard-to-reach populations: Purposive sampling is also useful when the population of interest is hard to reach, such as homeless individuals or individuals who have experienced trauma or abuse.
  • Specific characteristics: Purposive sampling is appropriate when researchers need to select participants with specific characteristics that are relevant to the research question, such as age, gender, or ethnicity.
  • Expertise : Purposive sampling is useful when researchers need to select participants with particular expertise or knowledge, such as teachers or healthcare professionals.
  • Maximum variation : Purposive sampling can be used to select participants who represent a range of perspectives or experiences, such as individuals from different socio-economic backgrounds or who have different levels of education.

Characteristics of Purposive Sampling

Purposive sampling has several characteristics that distinguish it from other sampling methods:

  • Non-random selection : Purposive sampling involves the deliberate selection of participants based on specific criteria, rather than random selection. This allows researchers to select participants who are most relevant to their research question or objectives.
  • Small sample sizes: Purposive sampling typically involves smaller sample sizes than probability sampling methods, as the focus is on selecting participants who meet specific criteria, rather than ensuring representativeness of the larger population.
  • Heterogeneous or homogeneous samples : Purposive sampling can involve selecting participants who are either similar to each other (homogeneous) or who are diverse and represent a range of perspectives or experiences (heterogeneous).
  • Multiple sampling strategies: Purposive sampling involves a range of sampling strategies that can be used to select participants, including maximum variation sampling, expert sampling, quota sampling, and snowball sampling.
  • Flexibility : Purposive sampling is a flexible method that can be adapted to suit different research questions and objectives. It allows researchers to select participants based on specific criteria, making it a useful method for exploring complex phenomena or researching hard-to-reach populations.

Advantages of Purposive Sampling

Purposive sampling has several advantages over other sampling methods:

  • Relevant participants: Purposive sampling allows researchers to select participants who are most relevant to their research question or objectives, ensuring that the data collected is of high quality and useful for the research.
  • Efficient : Purposive sampling is an efficient method of sampling, as it allows researchers to select participants based on specific criteria, rather than randomly selecting a large number of participants. This can save time and resources, especially when the population of interest is rare or hard to reach.
  • Representative : Purposive sampling can produce samples that are representative of the population of interest, as researchers can use a range of sampling strategies to select participants who are diverse and represent a range of perspectives or experiences.
  • Ethical considerations : Purposive sampling can be used to ensure that vulnerable or marginalized populations are included in research studies, ensuring that their voices and experiences are heard and taken into account.

Disadvantages of Purposive Sampling

Some Disadvantages of Purposive Sampling are as follows:

  • Sampling bias: Purposive sampling is susceptible to sampling bias, as the participants are not randomly selected from the population. This means that the sample may not be representative of the larger population, and the findings may not be generalizable to other populations.
  • Limited generalizability: The findings obtained from purposive sampling may be limited in their generalizability due to the small sample size and the specific selection criteria used. Therefore, it may not be possible to make broad generalizations based on the findings of a purposive sample.
  • Lack of transparency : The selection criteria used in purposive sampling may not be transparent, and this can limit the ability of other researchers to replicate the study.
  • Reliance on researcher judgment : Purposive sampling relies on the researcher’s judgment to select participants based on specific criteria, which can introduce bias into the selection process.
  • Potential for researcher subjectivity : The researcher’s subjectivity and bias may influence the selection process and the interpretation of the data collected.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Stratified Sampling

Stratified Random Sampling – Definition, Method...

Quota Sampling

Quota Sampling – Types, Methods and Examples

Sampling Methods

Sampling Methods – Types, Techniques and Examples

Simple Random Sampling

Simple Random Sampling – Types, Method and...

Non-probability Sampling

Non-probability Sampling – Types, Methods and...

Volunteer Sampling

Volunteer Sampling – Definition, Methods and...

IMAGES

  1. Epidemic evolution for a large and completely homogeneous population

    what is homogeneous population in research

  2. PPT

    what is homogeneous population in research

  3. Homogeneity and Heterogeneity

    what is homogeneous population in research

  4. A transcriptional distribution-based model of population homogeneity

    what is homogeneous population in research

  5. 5. Fundamental diagrams a) homogeneous population b) heterogeneous

    what is homogeneous population in research

  6. Comparison of a homogeneous population with E-strategy (E.st ) and a

    what is homogeneous population in research

VIDEO

  1. 467: #Homogeneous #Function

  2. HOMOZYGOUS vs HETEROZYGOUS POPULATIONS

  3. Mikhail Gromov

  4. Mazin Qumsiyeh on “How Does the Current State of Israel-Palestine Affect the World," a ZOOM

  5. Differentiation vol-1 HOMOGENEOUS EQUATION by Srinivasa rao

  6. Decisions Better with Heterogeneous Groups

COMMENTS

  1. More than Just Convenient: The Scientific Merits of Homogeneous Convenience Samples

    In homogeneous convenience sampling researchers undertake to study (and therefore sample) a population that is homogeneous with respect to one or more sociodemographic factors (e.g., the overall population is composed of just Blacks or Whites). Thus, the target population (not just the sample studied) is a specific sociodemographic subgroup.

  2. Homogeneity and heterogeneity as situational properties: Producing

    In the broad context of human populations - one of the central objects of gene-environment interaction (GEI) research and of this article - a homogeneous population is one that is believed to have, or that can be made to have, a uniform character, where all the constituents are of the same or similar nature, and are therefore recombinable ...

  3. Homogeneity, Homogeneous Data & Homogeneous Sampling

    Homogeneity of Variance. Homogeneity of variance (also called homoscedasticity) is used to describe a set of data that has the same variance. Visually, the data will have the same scatter on a scatter plot. If data does not have the same variance, it will show a heteroscedastic ("not the same") scatter pattern.

  4. What are homogeneous and heterogeneous populations?

    Homogenous populations are alike and heterogeneous populations are unalike. Homogenous means alike. Heterogenous means unalike or distinct from one another. Thus, a homogenous population has little variation. You could refer to a specific trait, such as hair color or you could refer to genetic diversity. For example, a population of humans that ...

  5. Homogeneous Populations

    Research and Information Research Methods. Social Work Addictions and Substance Misuse ... The homogeneous population model assumes equal expected value and variance for the variable of interest for all population units. Values from different units are assumed to be independent although this is relaxed in the last section of the chapter.

  6. Homogeneity and heterogeneity (statistics)

    Simple populations surveys may start from the idea that responses will be homogeneous across the whole of a population. Assessing the homogeneity of the population would involve looking to see whether the responses of certain identifiable subpopulations differ from those of others. For example, car-owners may differ from non-car-owners, or ...

  7. What Is the Big Deal About Populations in Research?

    In research, there are 2 kinds of populations: the target population and the accessible population. The accessible population is exactly what it sounds like, the subset of the target population that we can easily get our hands on to conduct our research. While our target population may be Caucasian females with a GFR of 20 or less who are ...

  8. Focus Groups: Heterogeneity vs. Homogeneity

    The following is a modified excerpt from Applied Qualitative Research Design: A Total Quality Framework Approach (Roller & Lavrakas, 2015, pp. 107-109). Fundamental to the design of a focus group study is group composition. Specifically, the researcher must determine the degree of homogeneity or heterogeneity that should be represented by the group participants.

  9. Stratified Sampling

    Stratified Sampling | Definition, Guide & Examples. Published on September 18, 2020 by Lauren Thomas.Revised on June 22, 2023. In a stratified sample, researchers divide a population into homogeneous subpopulations called strata (the plural of stratum) based on specific characteristics (e.g., race, gender identity, location, etc.).

  10. Improving diversity in medical research

    Clinical research is essential for the advancement of medicine; however, trials often enrol homogeneous populations that do not accurately represent the patient populations served. Representative ...

  11. PDF Choosing the Size of the Sample

    size than a study that requires one only to describe population parameters. Research frequently has multiple target populations, each critically important to the objectives of the study. A health survey may target all ... The more homogeneous the population in terms of the variables of interest, the more consideration should be given to choosing

  12. (PDF) CONCEPT OF POPULATION AND SAMPLE

    A part of population that repre sents it completely is known as sample. It means, the units, selected from the population as a sample, must represent all kind of characteristics of different ...

  13. Defining the study population: who and why?

    A population-based approach was proposed and the sample frame was from the National Cancer Database, which includes more than 40 million historical records from over 1500 treatment sites. This was used to create the study population (women with T1-3N1 breast cancer before chemotherapy) by refining the initial dataset to match the research ...

  14. Statistics without tears: Populations and samples

    A population is a complete set of people with a specialized set of characteristics, and a sample is a subset of the population. The usual criteria we use in defining population are geographic, for example, "the population of Uttar Pradesh". In medical research, the criteria for population may be clinical, demographic and time related.

  15. Heterogeneity: what is it and why does it matter?

    Tutorials and Fundamentals. Heterogeneity is not something to be afraid of, it just means that there is variability in your data. So, if one brings together different studies for analysing them or doing a meta-analysis, it is clear that there will be differences found. The opposite of heterogeneity is homogeneity meaning that all studies show ...

  16. Population Genetics: Why structure matters

    Natural populations are never homogeneous, and it is therefore misleading to imply there is a qualitative difference between 'within-population' and 'between-population' comparisons - as was recently done in connection with James Watson's statements about race and IQ (Harmon, 2019). With respect to confounding by population ...

  17. RWJF

    The process of selecting a small homogeneous group of subjects or units for examination and analysis. Why use this method? Homogeneous sampling is used when the goal of the research is to understand and describe a particular group in depth. Citation: Cohen D, Crabtree B. "Qualitative Research Guidelines Project."

  18. Data Homogeneity

    Data Homogeneity. It is often important to determine if a set of data is homogeneous before any statistical technique is applied to it. Homogeneous data are drawn from a single population. In other words, all outside processes that could potentially affect the data must remain constant for the complete time period of the sample.

  19. Homogenous sampling

    Homogenous sampling involves selecting similar cases to further investigate a particular phenomenon or subgroup of interest. The logic of homogenous sampling is in contrast to the logic of maximum variation sampling.

  20. Homogeneity and heterogeneity

    Homogeneity and heterogeneity; only ' b ' is homogeneous Homogeneity and heterogeneity are concepts relating to the uniformity of a substance, process or image.A homogeneous feature is uniform in composition or character (i.e. color, shape, size, weight, height, distribution, texture, language, income, disease, temperature, radioactivity, architectural design, etc.); one that is heterogeneous ...

  21. 9.5.1 What is heterogeneity?

    9.5.1. What is heterogeneity? Inevitably, studies brought together in a systematic review will differ. Any kind of variability among studies in a systematic review may be termed heterogeneity. It can be helpful to distinguish between different types of heterogeneity. Variability in the participants, interventions and outcomes studied may be ...

  22. Changes to Gut Microbiome May Increase Type 2 Diabetes Risk

    Research over the past decade had linked changes in the gut microbiome to the development of type 2 diabetes, but scientists had not been able to draw significant conclusions because of those studies' small size and varied design. ... If you only study a small, homogeneous population, you will probably miss something," said co-corresponding ...

  23. Homogeneous: Definition, Types, & Examples I ResearchTweet

    The term homogeneous has been used widely in different fields of research, such as biology, chemistry, and ecology, but it is always used to describe organisms in a mixture who have the same properties. ... In biology, a homogeneous population is one in which all of the individuals have virtually the same genetic makeup, as a result of some ...

  24. Sampling in Developmental Science: Situations, Shortcomings, Solutions

    Whether homogeneous sampling or quota sampling is a better nonprobability option depends on the research question. For research questions focused on sociodemographic factors as a source of heterogeneity (e.g., the sociodemographic factor is a focal variable of interest in the study), quota sampling is the better choice.

  25. When are populations used in research?

    When are populations used in research? Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible. ... In contrast, groups created in stratified sampling are homogeneous, as units share characteristics. Relatedly, ...

  26. Purposive Sampling

    Homogeneous sampling: In homogeneous sampling, the researcher selects participants who have similar characteristics or experiences. This can be useful when studying a specific subpopulation that shares common traits or experiences. ... Ethical considerations: Purposive sampling can be used to ensure that vulnerable or marginalized populations ...