7 Different Ways to Control for Confounding

Confounding can be controlled in the design phase of the study by using:

  • Random assignment
  • Restriction

Or in the data analysis phase by using:

  • Stratification
  • Inverse probability weighting
  • Instrumental variable estimation

Here’s a quick summary of the similarities and differences between these methods:

Study PhaseMethodCan easily control for multiple confoundersCan control for unmeasured and unknown confoundersCan control for time-varying confounders
DESIGNRandom AssignmentYESYES YES
RestrictionNO NO NO
MatchingNO NO NO
DATA ANALYSISStratificationNO NO NO
RegressionYES NO NO
Inverse Probability WeightingNO NO YES
Instrumental Variable EstimationYESYES NO

In what follows, we will explain how each of these methods works, and discuss its advantages and limitations.

1. Random assignment

How it works.

Random assignment is a process by which participants are assigned, with the same chance, to either receive or not a certain exposure.

each participant is assigned with the same chance to either being exposed or not

Randomizing the exposure adjusts for confounding by eliminating the influence of the confounder on the probability of receiving the exposure:

random assignment eliminates confounding by removing the association between the confounder and the exposure

Advantage of random assignment:

Random assignment controls for confounding due to both measurable and unmeasurable causes. So it is especially useful when confounding variables are unknown or cannot be measured.

It also controls for time-varying confounders, that is when the exposure and the confounders are measured repeatedly in studies where participants are followed over time.

Limitation of random assignment:

Here are 3 reasons not to use random assignment:

  • Ethical reason: Randomizing participants would be unethical when studying the effect of a harmful exposure, or on the contrary, when it is known for certain that the exposure is beneficial.
  • Practical reason: Some exposures are very hard to randomize, like air pollution and education. Also, random assignment is not an option when we are analyzing observational data that we did not collect ourselves.
  • Financial reason: Random assignment is a part of experimental designs where participants are followed over time, which turns out to be highly expensive in some cases.

Whenever the exposure cannot be randomly assigned to study participants, we will have to use an observational design and control for confounding by using another method from this list.

2. Restriction

Restriction refers to only including in the study participants of a certain confounder category, thereby eliminating its confounding effect.

For instance, if the relationship between smoking (the exposure) and heart disease (the outcome) is confounded by income, then restricting our study to only include participants of the same income category will eliminate its confounding effect:

causal diagram representing how restriction works

Advantage of restriction:

Unlike random assignment, restriction is easy to apply and also works for observational studies.

Limitation of restriction:

The biggest problem with restricting our study to 1 category of the confounder is that the results will not generalize well to the other categories. So restriction will limit the external validity of the study especially in cases where we have more than 1 confounder to control for.

3. Matching

Matching works by distributing the confounding variable evenly between the exposed and the unexposed groups.

The idea is to pair each exposed subject with an unexposed subject that shares the same characteristics regarding the variable that we want to control for. Then, by only analyzing participants for whom we found a match, we eliminate the confounding effect of that variable.

For example, suppose we want to control for income as a confounder of the relationship between smoking (the exposure) and heart disease (the outcome).

Representing the confounding effect of income in a causal diagram

In this case, each smoker should be matched with a non-smoker of the same income category.

Here’s a step-by-step description of how this works:

Initially: The confounder is unequally distributed among the exposed and unexposed groups.

graphical representation of the initial status before matching

Step 1: Match each smoker with a non-smoker of the same income category.

graphical representation of the first step of matching

Step 2: Exclude all unmatched participants from the study.

graphical representation of step 2 of matching

Result: The 2 groups will be balanced regarding the confounding variable.

Advantage of matching

Matching can be easy to apply in certain cases. For instance, matching on income in the example above can be done by selecting 1 smoker and 1 non-smoker from the same family, therefore having the same household income.

Limitation of matching

The more confounding variables we have to control for, the more difficult matching becomes, especially for continuous variables. The problem with matching on many characteristics is that a lot of participants will end up unmatched.

4. Stratification

Stratification controls for confounding by estimating the relationship between the exposure and the outcome within different subsets of the confounding variable, and then pooling these estimates.

Stratification works because, within each subset, the value of the confounder is the same for all participants and therefore cannot affect the estimated effect of the exposure on the outcome.

Here’s a step-by-step description of how to conduct a stratified analysis:

Step 1: Start by splitting the data into multiple subgroups (a.k.a. strata) according to the different categories of the confounding variable.

splitting the data into subgroups

Step 2: Within each subgroup (or stratum), estimate the relationship between the exposure and the outcome.

calculate the estimate in each subgroup

Step 3: Pool the obtained estimates:

  • By averaging them.
  • Or by weighting them by the size of each stratum — a method called standardization.

pooling the estimates

Result: The pooled estimate will be free of confounding.

Advantage of stratification

Stratified analysis is an old and intuitive method used to teach the logic of controlling for confounding. A more modern and practical approach would be regression analysis, which is next on our list.

Limitation of stratification

Stratification does not scale well, since controlling for multiple confounders simultaneously will lead to:

  • Complex calculations.
  • Subgroups that contain very few participants, and these will reflect the noise in the data more so than real effects.

5. Regression

Adjusting for confounding using regression simply means to include the confounding variable in the model used to estimate the influence of the exposure on the outcome.

A linear regression model, for example, will be of the form:

Where the coefficient β 1 will reflect the effect of the exposure on the outcome adjusted for the confounder.

Advantage of regression

Regression can easily control for multiple confounders simultaneously, as this simply means adding more variables to the model.

For more details on how to use it in practice, I wrote a separate article: An Example of Identifying and Adjusting for Confounding .

Limitation of regression

A regression model operates under certain assumptions that must be respected. For example, for linear regression these are:

  • A linear relationship between the predictors (the exposure and the confounder) and the outcome.
  • Independence, normality, and equal variance of the residuals.

6. Inverse probability weighting

Inverse probability weighting eliminates confounding by equalizing the frequency of the confounder between the exposed and the unexposed groups. This is done by counting each participant as many times as her inverse probability of being in a certain exposure category.

Here’s a step-by-step description of the process:

Suppose we want to control for income as a confounder of the relationship between smoking (the exposure) and heart disease (the outcome):

how does random assignment control confounding variables

Initially: Since income and smoking are associated, participants of different income levels will have different probabilities of being smokers.

the variable income is unequally distributed between the exposure groups

First, let’s focus on high income participants:

considering high income participants

Step 1: Calculate the probability “P” that a person is a smoker.

calculating the probability of being in the smoking group for high income participants

Step 2: Calculate the probability that a person is a non-smoker.

calculating the probability of being in the non-smoking group for high income participants

Step 3: Multiply each person by the inverse of their calculated probability. So each participant will no longer count as 1 person in the analysis. Instead, each will be counted as many times as their calculated inverse probability weight (i.e. 1 person will be 1/P persons).

Weighting each person by their inverse probability

Now the smoking group has: 1 × 5 = 5 participants. And the non-smoking group also has: 4 × 5/4 = 5 participants

Finally, we have to repeat steps 1, 2, and 3 for participants in the low-income category.

repeating the process for the low-income group

Result: The smoker and non-smoker groups are now balanced regarding income. So its confounding effect will be eliminated because it is no longer associated with the exposure.

Advantage of inverse probability weighting

This method is a type of what is referred to as G-methods that are used to control for time-varying confounders, that is when the exposure and the confounders are measured repeatedly in studies where participants are followed over time.

Limitation of inverse probability weighting

If some participants have very large weights (i.e. when their probability of being in a certain exposure category is very low), then each of these participants would be counted as a large number of people, which leads to instability in the estimation of the causal effect of the exposure on the outcome.

One solution would be to exclude from the study participants with very high or very low weights.

7. Instrumental variable estimation

The instrumental variable method estimates the unconfounded effect of the exposure on the outcome indirectly by using a variable — the instrumental variable — that represents the exposure but is not affected by confounding.

An instrumental variable satisfies 3 properties:

  • It causes the exposure.
  • It does not cause the outcome directly — it affects the outcome only through the exposure.
  • Its association with the outcome is unconfounded.

Here’s a diagram that represents the relationship between the instrumental variable, the exposure, and the outcome:

instrumental variable representation in a causal diagram

An instrumental variable is chosen so that nothing appears to cause it. So in a sense, it resembles the coin flip in a randomized experiment, because it appears to be randomly assigned.

How the instrumental variable can be used to study causality?

Looking at the data, if an association is found between the instrumental variable and the outcome then it must be causal, since according to property (3) above, their relationship is unconfounded. And because the instrumental variable affects the outcome only through the exposure, according to property (2), we can conclude that the exposure has a causal effect on the outcome.

How the instrumental variable helps identifying a causal relationship between the exposure and the outcome

So how to quantify this causal/unconfounded effect of the exposure on the outcome?

Let “α 1 ” denote the magnitude of the causal effect of the instrumental variable on the exposure, and “β 1 ” that of the exposure on the outcome.

alpha 1 is the effect of the instrumental variable on the exposure and beta 1 is the effect of the exposure on the outcome

So our objective is to find β 1 .

Note that the simple regression model between the exposure and the outcome produces a confounded estimate of β 1 :

And therefore does not reflect the true β 1 that we are searching for.

So how to find this true unconfounded β 1 ?

Technically, if we think in terms of linear regression:

  • α 1 is the change in the exposure CAUSED by a 1 unit change in the instrumental variable.
  • β 1 is the change in the outcome CAUSED by a 1 unit change in the exposure.

It follows that a 1 unit change in the instrumental variable CAUSES an α 1 × β 1 change in the outcome (since the instrumental variable only affects the outcome through the exposure).

And as discussed above, any association between the instrumental variable and the outcome is causal. So, α 1 × β 1 can be estimated from the following regression model:

Where a 1 = α 1 × β 1

And because any association between the instrumental variable and the exposure is also causal (also unconfounded), the following model can be used to estimate α 1 :

Where b 1 = α 1

We end up with 2 equations:

  • α 1 × β 1 = a 1

A simple calculation yields: β 1 = a 1 /b 1 which will be our estimated causal effect of the exposure on the outcome.

Advantage of instrumental variable estimation

Because the calculations that we just did are not dependent on any information about the confounder, we can use the instrumental variable approach to control for any measured, unmeasured, and unknown confounder.

This method is so powerful that it can be used in cases even where we do not know whether there is confounding or not between the exposure and the outcome, and which variables are suspect.

Limitation of instrumental variable estimation

In cases where the instrumental variable and the exposure are weakly correlated, the estimated effect of the exposure on the outcome will be biased.

The use of linear regression is also constrained by its assumptions, especially linearity and constant variance of the residuals.

As a rule of thumb, use the instrumental variable approach in cases where there are unmeasured confounders, otherwise, use other methods from this list since they will, in general, provide a better estimate of the causal relationship between the exposure and the outcome.

If you are interested, here are 3 Real-World Examples of Using Instrumental Variables .

  • Hernán M, Robins JM. Causal Inference . Chapman & Hall/CRC; 2020.
  • Roy J. A Crash Course in Causality: Inferring Causal Effects from Observational Data | Coursera.
  • Pearl J, Mackenzie D. The Book of Why: The New Science of Cause and Effect . First edition. Basic Books; 2018.

Further reading

  • Front-Door Criterion to Adjust for Unmeasured Confounding
  • 4 Simple Ways to Identify Confounding
  • 5 Real-World Examples of Confounding [With References]
  • Why Confounding is Not a Type of Bias
  • Using the 4 D-Separation Rules to Study a Causal Association
  • List of All Biases in Research (Sorted by Popularity)
  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Confounding Variable: Definition & Examples

By Jim Frost 86 Comments

Confounding Variable Definition

In studies examining possible causal links, a confounding variable is an unaccounted factor that impacts both the potential cause and effect and can distort the results. Recognizing and addressing these variables in your experimental design is crucial for producing valid findings. Statisticians also refer to confounding variables that cause bias as confounders, omitted variables, and lurking variables .

diagram that displays how confounding works.

A confounding variable systematically influences both an independent and dependent variable in a manner that changes the apparent relationship between them. Failing to account for a confounding variable can bias your results, leading to erroneous interpretations. This bias can produce the following problems:

  • Overestimate the strength of an effect.
  • Underestimate the strength of an effect.
  • Change the direction of an effect.
  • Mask an effect that actually exists.
  • Create Spurious Correlations .

Additionally confounding variables reduce an experiment’s internal validity , thereby reducing its ability to make causal inferences about treatment effects. You don’t want any of these problems!

In this post, you’ll learn about confounding variables, the problems they cause, and how to minimize their effects. I’ll provide plenty of examples along the way!

What is a Confounding Variable?

Confounding variables bias the results when researchers don’t account for them. How can variables you don’t measure affect the results for variables that you record? At first glance, this problem might not make sense.

Confounding variables influence both the independent and dependent variable, distorting the observed relationship between them. To be a confounding variable, the following two conditions must exist:

  • It must correlate with the dependent variable.
  • It must correlate with at least one independent variable in the experiment.

The diagram below illustrates these two conditions. There must be non-zero correlations (r) on all three sides of the triangle. X1 is the independent variable of interest while Y is the dependent variable. X2 is the confounding variable.

Diagram that displays the conditions for confounding variables to produce bias.

The correlation structure can cause confounding variables to bias the results that appear in your statistical output. In short, The amount of bias depends on the strength of these correlations. Strong correlations produce greater bias. If the relationships are weak, the bias might not be severe. If any of the correlations are zero, the extraneous variable won’t produce bias even if the researchers don’t control for it.

Leaving a confounding variable out of a regression model can produce omitted variable bias .

Confounding Variable Examples

Exercise and weight loss.

In a study examining the relationship between regular exercise and weight loss, diet is a confounding variable. People who exercise are likely to have other healthy habits that affect weight loss, such as diet. Without controlling for dietary habits, it’s unclear whether weight loss is due to exercise, changes in diet, or both.

Education and Income Level

When researching the correlation between the level of education and income, geographic location can be a confounding variable. Different regions may have varying economic opportunities, influencing income levels irrespective of education. Without controlling for location, you can’t be sure if education or location is driving income.

Exercise and Bone Density

I used to work in a biomechanics lab. For a bone density study, we measured various characteristics including the subjects’ activity levels, their weights, and bone densities among many others. Bone growth theories suggest that a positive correlation between activity level and bone density likely exists. Higher activity should produce greater bone density.

Early in the study, I wanted to validate our initial data quickly by using simple regression analysis to assess the relationship between activity and bone density. There should be a positive relationship. To my great surprise, there was no relationship at all!

Long story short, a confounding variable was hiding a significant positive correlation between activity and bone density. The offending variable was the subjects’ weights because it correlates with both the independent (activity) and dependent variable (bone density), thus allowing it to bias the results.

After including weight in the regression model, the results indicated that both activity and weight are statistically significant and positively correlate with bone density. Accounting for the confounding variable revealed the true relationship!

The diagram below shows the signs of the correlations between the variables. In the next section, I’ll explain how the confounder (Weight) hid the true relationship.

Diagram of the bone density model.

Related post : Identifying Independent and Dependent Variables

How the Confounder Hid the Relationship

The diagram for the Activity and Bone Density study indicates the conditions exist for the confounding variable (Weight) to bias the results because all three sides of the triangle have non-zero correlations. Let’s find out how leaving the confounding variable of weight out of the model masked the relationship between activity and bone density.

The correlation structure produces two opposing effects of activity. More active subjects get a bone density boost directly. However, they also tend to weigh less, which reduces bone density.

When I fit a regression model with only activity, the model had to attribute both opposing effects to activity alone. Hence, the zero correlation. However, when I fit the model with both activity and weight, it could assign the opposing effects to each variable separately.

Now imagine if we didn’t have the weight data. We wouldn’t have discovered the positive correlation between activity and bone density. Hence, the example shows the importance of controlling confounding variables. Which leads to the next section!

Reducing the Effect of Confounding Variables

As you saw above, accounting for the influence of confounding variables is essential to ensure your findings’ validity . Here are four methods to reduce their effects.

Restriction

Restriction involves limiting the study population to a specific group or criteria to eliminate confounding variables.

For example, in a study on the effects of caffeine on heart rate, researchers might restrict participants to non-smokers. This restriction eliminates smoking as a confounder that can influence heart rate.

This process involves pairing subjects by matching characteristics pertinent to the study. Then, researchers randomly assign one individual from each pair to the control group and the other to the experimental group. This randomness helps eliminate bias, ensuring a balanced and fair comparison between groups. This process controls confounding variables by equalizing them between groups. The goal is to create groups as similar as possible except for the experimental treatment.

For example, in a study examining the impact of a new education method on student performance, researchers match students on age, socioeconomic status, and baseline academic performance to control these potential confounders.

Learn more about Matched Pairs Design: Use & Examples .

Random Assignment

Randomly assigning subjects to the control and treatment groups helps ensure that the groups are statistically similar, minimizing the influence of confounding variables.

For example, in clinical trials for a new medication, participants are randomly assigned to either the treatment or control group. This random assignment helps evenly distribute variables such as age, gender, and health status across both groups.

Learn more about Random Assignment in Experiments .

Statistical Control

Statistical control involves using analytical techniques to adjust for the effect of confounding variables in the analysis phase. Researchers can use methods like regression analysis to control potential confounders.

For example, I showed you how I controlled for weight as a confounding variable in the bone density study. Including weight in the regression model revealed the genuine relationship between activity and bone density.

Learn more about controlling confounders by using regression analysis .

By incorporating these strategies into research design and analysis, researchers can significantly reduce the impact of confounding variables, leading to more accurate results.

If you aren’t careful, the hidden hazards of a confounding variable can completely flip the results of your experiment!

Kamangar F. Confounding variables in epidemiologic studies: basics and beyond . Arch Iran Med. 2012 Aug;15(8):508-16. PMID: 22827790.

Share this:

how does random assignment control confounding variables

Reader Interactions

' src=

January 15, 2024 at 10:02 am

To address this potential problem, I collect all the possible variables and create a correlation matrix to identify all the correlations, there direction, and their statistical significance, before regression.

' src=

January 15, 2024 at 2:54 pm

That’s a great practice for understanding the underlying correlation structure of your data. Definitely a good thing to do along with graphing the scatterplots for all those pairs because they’re good at displaying curved relationships that might not register with Pearson’s correlation.

It’s been awhile since I worked on the bone density study, but I’m sure I created that correlation & scatterplot matrix to get the lay of the land.

A couple of caveats:

Those correlations are pairwise relationships, equivalent to one predictor for a response (but without the directionality). So, those correlations can be affected by a confounding variable just like a simple regression model. Going back to the example in my post, if I did a pairwise correlation between all variables, including activity and bone density, that would’ve still been essentially zero–affected by the weight confounder in the same way as the regression model. At least with a correlation matrix, you’d be able to piece together that weight was a confounder likely affecting the other correlation.

And a confounder can exist outside your dataset. You might not have even measured a confounder, so it won’t be in your correlation matrix, but it can still impact your results. Hence, it’s always good to consider variables that you didn’t record as well.

I’m guessing you know all that, I’m more spelling it out for other readers.

And if I’m remember correctly, your background is more with randomized experiments. The random assignment process should break any correlation between a confounder and the outcome, making it essentially zero. Consequently, randomizes experiments tend to prevent confounding variables from affecting the results.

' src=

July 17, 2023 at 11:11 am

Hi Jim, In multivariate regression, I have always removed variables that aren’t significant. However, recently a reviewer said that this approach is unjustified. Is there a consensus about this? a reference article? thanks, Ray

July 17, 2023 at 4:52 pm

Hi Raymond,

I don’t have an article handy to refer you to. But based on what happens to models when you retain and exclude variables, I recommend the following approach.

Deciding whether to eliminate an insignificant independent variable from a regression model requires a thorough understanding of the theoretical implications related to that variable. If there’s strong theoretical justification for its inclusion, it might be advisable to keep it within the model, despite its insignificance.

Maintaining an insignificant variable in the model does not typically degrade its overall performance. On the contrary, removing a theoretically justified but insignificant variable can lead to biased outcomes for the remaining independent variables, a situation known as omitted variable bias . Therefore, it can be beneficial to retain an insignificant variable within the model.

It’s vital to consider two major aspects when making this decision. Firstly, whether there’s strong theoretical support for retaining the insignificant variable, and secondly, whether excluding it has a significant impact on the coefficient estimates of the remaining variables. In short, if you remove an insignificant variable and the other coefficients change, you need to assess the situation.

If there are no theoretical reasons to retain an insignificant variable and removing it doesn’t appear to bias the result, then you probably should remove it because it might increase the precision of your model somewhat.

Consequently, I advise “considering” the removal of insignificant independent variables from the model, instead of asserting that you “should” remove them, as this decision depends on the aforementioned factors and is not a hard-and-fast rule. Of course, when you do the write-up, explain your reasoning for including insignificant variables along with everything else.

' src=

January 16, 2023 at 5:31 pm

Thank you very much! That helped a lot.

January 15, 2023 at 9:12 am

thank you for the interesting post. I would like to ask a question because I think that I am very much stuck into a discipline mismatch. I come from economics but I am now working in the social sciences field.

You describe that conditions for confounding bias: 1) there is a correlation between x1 and x2 (the OVB) 2) x1 associates with y 3) x2 associates with y. I interpret 1) as that sometime x1 may determine x2 or the contrary.

However, I read quite recently a social stat paper in which they define confounding bias differently. 2)3) still hold but 1) says that x2 –> x1, not the contrary. So, the direction of the relationship cannot go the other way around. Otherwise that would be mediation..

I am a bit confused and think that this could be due to the different disciplines but I would be interested in knowing what you think.

Thank you. Best, Vero

January 16, 2023 at 12:56 am

Hi Veronica,

Some of your notation looks garbled in the comment, but I think I get the gist of your question. Unfortunately, the comments section doesn’t handle formatting well!

So, X 1 and X 2 are explanatory variables while Y is the outcome. The two x variables correlate with each other and the Y variable. In this scenario, yes, if you exclude X 2 , it will cause some degree of omitted variable bias. It is a confounding variable. The degree of bias depends on the collective strength of all three correlations.

Now, as for the question of the direction of the relationship between X 1 and X 2 , that doesn’t matter statistically. As long as the correlation is there, the potential for confounding bias exists. This is true whether the relationship between X 1 and X 2 is causal in either direction or totally non-causal. It just depends on the set of correlations existing.

I think you’re correct in that this is a difference between disciplines.

The social sciences define a mediator variable as explaining the process by which two variables are related, which gets to your point about the direction of a causal relationship. When X 1 –> X 2 , I’d say that the social sciences would call that a mediator variable AND that X 2 is still a confounder that will cause bias if it is omitted from the model. Both things are true.

I hope that helps!

' src=

October 10, 2022 at 11:07 am

Thanks in advance for your awesome content.

Regarding this question brought by Lucy, I want to ask the following: If introducing variables reduces the bias (because the model controls for it), why don’t we just insert all variables at once to see the real impact of each variable?

Let’s say I have a dataset of 150 observations and I want to study the impact of 20 variables (dummies and continuous), it is advantageous to introduce everything at once and see which variables are significant? I got the idea that introducing variables is always positive because it forces the model to show the real effects (of course I am talking about fundamented variables), but are there any caveats of doing so? Is it possible that some variables in fact may “hide” the significance of others because they will overshadow the others regressors? Usually it is said that, if the significance changes when introducing a variable, it was due to confounding. My question now is: is possible that confounding was not case and, in fact, the significance is just being hiden due to a present of a much more strong predictor?

October 10, 2022 at 8:10 pm

In some ways, you’re correct. Generally speaking, it is better to include too many variables than too few. However, there is a cost for including more variables than necessary, particularly when they’re not significant. Adding more variables than needed increases the model’s variance, which reduces statistical power and precision of the estimates. Ideally, you want a balance of all the necessary variables, no more, and no less. I write about this tradeoff in my post about selecting the best model . That should answer a lot of your questions.

I think the approach of starting with model with all possible variables has merit. You can always start removing the ones that are not significant. Just do that by removing one at a time and start by removing the least significant. Watch for any abrupt changes in coefficient signs and p-values as you remove each one.

As for caveats, there are rules of thumb as to how many independent variables you can include in a model based on how many observations you have. If you include too many, you can run into overfitting, which can produce whacky results. Read my post about overfitting models for information about that. So, in some cases, you just won’t be able to add all the potential variables at once, but that depends on the number of variables versus the number of observations. The overfitting post describes that.

And, to answer your last question, overfitting is another case where adding variables can change the significance that’s not due to confounding.

' src=

January 20, 2022 at 8:10 am

Thanks for the clear explanation, it was reallly helpful! I do have a question regarding this sentence: “The important takeaway here is that leaving out a confounding variable not only reduces the goodness-of-fit (larger residuals), but it can also bias the coefficient estimates.”

Is it always the case that leaving out a confounding variable leads to a lesser fit? I was thinking about the case of positive bias: say variables x and y are both negatively correlated with the dependent variable, but x and y are positively correlated with each other. If a high value for x is caused by a high value of y both variables ‘convey the information’ of variable y. So adding variable x to a model wouldn’t add any additional information, and thus wouldn’t improve the fit of the model.

Am I making a mistake in my reasoning somewhere? Or does leaving out a confounding variable not lead to a worse fit in this case?

Thanks again for the article! Sterre

January 20, 2022 at 2:20 pm

Think about it this way. In general, adding an IV always causes R-squared to increase to some degree–even when it’s only a chance correlation. That still applies when you add a confounding variable. However, with a confounding variable, you know it’s an appropriate variable to add.

Yes, the correlation with the IV in the model might capture some of the confounder’s explanatory power, but you can also be sure that adding it will cause the model to fit better. And, again, it’s an entirely appropriate variable to include because of its relationship with the DV (i.e., you’re not adding it just to artificially inflate R-squared/goodness-of-fit). Additionally, unless there’s a perfect correlation between the included IV and the confounder, the included IV can’t contain all the confounder’s information. But, if there was a perfect correlation, you wouldn’t be able to add both anyway.

There are cases where you might not want to include the confounder. If you’re mainly interested in making predictions and don’t need to understand the role of each IV, you might not need to include the confounder if your model makes sufficiently precise predictions. That’s particularly true if the confounder is difficult/expensive to measure.

Alternatively, if there is a very high, but not perfect correlation, between the included IV and the confounder, adding the confounder might introduce too much multicollinearity , which causes its own problems. So, you might be willing to take the tradeoff between exchanging multicollinearity issues for omitted variable bias. However, that’s a very specific weighing of pros and cons given the relative degree of severity for both problems for your specific model. So, there’s no general advice for which way to go. It’s also important to note that there are other types of regression analysis (Ridge and LASSO) that can effectively handle multicollinearity, although at the cost of introducing a slight bias. Another possibility to balance!

But, to your main question, yes, if you add the confounder, you can expect the model fit to improve to some degree. It may or may not be an improvement that’s important in a practical sense. Even if the fit isn’t notably better, it’s often worthwhile adding the confounder to address the bias.

' src=

May 2, 2021 at 4:23 pm

Jim, this was a great article, but I do not understand the table. I am sure it is easy, and I am missing something basic. what does it mean to be included and omitted: negative correlation…. etc. in the 2 way by 2 way table? I cannot wrap my head around the titles, and correspdonding scenarios. thanks John

May 3, 2021 at 9:39 pm

When I refer to “included” and “omitted,” I’m talking about whether the variable in question an independent variable IN the model (included), or a potential independent variable that is NOT in the model (omitted). After all, we’re talking about omitted variable bias, which is the bias caused by leaving an important variable out of the model.

The table allows you to determine the direction the coefficient estimate is being biased if you can determine the direction of the correlation between several variables.

In the example, I’m looking at a model where Activity (the included IV) predicts the bone density of the individual (the DV). The omitted confounder is weight. So, now we just need to assess the relationships between those variables to determine the direction of the bias. I explain the process of using the table with this example in the paragraph below the table, so I won’t retype it here. But, if you don’t understand something I write there, PLEASE let me know and I’ll help clarify it!

In the example, Activity = Included, Weight = Omitted, and Dependent = Bone Density. I use the signs from the triangle diagram that include a ways before the table which lists these three variables to determine the column and row to use.

Again, I’m not sure which part is tripping you up!

' src=

April 27, 2021 at 2:23 am

Thank you Jim ! The two groups are both people with illness, only different because they are illnesses that occur in different ages. The first illness group is of younger age like around 30, the other of older age around 45. Overlap of ages between these groups is very minimal. By control group, I meant a third group of healthy people without illness, and has ages uniformly distributed in the range represented in the two patient groups, and thus the group factor having three levels now.. I was thinking if this can reduce the previous problem of directly comparing the young and old patient groups where adding age as covariate can cause collinearity problem..

April 28, 2021 at 10:42 pm

Ah, ok. I didn’t realize that both groups had an illness. Usually a control group won’t have a condition.

I really wouldn’t worry about the type of multicollinearity you’re referring to. You’d want to include those two groups and age plus the interaction term, which you could remove if it’s not significant. If the two groups were completely distinct in age and had a decent gap between them, there are other model estimate problems to worry about, but that doesn’t seem to be the case. If age is a factor in this study area, you definitely don’t want to exclude it. Including it allows you to control for it. Otherwise, if you leave it out, the age effect will get rolled into the groups and, thereby, bias your results. Including age is particularly important in your case because you know the groups are unbalanced in age. You don’t want the model to attribute the difference in outcomes to the illness condition when it’s actually age that is unbalanced between those two conditions. I’d go so far to say that your model urgently needs you to include age!

That said, I would collect a true control group that has healthy people and ideally a broad range of ages that covers both groups. That will give you several benefits. Right now, you won’t know how your illness groups compare to a healthy group. You’ll only know how they compare to each other. Having that third group will allow you to compare each illness group to the healthy group. I’m assuming that’s useful information. Plus, having a full range of ages will allow the model to produce a better estimate of the age effect.

April 26, 2021 at 6:51 am

Hi JIm, Thanks a lot for your intuitive explanations!!

I want to study the effect of two Groups of patients (X1) on y (a test performance score), in a GLM framework. Age (X2) and Education (X3) are potential confounders on y.

However its not possible to match these two groups for age, as they are illnesses that occur in different age groups-one group is younger than the other. Hence the mean ages are significantly different between these groups.

I’m afraid adding age as a covariate could potentially cause multicollinearity problem as age is significantly different between groups, and make the estimation of group effect (β1) erroneous, although it might improve the model. Is recruiting a control group with age distribution comparable to the pooled patient groups, hence of a mean age mid-way between the two patient groups a good idea to improve the statistical power of the study? In this case my group factor X1 will have three levels. Can this reduce the multicollinearity problem to an extent as the ages of patients in the two patient groups are approximately represented in the control group also..? Should I add an interaction term of Age*Group in the GLM to account for the age difference between groups..? Thank you in advance.. -Mohan

April 26, 2021 at 11:13 pm

I’d at least try including age to see what happens. If there’s any overlap in age between the two groups, I think you’ll be ok. Even if there is no overlap, age is obviously a crucial variable. My guess would be that it’s doing more harm by excluding it from the model when it’s clearly important.

I’m a bit confused by what you’re suggesting for the control group. Isn’t one of your groups those individuals with the condition and the other without it?

It does sound possible that there would be an interaction effect in this case. I’d definitely try fitting and see what the results are! That interaction term would show whether the relationship between age and test score is different between the groups.

' src=

April 26, 2021 at 12:44 am

In the paragraph below the table, both weight and activity are referred to as included variables.

April 26, 2021 at 12:50 am

Hi Joshua, yes, you’re correct! A big thanks! I’ve corrected the text. In that example, activity is the included variable, weight is the omitted variable, and bone density it the dependent variable.

' src=

April 24, 2021 at 1:06 pm

Hi, Jim. Great article. However, is that a typo in the direction of omitted variable bias table? For the rows, it makes more sense to me if they were “correlation between dependent and omitted variables” instead of between depedent and included variables”.

April 25, 2021 at 11:21 pm

No, that’s not a typo!

' src=

April 22, 2021 at 9:53 am

Please let me know if this summary makes sense. Again, Thanks for the great posts !

Scenario 1: There are 10 IVs. They are modeled using OLS. We get the regression coefficients.

Scenario 2: One of the IVs is removed. It is not a confounder. The only impact is on the residuals (they increase). The coefficients obtained in Scenario 1 remain intact. Is that correct ?

Scenario 3: The IV that was removed in Scenario 2, is placed back into the mix. This time, another IV is removed. Now this one’s a confounder. OLS modeling is re-run. There are 3 resutls.

1) The residuals increase — because it is correlated with the dependent variable. 2) The coefficient of the other IV, to which this removed confounder is correlated, changes. 3) The coefficients of the other IVs remain intact.

Are these 3 scenarios an accurate summary, Jim? A reply would be much appreciated !

Again, do keep up the good work.

April 25, 2021 at 11:26 pm

Yes, that all sounds right on! 🙂

April 22, 2021 at 8:37 am

Great post, Jim !

Probably a basic question, but would appreciate your answer on this, since we have encountered this in practical scenarios. Thanks in advance.

What if we know of a variable that should get included on the IV side, we don’t have data for that, we know (from domain expertise) that it is correlated with the dependent variable, but it is not correlated with any of the IVs…In other words, it is not a confounding variable in the strictest sense of the term (since it is not correlated to any of the IVs).

How do we account for such variables?

Here again the solution would be to use proxy variables? In other words, can we consider proxy variables to be a workaround for not just confounders, but also non-confounders of the above type ?

Thanks again !

April 23, 2021 at 11:20 pm

I discuss several methods in this article. The one I’d recommend if at all possible is identifying a proxy variable that stands in for the important variable that you don’t have. It sounds like in your case it’s not a confounder. So, it’s probably not biasing your other coefficients. However, your model is missing important information. You might be able to improve the precision using a proxy variable.

' src=

March 19, 2021 at 10:45 am

Hi Jim, that article is helping me a lot during my research project, thank you so for that! However, there is one question for which I couldn’t find a satisfactory answer on the internet, so I hope that maybe you can shed some light on this: In my panel regression, I have my main independent variable on “Policy Uncertainty”, that catpures uncertainty related to the possible impact of future government policies. It is based on an index that has a mean of 100. My dependent variable is whether a firm has received funding in quarter t (Yes = 1, No = 0), thus I want to estimate the impact of policy uncertainty on the likelihood of receiving external funding. In my baseline regression, the coefficient on policy uncertainty is insignificant, suggesting that policy uncertainty has no impact. When I now add a proxy for uncertainty related finincial markets (e.g. implied stock market volatitily), then policy uncertainty becomes significant at the 1% level and the market uncertainty proxy is statistically significant at the 1% level too! The correlation between both is rather low, 0.2. Furthermore, both have opposite signs (poilcy uncertainty is positively associated with the likelihood of receiving funding), additionally, the magnitude of the coefficients is comparable.

Now am I wondering what this tells me…did the variable on policy uncertainty previously capture the effect of market uncertainty before including the latter in regression? Would be great if you could help 🙂

March 19, 2021 at 2:56 pm

Thanks for writing with the interesting questions!

First, I’ll assume you’re using binary logistic regression because you have a binary dependent variable. For logistic regression, you don’t interpret the coefficients that same ways as you do for say least squares regression. Typically, you’ll assess the odds ratio to understand the IVs relationship to the binary DV.

On to your example. It’s entirely possible that leaving out market uncertainty was causing omitted variable bias in the policy uncertainty. That might be what is happening. But, the positive sign of one and the negative sign of the other could be cancelling each other out when you only include the one. That is what happens in the example I use in this post. However, for that type of bias/confounding, you’d expect there to be a correlation between the two DVs and you say it is low.

Another possibility is the fact that for each variable in a model, the significance refers to the Adj SS for the variable, which factors in all the other variables before entering variable in question. So, the policy uncertainty in the model with market volatility is significant after accounting for the variance that the other variables explain, including market volatility. For the model without market volatility, the policy uncertainty is not significant in that different pool of remaining variability. Given the low correlation (0.2) between those two IVs, I’d lean towards this explanation. If there was a stronger correlation between the policy/market uncertainty, I’d lean towards omitted variable bias.

Also be sure that your model doesn’t have any other type of problems, such as overfitting or patterns in the residual plots . Those can cause weird things to happen with the coefficients.

It can be unnerving when the significance of one variable depends entirely on the presence of another variable. It makes choosing the correct model difficult! I’d let theory be your guide. I write about that towards the end of my post about selecting the correct regression model . That’s written in the contest of least squares regression, but the same ideas about theory and other research apply here.

You should definitely investigate this mystery further!

' src=

February 11, 2021 at 12:31 am

Thank you for this blog. I have a question: If two independent variables are corelated, can we not convert one into the other and replace that in the model? For example, If Y=X1 +X2, and X2= – 0.5X1, then Y=0.5X1. However, I don’t see that as a suggestion in the blog. The blog mentions that activity is related to weight, but then somehow both are finally included in the model, rather than replacing one with the other in the model. Will this not help with multicollinearity, too? I am sure I am missing something here that you can see, but I am unable to find that out. Can you please help?

Regards, Kushal Jain

February 11, 2021 at 4:45 pm

Why would you want to convert one to another? Typically, you want to understand the relationship between each independent variable and the dependent variable. In the model I talk about, I’d want to know the relationship between both activity and weight with bone density. Converting activity to weight does not help with that.

And, I’m not understanding what you mean by “then somehow both are finally included in the model.” You just include both variables in the model the normal way.

There’s no benefit to converting the variables as you describe and there are reasons not to do that!

' src=

November 25, 2020 at 2:22 pm

Hi Jim, I have been trying to figure out covariates for a study we are doing for some time. My colleague believes that if two covariates have a high correlation (>20%) then one should be removed from the model. I’m assuming this is true unless both are correlated to the dependent variable, per your discussion above? Also, what do you think about selecting covariates by using the 10% change method? Any thoughts would be helpful. We’ve had a heck of a time selecting covariates for this study. Thanks, Erin

November 27, 2020 at 2:06 am

It’s usually ok to have covariates that have a correlation greater than 20%. The exact value depends on the number of covariates and the strength of their correlations. But 20% is low and almost never a problem. When covariates are corelated, it’s known as multicollinearity. And, there’s a special measure known as VIFs that determine whether you have an excessive amount of correlation amongst your covariates. I have a post that discusses multicollinearity and how to detect and correct it .

I have not used the 10% change method myself. However, I would suggest using that method only as one point of information. I’d really place more emphasis on theory and understanding the subject area. However, observing how much a covariate changes can provide useful information about whether bias is a problem or not. In general, if you’re uncertain, I’d err on the side of unnecessarily including a covariate than leaving it out. There are usually fewer problems associated with having an additional variable than omitting one. However, keep an eye out on the VIFs as you do that. And, having a number of unnecessary variables could lead to problems if taken to an extreme or if you have a really small sample size.

I wrote a post about model selection . I give some practical tips in it. Overall, I suggest using a mix of theory, subject area knowledge, and statistical approaches. I’d suggest reading that. It’s not specifically about controlling for confounders but the same principles apply. Also, I’d highly recommend reading about what researchers performing similar studies have done if that’s at all possible. They might have already addressed that issue!

' src=

November 5, 2020 at 6:29 am

Hi Jim Im not sure whether my problem fits under this category or not so apologies if not. I am looking at whether an inflammatory biomarker (independant variable) correlates with a measure of cognitive function (dependant variable). It does if its just a simple linear regression however the biomarker (independant variable) is affected by age, sex and whether you’re a smoker or not. Correcting for these 3 covariables in the model shows that actually there is no correlation between the biomarker and cognitive function. I assume this was the correct thing to do but wanted to make sure seeing as a) none of the 3 covariables correlate with/predict my dependant variable, and b) as age correlates highly with the biomarker, does this not introduce colinearity? Thanks! Charlotte

November 6, 2020 at 9:46 pm

Hi Charlotte,

Yes, it sounds like you did the right thing. Including the other variables in the model allows the model to control for them.

The collinearity (aka multicollinearity or correlation between independent variables) between age and the biomarker is a potential concern. However, a little correlation, or a moderate amount of correlation is fine. What you really need to do is to assess the VIFs for your independent variables. I discuss VIFs and multicollinearity in my post about multicollinearity . So, your next step should be to determine whether you have problematic levels of multicollinearity.

One symptom of multicollinearity is a lack of statistical significance, which your model is experience. So, it would be good to check.

Actually, I’m noticing that at least several of your independent variables are binary. Smoker. Gender. Is the biomarker also binary? Present or not present? If so, that’s doesn’t change the rational for including the other variables in the model but it does mean VIFs won’t detect the multicollinearity.

' src=

October 28, 2020 at 9:33 pm

Thanks for the clarification, Jim. Best regards.

October 24, 2020 at 11:30 pm

I think the section on “Predicting the Direction of Omitted Variable Bias” has a typo on the first column, first two rows. It should state:

*Omitted* and Dependent: Negative Correlation

*Omitted* and Dependent: Positive Correlation

This makes it consistent with the required two conditions for Omitted Variable Bias to occurs:

The *omitted* variable must correlate with the dependent variable. The omitted variable must correlate with at least one independent variable that is in the regression model.

October 25, 2020 at 12:24 am

Hi Humberto,

Thanks for the close reading of my article! The table is correct as it is, but you are also correct. Let’s see why!

There are the following two requirements for omitted variable bias to exist: *The omitted variable must correlate with an IV in the model. *That IV must correlate with the DV.

The table accurately depicts both those conditions. The columns indicate the relationship between the IV (included) and omitted variable. The rows indicate the nature of the relationship between the IV and DV.

If both those conditions are true, you can then infer that there is a correlation between the omitted variable and the dependent variable and the nature of the correlation, as you indicate. I could include that in the table, but it is redundant information.

We’re thinking along the same lines and portraying the same overall picture. Alas, I’d need to use a three dimensional matrix to portray those three conditions! Fortunately, using the two conditions that I show in the table, we can still determine the direction of bias. And you could use those two relationships to determine the relationship between the omitted variable and dependent variable if you so wanted. However, that information doesn’t change our understanding of the direction of bias because it’s redundant with information already in the table.

Thanks for the great comment and it’s always beneficial thinking through these things using a different perspective!

' src=

August 14, 2020 at 3:00 am

Thank you for the intuitive explanation, Jim! I would like to ask a query. Suppose i have two groups-one with a recently diagnosed lung disease and another with chronic lung disease where i would like to do an independent t-test for the amount of lung damage. It happens that the two groups also significantly differ in their mean age. The group with recently diagnosed disease has a lesser mean age than the group with chronic disease. Also theory says Age can cause some damage in lung as a normal course too. So if i include age as a covariate in the model, wont it regress out the effect of DV and give underestimated effect as the IV (age) significantly correlates with DV (lung damage)? How do we address this confounding effect of correlation between only IV and DV? Should it be by having a control group without lung disease? If so can one control group help? Or should there be 2 control groups with age-matching to the two study groups? Thank you in advance.

August 15, 2020 at 3:46 pm

Hi Vineeth,

First, yes, if you know age is a factor, you should include it as a covariate in the model. It won’t “regress out” the true effect between the two groups. I would think of it a little differently.

You have two groups and you suspect that something caused those two groups to have differing amounts of lung damage. You also know that age plays a role. And those groups have different ages. So, if you look only at the groups without factoring in age, the effect of age is still present but the model is incorrectly attributing it to the groups. In your case, it will make the effect look larger.

When you include age, yes, it will reduce the effect size between the groups, but it’s reveal the correct effect by accounting for age. So, yes, in your cases, it’ll make the group difference look smaller, but don’t think of it as “regressing out” the effect but instead it is removing the bias in the other results. In other words, you’re improving the quality of your results.

When you look at your model results for say the grouping variable, it’s already controlling for the age variable. So, you’re left with what you need, just the effect between the IV and DV that is accounted for by another variable in the model, such as age. That’s what you need!

A control group for any experiment is always a good idea if you can manage one. However, it’s not always possible. I write about these experimental design issues, randomized experiments, observational studies, how to design a good experiment, etc. among other topics in my Introduction to Statistics ebook , which you might consider. It’s also just now available in print on Amazon !

' src=

August 12, 2020 at 7:04 am

I was wondering whether it’s correct to check the correlation between the independent variables and the error term in order to check for endogeneity. If we assume that there is endogeneity then the estimated errors aren’t correct and so the correlation between the independent variables and those errors doesn’t say much. Am I missing something here?

best regards,

' src=

July 15, 2020 at 1:57 pm

I wanted to look at the effects of confounders on my study but I’m not sure what analysis(es) to use for dichotomous covariates. I have one categorical iv with two levels, two continuous dvs, and then the two dichotomous confounding variables. It was hard to finds information for categorical covariates online. Thanks in advance Jim!

' src=

May 8, 2020 at 10:04 am

Thank you for your nice blog. I have still a question. Let’s say I want to determine the effect of one independent variable on a dependent variable with a linear regression analysis. I have selected a number of potential variables for this relationship based on literature, such as age, gender, health status and education level. How can I check (with statistical analyses) if these are indeed confounders? I would like to know for which of them I should control for in my linear regression analysis. Can I create a correlationmatrix beforehand to see if the potential confounder is both correlated with my independent and dependent variable? And what threshold for the correlation coefficient should be taken here? Is this every correlation coefficient except zero (for instance 0.004? Are there scientific articles/books that endorce this threshold? Or is it maybe better to use a “change-in-estimate” criterion to see if my regression coefficient changes with a particular size after adding my potential confounder in the linear regression model? What would be the threshold here?

I hope my question is clear. Thanks in advance!

' src=

April 29, 2020 at 2:47 am

thanks for a wonderful website! I love your example with the bone density which does not appear to be correlated to physical activity if looked at alone, and needs to have the weight added as explanatory variable to make both of them appear as significantly correlated with bone density. I would love to use this example in my class, as I think it is very important to understand that there are situations where a single-parameter model can lead you badly astray (here into thinking activity is not correlated with bone density). Of course, I could make up some numbers for my students, but it would be even nicer if I could give them your real data. Could you by any chance make a file of real measurements of bone densities, physical activity and weight available? I would be very grateful, and I suppose a lot of other teachers/students too!

best regards Martin

April 30, 2020 at 5:06 pm

When I wrote this post, I wanted to share the data. Unfortunately, it seems like I no longer have it. If I uncover it, I’ll add it to the post.

' src=

February 8, 2020 at 1:45 pm

The work you have done is amazing, and I’ve learned so much through this website. . I am at beginner level in SPSS and I would be grateful if you could answer my question. I have found that a medical treatment results in worse quality of life. But I know from crosstabs that people that are taking this treatment present more severe disease (continuous variable) that also correlates to quality of life. How can I test if it is treatment or severity that worsens quality of life?

February 8, 2020 at 3:16 pm

Hi Evangelia,

Thanks so much for your kind words, I really appreciate them! And, I’m glad my website has been helpful!

That’s a great question and a valid concern to have. Fortunately, in a regression model, the solution is very simple. Just include both the treatment and severity of the disease in the model as independent variables. Doing that allows the model to hold disease severity constant (i.e., controls for it) while it estimates the effect of the treatment.

Conversely, if you did not include severity of the disease in the model, and it correlates with both the treatment and quality of life, it is uncontrolled and will be a confounding variable. In other words, if you don’t include severity of disease, the estimate for the relationship between treatment and quality of life will be biased.

We can use the table in this post for estimating the direction of bias. Based on what you wrote, I’ll assume that the treatment condition and severity have a positive correlation. Those taking the treatment present a more severe disease. And, that the treatment condition has a negative correlation with quality of life. Those on the treatment have a lower quality of life for the reasons you indicated. That puts us in the top-right quadrant of the table, which indicates that if you do not include severity of disease as an IV, the treatment effect will be underestimated.

Again, simply by including disease severity in your model will reduce the bias!

' src=

December 7, 2019 at 7:32 pm

Just a question about what you said about power. Will adding more independent variables to a regression model cause a loss of power? (at a fixed sample size). Or does it depend on the type of independent variable added: confounder vs. non confounder.

' src=

November 1, 2019 at 8:54 pm

you mention “Suppose you have a regression model with two significant independent variables, X1 and X2. These independent variables correlate with each other and the dependent variable” How is possible for two random variables (in this case the two factors) to correlate with each other if they are independent? If two random variables are independent then covariance is zero and therefore correlaton is zero.

Corr(X1,X2)=Cov(X1, X2)/(sqrt(var(X1))*sqrt(var(X2))) Cov(X1,X2)=E[X1*X2]-E[X1]*E[X2] if X1 and X2 are independent then E[X1*X2]=E[X1]*E[X2] and therefore covariance is zero.

November 4, 2019 at 9:07 am

Ah, there’s a bit of confusion here. The explanatory variables in a regression model are often referred to as independent variables, as well as predictors, x-variables, inputs, etc. I was using “independent variable” as the name. You’re correct, if they were independent in the sense that you describe them, there would be no correlation. Ideally, there would be no correlation between them in a regression model. However, they can, in fact, be correlated. If that correlation is too strong, it will cause problems with the model.

“Independent variable” in the regression context refers to the predictors and describes their ideal state. In practice, they’ll often have some degree of correlation.

I hope this helps!

' src=

April 8, 2019 at 12:33 pm

Ah! Enlightenment!

I had taken your statement about the correlation of the independent variable with the residuals to be a statement about computed value of the correlation between them, that is, that cor(X1, resid) was nonzero. I believe that (in a model with a constant term), this is impossible.

But I think I get now that that you were using the term more loosely, referring to a (nonlinear) pattern appearing between the values of X1 and the corresponding residuals, in the same way as you would see a parabolic pattern in a scatterplot of residuals versus X if you tried to make a linear fit of quadratic data. The linear correlation between X and the residuals would still compute out, numerically, to zero, so X1 and the residuals would would technically be uncorrelated, but they would not be statistically independent. If the residuals are showing a nonlinear pattern when plotted against X, look for a lurker.

The Albany example was very helpful. Thanks so much for digging it up!

April 8, 2019 at 8:38 am

Hi, Jim! Thanks very much for you speedy reply!

I appreciate the clarity that you aim for in your writing, and I’m sorry if I wasn’t clear in my post. Let me try again, being a bit more precise, hopefully without getting too technical.

My problem is that I think that the very process used in finding the OLS coefficients (like minimizing the sum squared error of the residuals) results in a regression equation that satisfies two properties. First, that the sum (or mean) of the resulting residuals is zero. Second, that for any regressor Xi, Xi is orthogonal to the vector of residuals, which in turn leads to the covariance of the residuals with any regressor having to be zero. Certainly, the true error terms need not sum to zero, nor need they be uncorrelated with a regressor…but if I understand correctly, these properties of the _residuals_ is an automatic consequence of fitting OLS to a data set, regardless of whether the actual error terms are correlated to the regressor or not.

I’ve found a number of sources that seem to say this–one online example is on page two here: https://www.stat.berkeley.edu/~aditya/resources/LectureSIX.pdf . I’ll be happy to provide others on request.

I’ve also generated a number of my own data sets with correlated regressors X1 and X2 and Y values generated by a X1 + b X2 + (error), where a and b are constants and (error) is a normally distributed error term of fixed variance, independently chosen for each point in the data set. In each case, leaving X2 out of the model still left me with zero correlation between X1 and the residuals, although there was a correlation between X1 and the true error terms, of course.

If I have it wrong, I’d love to see a data set that demonstrates what you’re talking about. If you don’t have time to find one (which I certainly understand), I’d be quite happy with any reference you might point me to that talks about this kind of correlation between residuals and one of the regressors in OLS, in any context.

Thanks again for your help, and for making regression more comprehensible to so many people.

Scott Stevens

April 8, 2019 at 10:59 am

Unfortunately, the analysis doesn’t fix all possible problems with the residuals. It is possible to specify models where the residuals exhibit various problems. You mention that residuals will sum to zero. However, if you specify a model without a constant, the residuals won’t necessarily sum to zero-read about that here . If you have a time series model, it’s possible to have autocorrelation in the residuals if you leave out important variables. If you specify a model that doesn’t adequately model curvature in the data, you’ll see patterns in the residuals.

In a similar vein, if you leave out an important variable that is correlated both with the DV and another IV in the model, you can have residuals that correlate with an IV. The standard practice is to graph the residuals by the independent variable to look for that relationship because it might have a curved shape which indicates a relationship but not necessarily a linear one that correlation would detect.

As for references, any regression textbook should cover this assumption. Again, it’ll refer to error, but the key is to remember that residuals are the proxy for error.

Here’s a reference from the University of Albany about Omitted Variable Bias that goes into it in more detail from the standpoint of residuals and includes an example of graphing the residuals by the omitted variable.

April 7, 2019 at 11:17 am

Hi, Jim. I very much enjoy how you make regression more accessible, and I like to use your approaches with my own students. I’m confused, though by the matter brought up by SFDude.

I certainly see how the _error_ term in a regression model will be correlated with an independent variable when a confounding variable is omitted, but it seems to me that the normal equations that define the regression coefficients assure that an independent variable in the model will always be uncorrelated with the _residuals_ of that model, regardless of whether an omitted confounding variable exists or not. Certainly, “X1 correlates with X2, and X2 correlates with the residuals. Ergo, variable X1 correlates with the residuals” would not hold for any three variables X1 and X2 and R. For example, if A and B are independent, then “A correlates with A + B, A + B correlates with B. Ergo, A correlates with B” is a false statement.

If I’m missing something here, I’d very much appreciate a data set that demonstrates the kind of correlation between an independent variable and the residuals of the model that it seems you’re talking about.

Thanks! Scott Stevens

April 7, 2019 at 6:28 pm

Thanks for writing. And, I’m glad to hear that you find my website helpful!

The key thing to remember is that while the OLS assumptions refer to the error, we can’t directly observe the true error. So, we use the residuals as estimates of the error. If the error is correlated with an omitted variable, we’d expect the residuals to be correlated as well in approximately the same manner. Omitted variable bias is a real condition, and that description is simply getting deep into the nuts and bolts of how it works. But, it’s the accepted explanation. You can read it in textbooks. While the assumptions refer to error, we can only assess the residuals instead. They’re the best we’ve got!

When you say A and B are “independent”, if you mean they are not correlated, I’d agree that removing a truly uncorrelated variable from the model does not cause this type of bias. I mention that in this post. This bias only occurs when independent variables are correlated with each other to some degree, and with the dependent variable, and you exclude one of the IVs.

I guess I’m not exactly sure which part is causing the difficulty? The regression equations can’t ensure that the residuals are not uncorrelated if the model is specified in such a way that it causes them to be correlated. It’s just like in time series regression models, you have to be on the look out for autocorrelation (correlated residuals) because the model doesn’t account for time-order effects. Incorrectly specified models can and do cause problems with the residuals, including residuals that are correlated with other variables and themselves.

I’ll have to see if I can find a dataset with this condition.

' src=

March 10, 2019 at 10:41 am

Hi Jim, I am involved in a study which involves looking into s number of clinical paramaters like platelet count and Haemogobin for patients who underwent emergency change of a mechanical circulatory support device due to thrombosis or clotting of the actual device. The purpose is to look if there is a trend in these parameters in the time frame of before 3 days and after 3 days of the change and establish if these parameters could be used as predictor of the event. My concern is that there is no control group for this study. But I dont see the need for looking into trend in a group which never had an event itself. Will not having a control group be considered as a weakness for this study? Also, what would be best statistical test for this. I was thinking of the generalized linear model. I would really appreciate your guidance here. Thank you

' src=

February 20, 2019 at 8:49 am

I’m looking at a published paper that develops clinical prediction rules by using logistic regression in order to help primary care doctors to decide who to refer to breast clinics for further investigation. The dependent variable is simply whether breast cancer is found to be present or not. The independent variables include 11 symptoms and age in (mostly) ten year increments (six separate age bands). The age bands were decided before the logistical regression was carried out. The paper goes on to use the data to create a scoring system based on symptoms and age. If this scoring system were to be used then above a certain score a woman would be referred, and below a certain score a woman would not be referred.

The total sample size is 6590 women referred to a breast clinic of which 320 were found to have breast cancer. The sample itself is very skewed. In younger women, breast cancer is rare and so some categories the numbers are very low. So for instance, in the 18-29 age band there are 62 women referred of whom 8 women have breast cancer, and in the 30-39 age band there are 755 women referred of which only one woman has breast cancer. So my first question is: if there are fewer individuals in particular categories than symptoms can the paper still use logistic regression to predict who to refer to a breast clinic based on a scoring system that includes both age and symptoms? My second question is: if there is meant to be at least 10 individuals per variable in logistic regression, are the numbers of women with breast cancer in these age groups too small for logistic regression to apply?

When I look at the total number of women in the sample (6590) and then the total number of symptoms (8616) there is a discrepancy. This means that some women have had more than one symptom recorded. (Or from the symptoms’ point of view, some women have been recorded more than once). So my third question is: does this mean that some of the independent variables are not actually independent of each other? (There is around a 30%-32% discrepancy in all categories. How significant is this?)

There are lots of other problems with the paper (the fact the authors only look at referred women rather than all the symptomatic women that a primary care doctor sees is a case in point) but I’d like to know whether the statistics are flawed too. If there are any other questions I need to ask about the data please do let me know.

With very best wishes,

Ms Susan Mitchell

February 20, 2019 at 11:23 pm

Offhand, I don’t see anything that screams to me that there is a definite problem. I’d have to read the study to be really sure. Here’s some thoughts.

I’m not in the medical field, but I’ve heard talks by people in the that field and it sounds like this is a fairly common use for binary logistic regression. The analyst creates a model where you indicate which characteristics, risk factors, etc apply to an individual. Then, the model predicts the probability of an outcome for them. I’ve seen similar models for surgical success, death, etc. The idea is that it’s fairly easy to use because some can just enter the characteristics of the patient and the model spits out a probability. For any model of this type, you’d really have to check the residuals and see all the output to determine how well the model fits the data. But, there’s nothing inherently wrong with this approach.

I don’t see a problem with the sample size (6590) and the number of IVs (12). That’s actually a very good ratio of observations per IV.

It’s ok that there are fewer individuals in some categories. It’s better if you have a fairly equal number but it’s not a show stopper. Categories with fewer observations will have less precise estimates. It can potentially reduce the precision of model. You’d have to see how well the model fit the data to really know how well it works out. But, yes, if you have an extremely low number of individuals that have a particular symptom, you won’t get as precise of an estimate for that symptoms effect. You might see a wider CI for its odds ratio. But, it’s hard to say without seeing all of that output and how the numbers by symptoms. And, it’s possible that they selected the characteristics that apply to a sufficient number of women. Again, I wouldn’t be able to say. It’s an issue to consider for sure.

As for the number of symptoms versus the number of women, it’s ok that a woman can have more than one symptom. Each symptom is in it’s own column and will be coded with a 1 or 0. A row corresponds to one woman and she’ll have a 1 for each characteristic that she has and 0s for the ones that she does not have. It’s possible these symptoms are correlated. These are categorical variables, so you couldn’t use Pearson’s correlation. You’d need to use something like the chi-square test of independence. And, some correlation is okay. Only very high correlation would be problematic. Again, I can’t say whether that’s a problem in this study or not because it depends on the degree of correlation. It might be, but it’s not necessarily a problem. You’d hope that the study strategically included a good set of IVs that aren’t overly correlated.

Regarding the referred women vs symptomatic women, that comes down to the population that is being sampled and how generalizeable the results are. Not being familiar with the field, I don’t have a good sense for how that affects generalizability, but yes that would be a concern to consider.

So, I don’t see anything that shouts to me that it’s a definite problem. But, as with any regression model, it would come down to the usual assessments of how well the model fits the data. You mention issues that could be concerns, but again, it depends on the specifics.

Sorry I couldn’t provide more detailed thoughts but evaluating these things requires real specific information. But, the general approach for this study seems sound to me.

' src=

February 17, 2019 at 3:48 pm

I have a question, how well can we evaluate a regression equation “fits” the data by examing the R Square statistic, and test for statistical significance of the whole regression equation using the F-Test?

February 18, 2019 at 4:56 pm

I have two blog posts that will be perfect for you!

Interpreting R-squared Interpreting the F-test of Overall Significance

If you have questions about either one, please post it in the comments section of the corresponding post. But, I think those posts will go a long way in answering your questions!

' src=

January 18, 2019 at 7:00 pm

Mr. Frost I know I need to run a regression model however I’m still unsure of which one. I’m examining the effects of alcohol use on teenagers with 4 confounders.

January 19, 2019 at 6:47 pm

Hi Dahlia, to make the decision, I’d need to know what types of variables they all are (continuous, categorical, binary, etc). However, if the effect of alcohol is a continuous variable, then OLS linear regression is a great place to start!

Best of luck with your analysis!

' src=

January 5, 2019 at 2:39 am

Thank you very much Jim,

Very helpful, I think my problem is really on the number of observation (25 obs). Yes, I have read that post also, and I always keep the theory in mind when analyzing the IVs.

My main objective is to show the existing relationship between X2 and Y, which is also supported by literature, however, if I do not control for X1 I will never be sure that the effect I have found is due to X2 or X1, because X1 and X2 are correlated.

I think only correlation would be ok, since my number of observation are limited and by using regression it limits me about the number of IVs to be included in the model also, which may make me leave out of the model some others IVs, which is also bad.

Thank you again

Best regards!

January 4, 2019 at 9:40 am

Thank you for this very good post.

However, I have a question. What to do if the (IV) X1 and X2 are correlated (says 0.75) and both are correlated to Y (DV) at 0.60. However, when include X1 and X2 in the same model X2 is not statistically significant, but when put separably they become statistically significant. On the other hand, the model with only X1 has higher explanatory power than the model with only X2.

Note: In individual model both meet the OLS assumptions but, together, X2 become not statistically significant (using stepwise regression X2 is removed from the model), what this means. In addition, I know from the literarture that X2 affects Y, but I am testing X1, and X1 is showing better fits that X2.

Thank you in advance, I hope you understand my question!

January 4, 2019 at 3:15 pm

Yes, I understand completely! This situation isn’t too unusual. The underlying problem is that because the two IVs are correlated, they’re supplying a similar type of predictive information. There isn’t enough unique predictive information for both of them to be statistically significant. If you had a larger sample size, it’s possible that both would significant. Also, keep in mind that correlation is a pairwise measure and doesn’t account for other variables. When you include both IVs in the model, the relationship between each IV and the DV is determined after accounting for the other variables in the model. That’s why you can see a pairwise correlation but not a relationship in a regression model.

I know you’ve read a number of my posts, but I’m not sure if you’ve read the one about model specification. In that post, a key point I make is not to use statistical measures alone to determine which IVs to leave in the model. If theory suggests that X2 should be included, you have a very strong case for including it even if it’s not significant when X1 is in the model–just be sure to include that discussion in your write-up.

Conversely, just because X2 seems to provide a better fit statistically and is significant with or without X1 doesn’t mean you must include it in the model. Those are strong signs that you should consider including a variable in the model. However, as always, use theory as a guide and document the rational for the decisions you make.

For your case, you might consider include both IVs in the model. If they’re both supplying similar information and X2 is justified by theory, chances are that X1 is as well. Again, document your rationale. If you include both, check the VIFs to be sure that you don’t have problematic levels of multicollinearity when you include both IVs. If those are the only two IVs in your model, that won’t be problematic given the correlations you describe. But, it could be problematic if you more IVs in the model that are also correlated to X1 and X2.

Another thing to look at is whether the coefficients for X1 and X2 vary greatly depending on whether you have one or both of the IVs in the model. If they don’t change much, that’s nice and simple. However, if they do change quite a bit, then you need to determine which coefficient values are likely to be closer to the correct value because that corresponds to the choice about which IVs to include! I’m sounding like a broken record, but if this is a factor, document your rational and decisions.

I hope that helps! Best of luck with your analysis!

' src=

November 28, 2018 at 11:30 pm

Another great post! Thank you for truly making statistics intuitive. I learned a lot of this material back in school, but am only now understanding them more conceptually thanks to you. Super useful for my work in analytics. Please keep it up!

November 29, 2018 at 8:54 am

Thanks, Patrick! It’s great to hear that it was helpful!

' src=

November 12, 2018 at 12:54 pm

I think there may be a typo here – “These are important variables that the statistical model does include and, therefore, cannot control.” Shouldn’t it be “does not include”, if I understand correctly?

November 12, 2018 at 1:19 pm

Thanks, Jayant! Good eagle eyes! That is indeed a typo. I will fix it. Thanks for pointing it out!

' src=

November 3, 2018 at 12:07 pm

Mr. Jim thank you for making me understand econometrics. I thought that omitted variable is excluded from the model and that why they under/overestimate the coefficients. Somewhere in this article you mentioned that they are still included in the model but not controlled for. I find that very confusing, would you be able to clarify ? Thanks a lot.

November 3, 2018 at 2:26 pm

You’re definitely correct. Omitted variable bias occurs when you exclude a variable from the model. If I gave the impression that it’s included, please let me know where in the text because I want to clarify that! Thanks!

By excluding the variable, the model does not control for it, which biases the results. When you include a previously excluded variable, the model can now control for it and the bias goes away. Maybe I wrote that in a confusing way?

Thanks! I always strive to make my posts as clear as possible, so I’ll think about how to explain this better.

September 28, 2018 at 4:31 pm

In addition to mean square error, adj R-squared, I use Cp, IC, HQC, and SBIC to decide the number of dependent variables in multiple regression.

September 28, 2018 at 4:39 pm

I think there are a variety of good measures. I’d also add predicted R-squared–as long as you use them in conjunction with subject-area expertise. As I mention in this post, the entire set of estimate relationships must make theoretical sense. If they don’t, the statistical measures are not important.

September 28, 2018 at 4:13 pm

i have to read the article you named. Having said that, caution should be given when regression models model systems or processes not in statistical control. Also, some processes have physical bounds that a regression model does not capture and calculated predicted values have no physical meaning. Further, models from narrow ranges of independent variables may not be applicable outside the ranges of the independent variables.

September 28, 2018 at 4:19 pm

Hi Stan, those are all great points, and true. They all illustrate how you need to use your subject-area knowledge in conjunction with statistical analyses.

I talk about the issue of not going outside the range of the data, amongst other issues, in my post about Using Regression to Make Predictions .

I also agree about statistical control, which I think is under appreciated outside of the quality improvement arena. I’ve written about this in a post about using control charts with hypothesis tests .

September 28, 2018 at 2:30 pm

Valid confidence/prediction intervals are important if the regression model represents a process that is being characterized. When the prediction intervals are wide or too wide, the model’s validity and utility are in question.

September 28, 2018 at 2:49 pm

You’re definitely correct! If the model doesn’t fit the data, your predictions are worthless. One minor caveat that I’d add to your comment.

The prediction intervals can be too wide to be useful yet the model might still be valid. It’s really two separate assessments. Valid model and degree of precision. I write about this in several posts including the following: Understanding Precision in Prediction

September 26, 2018 at 9:13 am

Jim, does centering any independent explanatory variable require centering them all? Center the dependent and explanatory variables? I always make a normal probability plot of the deleted residuals as one test of the prediction capability of the fitted model. It is remarkable how good models give good normal probability plots. I also use the Shapiro-Wilks test to assess the deleted variables for normality. Stan Alekman

September 26, 2018 at 9:46 am

Yes, you should center all of the continuous independent variables if your goal is to reduce multicollinearity and/or to be able to interpret the intercept. I’ve never seen a reason to center the dependent variable.

It’s funny that you mention that about normally distributed residuals! I, too, have been impressed with how frequently that occurs even with fairly simple models. I’ve recently written a post about OLS assumptions and I mention how normal residuals are sort of optional. They only need to be normally distributed if you want to perform hypothesis tests and have valid confidence/prediction intervals. Most analysts want at least the hypothesis tests!

' src=

September 25, 2018 at 2:32 am

Hey Jim,your blogs are really helpful for me to learn data science.Here is my question in my assignment:

You have built a classification model with 90% accuracy but your client is not happy because False Positive rate was very high then what will you do? Can we do something to it by precision or recall??

this is the question..nothing is given in the background

though they should have given!

' src=

September 25, 2018 at 1:20 am

Thank you Jim Really interesting

September 25, 2018 at 1:26 am

Hi Brahim, you’re very welcome! I’m glad it was interesting!

' src=

September 24, 2018 at 10:30 pm

Hey Jim, you are awesome.

September 24, 2018 at 11:04 pm

Aw, MG, thanks so much!! 🙂

' src=

September 24, 2018 at 10:59 am

Thanks for another great article, Jim!.

Q: Could you expand with a specific plot example to explain more clearly, this statement: “We know that for omitted variable bias to exist, an independent variable must correlate with the residuals. Consequently, we can plot the residuals by the variables in our model. If we see a relationship in the plot, rather than random scatter, it both tells us that there is a problem and points us towards the solution. We know which independent variable correlates with the confounding variable.”

Thanks! SFdude

September 24, 2018 at 11:48 am

Hi, thanks!

I’ll try to find a good example plot to include soon. Basically, you’re looking for any non-random pattern. For example, the residuals might tend to either increase or decrease as the value of independent variable increases. That relationship can follow a straight line or display curvature, depending on the nature of relationship.

' src=

September 24, 2018 at 1:37 am

It’s been a long time I heard from you Jim . Missed your stats

September 24, 2018 at 9:53 am

Hi Saketh, thanks, you’re too kind! I try to post here every two weeks at least. Occasionally, weekly!

Comments and Questions Cancel reply

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

1.3: Threats to Internal Validity and Different Control Techniques

  • Last updated
  • Save as PDF
  • Page ID 32915

  • Yang Lydia Yang
  • Kansas State University

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Internal validity is often the focus from a research design perspective. To understand the pros and cons of various designs and to be able to better judge specific designs, we identify specific threats to internal validity . Before we do so, it is important to note that the primary challenge to establishing internal validity in social sciences is the fact that most of the phenomena we care about have multiple causes and are often a result of some complex set of interactions. For example, X may be only a partial cause of Y or X may cause Y, but only when Z is present. Multiple causation and interactive effects make it very difficult to demonstrate causality. Turning now to more specific threats, Figure 1.3.1 below identifies common threats to internal validity.

Figure \(\PageIndex{1}\): Common Threats to Internal Validity
Threat
History Any event that occurs while the experiment is in progress might be an alternation; using a control group mitigates this concern.
Maturation Normal changes over time (e.g., fatigue or aging) might affect the dependent variable; using a control group mitigates this concern
Selection Bias If randomization is not used to assign participants, the groups may not be equivalent
Experimental Mortality If groups lost participants (e.g., due to dropping out of the experiment) they may not be equivalent.
Testing A pre-test may confound the influence of the experimental treatment; using a control group mitigates this concern
Instrumentation Changes or difference in the process of measurements might alternatively account for differences
Statistical Regression The natural tendency for extreme scores to regress or move towards the mean

Different Control Techniques

All of the common threats mentioned above can introduce extraneous variables into your research design, which will potentially confound your research findings. In other words, we won't be able to tell whether it is the independent variable (i.e., the treatment we give participants), or the extraneous variable, that causes the changes in the dependent variable. Controlling for extraneous variables reduces its threats on the research design and gives us a better chance to claim the independent variable causes the changes in the dependent variable, i.e., internal validity. There are different techniques we can use to control for extraneous variables.

Random assignment

Random assignment is the single most powerful control technique we can use to minimize the potential threats of the confounding variables in research design. As we have seen in Dunn and her colleagues' study earlier, participants are not allowed to self select into either conditions (spend $20 on self or spend on others). Instead, they are randomly assigned into either group by the researcher(s). By doing so, the two groups are likely to be similar on all other factors except the independent variable itself. One confounding variable mentioned earlier is whether individuals had a happy childhood to begin with. Using random assignment, those who had a happy childhood will likely end up in each condition group. Similarly, those who didn't have a happy childhood will likely end up in each condition group too. As a consequence, we can expect the two condition groups to be very similar on this confounding variable. Applying the same logic, we can use random assignment to minimize all potential confounding variables (assuming your sample size is large enough!). With that, the only difference between the two groups is the condition participants are assigned to, which is the independent variable, then we are confident to infer that the independent variable actually causes the differences in the dependent variables.

It is critical to emphasize that random assignment is the only control technique to control for both known and unknown confounding variables. With all other control techniques mentioned below, we must first know what the confounding variable is before controlling it. Random assignment does not. With the simple act of randomly assigning participants into different conditions, we take care both the confounding variables we know of and the ones we don't even know that could threat the internal validity of our studies. As the saying goes, "what you don't know will hurt you." Random assignment take cares of it.

Matching is another technique we can use to control for extraneous variables. We must first identify the extraneous variable that can potentially confound the research design. Then we want to rank order the participants on this extraneous variable or list the participants in a ascending or descending order. Participants who are similar on the extraneous variable will be placed into different treatment groups. In other words, they are "matched" on the extraneous variable. Then we can carry out the intervention/treatment as usual. If different treatment groups do show differences on the dependent variable, we would know it is not the extraneous variables because participants are "matched" or equivalent on the extraneous variable. Rather it is more likely to the independent variable (i.e., the treatments) that causes the changes in the dependent variable. Use the example above (self-spending vs. others-spending on happiness) with the same extraneous variable of whether individuals had a happy childhood to begin with. Once we identify this extraneous variable, we do need to first collect some kind of data from the participants to measure how happy their childhood was. Or sometimes, data on the extraneous variables we plan to use may be already available (for example, you want to examine the effect of different types of tutoring on students' performance in Calculus I course and you plan to match them on this extraneous variable: college entrance test scores, which is already collected by the Admissions Office). In either case, getting the data on the identified extraneous variable is a typical step we need to do before matching. So going back to whether individuals had a happy childhood to begin with. Once we have data, we'd sort it in a certain order, for example, from the highest score (meaning participants reporting the happiest childhood) to the lowest score (meaning participants reporting the least happy childhood). We will then identify/match participants with the highest levels of childhood happiness and place them into different treatment groups. Then we go down the scale and match participants with relative high levels of childhood happiness and place them into different treatment groups. We repeat on the descending order until we match participants with the lowest levels of childhood happiness and place them into different treatment groups. By now, each treatment group will have participants with a full range of levels on childhood happiness (which is a strength...thinking about the variation, the representativeness of the sample). The two treatment groups will be similar or equivalent on this extraneous variable. If the treatments, self-spending vs. other-spending, eventually shows the differences on individual happiness, then we know it's not due to how happy their childhood was. We will be more confident it is due to the independent variable.

You may be thinking, but wait we have only taken care of one extraneous variable. What about other extraneous variables? Good thinking.That's exactly correct. We mentioned a few extraneous variables but have only matched them on one. This is the main limitation of matching. You can match participants on more than one extraneous variables, but it's cumbersome, if not impossible, to match them on 10 or 20 extraneous variables. More importantly, the more variables we try to match participants on, the less likely we will have a similar match. In other words, it may be easy to find/match participants on one particular extraneous variable (similar level of childhood happiness), but it's much harder to find/match participants to be similar on 10 different extraneous variables at once.

Holding Extraneous Variable Constant

Holding extraneous variable constant control technique is self-explanatory. We will use participants at one level of extraneous variable only, in other words, holding the extraneous variable constant. Using the same example above, for example we only want to study participants with the low level of childhood happiness. We do need to go through the same steps as in Matching: identifying the extraneous variable that can potentially confound the research design and getting the data on the identified extraneous variable. Once we have the data on childhood happiness scores, we will only include participants on the lower end of childhood happiness scores, then place them into different treatment groups and carry out the study as before. If the condition groups, self-spending vs. other-spending, eventually shows the differences on individual happiness, then we know it's not due to how happy their childhood was (since we already picked those on the lower end of childhood happiness only). We will be more confident it is due to the independent variable.

Similarly to Matching, we have to do this one extraneous variable at a time. As we increase the number of extraneous variables to be held constant, the more difficult it gets. The other limitation is by holding extraneous variable constant, we are excluding a big chunk of participants, in this case, anyone who are NOT low on childhood happiness. This is a major weakness, as we reduce the variability on the spectrum of childhood happiness levels, we decreases the representativeness of the sample and generalizabiliy suffers.

Building Extraneous Variables into Design

The last control technique building extraneous variables into research design is widely used. Like the name suggests, we would identify the extraneous variable that can potentially confound the research design, and include it into the research design by treating it as an independent variable. This control technique takes care of the limitation the previous control technique, holding extraneous variable constant, has. We don't need to excluding participants based on where they stand on the extraneous variable(s). Instead we can include participants with a wide range of levels on the extraneous variable(s). You can include multiple extraneous variables into the design at once. However, the more variables you include in the design, the large the sample size it requires for statistical analyses, which may be difficult to obtain due to limitations of time, staff, cost, access, etc.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

1.4.2 - causal conclusions.

In order to control for confounding variables, participants can be randomly assigned to different levels of the explanatory variable. This act of randomly assigning cases to different levels of the explanatory variable is known as randomization . An experiment that involves randomization may be referred to as a  randomized experiment or randomized comparative experiment . By randomly assigning cases to different conditions, a  causal conclusion  can be made; in other words, we can say that differences in the response variable are caused by differences in the explanatory variable. Without randomization, an  association  can be noted, but a causal conclusion cannot be made.

Note that randomization and random sampling are different concepts. Randomization refers to the random assignment of experimental units to different conditions (e.g., different treatment groups). Random sampling refers to probability-based methods for selecting a sample from a population.

Example: Fitness Programs Section  

Two teams have designed research studies to compare the weight loss of participants in two different fitness programs. Each team used a different research study design.

The first team surveyed people who already participate in each program. This is an observational study, which means there is no randomization . Each group is comprised of participants who made the personal decision to engaged in that fitness program. With this research study design, the researchers can only determine whether or not there is an  association  between the fitness program and participants' weight loss. A causal conclusion cannot be made because there may be  confounding variables . The people in the two groups may be different in some key ways. For example, if the cost of the two programs is different, the two groups may differ in terms of their finances. 

The second team of researchers obtained a sample of participants and randomly assigned half to participate in the first fitness program and half to participate in the second fitness program. They measured each participants' weight twice: both at the beginning and end of their study. This is a  randomized experiment  because the researchers randomly assigned each participant to one of the two programs. Because participants were randomly assigned to groups, the groups should be balanced in terms of any confounding variables and a  causal conclusion  may be drawn from this study.

Statistical Thinking: A Simulation Approach to Modeling Uncertainty (UM STAT 216 edition)

3.6 causation and random assignment.

Medical researchers may be interested in showing that a drug helps improve people’s health (the cause of improvement is the drug), while educational researchers may be interested in showing a curricular innovation improves students’ learning (the curricular innovation causes improved learning).

To attribute a causal relationship, there are three criteria a researcher needs to establish:

  • Association of the Cause and Effect: There needs to be a association between the cause and effect.
  • Timing: The cause needs to happen BEFORE the effect.
  • No Plausible Alternative Explanations: ALL other possible explanations for the effect need to be ruled out.

Please read more about each of these criteria at the Web Center for Social Research Methods .

The third criterion can be quite difficult to meet. To rule out ALL other possible explanations for the effect, we want to compare the world with the cause applied to the world without the cause. In practice, we do this by comparing two different groups: a “treatment” group that gets the cause applied to them, and a “control” group that does not. To rule out alternative explanations, the groups need to be “identical” with respect to every possible characteristic (aside from the treatment) that could explain differences. This way the only characteristic that will be different is that the treatment group gets the treatment and the control group doesn’t. If there are differences in the outcome, then it must be attributable to the treatment, because the other possible explanations are ruled out.

So, the key is to make the control and treatment groups “identical” when you are forming them. One thing that makes this task (slightly) easier is that they don’t have to be exactly identical, only probabilistically equivalent . This means, for example, that if you were matching groups on age that you don’t need the two groups to have identical age distributions; they would only need to have roughly the same AVERAGE age. Here roughly means “the average ages should be the same within what we expect because of sampling error.”

Now we just need to create the groups so that they have, on average, the same characteristics … for EVERY POSSIBLE CHARCTERISTIC that could explain differences in the outcome.

It turns out that creating probabilistically equivalent groups is a really difficult problem. One method that works pretty well for doing this is to randomly assign participants to the groups. This works best when you have large sample sizes, but even with small sample sizes random assignment has the advantage of at least removing the systematic bias between the two groups (any differences are due to chance and will probably even out between the groups). As Wikipedia’s page on random assignment points out,

Random assignment of participants helps to ensure that any differences between and within the groups are not systematic at the outset of the experiment. Thus, any differences between groups recorded at the end of the experiment can be more confidently attributed to the experimental procedures or treatment. … Random assignment does not guarantee that the groups are matched or equivalent. The groups may still differ on some preexisting attribute due to chance. The use of random assignment cannot eliminate this possibility, but it greatly reduces it.

We use the term internal validity to describe the degree to which cause-and-effect inferences are accurate and meaningful. Causal attribution is the goal for many researchers. Thus, by using random assignment we have a pretty high degree of evidence for internal validity; we have a much higher belief in causal inferences. Much like evidence used in a court of law, it is useful to think about validity evidence on a continuum. For example, a visualization of the internal validity evidence for a study that employed random assignment in the design might be:

how does random assignment control confounding variables

The degree of internal validity evidence is high (in the upper-third). How high depends on other factors such as sample size.

To learn more about random assignment, you can read the following:

  • The research report, Random Assignment Evaluation Studies: A Guide for Out-of-School Time Program Practitioners

3.6.1 Example: Does sleep deprivation cause an decrease in performance?

Let’s consider the criteria with respect to the sleep deprivation study we explored in class.

3.6.1.1 Association of cause and effect

First, we ask, Is there an association between the cause and the effect? In the sleep deprivation study, we would ask, “Is sleep deprivation associated with an decrease in performance?”

This is what a hypothesis test helps us answer! If the result is statistically significant , then we have an association between the cause and the effect. If the result is not statistically significant, then there is not sufficient evidence for an association between cause and effect.

In the case of the sleep deprivation experiment, the result was statistically significant, so we can say that sleep deprivation is associated with a decrease in performance.

3.6.1.2 Timing

Second, we ask, Did the cause come before the effect? In the sleep deprivation study, the answer is yes. The participants were sleep deprived before their performance was tested. It may seem like this is a silly question to ask, but as the link above describes, it is not always so clear to establish the timing. Thus, it is important to consider this question any time we are interested in establishing causality.

3.6.1.3 No plausible alternative explanations

Finally, we ask Are there any plausible alternative explanations for the observed effect? In the sleep deprivation study, we would ask, “Are there plausible alternative explanations for the observed difference between the groups, other than sleep deprivation?” Because this is a question about plausibility, human judgment comes into play. Researchers must make an argument about why there are no plausible alternatives. As described above, a strong study design can help to strengthen the argument.

At first, it may seem like there are a lot of plausible alternative explanations for the difference in performance. There are a lot of things that might affect someone’s performance on a visual task! Sleep deprivation is just one of them! For example, artists may be more adept at visual discrimination than other people. This is an example of a potential confounding variable. A confounding variable is a variable that might affect the results, other than the causal variable that we are interested in.

Here’s the thing though. We are not interested in figuring out why any particular person got the score that they did. Instead, we are interested in determining why one group was different from another group. In the sleep deprivation study, the participants were randomly assigned. This means that the there is no systematic difference between the groups, with respect to any confounding variables. Yes—artistic experience is a possible confounding variable, and it may be the reason why two people score differently. BUT: There is no systematic difference between the groups with respect to artistic experience, and so artistic experience is not a plausible explanation as to why the groups would be different. The same can be said for any possible confounding variable. Because the groups were randomly assigned, it is not plausible to say that the groups are different with respect to any confounding variable. Random assignment helps us rule out plausible alternatives.

3.6.1.4 Making a causal claim

Now, let’s see about make a causal claim for the sleep deprivation study:

  • Association: There is a statistically significant result, so the cause is associated with the effect
  • Timing: The participants were sleep deprived before their performance was measured, so the cause came before the effect
  • Plausible alternative explanations: The participants were randomly assigned, so the groups are not systematically different on any confounding variable. The only systematic difference between the groups was sleep deprivation. Thus, there are no plausible alternative explanations for the difference between the groups, other than sleep deprivation

Thus, the internal validity evidence for this study is high, and we can make a causal claim. For the participants in this study, we can say that sleep deprivation caused a decrease in performance.

Key points: Causation and internal validity

To make a cause-and-effect inference, you need to consider three criteria:

  • Association of the Cause and Effect: There needs to be a association between the cause and effect. This can be established by a hypothesis test.

Random assignment removes any systematic differences between the groups (other than the treatment), and thus helps to rule out plausible alternative explanations.

Internal validity describes the degree to which cause-and-effect inferences are accurate and meaningful.

Confounding variables are variables that might affect the results, other than the causal variable that we are interested in.

Probabilistic equivalence means that there is not a systematic difference between groups. The groups are the same on average.

How can we make "equivalent" experimental groups?

Random Assignment in Psychology | Definition, Purpose & Examples

Dr. Linder has taught undergraduate Psychology courses for the past 15 years both in person and online formats along with hybrid courses. He has a Doctorate, B.A. and M.S in Psychology and Health Psychology respectively.

Andrea teaches high school AP Psychology and Online Economics and has a Masters degree in Curriculum and Instruction.

Table of Contents

Random assignment in psychology, random assignment example, lesson summary, what is the purpose of using random assignment.

Random assignment eliminates initial group differences between the experimental group and the control group. It is a method of limiting the effects of cofounding variables because any impact that may occur is not systematic and is evenly spread across each group, leaving only the dependent variable to effect results.

What is an example of random assignment?

If the research design has only two groups, the experimental group and the control group, every person is just as likely to be in either group. In this way, a new medicine may be compared to an older medicine. All people with the same complaint of pain back pain, for example, are chosen. A coin is tossed and based on heads or tails, the person is enrolled into the older medicine treatment or newer medicine treatment. (This is the random assignment). The reduction of reported pain is then compared.

What is random assignment and random sampling?

Random assignment happens when participants are already part of the research experiment. It is used to place people in groups where every participant has an equal chance of being in the experimental group or control group. Random sampling means every person of a given population has an equal chance to be a part of a research study. First, a random sample is taken of all people matching the research criteria. Then, they are randomly assigned to groups after they are a part of the study.

Random assignment is a critical part of any experimental design in science, especially random assignment in psychology. The simplest random assignment definition is that every participant in the research study has an equal chance of being in either the experimental group or control group.

Before random assignment can be accomplished, there first has to be a true experiment . In a true experiment, the principal relationship being investigated is the connection between the independent variable and the dependent variable . The independent variable is what a researcher manipulates. The dependent variable is the measured outcome. In a simple design, the independent variable has two groups. The experimental group and the control group . Assignment to these groups is where random assignment comes into play.

The easiest way to understand the relationship between these key terms is by an example. If an experiment is investigating how a new medicine relieves headaches, the medicine is the independent variable. First, a random sample is taken from among all people with headaches. It is impossible to include everyone in the world who suffers from headaches because it would simply be too many people; instead, a sample is randomly selected. Everyone has an equal chance to be in the study if they are suffering from a headache. While this is very important, it is not random assignment.

The independent variable in this research is what medicine is given to participants to relieve the headache. Subjects must be randomly assigned to the two groups. Half of the participants will be given a standard headache medicine. This is the control group . The other half will be given the new medicine, this is the experimental group . After the medicine is given, the dependent variable of how well their headache is relieved determines how well the new medicine worked. To draw conclusions from these results, random assignment to each group must have taken place or the results cannot provide conclusive evidence. If random assignment was not followed, the experiment is not an experiment at all. For random assignment to be random, it must strictly follow these points:

  • Everyone has an equal chance to be in the control or the experimental group.
  • No participant is placed into one or another group because they have greater or lesser symptoms.
  • The participant does not know what group they have been assigned to (the experimental or control group).
  • There is no systematic flaw with the assignment, such as a random number generator that is not producing random numbers.

Purpose of Random Assignment

The purpose of random assignment is to achieve a statistically clean sample in both groups. It attempts to eliminate the impact of a confounding variable . A confounding variable is any circumstance that unintentionally impacts or affects the results. Some confounding variables can be anticipated, but not all. Random assignment tries to cancel initial group differences , meaning that each group is not the same when the research begins. Random assignment controls for this by making any differences between the two groups a product of simple chance rather than a consistent bias.

Why is Random Assignment Important?

To draw conclusions of causality between the independent variable and dependent variable, both groups must begin as equally as possible in as many ways as possible. In this example, some participants will likely report they have a more painful headache than others. If this information is known at the time of assignment and participants with greater reports of pain are assigned to the new medicine hoping it will work better, the research is compromised. Any results generated cannot be used to show how well the medicines worked and cannot show which had a greater effect. The groups were not equal at the start so the outcome will also not be equal. This is one confounding variable that a researcher may reasonably expect to be a problem.

Random assignment eliminates this possibility, but it also eliminates the impact of confounding variables that may not be foreseen. Because an experiment is trying to show causality and not just show a relationship , random assignment must be present. Without this, no conclusions of the independent variable impacting the dependent variable can be relied upon. Other research uses different assignment techniques purposely to answer a specific research question, but this produces a quasi-experimental design , not a true experiment.

Benefits of Random Assignment

There are several benefits to using random assignment.

  • Ease of assigning participants to the control or experimental group.
  • Reduction of initial screening needed for participants.
  • Improved scientific reliability of the research.
  • Results are more conclusive and more widely accepted.

To unlock this lesson you must be a Study.com Member. Create your account

how does random assignment control confounding variables

An error occurred trying to load this video.

Try refreshing the page, or contact customer support.

You must c C reate an account to continue watching

Register to view this lesson.

As a member, you'll also get unlimited access to over 88,000 lessons in math, English, science, history, and more. Plus, get practice tests, quizzes, and personalized coaching to help you succeed.

Get unlimited access to over 88,000 lessons.

Already registered? Log in here for access

Resources created by teachers for teachers.

I would definitely recommend Study.com to my colleagues. It’s like a teacher waved a magic wand and did the work for me. I feel like it’s a lifeline.

You're on a roll. Keep up the good work!

Just checking in. are you still watching.

  • 0:05 What is Random Assignment?
  • 1:14 Experimental Design
  • 2:10 Experimental and…
  • 3:22 Making Random…
  • 4:29 Lesson Summary

Random assignment is complex in theory but can be simple in practice. If there are only two groups, as in the headache medicine example, a coin toss can be used to assign participants to the control group or the experimental group. A coin toss is an accepted method of randomizing a trial such as this one. With only two groups, each person has a 50% chance of being assigned to either group. Other methods can also be used if there are more groups. Computer programs are available to generate random numbers that can be used for random assignment.

Regardless of what specific method is used, every participant in the sample must have an equal chance to be assigned to either the experimental group or the control group.

A coin toss is one way to randomly assign participants to different groups

Random assignment is a part of the design of an experiment, and it is part of what sets an experiment apart from other research methods such as a quasi-experimental design . Random assignment is defined as every participant having an equal chance of being in either the experimental group or the control group . Each group is presented with the independent variable , or the condition the researcher manipulates. The dependent variable , or outcome, is then measured for changes. When compared, differences between the experimental and control group can be attributed to the difference of the independent variable. Random assignment allows cause-and-effect relationships to be identified but it is only one part of the process.

Random assignment produces groups that are, in theory, equal at the start of the experiment. This controls for confounding variables that may skew results. Random assignment in psychology produces a true experiment and is the only way cause and effect relationships can be investigated.

Video Transcript

What is random assignment.

Psychologists use experiments to investigate how manipulation of one factor causes a change in another factor. Scientists refer to these factors as one of two kinds of variables. The independent variable is that first factor: the one whose influence we're trying to measure. An independent variable doesn't change based on the other variables. The second factor - the one being influenced by changes - is called a dependent variable . This kind of variable changes based on the independent variable. Experiments are the best way to determine cause and effect relationships between these variables.

Psychologists rely on random assignment to assign subjects to different groups in an experiment. Random assignment leaves it completely up to chance to determine which subjects receive the critical part of the experiment, which is imperative for determining that the independent variable is indeed what creates the result. Randomly assigning subjects helps to eliminate confounding variables , or variables other than the independent variable that could cause a change in the dependent variable.

Experimental Design

Suppose one day while studying for a test, you notice that you seem more focused and productive while you are listening to music. In fact, you think it's possible that listening to music while studying helps you earn better grades on tests. You have been taking psychology courses, and armed with the love of science, you decide to conduct an experiment to see if your hypothesis is correct.

You decide to test your hypothesis on the 300 students in your college introduction to psychology class. What is the independent variable in your experiment?

Remember that the independent variable is the part of the study that is manipulated or changed to determine a result. In your experiment, you will manipulate whether or not students listen to music while studying, so listening to music is the independent variable . The dependent variable then will be the subjects' scores on the test. The dependent variable shows the effect of the manipulation.

Experimental and Control Groups

To test the independent variable, you will need an experimental group and a control group. The experimental group is the group who receives the critical part of the experiment, the treatment. This is the group who will listen to music while studying.

But to know if the music has an effect on test scores, you will also need to compare the results of the experimental group to a control group , a group which doesn't receive the critical part of the experiment. In this case, our control group won't listen to music while studying.

How will you decide who is in the experimental group and who is in the control group? What about allowing students to choose which group they're in? No, that won't work. Maybe the students who choose to listen to music are already better students who excel at focused studying. We can't assume the results will be valid.

Okay, so how about picking the experimental group based on a first-come basis? Sorry, choosing to put the first 150 students who come to class in the experimental group is also not random assignment. Maybe those students who get up earlier to make it to class on time typically perform higher on tests because they get more sleep. Random assignment is the only way to eliminate other variables that could influence your results.

Making Random Assignment Happen

So how do you ensure random assignment? There are a lot of different methods; the only requirement is that every subject has an equal chance to be in the experimental group .

Drawing names out of a hat or creating a lottery are ways to make assignment to the experimental group random. Choosing every third name off a list of students is also random. Several computer programs can generate a random assignment of participants for you. The important piece is that subjects are equally likely to be in the experimental group.

Random assignment is the best way to assure that the only difference between the control group and the experimental group is whether or not they receive the treatment. Any other differences between groups, such as amount of sleep, G.P.A., or even unknown factors, are more likely to be equally distributed if subjects are chosen randomly.

Random assignment is the only way to assume the difference in test scores is caused by listening to music while studying. It allows psychologists, and budding psychologists like yourself, to have confidence that the results of your study are valid.

In psychology experiments, psychologists use random assignment to assign subjects to groups. Using random methods, subjects are assigned to either an experimental group , which will receive an experimental treatment and be observed, or the control group , which is observed under normal, non-experimental conditions. Random selection ensures that there will be no confounding variables , or factors that could influence the result of the experiment; confounding variables can render an experiment's findings completely invalid.

There are many ways to assign subjects completely randomly: You can use a computer program, take names from a list at regular intervals, or even pick names from a hat. The important thing is that a random assignment method must allow every subject an equal chance of being in either the experimental group or the control group.

Key Takeaways

  • Random assignment helps to ensure that confounding variables do not occur
  • To qualify for random assignment each subject must have an equal chance of being selected
  • Random assignment allow researchers to be confident their findings are valid

Lesson Outcomes

After viewing this lesson, you should be able to:

  • Define random assignment and other research method key terms
  • Summarize the methods of a random assignment
  • Provide examples of random assignments

Unlock Your Education

See for yourself why 30 million people use study.com, become a study.com member and start learning now..

Already a member? Log In

Recommended Lessons and Courses for You

Related lessons, related courses, recommended lessons for you.

Between  Subjects Design | Benefits & Examples

Random Assignment in Psychology | Definition, Purpose & Examples Related Study Materials

  • Related Topics

Browse by Courses

  • Psychology 101: Intro to Psychology
  • Psychology 102: Educational Psychology
  • Psychology 106: Abnormal Psychology
  • CLEP Human Growth and Development Prep
  • Human Growth and Development: Certificate Program
  • Human Growth and Development: Help and Review
  • Introduction to Psychology: Certificate Program
  • UExcel Social Psychology: Study Guide & Test Prep
  • Human Growth and Development: Tutoring Solution
  • Introduction to Social Psychology: Certificate Program
  • Human Growth and Development: Homework Help Resource
  • Life Span Developmental Psychology: Homework Help Resource
  • Psychology 103: Human Growth and Development
  • Psychology 104: Social Psychology
  • Psychology 105: Research Methods in Psychology

Browse by Lessons

  • Random Sampling in Psychology | Definition, Purpose & Benefits
  • Random Allocation & Random Selection | Definition & Examples
  • Random Assignment in Research: Definition and Importance
  • Handling Gender Issues in the Classroom
  • Math Strategies for High School Students
  • How to Use Augmented Reality in the Classroom
  • Plant Activities for Kindergarten
  • Critical Thinking Skills in Nursing | Overview & Examples
  • End of the Year Activities for 5th Grade
  • Carol Dweck & Growth Mindset Psychology
  • 4th Grade Science Fair Projects
  • Critical Thinking Activities for High School
  • Financial Literacy for College Students
  • Marketing Activities for High School Students
  • Plant Activities for First Grade

Create an account to start this course today Used by over 30 million students worldwide Create an account

Explore our library of over 88,000 lessons

  • Foreign Language
  • Social Science
  • See All College Courses
  • Common Core
  • High School
  • See All High School Courses
  • College & Career Guidance Courses
  • College Placement Exams
  • Entrance Exams
  • General Test Prep
  • K-8 Courses
  • Skills Courses
  • Teacher Certification Exams
  • See All Other Courses
  • Create a Goal
  • Create custom courses
  • Get your questions answered

how does random assignment control confounding variables

PH717 Module 11 - Confounding and Effect Measure Modification

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  

On This Page sidebar

Three Methods for Minimizing Confounding in the Study Design Phase

Randomization in a clinical trial, strengths of randomization, limitations of randomization to control for confounding, restriction of enrollment, drawbacks of restriction, matching compared groups, advantages of matching, drawbacks of matching.

Learn More sidebar

Confounding is a major problem in epidemiologic research, and it accounts for many of the discrepancies among published studies. Nevertheless, there are ways of minimizing confounding in the design phase of a study, and there are also methods for adjusting for confounding during analysis of a study.

The ideal way to minimize the effects of confounding is to conduct a large randomized clinical trial so that each subject has an equal chance of being assigned to any of the treatment options. If this is done with a sufficiently large number of subjects, other risk factors (i.e., confounding factors) should be equally distributed among the exposure groups. The beauty of this is that even unknown confounding factors will be equally distributed among the comparison groups. If all of these other factors are distributed equally among the groups being compared, they will not distort the association between the treatment being studied and the outcome.

The success of randomization is usually evaluated in one of the first tables in a clinical trial, i.e., a table comparing characteristics of the exposure groups. If the groups have similar distributions of all of the known confounding factors, then randomization was successful. However, if randomization was not successful in producing equal distributions of confounding factors, then methods of adjusting for confounding must be used in the analysis of the data.

  • There is no limit on the number of confounders that can be controlled
  • It controls for both known and unknown confounders
  • If successful, there is no need to "adjust" for confounding
  • It is limited to intervention studies (clinical trials)
  • It may not be completely effective for small trials

Limiting the study to subjects in one category of the confounder is a simple way of ensuring that all participants have the same level of the confounder. For example,

  • If smoking is a confounding factor, one could limit the study population to only non-smokers or only smokers.
  • If sex is a confounding factor, limit the participants to only men or only women
  • If age is a confounding factor, restrict the study to subjects in a specific age category, e.g., persons >65.

Restriction is simple and generally effective, but it has several drawbacks:

  • It can only be used for known confounders and only when the status of potential subjects is known with respect to that variable
  • Residual confounding may occur if restriction is not narrow enough. For example, a study of the association between physical activity and heart disease might be restricted to subjects between the ages of 30-60, but that is a wide age range, and the risk of heart disease still varies widely within that range.
  • Investigators cannot evaluate the effect of the restricted variable, since it doesn't vary
  • Restriction limits the number of potential subjects and may limit sample size
  • If restriction is used, one cannot generalize the findings to those who were excluded.
  • Restriction is particularly cumbersome if used to control for multiple confounding variables.

Another risk factor can only cause confounding if it is distributed differently in the groups being compared. Therefore, another method of preventing confounding is to match the subjects with respect to confounding variables. This method can be used in both cohort studies and in case-control studies in order to enroll a reference group that has artificially been created to have the same distribution of a confounding factor as the index group. For example,

  • In a case-control study of lung cancer where age is a potential confounding factor, match each case with one or more control subjects of similar age. If this is done the age distribution of the comparison groups will be the same, and there will be no confounding by age.
  • In a cohort study on effects of smoking each smoker (the index group) who is enrolled is matched with a non-smoker (reference group) of similar age. Once again, the groups being compared will have the same age distribution, so confounding by age will be prevented
  • Matching is particularly useful when trying to control for complex or difficult to measure confounding variables, e.g., matching by neighborhood to control for confounding by air pollution.
  • It can also be used in case-control studies with few cases when additional control subjects are enrolled to increase statistical power, e.g., 4 to 1 matching of controls to cases.
  • It can only be used for known confounders.
  • It can be difficult, expensive, and time-consuming to find appropriate matches.
  • One cannot evaluate the effect of the matched variable.
  • Matching requires special analytic methods. 

return to top | previous page | next page

Content ©2021. All Rights Reserved. Date last modified: November 11, 2021. Wayne W. LaMorte, MD, PhD, MPH

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

What random assignment does and does not do

Affiliation.

Random assignment of patients to comparison groups stochastically tends, with increasing sample size or number of experiment replications, to minimize the confounding of treatment outcome differences by the effects of differences among these groups in unknown/unmeasured patient characteristics. To what degree such confounding is actually avoided we cannot know unless we have validly measured these patient variables, but completely avoiding it is quite unlikely. Even if this confounding were completely avoided, confounding by unmeasured Patient Variable x Treatment Variable interactions remains a possibility. And the causal power of the confounding variables is no less important for internal validity than the degree of confounding.

Copyright 2003 Wiley Periodicals, Inc. J Clin Psychol.

PubMed Disclaimer

Similar articles

  • Evidence from nonrandomized studies: a case study on the estimation of causal effects. Schmoor C, Caputo A, Schumacher M. Schmoor C, et al. Am J Epidemiol. 2008 May 1;167(9):1120-9. doi: 10.1093/aje/kwn010. Epub 2008 Mar 11. Am J Epidemiol. 2008. PMID: 18334500
  • Two-stage instrumental variable methods for estimating the causal odds ratio: analysis of bias. Cai B, Small DS, Have TR. Cai B, et al. Stat Med. 2011 Jul 10;30(15):1809-24. doi: 10.1002/sim.4241. Epub 2011 Apr 15. Stat Med. 2011. PMID: 21495062
  • The analysis of continuous outcomes in multi-centre trials with small centre sizes. Pickering RM, Weatherall M. Pickering RM, et al. Stat Med. 2007 Dec 30;26(30):5445-56. doi: 10.1002/sim.3068. Stat Med. 2007. PMID: 17924360
  • Randomization procedures in orthopaedic trials. Randelli P, Arrigoni P, Lubowitz JH, Cabitza P, Denti M. Randelli P, et al. Arthroscopy. 2008 Jul;24(7):834-8. doi: 10.1016/j.arthro.2008.01.011. Epub 2008 Mar 21. Arthroscopy. 2008. PMID: 18589273 Review.
  • Risk factors, confounding, and the illusion of statistical control. Christenfeld NJ, Sloan RP, Carroll D, Greenland S. Christenfeld NJ, et al. Psychosom Med. 2004 Nov-Dec;66(6):868-75. doi: 10.1097/01.psy.0000140008.70959.41. Psychosom Med. 2004. PMID: 15564351 Review.
  • Patient Expectations of Assigned Treatments Impact Strength of Randomised Control Trials. Truzoli R, Reed P, Osborne LA. Truzoli R, et al. Front Med (Lausanne). 2021 Jun 17;8:648403. doi: 10.3389/fmed.2021.648403. eCollection 2021. Front Med (Lausanne). 2021. PMID: 34222273 Free PMC article.
  • Preference in random assignment: implications for the interpretation of randomized trials. Macias C, Gold PB, Hargreaves WA, Aronson E, Bickman L, Barreira PJ, Jones DR, Rodican CF, Fisher WH. Macias C, et al. Adm Policy Ment Health. 2009 Sep;36(5):331-42. doi: 10.1007/s10488-009-0224-0. Epub 2009 May 12. Adm Policy Ment Health. 2009. PMID: 19434489 Free PMC article.
  • Is personality a key predictor of missing study data? An analysis from a randomized controlled trial. Jerant A, Chapman BP, Duberstein P, Franks P. Jerant A, et al. Ann Fam Med. 2009 Mar-Apr;7(2):148-56. doi: 10.1370/afm.920. Ann Fam Med. 2009. PMID: 19273870 Free PMC article.
  • When programs benefit some people more than others: tests of differential service effectiveness. Macias C, Jones DR, Hargreaves WA, Wang Q, Rodican CF, Barreira PJ, Gold PB. Macias C, et al. Adm Policy Ment Health. 2008 Jul;35(4):283-94. doi: 10.1007/s10488-008-0174-y. Adm Policy Ment Health. 2008. PMID: 18512145 Free PMC article.
  • Impact of referral source and study applicants' preference for randomly assigned service on research enrollment, service engagement, and evaluative outcomes. Macias C, Barreira P, Hargreaves W, Bickman L, Fisher W, Aronson E. Macias C, et al. Am J Psychiatry. 2005 Apr;162(4):781-7. doi: 10.1176/appi.ajp.162.4.781. Am J Psychiatry. 2005. PMID: 15800153 Free PMC article.

Publication types

  • Search in MeSH

LinkOut - more resources

Full text sources.

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Frequently asked questions

How do i prevent confounding variables from interfering with my research.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

Controlled Experiment

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

This is when a hypothesis is scientifically tested.

In a controlled experiment, an independent variable (the cause) is systematically manipulated, and the dependent variable (the effect) is measured; any extraneous variables are controlled.

The researcher can operationalize (i.e., define) the studied variables so they can be objectively measured. The quantitative data can be analyzed to see if there is a difference between the experimental and control groups.

controlled experiment cause and effect

What is the control group?

In experiments scientists compare a control group and an experimental group that are identical in all respects, except for one difference – experimental manipulation.

Unlike the experimental group, the control group is not exposed to the independent variable under investigation and so provides a baseline against which any changes in the experimental group can be compared.

Since experimental manipulation is the only difference between the experimental and control groups, we can be sure that any differences between the two are due to experimental manipulation rather than chance.

Randomly allocating participants to independent variable groups means that all participants should have an equal chance of participating in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

control group experimental group

What are extraneous variables?

The researcher wants to ensure that the manipulation of the independent variable has changed the changes in the dependent variable.

Hence, all the other variables that could affect the dependent variable to change must be controlled. These other variables are called extraneous or confounding variables.

Extraneous variables should be controlled were possible, as they might be important enough to provide alternative explanations for the effects.

controlled experiment extraneous variables

In practice, it would be difficult to control all the variables in a child’s educational achievement. For example, it would be difficult to control variables that have happened in the past.

A researcher can only control the current environment of participants, such as time of day and noise levels.

controlled experiment variables

Why conduct controlled experiments?

Scientists use controlled experiments because they allow for precise control of extraneous and independent variables. This allows a cause-and-effect relationship to be established.

Controlled experiments also follow a standardized step-by-step procedure. This makes it easy for another researcher to replicate the study.

Key Terminology

Experimental group.

The group being treated or otherwise manipulated for the sake of the experiment.

Control Group

They receive no treatment and are used as a comparison group.

Ecological validity

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) – is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables that are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of participating in each condition.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

What is the control in an experiment?

In an experiment , the control is a standard or baseline group not exposed to the experimental treatment or manipulation. It serves as a comparison group to the experimental group, which does receive the treatment or manipulation.

The control group helps to account for other variables that might influence the outcome, allowing researchers to attribute differences in results more confidently to the experimental treatment.

Establishing a cause-and-effect relationship between the manipulated variable (independent variable) and the outcome (dependent variable) is critical in establishing a cause-and-effect relationship between the manipulated variable.

What is the purpose of controlling the environment when testing a hypothesis?

Controlling the environment when testing a hypothesis aims to eliminate or minimize the influence of extraneous variables. These variables other than the independent variable might affect the dependent variable, potentially confounding the results.

By controlling the environment, researchers can ensure that any observed changes in the dependent variable are likely due to the manipulation of the independent variable, not other factors.

This enhances the experiment’s validity, allowing for more accurate conclusions about cause-and-effect relationships.

It also improves the experiment’s replicability, meaning other researchers can repeat the experiment under the same conditions to verify the results.

Why are hypotheses important to controlled experiments?

Hypotheses are crucial to controlled experiments because they provide a clear focus and direction for the research. A hypothesis is a testable prediction about the relationship between variables.

It guides the design of the experiment, including what variables to manipulate (independent variables) and what outcomes to measure (dependent variables).

The experiment is then conducted to test the validity of the hypothesis. If the results align with the hypothesis, they provide evidence supporting it.

The hypothesis may be revised or rejected if the results do not align. Thus, hypotheses are central to the scientific method, driving the iterative inquiry, experimentation, and knowledge advancement process.

What is the experimental method?

The experimental method is a systematic approach in scientific research where an independent variable is manipulated to observe its effect on a dependent variable, under controlled conditions.

Print Friendly, PDF & Email

Related Articles

Qualitative Data Coding

Research Methodology

Qualitative Data Coding

What Is a Focus Group?

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

Cross-Cultural Research Methodology In Psychology

What Is Internal Validity In Research?

What Is Internal Validity In Research?

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Criterion Validity: Definition & Examples

how does random assignment control confounding variables

Snapsolve any problem by taking a picture. Try it in the Numerade app?

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian Dermatol Online J
  • v.10(5); Sep-Oct 2019

Selection of Control, Randomization, Blinding, and Allocation Concealment

Department of Pharmacology, Rampurhat Government Medical College, Rampurhat, Birbhum, West Bengal, India

Piyush Kumar

1 Department of Dermatology, Katihar Medical College and Hospital, Bihar, India

Rajesh Kumar

2 Department of Dermatology, Grant Medical College and Bombay Hospital Institute of Medical Sciences, Mumbai, Maharashtra, India

Nilay Kanti Das

3 Department of Dermatology, Bankura Sammilani Medical College, Kenduadihi, Bankura, West Bengal, India

Clinical trials looking at which treatment is better must have certain checks in place. Appropriate “control” selection while comparing the investigating agent to the “control group is essential to rule out selection bias. Randomization is another step to minimize variability or “confounders.” By randomization, research participants have an equal chance of being selected into any treatment group of the study, generating comparable intervention groups, thereby distributing the confounders. A trial can be “open labeled” or “blinded.” By the process of blinding, we make the participant and/or assessing physician unaware of the treatment he/she is going to receive. Thus, the element of bias which can creep in owing to personal preference or subjective component to the assessment of outcome can be eliminated. Concealment of allocation is done as the participant enters the trial. Concealment secures randomization and prevents “selection bias”.

Introduction

Clinical trials or interventional studies are to be designed in such a way that it gives a comprehensive idea about the effectiveness/efficacy or safety of any new agent introduced for the treatment of any clinical condition. To understand the fact that the improvement (or deterioration) is not happening by chance , it is essential that the treatment modality is compared against another modality of treatment (active control) or no treatment (placebo control). Thus, the role of having control is paramount, and it decides the level of evidence of any trial and in-turn decides the grade of recommendation.

Apart from having control, there is another important factor which can affect the interpretation of result in any clinical trial, and this factor is “bias.” The bias can be while selecting the participant and the control (selection bias), owing to the confounding factors (confounding bias) and also while assessing the outcome (assessment bias). Randomization is the method adopted to eliminate the bias of selection and confounder. It has got two steps; generation of random number and concealment of the random number from the dispensing physician (allocation concealment). For eliminating the assessment bias, the method adopted is blinding and it can be done at a different level using the participant of the trial, assessing physician, and even the statistician analyzing the results.

The article will attempt in elaborating on these facets of clinical trial and impart practical clues to implement the same.

A. Selection of control- The “control” is used in clinical trials to nullify the effect of known or unknown factors (other than the factor being tested) on the research outcome and hence, to increase the reliability of the results. For example, if a new topical medication is shown to be effective in psoriasis patient, the inclusion of “control” in the study allows the investigator to conclude that the new medication is truly effective and improvement did not happen by chance.

Controls in case-control studies:

  • While choosing control, two principles should be followed:[ 1 ]
  • The control or the comparison group should be representative of the source population from which cases are derived
  • The controls should be independent of the exposure, i.e., less likely to have the exposure of interest.

For e.g., to test the role of “topical corticosteroids causing the unresponsiveness to standard antifungal therapy” it is essential that case (those who have used steroid) and control (those who have not) be chosen from the same socio-cultural strata to eliminate the confounders such as hot and humid working environment, cleanliness, etc. This prevents “selection bias.” Thus, a control minimizes the effects of variables other than the variable under evaluation.

Controls in clinical trials:

In making a decision about a new treatment, the control arm is usually taken as the “gold standard treatment” (or the “best available treatment”). Comparison between the “test” arm (or “experimental” arm) and the “control” arm in such clinical studies makes a fruitful assessment of the new treatment compared with the previous one and increases the reliability of the study.

  • Placebo control: A placebo is an inactive substance that looks like the drug or treatment being tested.[ 2 ] A placebo control may be used where no standard treatment exists or else using a placebo control becomes unethical and substandard care in patients with active disease, where there is an approved treatment. Guidelines state that there should be a condition of “clinical equipoise” before a placebo-controlled trial is started. Clinical equipoise assumes that there exists not one better intervention either for the control or experimental group during the design of the trial, e.g., a trial on systemic sclerosis by using an experimental drug understanding the fact that there is no gold-standard therapy for systemic sclerosis.[ 3 ] The use of “active” control or “historical” control (stated below) can address this issue. Participants who are going to receive placebo are not going to benefit from the trial. This therapeutic misconception should be eliminated during the informed consent process.
  • Dose-response control: A new dose for a known drug makes it a “new drug.”[ 4 ] During its clinical trial, the control is usually the previously used dose. For e.g., 10 mg levocetirizine compared to 5 mg levocetirizine
  • Active control: Here, the control is an active drug, usually the standard therapy or known effective treatment
  • Historical control: Such control uses data from previously conducted studies and administrative databases. The studies that can be chosen for historical control can be a prospective natural history study or a control group from a previous randomized controlled trial.

B. Randomization

Minimizing variability of evaluation is the core of conducting good research or experiment. This variability is also known as “confounders.” Confounders can be known or unknown. Confounders have the possibility of generating erroneous results because of the unknown effects of unmeasured variables. The process by which confounders can be reduced is known as “randomization.” By randomization, research participants have an equal chance of being selected into any treatment group of the study, generating comparable intervention groups, thereby distributing the confounders. The confounders can, therefore, be ignored. Thus, the difference in outcome and results can be explained by treatment alone. However, if one wants to gain greater experience using a new treatment or drug, one may opt for “Unequal randomization”- randomization in 2:1 ratio (2/3 of patients on new treatment). The power of the study does get reduced [power decreases from 0.95 (for 1:1) to 0.925 (for 2:1)], but this technique is statistically feasible and is especially suited to phase II randomized trials.

Benefits of randomization:

  • Balances the treatment groups with respect to baseline variability, known, and unknown confounding factors, thus eliminates “confounding bias”
  • Eliminates “selection bias.” Selection bias occurs when the researcher voluntarily or involuntarily steers the less sick patients to the treatment he feels is better and vice versa
  • Forms the basis for statistical tests.

How to randomize?

  • Computer generated random number table. The statistical software has a provision of choosing equal or unequal randomized groups, choosing stratification, etc
  • Random number table from statistical textbooks
  • For smaller experiments: tossing coins (heads-control and tails-treatment), roll of dice (≤3-treatment and >3-control), and shuffled deck of cards (even-Group A and odd-Group B). However, these methods are replaced by the aforementioned methods.

What is not randomization?

  • Alternate assignment: Study participants alternatively assigned to treatment, e.g., Odd numbers go to Treatment A and even numbers go to Treatment B
  • Assignment according to the date of entry to study, e.g., 2 weeks of the month to Group A, next 2 weeks to Group B
  • Assignment according to the days of the week, e.g., Monday OPD patients to Group A, Wednesday OPD patients to Group B. This gives rise to Berksonian bias.

Techniques for randomization

  • Simple randomization: Randomization according to a single sequence of random assignments is known as simple randomization.[ 5 ] Assignment to the treatment groups is random and not concerned with other variables. For e.g. toss of coin, roll of dice, etc. This is the most simple and easy approach of randomization. In clinical studies with large sample size (at least 1,000 participants), simple randomization usually balances the number of subjects in each group. However, simple randomization could be problematic in smaller samples resulting in an unequal number of participants in treatment arms

e.g., Block randomization of two treatment groups A and B, number of blocks = 5, size of blocks = 10, and fixed size blocks.

1: A 2: B 3: B 4: A 5: A 6: B 7: B 8: A 9: B 10: A

1: A 2: B 3: B 4: B 5: A 6: B 7: A 8: A 9: B 10: A

1: A 2: B 3: B 4: A 5: B 6: A 7: B 8: A 9: A 10: B

1: A 2: B 3: B 4: B 5: B 6: A 7: A 8: A 9: A 10: B

1: B 2: A 3: B 4: A 5: B 6: A 7: B 8: B 9: A 10: A

  • Stratified randomization: When specific variables are known to influence the outcome, stratification of the sample is required to keep the variables (e.g., age, gender, weight, prognostic status, etc) as similar as possible in between the treatment groups. This method achieves a balance between baseline characteristics. At first, variable is identified, strata is created. Participants are assigned to strata. Simple randomization is then applied to each stratum to assign subjects to either group, e.g. in case of assessing results of immunotherapy for viral warts, stratification can be done with respect to the types of warts viz. verruca vulgaris, verruca plana, plantar wart, condyloma acuminata.
  • Cluster randomization: This method randomizes groups of people instead of individuals. This method is also known as “group randomization.” Cluster randomization is particularly favored to avoid complaint among the group of people living in close vicinity, e.g., vaccine trials where all participants of the same locality receive the same vaccine, lifestyle modification studies, and studies involving nutritional interventions. Here, sampling units are groups and not individuals.

C. Blinding

A trial can be “open labeled” or “blinded”. By the process of blinding, we make the participant and/or assessing physician unaware of the treatment he/she is going to receive. Thus, the element of bias which can creep in owing to personal preference or subjective component to the assessment of outcome (e.g., a tool like physician global score is used to assess the outcome) can be eliminated. The process has now been further extended to include the statistician analyzing the result to make it fool proof. Thus, blinding is helpful in eliminating intentional or unintentional bias, increasing the objectivity of results, and ensuring the credibility of study conclusions.

Types of blinding:

  • Open-labeled or unblinded: All parties involved in a study are aware of the treatment the participants are receiving. Although blinding is desirable, sometimes it may not be possible or feasible. This type of study design suffers from low credibility but may be acceptable if endpoints are indisputably objective (e.g., survival or death)
  • Single-blind: The participants in a study might drop out from study or might give false assessment if they come to know that they are receiving “no treatment.” In addition, they might develop a placebo effect, if they know they are receiving “new treatment.” All these biases can be eliminated by single-blinding. In this, a group of individuals (usually the participants) do not know the intervention he or she is going to receive. Conventionally, it refers to participant-blinded but logically the group of individuals blinded can also be the outcome assessor. Thus, a single-blind trial can be either participant-blind or assessor-blind, and it is better to specify who is blinded, instead of saying single-blind
  • Double-blind: Like participants, the investigator/observer may influence the results of the study, if they are aware if a group of individuals are receiving a particular treatment. For example, if the endpoint is subjective (e.g. physician global scale), they might record a more favorable response for treatment of their preference. In addition, they might influence participants’ assessment of a particular treatment during follow-up meetings. In double-blinding, neither the participant nor the investigator/observer/outcome assessor is aware of the treatment allotted. The investigator is the person carrying out the research. The observer or the outcome assessor is the person who assesses the parameters of the study
  • Triple-blind: Triple-blinding is done to eliminate the bias of data analysts. In triple-blinding, the participant, investigator, and the data analyst are unaware of the treatment given.

However, instead of expressing whether the trial is single, double, or triple blinded, it is more pertinent to specify who exactly is going to be blinded.

Masking: It is a term used interchangeably with “blinding” and is usually used by ophthalmologists.

Advantages of blinding:

  • Avoids observation bias. For e.g., during the evaluation of a subjective score like urticaria severity score, blinding prevents favoring of the test drug by the investigator
  • Can also reduce the opportunity for bias to enter into the evaluation of the trial results owing to the knowledge of the treatment.

Procedures of blinding a trial:

  • By using identical looking dosage forms, for e.g. in a placebo-controlled trial, the placebo used should be similar looking in shape, size, color, and odor as that of the active drug

Investigational group = Active drug + Placebo

Control group = Placebo + Active control

  • Both the active drugs can be taken out from their packaging and repacked in similar looking opaque containers. The containers can be labeled according to randomization
  • The observer can be blinded by separating the room where the person is dispensing the drug and the person observing the effects of the drug.

Assessment of the efficacy of blinding:

There might be untoward effects in which the trial can be unblinded. The curiosity of participants or staff, differences in taste and smell of the drug, and placebo or a cross-over study are such instances. Ideal placebos are not always easy to procure or manufacture. Thus, the assessment of blinding should be done prior to the decoding of randomization. The participants, investigators, and staff are asked to guess what the participants had received. If the guess is 50% in each group, blinding has been maintained. If the guess is >50%, there had been a breach in blinding. If guess is <50% a suspicion about non-admittance of breach of blinding should be made.

Instances when blinding may be broken:

  • The study is completed and data analyzed
  • For individual patient during an emergency, e.g., road traffic accident due to an antihistamine trial in urticaria, participant in psoriasis trial progressing to erythroderma.

Instances when blinding is not possible or difficult to achieve:

  • A surgical discipline is tested against a medical therapy, e.g., electrosurgery in pyogenic granuloma is tested against topical timolol
  • “Sham procedure” is the improvisation, while working with surgical therapy that can be utilized to make it blinded, e.g., creating a dermal pocket without introducing any warty tissue against a regular auto-inoculation of wart can be done to make the placebo-arm blinded.[ 8 ] Ethical issues limits it uses of sham procedure, but in this instance, the authors have argued to have used the procedure to rule out the role of psychological effect, which has proven its role in wart therapy.

What to do if blinding is not possible or ethical:

  • Researchers should ensure that the outcomes being measured are as objective as possible
  • In addition, a duplicate assessment of outcome may be considered and researchers should report the level of agreement achieved
  • Expertise-based trial design- It can be done for surgical procedures, where patients are randomly assigned to different surgeons
  • Partial blinding- Sometimes, independent blinded evaluators may be sufficient to reduce bias
  • Limitations and potential biases due to lack of blinding need to be acknowledged and discussed.

D. Allocation concealment

Concealment of allocation is done as the participant enters the trial. Concealment secures randomization and prevents “selection bias.”

Every researcher tries to prove his hypothesis as correct. This can lead to conscious or unconscious steering of certain “good” patients to the desired group and others to the alternate group. If the investigator knows the randomization, such bias can lead to an imbalance in the study and wrong conclusions can be drawn. Such bias can be avoided by the following allocation of concealment techniques:

  • Third party randomization by phone or pharmacy. In large multi-centric trials, interactive-voice-response-service is used to ensure the allocation concealment in different centers
  • Sequentially numbered, opaque, sealed envelope (SNOSE) technique: the randomization group is written on a paper and is kept in an opaque sealed envelope. The envelope is labeled with a serial number. The investigator opens the sealed envelope once the patient has consented to participate and then assigns the treatment group accordingly
  • Sequentially numbered opaque containers: Similar to SNOSE, but here, instead of a piece of paper, the medicines are stored in opaque containers according to randomization, and there is no possibility of the dispenser to know which medicine is kept in which container.[ 9 ] Thus, the allocation is concealed.

E. Differences between allocation concealment and blinding

Allocation concealmentBlinding
PurposeConceals randomization sequenceMakes participant or investigator or both unaware of the treatment received
Bias preventedSelection biasObservation bias
Time in the trialDone when the patient enters the trial (during recruitment)Occurs after the patient has entered the trial (after recruitment)

To conclude, a randomized controlled trial (RCT) is the gold-standard study design to evaluate any therapeutic method and carries the highest level of evidence (Level I b). Undoubtedly, as a researcher, we all are interested in conducting an RCT and contribute to the knowledge of the scientific world. Choosing the correct control group and avoiding biases are the most important aspect of any RCT. There can be a situation where blinding is not possible because of operational issues, but in every trial, effort should be thrust on randomization, which can eliminate two major biases: the bias of selection and confounding bias. Proper randomization would ensure that the baseline confounders are balanced lest; complex statistical methods are called for balancing them (e.g., multivariate analysis). This article is an attempt to provide practical tips for the researchers interested in a clinical trial, so that the data generated is more valid and credible.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

IMAGES

  1. Solved How does random assignment control confounding

    how does random assignment control confounding variables

  2. 7 Different Ways to Control for Confounding

    how does random assignment control confounding variables

  3. Purpose and Limitations of Random Assignment

    how does random assignment control confounding variables

  4. Random Assignment in Psychology

    how does random assignment control confounding variables

  5. Introduction to Random Assignment -Voxco

    how does random assignment control confounding variables

  6. How to control for confounding variables in research??

    how does random assignment control confounding variables

VIDEO

  1. random sampling & assignment

  2. Experimental Design AP Practice Question #1

  3. COSM

  4. Media Savvy at DC Black Pride 2024

  5. Techniques to control Extraneous variables

  6. How to test control variables and interpret the results on SPSS

COMMENTS

  1. Random Assignment in Experiments

    Correlation, Causation, and Confounding Variables. Random assignment helps you separate causation from correlation and rule out confounding variables. As a critical component of the scientific method, experiments typically set up contrasts between a control group and one or more treatment groups. The idea is to determine whether the effect, which is the difference between a treatment group and ...

  2. Confounding Variables

    Confounding variables (a.k.a. confounders or confounding factors) are a type of extraneous variable that are related to a study's independent and dependent variables. A variable must meet two conditions to be a confounder: It must be correlated with the independent variable. This may be a causal relationship, but it does not have to be.

  3. How to control confounding effects by statistical analysis

    To control for confounding in the analyses, investigators should measure the confounders in the study. Researchers usually do this by collecting data on all known, previously identified confounders. There are mostly two options to dealing with confounders in analysis stage; Stratification and Multivariate methods. 1.

  4. 7 Different Ways to Control for Confounding

    Random assignment controls for confounding due to both measurable and unmeasurable causes. So it is especially useful when confounding variables are unknown or cannot be measured. It also controls for time-varying confounders, that is when the exposure and the confounders are measured repeatedly in studies where participants are followed over ...

  5. Random Assignment in Experiments

    Why does random assignment matter? Random assignment is an important part of control in experimental research, because it helps strengthen the internal validity of an experiment and avoid biases.. In experiments, researchers manipulate an independent variable to assess its effect on a dependent variable, while controlling for other variables. To do so, they often use different levels of an ...

  6. What is a Confounding Variable? (Definition & Example)

    How to Reduce the Effect of Confounding Variables. There are several ways to reduce the effect of confounding variables, including the following methods: 1. Random Assignment. Random assignment refers to the process of randomly assigning individuals in a study to either a treatment group or a control group.

  7. Confounding Variable: Definition & Examples

    In studies examining possible causal links, a confounding variable is an unaccounted factor that impacts both the potential cause and effect and can distort the results. Recognizing and addressing these variables in your experimental design is crucial for producing valid findings. Statisticians also refer to confounding variables that cause ...

  8. Random Assignment in Psychology: Definition & Examples

    Random selection (also called probability sampling or random sampling) is a way of randomly selecting members of a population to be included in your study. On the other hand, random assignment is a way of sorting the sample participants into control and treatment groups. Random selection ensures that everyone in the population has an equal ...

  9. 1.3: Threats to Internal Validity and Different Control Techniques

    Random assignment. Random assignment is the single most powerful control technique we can use to minimize the potential threats of the confounding variables in research design. As we have seen in Dunn and her colleagues' study earlier, participants are not allowed to self select into either conditions (spend $20 on self or spend on others).

  10. Control of confounding in the analysis phase

    These include transformations of variables, 16 shrinkage of parameter estimates 23 and random coefficient regression models. 24 Despite great flexibility when exploring associations between an exposure and an outcome while controlling for potential confounders, multivariable analysis does not directly identify whether a factor is a true confounder.

  11. The Role of Randomization to Address Confounding Variables in Machine

    In randomization the random assignment of study subjects to exposure categories to breaking any links between exposure and confounders. This reduces potential for confounding by generating groups that are fairly comparable with respect to known and unknown confounding variables. — How to control confounding effects by statistical analysis, 2012.

  12. 1.4.2

    1.4.2 - Causal Conclusions. In order to control for confounding variables, participants can be randomly assigned to different levels of the explanatory variable. This act of randomly assigning cases to different levels of the explanatory variable is known as randomization. An experiment that involves randomization may be referred to as a ...

  13. 3.6 Causation and Random Assignment

    Random assignment of participants helps to ensure that any differences between and within the groups are not systematic at the outset of the experiment. Thus, any differences between groups recorded at the end of the experiment can be more confidently attributed to the experimental procedures or treatment. … Random assignment does not ...

  14. Control of Confounding in Study Design

    Restriction. One of the conditions necessary for confounding to occur is that the confounding factor must be distributed unequally among the groups being compared. Consequently, one of the strategies employed for avoiding confounding is to restrict admission into the study to a group of subjects who have the same levels of the confounding factors.

  15. PDF Random assignment: It's all in the cards

    2. Explain HOW you (the researcher) will conduct random assignment. 3. Argue WHY you (the researcher) will conduct random assignment. In your answer, be sure to discuss at least one confounding variable that is equally distributed between the control and experimental groups. Underline the confounding variable.

  16. Confounding: What it is and how to deal with it

    Keywords. Confounding, sometimes referred to as confounding bias, is mostly described as a 'mixing' or 'blurring' of effects. 1 It occurs when an investigator tries to determine the effect of an exposure on the occurrence of a disease (or other outcome), but then actually measures the effect of another factor, a confounding variable.

  17. Random Assignment in Psychology

    Random assignment is defined as every participant having an equal chance of being in either the experimental group or the control group. Each group is presented with the independent variable , or ...

  18. Assessing bias: the importance of considering confounding

    Confounding is often referred to as a "mixing of effects" 1, 2 wherein the effects of the exposure under study on a given outcome are mixed in with the effects of an additional factor (or set of factors) resulting in a distortion of the true relationship. In a clinical trial, this can happen when the distribution of a known prognostic ...

  19. Three Methods for Minimizing Confounding in the Study Design Phase

    Matching Compared Groups. Another risk factor can only cause confounding if it is distributed differently in the groups being compared. Therefore, another method of preventing confounding is to match the subjects with respect to confounding variables. This method can be used in both cohort studies and in case-control studies in order to enroll ...

  20. What random assignment does and does not do

    Abstract. Random assignment of patients to comparison groups stochastically tends, with increasing sample size or number of experiment replications, to minimize the confounding of treatment outcome differences by the effects of differences among these groups in unknown/unmeasured patient characteristics. To what degree such confounding is ...

  21. Issues in Outcomes Research: An Overview of Randomization Techniques

    One critical component of clinical trials that strengthens results is random assignment of participants to control and treatment groups. Although randomization appears to be a simple concept, issues of balancing sample sizes and controlling the influence of covariates a priori are important. ... Thus, age could be a confounding variable and ...

  22. How do I prevent confounding variables from interfering with ...

    A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable. In your research design, it's important to identify potential confounding variables and plan how you will reduce their impact.

  23. What Is a Controlled Experiment?

    In an experiment, the control is a standard or baseline group not exposed to the experimental treatment or manipulation.It serves as a comparison group to the experimental group, which does receive the treatment or manipulation. The control group helps to account for other variables that might influence the outcome, allowing researchers to attribute differences in results more confidently to ...

  24. Which of the following would increase the internal validity ...

    Random assignment helps ensure that any other variables that could affect the results are equally distributed across experimental and control groups. This reduces the risk that the groups differ in any meaningful way other than the intervention being tested. By doing so, it controls for confounding variables, thus enhancing the internal ...

  25. Selection of Control, Randomization, Blinding, and Allocation

    The bias can be while selecting the participant and the control (selection bias), owing to the confounding factors (confounding bias) and also while assessing the outcome (assessment bias). Randomization is the method adopted to eliminate the bias of selection and confounder. It has got two steps; generation of random number and concealment of ...