Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Prevent plagiarism. Run a free check.

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved June 9, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Storyboard That

  • My Storyboards

Exploring the Art of Experimental Design: A Step-by-Step Guide for Students and Educators

Experimental design for students.

Experimental design is a key method used in subjects like biology, chemistry, physics, psychology, and social sciences. It helps us figure out how different factors affect what we're studying, whether it's plants, chemicals, physical laws, human behavior, or how society works. Basically, it's a way to set up experiments so we can test ideas, see what happens, and make sense of our results. It's super important for students and researchers who want to answer big questions in science and understand the world better. Experimental design skills can be applied in situations ranging from problem solving to data analysis; they are wide reaching and can frequently be applied outside the classroom. The teaching of these skills is a very important part of science education, but is often overlooked when focused on teaching the content. As science educators, we have all seen the benefits practical work has for student engagement and understanding. However, with the time constraints placed on the curriculum, the time needed for students to develop these experimental research design and investigative skills can get squeezed out. Too often they get a ‘recipe’ to follow, which doesn’t allow them to take ownership of their practical work. From a very young age, they start to think about the world around them. They ask questions then use observations and evidence to answer them. Students tend to have intelligent, interesting, and testable questions that they love to ask. As educators, we should be working towards encouraging these questions and in turn, nurturing this natural curiosity in the world around them.

Teaching the design of experiments and letting students develop their own questions and hypotheses takes time. These materials have been created to scaffold and structure the process to allow teachers to focus on improving the key ideas in experimental design. Allowing students to ask their own questions, write their own hypotheses, and plan and carry out their own investigations is a valuable experience for them. This will lead to students having more ownership of their work. When students carry out the experimental method for their own questions, they reflect on how scientists have historically come to understand how the universe works.

Experimental Design

Take a look at the printer-friendly pages and worksheet templates below!

What are the Steps of Experimental Design?

Embarking on the journey of scientific discovery begins with mastering experimental design steps. This foundational process is essential for formulating experiments that yield reliable and insightful results, guiding researchers and students alike through the detailed planning, experimental research design, and execution of their studies. By leveraging an experimental design template, participants can ensure the integrity and validity of their findings. Whether it's through designing a scientific experiment or engaging in experimental design activities, the aim is to foster a deep understanding of the fundamentals: How should experiments be designed? What are the 7 experimental design steps? How can you design your own experiment?

This is an exploration of the seven key experimental method steps, experimental design ideas, and ways to integrate design of experiments. Student projects can benefit greatly from supplemental worksheets and we will also provide resources such as worksheets aimed at teaching experimental design effectively. Let’s dive into the essential stages that underpin the process of designing an experiment, equipping learners with the tools to explore their scientific curiosity.

1. Question

This is a key part of the scientific method and the experimental design process. Students enjoy coming up with questions. Formulating questions is a deep and meaningful activity that can give students ownership over their work. A great way of getting students to think of how to visualize their research question is using a mind map storyboard.

Free Customizable Experimental Design in Science Questions Spider Map

Ask students to think of any questions they want to answer about the universe or get them to think about questions they have about a particular topic. All questions are good questions, but some are easier to test than others.

2. Hypothesis

A hypothesis is known as an educated guess. A hypothesis should be a statement that can be tested scientifically. At the end of the experiment, look back to see whether the conclusion supports the hypothesis or not.

Forming good hypotheses can be challenging for students to grasp. It is important to remember that the hypothesis is not a research question, it is a testable statement . One way of forming a hypothesis is to form it as an “if... then...” statement. This certainly isn't the only or best way to form a hypothesis, but can be a very easy formula for students to use when first starting out.

An “if... then...” statement requires students to identify the variables first, and that may change the order in which they complete the stages of the visual organizer. After identifying the dependent and independent variables, the hypothesis then takes the form if [change in independent variable], then [change in dependent variable].

For example, if an experiment were looking for the effect of caffeine on reaction time, the independent variable would be amount of caffeine and the dependent variable would be reaction time. The “if, then” hypothesis could be: If you increase the amount of caffeine taken, then the reaction time will decrease.

3. Explanation of Hypothesis

What led you to this hypothesis? What is the scientific background behind your hypothesis? Depending on age and ability, students use their prior knowledge to explain why they have chosen their hypotheses, or alternatively, research using books or the internet. This could also be a good time to discuss with students what a reliable source is.

For example, students may reference previous studies showing the alertness effects of caffeine to explain why they hypothesize caffeine intake will reduce reaction time.

4. Prediction

The prediction is slightly different to the hypothesis. A hypothesis is a testable statement, whereas the prediction is more specific to the experiment. In the discovery of the structure of DNA, the hypothesis proposed that DNA has a helical structure. The prediction was that the X-ray diffraction pattern of DNA would be an X shape.

Students should formulate a prediction that is a specific, measurable outcome based on their hypothesis. Rather than just stating "caffeine will decrease reaction time," students could predict that "drinking 2 cans of soda (90mg caffeine) will reduce average reaction time by 50 milliseconds compared to drinking no caffeine."

5. Identification of Variables

Below is an example of a Discussion Storyboard that can be used to get your students talking about variables in experimental design.

Experimental Design in Science Discussion Storyboard with Students

The three types of variables you will need to discuss with your students are dependent, independent, and controlled variables. To keep this simple, refer to these as "what you are going to measure", "what you are going to change", and "what you are going to keep the same". With more advanced students, you should encourage them to use the correct vocabulary.

Dependent variables are what is measured or observed by the scientist. These measurements will often be repeated because repeated measurements makes your data more reliable.

The independent variables are variables that scientists decide to change to see what effect it has on the dependent variable. Only one is chosen because it would be difficult to figure out which variable is causing any change you observe.

Controlled variables are quantities or factors that scientists want to remain the same throughout the experiment. They are controlled to remain constant, so as to not affect the dependent variable. Controlling these allows scientists to see how the independent variable affects the dependent variable within the experimental group.

Use this example below in your lessons, or delete the answers and set it as an activity for students to complete on Storyboard That.

How temperature affects the amount of sugar able to be dissolved in water
Independent VariableWater Temperature
(Range 5 different samples at 10°C, 20°C, 30°C, 40°C and 50°C)
Dependent VariableThe amount of sugar that can be dissolved in the water, measured in teaspoons.
Controlled Variables

Identifying Variables Storyboard with Pictures | Experimental Design Process St

6. Risk Assessment

Ultimately this must be signed off on by a responsible adult, but it is important to get students to think about how they will keep themselves safe. In this part, students should identify potential risks and then explain how they are going to minimize risk. An activity to help students develop these skills is to get them to identify and manage risks in different situations. Using the storyboard below, get students to complete the second column of the T-chart by saying, "What is risk?", then explaining how they could manage that risk. This storyboard could also be projected for a class discussion.

Risk Assessment Storyboard for Experimental Design in Science

7. Materials

In this section, students will list the materials they need for the experiments, including any safety equipment that they have highlighted as needing in the risk assessment section. This is a great time to talk to students about choosing tools that are suitable for the job. You are going to use a different tool to measure the width of a hair than to measure the width of a football field!

8. General Plan and Diagram

It is important to talk to students about reproducibility. They should write a procedure that would allow their experimental method to be reproduced easily by another scientist. The easiest and most concise way for students to do this is by making a numbered list of instructions. A useful activity here could be getting students to explain how to make a cup of tea or a sandwich. Act out the process, pointing out any steps they’ve missed.

For English Language Learners and students who struggle with written English, students can describe the steps in their experiment visually using Storyboard That.

Not every experiment will need a diagram, but some plans will be greatly improved by including one. Have students focus on producing clear and easy-to-understand diagrams that illustrate the experimental group.

For example, a procedure to test the effect of sunlight on plant growth utilizing completely randomized design could detail:

  • Select 10 similar seedlings of the same age and variety
  • Prepare 2 identical trays with the same soil mixture
  • Place 5 plants in each tray; label one set "sunlight" and one set "shade"
  • Position sunlight tray by a south-facing window, and shade tray in a dark closet
  • Water both trays with 50 mL water every 2 days
  • After 3 weeks, remove plants and measure heights in cm

9. Carry Out Experiment

Once their procedure is approved, students should carefully carry out their planned experiment, following their written instructions. As data is collected, students should organize the raw results in tables, graphs, photos or drawings. This creates clear documentation for analyzing trends.

Some best practices for data collection include:

  • Record quantitative data numerically with units
  • Note qualitative observations with detailed descriptions
  • Capture set up through illustrations or photos
  • Write observations of unexpected events
  • Identify data outliers and sources of error

For example, in the plant growth experiment, students could record:

GroupSunlightSunlightSunlightShadeShade
Plant ID12312
Start Height5 cm4 cm5 cm6 cm4 cm
End Height18 cm17 cm19 cm9 cm8 cm

They would also describe observations like leaf color change or directional bending visually or in writing.

It is crucial that students practice safe science procedures. Adult supervision is required for experimentation, along with proper risk assessment.

Well-documented data collection allows for deeper analysis after experiment completion to determine whether hypotheses and predictions were supported.

Completed Examples

Editable Scientific Investigation Design Example: Moldy Bread

Resources and Experimental Design Examples

Using visual organizers is an effective way to get your students working as scientists in the classroom.

There are many ways to use these investigation planning tools to scaffold and structure students' work while they are working as scientists. Students can complete the planning stage on Storyboard That using the text boxes and diagrams, or you could print them off and have students complete them by hand. Another great way to use them is to project the planning sheet onto an interactive whiteboard and work through how to complete the planning materials as a group. Project it onto a screen and have students write their answers on sticky notes and put their ideas in the correct section of the planning document.

Very young learners can still start to think as scientists! They have loads of questions about the world around them and you can start to make a note of these in a mind map. Sometimes you can even start to ‘investigate’ these questions through play.

The foundation resource is intended for elementary students or students who need more support. It is designed to follow exactly the same process as the higher resources, but made slightly easier. The key difference between the two resources are the details that students are required to think about and the technical vocabulary used. For example, it is important that students identify variables when they are designing their investigations. In the higher version, students not only have to identify the variables, but make other comments, such as how they are going to measure the dependent variable or utilizing completely randomized design. As well as the difference in scaffolding between the two levels of resources, you may want to further differentiate by how the learners are supported by teachers and assistants in the room.

Students could also be encouraged to make their experimental plan easier to understand by using graphics, and this could also be used to support ELLs.

Customizable Foundation Experimental Design Steps T Chart Template

Students need to be assessed on their science inquiry skills alongside the assessment of their knowledge. Not only will that let students focus on developing their skills, but will also allow them to use their assessment information in a way that will help them improve their science skills. Using Quick Rubric , you can create a quick and easy assessment framework and share it with students so they know how to succeed at every stage. As well as providing formative assessment which will drive learning, this can also be used to assess student work at the end of an investigation and set targets for when they next attempt to plan their own investigation. The rubrics have been written in a way to allow students to access them easily. This way they can be shared with students as they are working through the planning process so students know what a good experimental design looks like.

Proficient
13 Points
Emerging
7 Points
Beginning
0 Points
Proficient
11 Points
Emerging
5 Points
Beginning
0 Points

Printable Resources

Return to top

Print Ready Experimental Design Idea Sheet

Related Activities

Chemical Reactions Experiment Worksheet

Additional Worksheets

If you're looking to add additional projects or continue to customize worksheets, take a look at several template pages we've compiled for you below. Each worksheet can be copied and tailored to your projects or students! Students can also be encouraged to create their own if they want to try organizing information in an easy to understand way.

  • Lab Worksheets
  • Discussion Worksheets
  • Checklist Worksheets

Related Resources

  • Scientific Method Steps
  • Science Discussion Storyboards
  • Developing Critical Thinking Skills

How to Teach Students the Design of Experiments

Encourage questioning and curiosity.

Foster a culture of inquiry by encouraging students to ask questions about the world around them.

Formulate testable hypotheses

Teach students how to develop hypotheses that can be scientifically tested. Help them understand the difference between a hypothesis and a question.

Provide scientific background

Help students understand the scientific principles and concepts relevant to their hypotheses. Encourage them to draw on prior knowledge or conduct research to support their hypotheses.

Identify variables

Teach students about the three types of variables (dependent, independent, and controlled) and how they relate to experimental design. Emphasize the importance of controlling variables and measuring the dependent variable accurately.

Plan and diagram the experiment

Guide students in developing a clear and reproducible experimental procedure. Encourage them to create a step-by-step plan or use visual diagrams to illustrate the process.

Carry out the experiment and analyze data

Support students as they conduct the experiment according to their plan. Guide them in collecting data in a meaningful and organized manner. Assist them in analyzing the data and drawing conclusions based on their findings.

Frequently Asked Questions about Experimental Design for Students

What are some common experimental design tools and techniques that students can use.

Common experimental design tools and techniques that students can use include random assignment, control groups, blinding, replication, and statistical analysis. Students can also use observational studies, surveys, and experiments with natural or quasi-experimental designs. They can also use data visualization tools to analyze and present their results.

How can experimental design help students develop critical thinking skills?

Experimental design helps students develop critical thinking skills by encouraging them to think systematically and logically about scientific problems. It requires students to analyze data, identify patterns, and draw conclusions based on evidence. It also helps students to develop problem-solving skills by providing opportunities to design and conduct experiments to test hypotheses.

How can experimental design be used to address real-world problems?

Experimental design can be used to address real-world problems by identifying variables that contribute to a particular problem and testing interventions to see if they are effective in addressing the problem. For example, experimental design can be used to test the effectiveness of new medical treatments or to evaluate the impact of social interventions on reducing poverty or improving educational outcomes.

What are some common experimental design pitfalls that students should avoid?

Common experimental design pitfalls that students should avoid include failing to control variables, using biased samples, relying on anecdotal evidence, and failing to measure dependent variables accurately. Students should also be aware of ethical considerations when conducting experiments, such as obtaining informed consent and protecting the privacy of research subjects.

  • 353/365 ~ Second Fall #running #injury • Ray Bouknight • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Always Writing • mrsdkrebs • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Batteries • Razor512 • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Bleed for It • zerojay • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Bulbs • Roo Reynolds • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • Change • dominiccampbell • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Children • Quang Minh (YILKA) • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • Danger • KatJaTo • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • draw • Asja. • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Epic Fireworks Safety Goggles • EpicFireworks • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • GERMAN BUNSEN • jasonwoodhead23 • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Heart Dissection • tjmwatson • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • ISST 2014 Munich • romanboed • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Lightbulb! • Matthew Wynn • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Mini magnifying glass • SkintDad.co.uk • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • Plants • henna lion • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • Plants • Graham S Dean Photography • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Pré Treino.... São Carlos está foda com essa queimada toda #asma #athsma #ashmatt #asthma • .v1ctor Casale. • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • puzzle • olgaberrios • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Puzzled • Brad Montgomery • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Question Mark • ryanmilani • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Radiator • Conal Gallagher • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Red Tool Box • marinetank0 • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Remote Control • Sean MacEntee • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • stopwatch • Search Engine People Blog • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Thinking • Caramdir • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • Thumb Update: The hot-glue induced burn now has a purple blister. Purple is my favorite color. (September 26, 2012 at 04:16PM) • elisharene • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • Washing my Hands 2 • AlishaV • License Attribution (http://creativecommons.org/licenses/by/2.0/)
  • Windows • Stanley Zimny (Thank You for 18 Million views) • License Attribution, Non Commercial (http://creativecommons.org/licenses/by-nc/2.0/)
  • wire • Dyroc • License Attribution (http://creativecommons.org/licenses/by/2.0/)

Try 1 Month For

30 Day Money Back Guarantee New Customers Only Full Price After Introductory Offer

Learn more about our Department, School, and District packages

  • 30 Day Money Back Guarantee
  • New Customers Only
  • Full Price After Introductory Offer

19+ Experimental Design Examples (Methods + Types)

practical psychology logo

Ever wondered how scientists discover new medicines, psychologists learn about behavior, or even how marketers figure out what kind of ads you like? Well, they all have something in common: they use a special plan or recipe called an "experimental design."

Imagine you're baking cookies. You can't just throw random amounts of flour, sugar, and chocolate chips into a bowl and hope for the best. You follow a recipe, right? Scientists and researchers do something similar. They follow a "recipe" called an experimental design to make sure their experiments are set up in a way that the answers they find are meaningful and reliable.

Experimental design is the roadmap researchers use to answer questions. It's a set of rules and steps that researchers follow to collect information, or "data," in a way that is fair, accurate, and makes sense.

experimental design test tubes

Long ago, people didn't have detailed game plans for experiments. They often just tried things out and saw what happened. But over time, people got smarter about this. They started creating structured plans—what we now call experimental designs—to get clearer, more trustworthy answers to their questions.

In this article, we'll take you on a journey through the world of experimental designs. We'll talk about the different types, or "flavors," of experimental designs, where they're used, and even give you a peek into how they came to be.

What Is Experimental Design?

Alright, before we dive into the different types of experimental designs, let's get crystal clear on what experimental design actually is.

Imagine you're a detective trying to solve a mystery. You need clues, right? Well, in the world of research, experimental design is like the roadmap that helps you find those clues. It's like the game plan in sports or the blueprint when you're building a house. Just like you wouldn't start building without a good blueprint, researchers won't start their studies without a strong experimental design.

So, why do we need experimental design? Think about baking a cake. If you toss ingredients into a bowl without measuring, you'll end up with a mess instead of a tasty dessert.

Similarly, in research, if you don't have a solid plan, you might get confusing or incorrect results. A good experimental design helps you ask the right questions ( think critically ), decide what to measure ( come up with an idea ), and figure out how to measure it (test it). It also helps you consider things that might mess up your results, like outside influences you hadn't thought of.

For example, let's say you want to find out if listening to music helps people focus better. Your experimental design would help you decide things like: Who are you going to test? What kind of music will you use? How will you measure focus? And, importantly, how will you make sure that it's really the music affecting focus and not something else, like the time of day or whether someone had a good breakfast?

In short, experimental design is the master plan that guides researchers through the process of collecting data, so they can answer questions in the most reliable way possible. It's like the GPS for the journey of discovery!

History of Experimental Design

Around 350 BCE, people like Aristotle were trying to figure out how the world works, but they mostly just thought really hard about things. They didn't test their ideas much. So while they were super smart, their methods weren't always the best for finding out the truth.

Fast forward to the Renaissance (14th to 17th centuries), a time of big changes and lots of curiosity. People like Galileo started to experiment by actually doing tests, like rolling balls down inclined planes to study motion. Galileo's work was cool because he combined thinking with doing. He'd have an idea, test it, look at the results, and then think some more. This approach was a lot more reliable than just sitting around and thinking.

Now, let's zoom ahead to the 18th and 19th centuries. This is when people like Francis Galton, an English polymath, started to get really systematic about experimentation. Galton was obsessed with measuring things. Seriously, he even tried to measure how good-looking people were ! His work helped create the foundations for a more organized approach to experiments.

Next stop: the early 20th century. Enter Ronald A. Fisher , a brilliant British statistician. Fisher was a game-changer. He came up with ideas that are like the bread and butter of modern experimental design.

Fisher invented the concept of the " control group "—that's a group of people or things that don't get the treatment you're testing, so you can compare them to those who do. He also stressed the importance of " randomization ," which means assigning people or things to different groups by chance, like drawing names out of a hat. This makes sure the experiment is fair and the results are trustworthy.

Around the same time, American psychologists like John B. Watson and B.F. Skinner were developing " behaviorism ." They focused on studying things that they could directly observe and measure, like actions and reactions.

Skinner even built boxes—called Skinner Boxes —to test how animals like pigeons and rats learn. Their work helped shape how psychologists design experiments today. Watson performed a very controversial experiment called The Little Albert experiment that helped describe behaviour through conditioning—in other words, how people learn to behave the way they do.

In the later part of the 20th century and into our time, computers have totally shaken things up. Researchers now use super powerful software to help design their experiments and crunch the numbers.

With computers, they can simulate complex experiments before they even start, which helps them predict what might happen. This is especially helpful in fields like medicine, where getting things right can be a matter of life and death.

Also, did you know that experimental designs aren't just for scientists in labs? They're used by people in all sorts of jobs, like marketing, education, and even video game design! Yes, someone probably ran an experiment to figure out what makes a game super fun to play.

So there you have it—a quick tour through the history of experimental design, from Aristotle's deep thoughts to Fisher's groundbreaking ideas, and all the way to today's computer-powered research. These designs are the recipes that help people from all walks of life find answers to their big questions.

Key Terms in Experimental Design

Before we dig into the different types of experimental designs, let's get comfy with some key terms. Understanding these terms will make it easier for us to explore the various types of experimental designs that researchers use to answer their big questions.

Independent Variable : This is what you change or control in your experiment to see what effect it has. Think of it as the "cause" in a cause-and-effect relationship. For example, if you're studying whether different types of music help people focus, the kind of music is the independent variable.

Dependent Variable : This is what you're measuring to see the effect of your independent variable. In our music and focus experiment, how well people focus is the dependent variable—it's what "depends" on the kind of music played.

Control Group : This is a group of people who don't get the special treatment or change you're testing. They help you see what happens when the independent variable is not applied. If you're testing whether a new medicine works, the control group would take a fake pill, called a placebo , instead of the real medicine.

Experimental Group : This is the group that gets the special treatment or change you're interested in. Going back to our medicine example, this group would get the actual medicine to see if it has any effect.

Randomization : This is like shaking things up in a fair way. You randomly put people into the control or experimental group so that each group is a good mix of different kinds of people. This helps make the results more reliable.

Sample : This is the group of people you're studying. They're a "sample" of a larger group that you're interested in. For instance, if you want to know how teenagers feel about a new video game, you might study a sample of 100 teenagers.

Bias : This is anything that might tilt your experiment one way or another without you realizing it. Like if you're testing a new kind of dog food and you only test it on poodles, that could create a bias because maybe poodles just really like that food and other breeds don't.

Data : This is the information you collect during the experiment. It's like the treasure you find on your journey of discovery!

Replication : This means doing the experiment more than once to make sure your findings hold up. It's like double-checking your answers on a test.

Hypothesis : This is your educated guess about what will happen in the experiment. It's like predicting the end of a movie based on the first half.

Steps of Experimental Design

Alright, let's say you're all fired up and ready to run your own experiment. Cool! But where do you start? Well, designing an experiment is a bit like planning a road trip. There are some key steps you've got to take to make sure you reach your destination. Let's break it down:

  • Ask a Question : Before you hit the road, you've got to know where you're going. Same with experiments. You start with a question you want to answer, like "Does eating breakfast really make you do better in school?"
  • Do Some Homework : Before you pack your bags, you look up the best places to visit, right? In science, this means reading up on what other people have already discovered about your topic.
  • Form a Hypothesis : This is your educated guess about what you think will happen. It's like saying, "I bet this route will get us there faster."
  • Plan the Details : Now you decide what kind of car you're driving (your experimental design), who's coming with you (your sample), and what snacks to bring (your variables).
  • Randomization : Remember, this is like shuffling a deck of cards. You want to mix up who goes into your control and experimental groups to make sure it's a fair test.
  • Run the Experiment : Finally, the rubber hits the road! You carry out your plan, making sure to collect your data carefully.
  • Analyze the Data : Once the trip's over, you look at your photos and decide which ones are keepers. In science, this means looking at your data to see what it tells you.
  • Draw Conclusions : Based on your data, did you find an answer to your question? This is like saying, "Yep, that route was faster," or "Nope, we hit a ton of traffic."
  • Share Your Findings : After a great trip, you want to tell everyone about it, right? Scientists do the same by publishing their results so others can learn from them.
  • Do It Again? : Sometimes one road trip just isn't enough. In the same way, scientists often repeat their experiments to make sure their findings are solid.

So there you have it! Those are the basic steps you need to follow when you're designing an experiment. Each step helps make sure that you're setting up a fair and reliable way to find answers to your big questions.

Let's get into examples of experimental designs.

1) True Experimental Design

notepad

In the world of experiments, the True Experimental Design is like the superstar quarterback everyone talks about. Born out of the early 20th-century work of statisticians like Ronald A. Fisher, this design is all about control, precision, and reliability.

Researchers carefully pick an independent variable to manipulate (remember, that's the thing they're changing on purpose) and measure the dependent variable (the effect they're studying). Then comes the magic trick—randomization. By randomly putting participants into either the control or experimental group, scientists make sure their experiment is as fair as possible.

No sneaky biases here!

True Experimental Design Pros

The pros of True Experimental Design are like the perks of a VIP ticket at a concert: you get the best and most trustworthy results. Because everything is controlled and randomized, you can feel pretty confident that the results aren't just a fluke.

True Experimental Design Cons

However, there's a catch. Sometimes, it's really tough to set up these experiments in a real-world situation. Imagine trying to control every single detail of your day, from the food you eat to the air you breathe. Not so easy, right?

True Experimental Design Uses

The fields that get the most out of True Experimental Designs are those that need super reliable results, like medical research.

When scientists were developing COVID-19 vaccines, they used this design to run clinical trials. They had control groups that received a placebo (a harmless substance with no effect) and experimental groups that got the actual vaccine. Then they measured how many people in each group got sick. By comparing the two, they could say, "Yep, this vaccine works!"

So next time you read about a groundbreaking discovery in medicine or technology, chances are a True Experimental Design was the VIP behind the scenes, making sure everything was on point. It's been the go-to for rigorous scientific inquiry for nearly a century, and it's not stepping off the stage anytime soon.

2) Quasi-Experimental Design

So, let's talk about the Quasi-Experimental Design. Think of this one as the cool cousin of True Experimental Design. It wants to be just like its famous relative, but it's a bit more laid-back and flexible. You'll find quasi-experimental designs when it's tricky to set up a full-blown True Experimental Design with all the bells and whistles.

Quasi-experiments still play with an independent variable, just like their stricter cousins. The big difference? They don't use randomization. It's like wanting to divide a bag of jelly beans equally between your friends, but you can't quite do it perfectly.

In real life, it's often not possible or ethical to randomly assign people to different groups, especially when dealing with sensitive topics like education or social issues. And that's where quasi-experiments come in.

Quasi-Experimental Design Pros

Even though they lack full randomization, quasi-experimental designs are like the Swiss Army knives of research: versatile and practical. They're especially popular in fields like education, sociology, and public policy.

For instance, when researchers wanted to figure out if the Head Start program , aimed at giving young kids a "head start" in school, was effective, they used a quasi-experimental design. They couldn't randomly assign kids to go or not go to preschool, but they could compare kids who did with kids who didn't.

Quasi-Experimental Design Cons

Of course, quasi-experiments come with their own bag of pros and cons. On the plus side, they're easier to set up and often cheaper than true experiments. But the flip side is that they're not as rock-solid in their conclusions. Because the groups aren't randomly assigned, there's always that little voice saying, "Hey, are we missing something here?"

Quasi-Experimental Design Uses

Quasi-Experimental Design gained traction in the mid-20th century. Researchers were grappling with real-world problems that didn't fit neatly into a laboratory setting. Plus, as society became more aware of ethical considerations, the need for flexible designs increased. So, the quasi-experimental approach was like a breath of fresh air for scientists wanting to study complex issues without a laundry list of restrictions.

In short, if True Experimental Design is the superstar quarterback, Quasi-Experimental Design is the versatile player who can adapt and still make significant contributions to the game.

3) Pre-Experimental Design

Now, let's talk about the Pre-Experimental Design. Imagine it as the beginner's skateboard you get before you try out for all the cool tricks. It has wheels, it rolls, but it's not built for the professional skatepark.

Similarly, pre-experimental designs give researchers a starting point. They let you dip your toes in the water of scientific research without diving in head-first.

So, what's the deal with pre-experimental designs?

Pre-Experimental Designs are the basic, no-frills versions of experiments. Researchers still mess around with an independent variable and measure a dependent variable, but they skip over the whole randomization thing and often don't even have a control group.

It's like baking a cake but forgetting the frosting and sprinkles; you'll get some results, but they might not be as complete or reliable as you'd like.

Pre-Experimental Design Pros

Why use such a simple setup? Because sometimes, you just need to get the ball rolling. Pre-experimental designs are great for quick-and-dirty research when you're short on time or resources. They give you a rough idea of what's happening, which you can use to plan more detailed studies later.

A good example of this is early studies on the effects of screen time on kids. Researchers couldn't control every aspect of a child's life, but they could easily ask parents to track how much time their kids spent in front of screens and then look for trends in behavior or school performance.

Pre-Experimental Design Cons

But here's the catch: pre-experimental designs are like that first draft of an essay. It helps you get your ideas down, but you wouldn't want to turn it in for a grade. Because these designs lack the rigorous structure of true or quasi-experimental setups, they can't give you rock-solid conclusions. They're more like clues or signposts pointing you in a certain direction.

Pre-Experimental Design Uses

This type of design became popular in the early stages of various scientific fields. Researchers used them to scratch the surface of a topic, generate some initial data, and then decide if it's worth exploring further. In other words, pre-experimental designs were the stepping stones that led to more complex, thorough investigations.

So, while Pre-Experimental Design may not be the star player on the team, it's like the practice squad that helps everyone get better. It's the starting point that can lead to bigger and better things.

4) Factorial Design

Now, buckle up, because we're moving into the world of Factorial Design, the multi-tasker of the experimental universe.

Imagine juggling not just one, but multiple balls in the air—that's what researchers do in a factorial design.

In Factorial Design, researchers are not satisfied with just studying one independent variable. Nope, they want to study two or more at the same time to see how they interact.

It's like cooking with several spices to see how they blend together to create unique flavors.

Factorial Design became the talk of the town with the rise of computers. Why? Because this design produces a lot of data, and computers are the number crunchers that help make sense of it all. So, thanks to our silicon friends, researchers can study complicated questions like, "How do diet AND exercise together affect weight loss?" instead of looking at just one of those factors.

Factorial Design Pros

This design's main selling point is its ability to explore interactions between variables. For instance, maybe a new study drug works really well for young people but not so great for older adults. A factorial design could reveal that age is a crucial factor, something you might miss if you only studied the drug's effectiveness in general. It's like being a detective who looks for clues not just in one room but throughout the entire house.

Factorial Design Cons

However, factorial designs have their own bag of challenges. First off, they can be pretty complicated to set up and run. Imagine coordinating a four-way intersection with lots of cars coming from all directions—you've got to make sure everything runs smoothly, or you'll end up with a traffic jam. Similarly, researchers need to carefully plan how they'll measure and analyze all the different variables.

Factorial Design Uses

Factorial designs are widely used in psychology to untangle the web of factors that influence human behavior. They're also popular in fields like marketing, where companies want to understand how different aspects like price, packaging, and advertising influence a product's success.

And speaking of success, the factorial design has been a hit since statisticians like Ronald A. Fisher (yep, him again!) expanded on it in the early-to-mid 20th century. It offered a more nuanced way of understanding the world, proving that sometimes, to get the full picture, you've got to juggle more than one ball at a time.

So, if True Experimental Design is the quarterback and Quasi-Experimental Design is the versatile player, Factorial Design is the strategist who sees the entire game board and makes moves accordingly.

5) Longitudinal Design

pill bottle

Alright, let's take a step into the world of Longitudinal Design. Picture it as the grand storyteller, the kind who doesn't just tell you about a single event but spins an epic tale that stretches over years or even decades. This design isn't about quick snapshots; it's about capturing the whole movie of someone's life or a long-running process.

You know how you might take a photo every year on your birthday to see how you've changed? Longitudinal Design is kind of like that, but for scientific research.

With Longitudinal Design, instead of measuring something just once, researchers come back again and again, sometimes over many years, to see how things are going. This helps them understand not just what's happening, but why it's happening and how it changes over time.

This design really started to shine in the latter half of the 20th century, when researchers began to realize that some questions can't be answered in a hurry. Think about studies that look at how kids grow up, or research on how a certain medicine affects you over a long period. These aren't things you can rush.

The famous Framingham Heart Study , started in 1948, is a prime example. It's been studying heart health in a small town in Massachusetts for decades, and the findings have shaped what we know about heart disease.

Longitudinal Design Pros

So, what's to love about Longitudinal Design? First off, it's the go-to for studying change over time, whether that's how people age or how a forest recovers from a fire.

Longitudinal Design Cons

But it's not all sunshine and rainbows. Longitudinal studies take a lot of patience and resources. Plus, keeping track of participants over many years can be like herding cats—difficult and full of surprises.

Longitudinal Design Uses

Despite these challenges, longitudinal studies have been key in fields like psychology, sociology, and medicine. They provide the kind of deep, long-term insights that other designs just can't match.

So, if the True Experimental Design is the superstar quarterback, and the Quasi-Experimental Design is the flexible athlete, then the Factorial Design is the strategist, and the Longitudinal Design is the wise elder who has seen it all and has stories to tell.

6) Cross-Sectional Design

Now, let's flip the script and talk about Cross-Sectional Design, the polar opposite of the Longitudinal Design. If Longitudinal is the grand storyteller, think of Cross-Sectional as the snapshot photographer. It captures a single moment in time, like a selfie that you take to remember a fun day. Researchers using this design collect all their data at one point, providing a kind of "snapshot" of whatever they're studying.

In a Cross-Sectional Design, researchers look at multiple groups all at the same time to see how they're different or similar.

This design rose to popularity in the mid-20th century, mainly because it's so quick and efficient. Imagine wanting to know how people of different ages feel about a new video game. Instead of waiting for years to see how opinions change, you could just ask people of all ages what they think right now. That's Cross-Sectional Design for you—fast and straightforward.

You'll find this type of research everywhere from marketing studies to healthcare. For instance, you might have heard about surveys asking people what they think about a new product or political issue. Those are usually cross-sectional studies, aimed at getting a quick read on public opinion.

Cross-Sectional Design Pros

So, what's the big deal with Cross-Sectional Design? Well, it's the go-to when you need answers fast and don't have the time or resources for a more complicated setup.

Cross-Sectional Design Cons

Remember, speed comes with trade-offs. While you get your results quickly, those results are stuck in time. They can't tell you how things change or why they're changing, just what's happening right now.

Cross-Sectional Design Uses

Also, because they're so quick and simple, cross-sectional studies often serve as the first step in research. They give scientists an idea of what's going on so they can decide if it's worth digging deeper. In that way, they're a bit like a movie trailer, giving you a taste of the action to see if you're interested in seeing the whole film.

So, in our lineup of experimental designs, if True Experimental Design is the superstar quarterback and Longitudinal Design is the wise elder, then Cross-Sectional Design is like the speedy running back—fast, agile, but not designed for long, drawn-out plays.

7) Correlational Design

Next on our roster is the Correlational Design, the keen observer of the experimental world. Imagine this design as the person at a party who loves people-watching. They don't interfere or get involved; they just observe and take mental notes about what's going on.

In a correlational study, researchers don't change or control anything; they simply observe and measure how two variables relate to each other.

The correlational design has roots in the early days of psychology and sociology. Pioneers like Sir Francis Galton used it to study how qualities like intelligence or height could be related within families.

This design is all about asking, "Hey, when this thing happens, does that other thing usually happen too?" For example, researchers might study whether students who have more study time get better grades or whether people who exercise more have lower stress levels.

One of the most famous correlational studies you might have heard of is the link between smoking and lung cancer. Back in the mid-20th century, researchers started noticing that people who smoked a lot also seemed to get lung cancer more often. They couldn't say smoking caused cancer—that would require a true experiment—but the strong correlation was a red flag that led to more research and eventually, health warnings.

Correlational Design Pros

This design is great at proving that two (or more) things can be related. Correlational designs can help prove that more detailed research is needed on a topic. They can help us see patterns or possible causes for things that we otherwise might not have realized.

Correlational Design Cons

But here's where you need to be careful: correlational designs can be tricky. Just because two things are related doesn't mean one causes the other. That's like saying, "Every time I wear my lucky socks, my team wins." Well, it's a fun thought, but those socks aren't really controlling the game.

Correlational Design Uses

Despite this limitation, correlational designs are popular in psychology, economics, and epidemiology, to name a few fields. They're often the first step in exploring a possible relationship between variables. Once a strong correlation is found, researchers may decide to conduct more rigorous experimental studies to examine cause and effect.

So, if the True Experimental Design is the superstar quarterback and the Longitudinal Design is the wise elder, the Factorial Design is the strategist, and the Cross-Sectional Design is the speedster, then the Correlational Design is the clever scout, identifying interesting patterns but leaving the heavy lifting of proving cause and effect to the other types of designs.

8) Meta-Analysis

Last but not least, let's talk about Meta-Analysis, the librarian of experimental designs.

If other designs are all about creating new research, Meta-Analysis is about gathering up everyone else's research, sorting it, and figuring out what it all means when you put it together.

Imagine a jigsaw puzzle where each piece is a different study. Meta-Analysis is the process of fitting all those pieces together to see the big picture.

The concept of Meta-Analysis started to take shape in the late 20th century, when computers became powerful enough to handle massive amounts of data. It was like someone handed researchers a super-powered magnifying glass, letting them examine multiple studies at the same time to find common trends or results.

You might have heard of the Cochrane Reviews in healthcare . These are big collections of meta-analyses that help doctors and policymakers figure out what treatments work best based on all the research that's been done.

For example, if ten different studies show that a certain medicine helps lower blood pressure, a meta-analysis would pull all that information together to give a more accurate answer.

Meta-Analysis Pros

The beauty of Meta-Analysis is that it can provide really strong evidence. Instead of relying on one study, you're looking at the whole landscape of research on a topic.

Meta-Analysis Cons

However, it does have some downsides. For one, Meta-Analysis is only as good as the studies it includes. If those studies are flawed, the meta-analysis will be too. It's like baking a cake: if you use bad ingredients, it doesn't matter how good your recipe is—the cake won't turn out well.

Meta-Analysis Uses

Despite these challenges, meta-analyses are highly respected and widely used in many fields like medicine, psychology, and education. They help us make sense of a world that's bursting with information by showing us the big picture drawn from many smaller snapshots.

So, in our all-star lineup, if True Experimental Design is the quarterback and Longitudinal Design is the wise elder, the Factorial Design is the strategist, the Cross-Sectional Design is the speedster, and the Correlational Design is the scout, then the Meta-Analysis is like the coach, using insights from everyone else's plays to come up with the best game plan.

9) Non-Experimental Design

Now, let's talk about a player who's a bit of an outsider on this team of experimental designs—the Non-Experimental Design. Think of this design as the commentator or the journalist who covers the game but doesn't actually play.

In a Non-Experimental Design, researchers are like reporters gathering facts, but they don't interfere or change anything. They're simply there to describe and analyze.

Non-Experimental Design Pros

So, what's the deal with Non-Experimental Design? Its strength is in description and exploration. It's really good for studying things as they are in the real world, without changing any conditions.

Non-Experimental Design Cons

Because a non-experimental design doesn't manipulate variables, it can't prove cause and effect. It's like a weather reporter: they can tell you it's raining, but they can't tell you why it's raining.

The downside? Since researchers aren't controlling variables, it's hard to rule out other explanations for what they observe. It's like hearing one side of a story—you get an idea of what happened, but it might not be the complete picture.

Non-Experimental Design Uses

Non-Experimental Design has always been a part of research, especially in fields like anthropology, sociology, and some areas of psychology.

For instance, if you've ever heard of studies that describe how people behave in different cultures or what teens like to do in their free time, that's often Non-Experimental Design at work. These studies aim to capture the essence of a situation, like painting a portrait instead of taking a snapshot.

One well-known example you might have heard about is the Kinsey Reports from the 1940s and 1950s, which described sexual behavior in men and women. Researchers interviewed thousands of people but didn't manipulate any variables like you would in a true experiment. They simply collected data to create a comprehensive picture of the subject matter.

So, in our metaphorical team of research designs, if True Experimental Design is the quarterback and Longitudinal Design is the wise elder, Factorial Design is the strategist, Cross-Sectional Design is the speedster, Correlational Design is the scout, and Meta-Analysis is the coach, then Non-Experimental Design is the sports journalist—always present, capturing the game, but not part of the action itself.

10) Repeated Measures Design

white rat

Time to meet the Repeated Measures Design, the time traveler of our research team. If this design were a player in a sports game, it would be the one who keeps revisiting past plays to figure out how to improve the next one.

Repeated Measures Design is all about studying the same people or subjects multiple times to see how they change or react under different conditions.

The idea behind Repeated Measures Design isn't new; it's been around since the early days of psychology and medicine. You could say it's a cousin to the Longitudinal Design, but instead of looking at how things naturally change over time, it focuses on how the same group reacts to different things.

Imagine a study looking at how a new energy drink affects people's running speed. Instead of comparing one group that drank the energy drink to another group that didn't, a Repeated Measures Design would have the same group of people run multiple times—once with the energy drink, and once without. This way, you're really zeroing in on the effect of that energy drink, making the results more reliable.

Repeated Measures Design Pros

The strong point of Repeated Measures Design is that it's super focused. Because it uses the same subjects, you don't have to worry about differences between groups messing up your results.

Repeated Measures Design Cons

But the downside? Well, people can get tired or bored if they're tested too many times, which might affect how they respond.

Repeated Measures Design Uses

A famous example of this design is the "Little Albert" experiment, conducted by John B. Watson and Rosalie Rayner in 1920. In this study, a young boy was exposed to a white rat and other stimuli several times to see how his emotional responses changed. Though the ethical standards of this experiment are often criticized today, it was groundbreaking in understanding conditioned emotional responses.

In our metaphorical lineup of research designs, if True Experimental Design is the quarterback and Longitudinal Design is the wise elder, Factorial Design is the strategist, Cross-Sectional Design is the speedster, Correlational Design is the scout, Meta-Analysis is the coach, and Non-Experimental Design is the journalist, then Repeated Measures Design is the time traveler—always looping back to fine-tune the game plan.

11) Crossover Design

Next up is Crossover Design, the switch-hitter of the research world. If you're familiar with baseball, you'll know a switch-hitter is someone who can bat both right-handed and left-handed.

In a similar way, Crossover Design allows subjects to experience multiple conditions, flipping them around so that everyone gets a turn in each role.

This design is like the utility player on our team—versatile, flexible, and really good at adapting.

The Crossover Design has its roots in medical research and has been popular since the mid-20th century. It's often used in clinical trials to test the effectiveness of different treatments.

Crossover Design Pros

The neat thing about this design is that it allows each participant to serve as their own control group. Imagine you're testing two new kinds of headache medicine. Instead of giving one type to one group and another type to a different group, you'd give both kinds to the same people but at different times.

Crossover Design Cons

What's the big deal with Crossover Design? Its major strength is in reducing the "noise" that comes from individual differences. Since each person experiences all conditions, it's easier to see real effects. However, there's a catch. This design assumes that there's no lasting effect from the first condition when you switch to the second one. That might not always be true. If the first treatment has a long-lasting effect, it could mess up the results when you switch to the second treatment.

Crossover Design Uses

A well-known example of Crossover Design is in studies that look at the effects of different types of diets—like low-carb vs. low-fat diets. Researchers might have participants follow a low-carb diet for a few weeks, then switch them to a low-fat diet. By doing this, they can more accurately measure how each diet affects the same group of people.

In our team of experimental designs, if True Experimental Design is the quarterback and Longitudinal Design is the wise elder, Factorial Design is the strategist, Cross-Sectional Design is the speedster, Correlational Design is the scout, Meta-Analysis is the coach, Non-Experimental Design is the journalist, and Repeated Measures Design is the time traveler, then Crossover Design is the versatile utility player—always ready to adapt and play multiple roles to get the most accurate results.

12) Cluster Randomized Design

Meet the Cluster Randomized Design, the team captain of group-focused research. In our imaginary lineup of experimental designs, if other designs focus on individual players, then Cluster Randomized Design is looking at how the entire team functions.

This approach is especially common in educational and community-based research, and it's been gaining traction since the late 20th century.

Here's how Cluster Randomized Design works: Instead of assigning individual people to different conditions, researchers assign entire groups, or "clusters." These could be schools, neighborhoods, or even entire towns. This helps you see how the new method works in a real-world setting.

Imagine you want to see if a new anti-bullying program really works. Instead of selecting individual students, you'd introduce the program to a whole school or maybe even several schools, and then compare the results to schools without the program.

Cluster Randomized Design Pros

Why use Cluster Randomized Design? Well, sometimes it's just not practical to assign conditions at the individual level. For example, you can't really have half a school following a new reading program while the other half sticks with the old one; that would be way too confusing! Cluster Randomization helps get around this problem by treating each "cluster" as its own mini-experiment.

Cluster Randomized Design Cons

There's a downside, too. Because entire groups are assigned to each condition, there's a risk that the groups might be different in some important way that the researchers didn't account for. That's like having one sports team that's full of veterans playing against a team of rookies; the match wouldn't be fair.

Cluster Randomized Design Uses

A famous example is the research conducted to test the effectiveness of different public health interventions, like vaccination programs. Researchers might roll out a vaccination program in one community but not in another, then compare the rates of disease in both.

In our metaphorical research team, if True Experimental Design is the quarterback, Longitudinal Design is the wise elder, Factorial Design is the strategist, Cross-Sectional Design is the speedster, Correlational Design is the scout, Meta-Analysis is the coach, Non-Experimental Design is the journalist, Repeated Measures Design is the time traveler, and Crossover Design is the utility player, then Cluster Randomized Design is the team captain—always looking out for the group as a whole.

13) Mixed-Methods Design

Say hello to Mixed-Methods Design, the all-rounder or the "Renaissance player" of our research team.

Mixed-Methods Design uses a blend of both qualitative and quantitative methods to get a more complete picture, just like a Renaissance person who's good at lots of different things. It's like being good at both offense and defense in a sport; you've got all your bases covered!

Mixed-Methods Design is a fairly new kid on the block, becoming more popular in the late 20th and early 21st centuries as researchers began to see the value in using multiple approaches to tackle complex questions. It's the Swiss Army knife in our research toolkit, combining the best parts of other designs to be more versatile.

Here's how it could work: Imagine you're studying the effects of a new educational app on students' math skills. You might use quantitative methods like tests and grades to measure how much the students improve—that's the 'numbers part.'

But you also want to know how the students feel about math now, or why they think they got better or worse. For that, you could conduct interviews or have students fill out journals—that's the 'story part.'

Mixed-Methods Design Pros

So, what's the scoop on Mixed-Methods Design? The strength is its versatility and depth; you're not just getting numbers or stories, you're getting both, which gives a fuller picture.

Mixed-Methods Design Cons

But, it's also more challenging. Imagine trying to play two sports at the same time! You have to be skilled in different research methods and know how to combine them effectively.

Mixed-Methods Design Uses

A high-profile example of Mixed-Methods Design is research on climate change. Scientists use numbers and data to show temperature changes (quantitative), but they also interview people to understand how these changes are affecting communities (qualitative).

In our team of experimental designs, if True Experimental Design is the quarterback, Longitudinal Design is the wise elder, Factorial Design is the strategist, Cross-Sectional Design is the speedster, Correlational Design is the scout, Meta-Analysis is the coach, Non-Experimental Design is the journalist, Repeated Measures Design is the time traveler, Crossover Design is the utility player, and Cluster Randomized Design is the team captain, then Mixed-Methods Design is the Renaissance player—skilled in multiple areas and able to bring them all together for a winning strategy.

14) Multivariate Design

Now, let's turn our attention to Multivariate Design, the multitasker of the research world.

If our lineup of research designs were like players on a basketball court, Multivariate Design would be the player dribbling, passing, and shooting all at once. This design doesn't just look at one or two things; it looks at several variables simultaneously to see how they interact and affect each other.

Multivariate Design is like baking a cake with many ingredients. Instead of just looking at how flour affects the cake, you also consider sugar, eggs, and milk all at once. This way, you understand how everything works together to make the cake taste good or bad.

Multivariate Design has been a go-to method in psychology, economics, and social sciences since the latter half of the 20th century. With the advent of computers and advanced statistical software, analyzing multiple variables at once became a lot easier, and Multivariate Design soared in popularity.

Multivariate Design Pros

So, what's the benefit of using Multivariate Design? Its power lies in its complexity. By studying multiple variables at the same time, you can get a really rich, detailed understanding of what's going on.

Multivariate Design Cons

But that complexity can also be a drawback. With so many variables, it can be tough to tell which ones are really making a difference and which ones are just along for the ride.

Multivariate Design Uses

Imagine you're a coach trying to figure out the best strategy to win games. You wouldn't just look at how many points your star player scores; you'd also consider assists, rebounds, turnovers, and maybe even how loud the crowd is. A Multivariate Design would help you understand how all these factors work together to determine whether you win or lose.

A well-known example of Multivariate Design is in market research. Companies often use this approach to figure out how different factors—like price, packaging, and advertising—affect sales. By studying multiple variables at once, they can find the best combination to boost profits.

In our metaphorical research team, if True Experimental Design is the quarterback, Longitudinal Design is the wise elder, Factorial Design is the strategist, Cross-Sectional Design is the speedster, Correlational Design is the scout, Meta-Analysis is the coach, Non-Experimental Design is the journalist, Repeated Measures Design is the time traveler, Crossover Design is the utility player, Cluster Randomized Design is the team captain, and Mixed-Methods Design is the Renaissance player, then Multivariate Design is the multitasker—juggling many variables at once to get a fuller picture of what's happening.

15) Pretest-Posttest Design

Let's introduce Pretest-Posttest Design, the "Before and After" superstar of our research team. You've probably seen those before-and-after pictures in ads for weight loss programs or home renovations, right?

Well, this design is like that, but for science! Pretest-Posttest Design checks out what things are like before the experiment starts and then compares that to what things are like after the experiment ends.

This design is one of the classics, a staple in research for decades across various fields like psychology, education, and healthcare. It's so simple and straightforward that it has stayed popular for a long time.

In Pretest-Posttest Design, you measure your subject's behavior or condition before you introduce any changes—that's your "before" or "pretest." Then you do your experiment, and after it's done, you measure the same thing again—that's your "after" or "posttest."

Pretest-Posttest Design Pros

What makes Pretest-Posttest Design special? It's pretty easy to understand and doesn't require fancy statistics.

Pretest-Posttest Design Cons

But there are some pitfalls. For example, what if the kids in our math example get better at multiplication just because they're older or because they've taken the test before? That would make it hard to tell if the program is really effective or not.

Pretest-Posttest Design Uses

Let's say you're a teacher and you want to know if a new math program helps kids get better at multiplication. First, you'd give all the kids a multiplication test—that's your pretest. Then you'd teach them using the new math program. At the end, you'd give them the same test again—that's your posttest. If the kids do better on the second test, you might conclude that the program works.

One famous use of Pretest-Posttest Design is in evaluating the effectiveness of driver's education courses. Researchers will measure people's driving skills before and after the course to see if they've improved.

16) Solomon Four-Group Design

Next up is the Solomon Four-Group Design, the "chess master" of our research team. This design is all about strategy and careful planning. Named after Richard L. Solomon who introduced it in the 1940s, this method tries to correct some of the weaknesses in simpler designs, like the Pretest-Posttest Design.

Here's how it rolls: The Solomon Four-Group Design uses four different groups to test a hypothesis. Two groups get a pretest, then one of them receives the treatment or intervention, and both get a posttest. The other two groups skip the pretest, and only one of them receives the treatment before they both get a posttest.

Sound complicated? It's like playing 4D chess; you're thinking several moves ahead!

Solomon Four-Group Design Pros

What's the pro and con of the Solomon Four-Group Design? On the plus side, it provides really robust results because it accounts for so many variables.

Solomon Four-Group Design Cons

The downside? It's a lot of work and requires a lot of participants, making it more time-consuming and costly.

Solomon Four-Group Design Uses

Let's say you want to figure out if a new way of teaching history helps students remember facts better. Two classes take a history quiz (pretest), then one class uses the new teaching method while the other sticks with the old way. Both classes take another quiz afterward (posttest).

Meanwhile, two more classes skip the initial quiz, and then one uses the new method before both take the final quiz. Comparing all four groups will give you a much clearer picture of whether the new teaching method works and whether the pretest itself affects the outcome.

The Solomon Four-Group Design is less commonly used than simpler designs but is highly respected for its ability to control for more variables. It's a favorite in educational and psychological research where you really want to dig deep and figure out what's actually causing changes.

17) Adaptive Designs

Now, let's talk about Adaptive Designs, the chameleons of the experimental world.

Imagine you're a detective, and halfway through solving a case, you find a clue that changes everything. You wouldn't just stick to your old plan; you'd adapt and change your approach, right? That's exactly what Adaptive Designs allow researchers to do.

In an Adaptive Design, researchers can make changes to the study as it's happening, based on early results. In a traditional study, once you set your plan, you stick to it from start to finish.

Adaptive Design Pros

This method is particularly useful in fast-paced or high-stakes situations, like developing a new vaccine in the middle of a pandemic. The ability to adapt can save both time and resources, and more importantly, it can save lives by getting effective treatments out faster.

Adaptive Design Cons

But Adaptive Designs aren't without their drawbacks. They can be very complex to plan and carry out, and there's always a risk that the changes made during the study could introduce bias or errors.

Adaptive Design Uses

Adaptive Designs are most often seen in clinical trials, particularly in the medical and pharmaceutical fields.

For instance, if a new drug is showing really promising results, the study might be adjusted to give more participants the new treatment instead of a placebo. Or if one dose level is showing bad side effects, it might be dropped from the study.

The best part is, these changes are pre-planned. Researchers lay out in advance what changes might be made and under what conditions, which helps keep everything scientific and above board.

In terms of applications, besides their heavy usage in medical and pharmaceutical research, Adaptive Designs are also becoming increasingly popular in software testing and market research. In these fields, being able to quickly adjust to early results can give companies a significant advantage.

Adaptive Designs are like the agile startups of the research world—quick to pivot, keen to learn from ongoing results, and focused on rapid, efficient progress. However, they require a great deal of expertise and careful planning to ensure that the adaptability doesn't compromise the integrity of the research.

18) Bayesian Designs

Next, let's dive into Bayesian Designs, the data detectives of the research universe. Named after Thomas Bayes, an 18th-century statistician and minister, this design doesn't just look at what's happening now; it also takes into account what's happened before.

Imagine if you were a detective who not only looked at the evidence in front of you but also used your past cases to make better guesses about your current one. That's the essence of Bayesian Designs.

Bayesian Designs are like detective work in science. As you gather more clues (or data), you update your best guess on what's really happening. This way, your experiment gets smarter as it goes along.

In the world of research, Bayesian Designs are most notably used in areas where you have some prior knowledge that can inform your current study. For example, if earlier research shows that a certain type of medicine usually works well for a specific illness, a Bayesian Design would include that information when studying a new group of patients with the same illness.

Bayesian Design Pros

One of the major advantages of Bayesian Designs is their efficiency. Because they use existing data to inform the current experiment, often fewer resources are needed to reach a reliable conclusion.

Bayesian Design Cons

However, they can be quite complicated to set up and require a deep understanding of both statistics and the subject matter at hand.

Bayesian Design Uses

Bayesian Designs are highly valued in medical research, finance, environmental science, and even in Internet search algorithms. Their ability to continually update and refine hypotheses based on new evidence makes them particularly useful in fields where data is constantly evolving and where quick, informed decisions are crucial.

Here's a real-world example: In the development of personalized medicine, where treatments are tailored to individual patients, Bayesian Designs are invaluable. If a treatment has been effective for patients with similar genetics or symptoms in the past, a Bayesian approach can use that data to predict how well it might work for a new patient.

This type of design is also increasingly popular in machine learning and artificial intelligence. In these fields, Bayesian Designs help algorithms "learn" from past data to make better predictions or decisions in new situations. It's like teaching a computer to be a detective that gets better and better at solving puzzles the more puzzles it sees.

19) Covariate Adaptive Randomization

old person and young person

Now let's turn our attention to Covariate Adaptive Randomization, which you can think of as the "matchmaker" of experimental designs.

Picture a soccer coach trying to create the most balanced teams for a friendly match. They wouldn't just randomly assign players; they'd take into account each player's skills, experience, and other traits.

Covariate Adaptive Randomization is all about creating the most evenly matched groups possible for an experiment.

In traditional randomization, participants are allocated to different groups purely by chance. This is a pretty fair way to do things, but it can sometimes lead to unbalanced groups.

Imagine if all the professional-level players ended up on one soccer team and all the beginners on another; that wouldn't be a very informative match! Covariate Adaptive Randomization fixes this by using important traits or characteristics (called "covariates") to guide the randomization process.

Covariate Adaptive Randomization Pros

The benefits of this design are pretty clear: it aims for balance and fairness, making the final results more trustworthy.

Covariate Adaptive Randomization Cons

But it's not perfect. It can be complex to implement and requires a deep understanding of which characteristics are most important to balance.

Covariate Adaptive Randomization Uses

This design is particularly useful in medical trials. Let's say researchers are testing a new medication for high blood pressure. Participants might have different ages, weights, or pre-existing conditions that could affect the results.

Covariate Adaptive Randomization would make sure that each treatment group has a similar mix of these characteristics, making the results more reliable and easier to interpret.

In practical terms, this design is often seen in clinical trials for new drugs or therapies, but its principles are also applicable in fields like psychology, education, and social sciences.

For instance, in educational research, it might be used to ensure that classrooms being compared have similar distributions of students in terms of academic ability, socioeconomic status, and other factors.

Covariate Adaptive Randomization is like the wise elder of the group, ensuring that everyone has an equal opportunity to show their true capabilities, thereby making the collective results as reliable as possible.

20) Stepped Wedge Design

Let's now focus on the Stepped Wedge Design, a thoughtful and cautious member of the experimental design family.

Imagine you're trying out a new gardening technique, but you're not sure how well it will work. You decide to apply it to one section of your garden first, watch how it performs, and then gradually extend the technique to other sections. This way, you get to see its effects over time and across different conditions. That's basically how Stepped Wedge Design works.

In a Stepped Wedge Design, all participants or clusters start off in the control group, and then, at different times, they 'step' over to the intervention or treatment group. This creates a wedge-like pattern over time where more and more participants receive the treatment as the study progresses. It's like rolling out a new policy in phases, monitoring its impact at each stage before extending it to more people.

Stepped Wedge Design Pros

The Stepped Wedge Design offers several advantages. Firstly, it allows for the study of interventions that are expected to do more good than harm, which makes it ethically appealing.

Secondly, it's useful when resources are limited and it's not feasible to roll out a new treatment to everyone at once. Lastly, because everyone eventually receives the treatment, it can be easier to get buy-in from participants or organizations involved in the study.

Stepped Wedge Design Cons

However, this design can be complex to analyze because it has to account for both the time factor and the changing conditions in each 'step' of the wedge. And like any study where participants know they're receiving an intervention, there's the potential for the results to be influenced by the placebo effect or other biases.

Stepped Wedge Design Uses

This design is particularly useful in health and social care research. For instance, if a hospital wants to implement a new hygiene protocol, it might start in one department, assess its impact, and then roll it out to other departments over time. This allows the hospital to adjust and refine the new protocol based on real-world data before it's fully implemented.

In terms of applications, Stepped Wedge Designs are commonly used in public health initiatives, organizational changes in healthcare settings, and social policy trials. They are particularly useful in situations where an intervention is being rolled out gradually and it's important to understand its impacts at each stage.

21) Sequential Design

Next up is Sequential Design, the dynamic and flexible member of our experimental design family.

Imagine you're playing a video game where you can choose different paths. If you take one path and find a treasure chest, you might decide to continue in that direction. If you hit a dead end, you might backtrack and try a different route. Sequential Design operates in a similar fashion, allowing researchers to make decisions at different stages based on what they've learned so far.

In a Sequential Design, the experiment is broken down into smaller parts, or "sequences." After each sequence, researchers pause to look at the data they've collected. Based on those findings, they then decide whether to stop the experiment because they've got enough information, or to continue and perhaps even modify the next sequence.

Sequential Design Pros

This allows for a more efficient use of resources, as you're only continuing with the experiment if the data suggests it's worth doing so.

One of the great things about Sequential Design is its efficiency. Because you're making data-driven decisions along the way, you can often reach conclusions more quickly and with fewer resources.

Sequential Design Cons

However, it requires careful planning and expertise to ensure that these "stop or go" decisions are made correctly and without bias.

Sequential Design Uses

In terms of its applications, besides healthcare and medicine, Sequential Design is also popular in quality control in manufacturing, environmental monitoring, and financial modeling. In these areas, being able to make quick decisions based on incoming data can be a big advantage.

This design is often used in clinical trials involving new medications or treatments. For example, if early results show that a new drug has significant side effects, the trial can be stopped before more people are exposed to it.

On the flip side, if the drug is showing promising results, the trial might be expanded to include more participants or to extend the testing period.

Think of Sequential Design as the nimble athlete of experimental designs, capable of quick pivots and adjustments to reach the finish line in the most effective way possible. But just like an athlete needs a good coach, this design requires expert oversight to make sure it stays on the right track.

22) Field Experiments

Last but certainly not least, let's explore Field Experiments—the adventurers of the experimental design world.

Picture a scientist leaving the controlled environment of a lab to test a theory in the real world, like a biologist studying animals in their natural habitat or a social scientist observing people in a real community. These are Field Experiments, and they're all about getting out there and gathering data in real-world settings.

Field Experiments embrace the messiness of the real world, unlike laboratory experiments, where everything is controlled down to the smallest detail. This makes them both exciting and challenging.

Field Experiment Pros

On one hand, the results often give us a better understanding of how things work outside the lab.

While Field Experiments offer real-world relevance, they come with challenges like controlling for outside factors and the ethical considerations of intervening in people's lives without their knowledge.

Field Experiment Cons

On the other hand, the lack of control can make it harder to tell exactly what's causing what. Yet, despite these challenges, they remain a valuable tool for researchers who want to understand how theories play out in the real world.

Field Experiment Uses

Let's say a school wants to improve student performance. In a Field Experiment, they might change the school's daily schedule for one semester and keep track of how students perform compared to another school where the schedule remained the same.

Because the study is happening in a real school with real students, the results could be very useful for understanding how the change might work in other schools. But since it's the real world, lots of other factors—like changes in teachers or even the weather—could affect the results.

Field Experiments are widely used in economics, psychology, education, and public policy. For example, you might have heard of the famous "Broken Windows" experiment in the 1980s that looked at how small signs of disorder, like broken windows or graffiti, could encourage more serious crime in neighborhoods. This experiment had a big impact on how cities think about crime prevention.

From the foundational concepts of control groups and independent variables to the sophisticated layouts like Covariate Adaptive Randomization and Sequential Design, it's clear that the realm of experimental design is as varied as it is fascinating.

We've seen that each design has its own special talents, ideal for specific situations. Some designs, like the Classic Controlled Experiment, are like reliable old friends you can always count on.

Others, like Sequential Design, are flexible and adaptable, making quick changes based on what they learn. And let's not forget the adventurous Field Experiments, which take us out of the lab and into the real world to discover things we might not see otherwise.

Choosing the right experimental design is like picking the right tool for the job. The method you choose can make a big difference in how reliable your results are and how much people will trust what you've discovered. And as we've learned, there's a design to suit just about every question, every problem, and every curiosity.

So the next time you read about a new discovery in medicine, psychology, or any other field, you'll have a better understanding of the thought and planning that went into figuring things out. Experimental design is more than just a set of rules; it's a structured way to explore the unknown and answer questions that can change the world.

Related posts:

  • Experimental Psychologist Career (Salary + Duties + Interviews)
  • 40+ Famous Psychologists (Images + Biographies)
  • 11+ Psychology Experiment Ideas (Goals + Methods)
  • The Little Albert Experiment
  • 41+ White Collar Job Examples (Salary + Path)

Reference this article:

About The Author

Photo of author

Free Personality Test

Free Personality Quiz

Free Memory Test

Free Memory Test

Free IQ Test

Free IQ Test

PracticalPie.com is a participant in the Amazon Associates Program. As an Amazon Associate we earn from qualifying purchases.

Follow Us On:

Youtube Facebook Instagram X/Twitter

Psychology Resources

Developmental

Personality

Relationships

Psychologists

Serial Killers

Psychology Tests

Personality Quiz

Memory Test

Depression test

Type A/B Personality Test

© PracticalPsychology. All rights reserved

Privacy Policy | Terms of Use

  • Privacy Policy

Research Method

Home » Experimental Design – Types, Methods, Guide

Experimental Design – Types, Methods, Guide

Table of Contents

Experimental Research Design

Experimental Design

Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results.

Experimental design typically includes identifying the variables that will be manipulated or measured, defining the sample or population to be studied, selecting an appropriate method of sampling, choosing a method for data collection and analysis, and determining the appropriate statistical tests to use.

Types of Experimental Design

Here are the different types of experimental design:

Completely Randomized Design

In this design, participants are randomly assigned to one of two or more groups, and each group is exposed to a different treatment or condition.

Randomized Block Design

This design involves dividing participants into blocks based on a specific characteristic, such as age or gender, and then randomly assigning participants within each block to one of two or more treatment groups.

Factorial Design

In a factorial design, participants are randomly assigned to one of several groups, each of which receives a different combination of two or more independent variables.

Repeated Measures Design

In this design, each participant is exposed to all of the different treatments or conditions, either in a random order or in a predetermined order.

Crossover Design

This design involves randomly assigning participants to one of two or more treatment groups, with each group receiving one treatment during the first phase of the study and then switching to a different treatment during the second phase.

Split-plot Design

In this design, the researcher manipulates one or more variables at different levels and uses a randomized block design to control for other variables.

Nested Design

This design involves grouping participants within larger units, such as schools or households, and then randomly assigning these units to different treatment groups.

Laboratory Experiment

Laboratory experiments are conducted under controlled conditions, which allows for greater precision and accuracy. However, because laboratory conditions are not always representative of real-world conditions, the results of these experiments may not be generalizable to the population at large.

Field Experiment

Field experiments are conducted in naturalistic settings and allow for more realistic observations. However, because field experiments are not as controlled as laboratory experiments, they may be subject to more sources of error.

Experimental Design Methods

Experimental design methods refer to the techniques and procedures used to design and conduct experiments in scientific research. Here are some common experimental design methods:

Randomization

This involves randomly assigning participants to different groups or treatments to ensure that any observed differences between groups are due to the treatment and not to other factors.

Control Group

The use of a control group is an important experimental design method that involves having a group of participants that do not receive the treatment or intervention being studied. The control group is used as a baseline to compare the effects of the treatment group.

Blinding involves keeping participants, researchers, or both unaware of which treatment group participants are in, in order to reduce the risk of bias in the results.

Counterbalancing

This involves systematically varying the order in which participants receive treatments or interventions in order to control for order effects.

Replication

Replication involves conducting the same experiment with different samples or under different conditions to increase the reliability and validity of the results.

This experimental design method involves manipulating multiple independent variables simultaneously to investigate their combined effects on the dependent variable.

This involves dividing participants into subgroups or blocks based on specific characteristics, such as age or gender, in order to reduce the risk of confounding variables.

Data Collection Method

Experimental design data collection methods are techniques and procedures used to collect data in experimental research. Here are some common experimental design data collection methods:

Direct Observation

This method involves observing and recording the behavior or phenomenon of interest in real time. It may involve the use of structured or unstructured observation, and may be conducted in a laboratory or naturalistic setting.

Self-report Measures

Self-report measures involve asking participants to report their thoughts, feelings, or behaviors using questionnaires, surveys, or interviews. These measures may be administered in person or online.

Behavioral Measures

Behavioral measures involve measuring participants’ behavior directly, such as through reaction time tasks or performance tests. These measures may be administered using specialized equipment or software.

Physiological Measures

Physiological measures involve measuring participants’ physiological responses, such as heart rate, blood pressure, or brain activity, using specialized equipment. These measures may be invasive or non-invasive, and may be administered in a laboratory or clinical setting.

Archival Data

Archival data involves using existing records or data, such as medical records, administrative records, or historical documents, as a source of information. These data may be collected from public or private sources.

Computerized Measures

Computerized measures involve using software or computer programs to collect data on participants’ behavior or responses. These measures may include reaction time tasks, cognitive tests, or other types of computer-based assessments.

Video Recording

Video recording involves recording participants’ behavior or interactions using cameras or other recording equipment. This method can be used to capture detailed information about participants’ behavior or to analyze social interactions.

Data Analysis Method

Experimental design data analysis methods refer to the statistical techniques and procedures used to analyze data collected in experimental research. Here are some common experimental design data analysis methods:

Descriptive Statistics

Descriptive statistics are used to summarize and describe the data collected in the study. This includes measures such as mean, median, mode, range, and standard deviation.

Inferential Statistics

Inferential statistics are used to make inferences or generalizations about a larger population based on the data collected in the study. This includes hypothesis testing and estimation.

Analysis of Variance (ANOVA)

ANOVA is a statistical technique used to compare means across two or more groups in order to determine whether there are significant differences between the groups. There are several types of ANOVA, including one-way ANOVA, two-way ANOVA, and repeated measures ANOVA.

Regression Analysis

Regression analysis is used to model the relationship between two or more variables in order to determine the strength and direction of the relationship. There are several types of regression analysis, including linear regression, logistic regression, and multiple regression.

Factor Analysis

Factor analysis is used to identify underlying factors or dimensions in a set of variables. This can be used to reduce the complexity of the data and identify patterns in the data.

Structural Equation Modeling (SEM)

SEM is a statistical technique used to model complex relationships between variables. It can be used to test complex theories and models of causality.

Cluster Analysis

Cluster analysis is used to group similar cases or observations together based on similarities or differences in their characteristics.

Time Series Analysis

Time series analysis is used to analyze data collected over time in order to identify trends, patterns, or changes in the data.

Multilevel Modeling

Multilevel modeling is used to analyze data that is nested within multiple levels, such as students nested within schools or employees nested within companies.

Applications of Experimental Design 

Experimental design is a versatile research methodology that can be applied in many fields. Here are some applications of experimental design:

  • Medical Research: Experimental design is commonly used to test new treatments or medications for various medical conditions. This includes clinical trials to evaluate the safety and effectiveness of new drugs or medical devices.
  • Agriculture : Experimental design is used to test new crop varieties, fertilizers, and other agricultural practices. This includes randomized field trials to evaluate the effects of different treatments on crop yield, quality, and pest resistance.
  • Environmental science: Experimental design is used to study the effects of environmental factors, such as pollution or climate change, on ecosystems and wildlife. This includes controlled experiments to study the effects of pollutants on plant growth or animal behavior.
  • Psychology : Experimental design is used to study human behavior and cognitive processes. This includes experiments to test the effects of different interventions, such as therapy or medication, on mental health outcomes.
  • Engineering : Experimental design is used to test new materials, designs, and manufacturing processes in engineering applications. This includes laboratory experiments to test the strength and durability of new materials, or field experiments to test the performance of new technologies.
  • Education : Experimental design is used to evaluate the effectiveness of teaching methods, educational interventions, and programs. This includes randomized controlled trials to compare different teaching methods or evaluate the impact of educational programs on student outcomes.
  • Marketing : Experimental design is used to test the effectiveness of marketing campaigns, pricing strategies, and product designs. This includes experiments to test the impact of different marketing messages or pricing schemes on consumer behavior.

Examples of Experimental Design 

Here are some examples of experimental design in different fields:

  • Example in Medical research : A study that investigates the effectiveness of a new drug treatment for a particular condition. Patients are randomly assigned to either a treatment group or a control group, with the treatment group receiving the new drug and the control group receiving a placebo. The outcomes, such as improvement in symptoms or side effects, are measured and compared between the two groups.
  • Example in Education research: A study that examines the impact of a new teaching method on student learning outcomes. Students are randomly assigned to either a group that receives the new teaching method or a group that receives the traditional teaching method. Student achievement is measured before and after the intervention, and the results are compared between the two groups.
  • Example in Environmental science: A study that tests the effectiveness of a new method for reducing pollution in a river. Two sections of the river are selected, with one section treated with the new method and the other section left untreated. The water quality is measured before and after the intervention, and the results are compared between the two sections.
  • Example in Marketing research: A study that investigates the impact of a new advertising campaign on consumer behavior. Participants are randomly assigned to either a group that is exposed to the new campaign or a group that is not. Their behavior, such as purchasing or product awareness, is measured and compared between the two groups.
  • Example in Social psychology: A study that examines the effect of a new social intervention on reducing prejudice towards a marginalized group. Participants are randomly assigned to either a group that receives the intervention or a control group that does not. Their attitudes and behavior towards the marginalized group are measured before and after the intervention, and the results are compared between the two groups.

When to use Experimental Research Design 

Experimental research design should be used when a researcher wants to establish a cause-and-effect relationship between variables. It is particularly useful when studying the impact of an intervention or treatment on a particular outcome.

Here are some situations where experimental research design may be appropriate:

  • When studying the effects of a new drug or medical treatment: Experimental research design is commonly used in medical research to test the effectiveness and safety of new drugs or medical treatments. By randomly assigning patients to treatment and control groups, researchers can determine whether the treatment is effective in improving health outcomes.
  • When evaluating the effectiveness of an educational intervention: An experimental research design can be used to evaluate the impact of a new teaching method or educational program on student learning outcomes. By randomly assigning students to treatment and control groups, researchers can determine whether the intervention is effective in improving academic performance.
  • When testing the effectiveness of a marketing campaign: An experimental research design can be used to test the effectiveness of different marketing messages or strategies. By randomly assigning participants to treatment and control groups, researchers can determine whether the marketing campaign is effective in changing consumer behavior.
  • When studying the effects of an environmental intervention: Experimental research design can be used to study the impact of environmental interventions, such as pollution reduction programs or conservation efforts. By randomly assigning locations or areas to treatment and control groups, researchers can determine whether the intervention is effective in improving environmental outcomes.
  • When testing the effects of a new technology: An experimental research design can be used to test the effectiveness and safety of new technologies or engineering designs. By randomly assigning participants or locations to treatment and control groups, researchers can determine whether the new technology is effective in achieving its intended purpose.

How to Conduct Experimental Research

Here are the steps to conduct Experimental Research:

  • Identify a Research Question : Start by identifying a research question that you want to answer through the experiment. The question should be clear, specific, and testable.
  • Develop a Hypothesis: Based on your research question, develop a hypothesis that predicts the relationship between the independent and dependent variables. The hypothesis should be clear and testable.
  • Design the Experiment : Determine the type of experimental design you will use, such as a between-subjects design or a within-subjects design. Also, decide on the experimental conditions, such as the number of independent variables, the levels of the independent variable, and the dependent variable to be measured.
  • Select Participants: Select the participants who will take part in the experiment. They should be representative of the population you are interested in studying.
  • Randomly Assign Participants to Groups: If you are using a between-subjects design, randomly assign participants to groups to control for individual differences.
  • Conduct the Experiment : Conduct the experiment by manipulating the independent variable(s) and measuring the dependent variable(s) across the different conditions.
  • Analyze the Data: Analyze the data using appropriate statistical methods to determine if there is a significant effect of the independent variable(s) on the dependent variable(s).
  • Draw Conclusions: Based on the data analysis, draw conclusions about the relationship between the independent and dependent variables. If the results support the hypothesis, then it is accepted. If the results do not support the hypothesis, then it is rejected.
  • Communicate the Results: Finally, communicate the results of the experiment through a research report or presentation. Include the purpose of the study, the methods used, the results obtained, and the conclusions drawn.

Purpose of Experimental Design 

The purpose of experimental design is to control and manipulate one or more independent variables to determine their effect on a dependent variable. Experimental design allows researchers to systematically investigate causal relationships between variables, and to establish cause-and-effect relationships between the independent and dependent variables. Through experimental design, researchers can test hypotheses and make inferences about the population from which the sample was drawn.

Experimental design provides a structured approach to designing and conducting experiments, ensuring that the results are reliable and valid. By carefully controlling for extraneous variables that may affect the outcome of the study, experimental design allows researchers to isolate the effect of the independent variable(s) on the dependent variable(s), and to minimize the influence of other factors that may confound the results.

Experimental design also allows researchers to generalize their findings to the larger population from which the sample was drawn. By randomly selecting participants and using statistical techniques to analyze the data, researchers can make inferences about the larger population with a high degree of confidence.

Overall, the purpose of experimental design is to provide a rigorous, systematic, and scientific method for testing hypotheses and establishing cause-and-effect relationships between variables. Experimental design is a powerful tool for advancing scientific knowledge and informing evidence-based practice in various fields, including psychology, biology, medicine, engineering, and social sciences.

Advantages of Experimental Design 

Experimental design offers several advantages in research. Here are some of the main advantages:

  • Control over extraneous variables: Experimental design allows researchers to control for extraneous variables that may affect the outcome of the study. By manipulating the independent variable and holding all other variables constant, researchers can isolate the effect of the independent variable on the dependent variable.
  • Establishing causality: Experimental design allows researchers to establish causality by manipulating the independent variable and observing its effect on the dependent variable. This allows researchers to determine whether changes in the independent variable cause changes in the dependent variable.
  • Replication : Experimental design allows researchers to replicate their experiments to ensure that the findings are consistent and reliable. Replication is important for establishing the validity and generalizability of the findings.
  • Random assignment: Experimental design often involves randomly assigning participants to conditions. This helps to ensure that individual differences between participants are evenly distributed across conditions, which increases the internal validity of the study.
  • Precision : Experimental design allows researchers to measure variables with precision, which can increase the accuracy and reliability of the data.
  • Generalizability : If the study is well-designed, experimental design can increase the generalizability of the findings. By controlling for extraneous variables and using random assignment, researchers can increase the likelihood that the findings will apply to other populations and contexts.

Limitations of Experimental Design

Experimental design has some limitations that researchers should be aware of. Here are some of the main limitations:

  • Artificiality : Experimental design often involves creating artificial situations that may not reflect real-world situations. This can limit the external validity of the findings, or the extent to which the findings can be generalized to real-world settings.
  • Ethical concerns: Some experimental designs may raise ethical concerns, particularly if they involve manipulating variables that could cause harm to participants or if they involve deception.
  • Participant bias : Participants in experimental studies may modify their behavior in response to the experiment, which can lead to participant bias.
  • Limited generalizability: The conditions of the experiment may not reflect the complexities of real-world situations. As a result, the findings may not be applicable to all populations and contexts.
  • Cost and time : Experimental design can be expensive and time-consuming, particularly if the experiment requires specialized equipment or if the sample size is large.
  • Researcher bias : Researchers may unintentionally bias the results of the experiment if they have expectations or preferences for certain outcomes.
  • Lack of feasibility : Experimental design may not be feasible in some cases, particularly if the research question involves variables that cannot be manipulated or controlled.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Triangulation

Triangulation in Research – Types, Methods and...

Quantitative Research

Quantitative Research – Methods, Types and...

Observational Research

Observational Research – Methods and Guide

Qualitative Research Methods

Qualitative Research Methods

Qualitative Research

Qualitative Research – Methods, Analysis Types...

Exploratory Research

Exploratory Research – Types, Methods and...

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • A Quick Guide to Experimental Design | 5 Steps & Examples

A Quick Guide to Experimental Design | 5 Steps & Examples

Published on 11 April 2022 by Rebecca Bevans . Revised on 5 December 2022.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design means creating a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying. 

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If if random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, frequently asked questions about experimental design.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Prevent plagiarism, run a free check.

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalised and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomised design vs a randomised block design .
  • A between-subjects design vs a within-subjects design .

Randomisation

An experiment can be completely randomised or randomised within blocks (aka strata):

  • In a completely randomised design , every subject is assigned to a treatment group at random.
  • In a randomised block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomised design Randomised block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomisation isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomising or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomised.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomised.

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimise bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalised to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

Experimental designs are a set of procedures that you plan in order to examine the relationship between variables that interest you.

To design a successful experiment, first identify:

  • A testable hypothesis
  • One or more independent variables that you will manipulate
  • One or more dependent variables that you will measure

When designing the experiment, first decide:

  • How your variable(s) will be manipulated
  • How you will control for any potential confounding or lurking variables
  • How many subjects you will include
  • How you will assign treatments to your subjects

The key difference between observational studies and experiments is that, done correctly, an observational study will never influence the responses or behaviours of participants. Experimental designs will have a treatment condition applied to at least a portion of participants.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word ‘between’ means that you’re comparing different conditions between groups, while the word ‘within’ means you’re comparing different conditions within the same group.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bevans, R. (2022, December 05). A Quick Guide to Experimental Design | 5 Steps & Examples. Scribbr. Retrieved 9 June 2024, from https://www.scribbr.co.uk/research-methods/guide-to-experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Enago Academy

Experimental Research Design — 6 mistakes you should never make!

' src=

Since school days’ students perform scientific experiments that provide results that define and prove the laws and theorems in science. These experiments are laid on a strong foundation of experimental research designs.

An experimental research design helps researchers execute their research objectives with more clarity and transparency.

In this article, we will not only discuss the key aspects of experimental research designs but also the issues to avoid and problems to resolve while designing your research study.

Table of Contents

What Is Experimental Research Design?

Experimental research design is a framework of protocols and procedures created to conduct experimental research with a scientific approach using two sets of variables. Herein, the first set of variables acts as a constant, used to measure the differences of the second set. The best example of experimental research methods is quantitative research .

Experimental research helps a researcher gather the necessary data for making better research decisions and determining the facts of a research study.

When Can a Researcher Conduct Experimental Research?

A researcher can conduct experimental research in the following situations —

  • When time is an important factor in establishing a relationship between the cause and effect.
  • When there is an invariable or never-changing behavior between the cause and effect.
  • Finally, when the researcher wishes to understand the importance of the cause and effect.

Importance of Experimental Research Design

To publish significant results, choosing a quality research design forms the foundation to build the research study. Moreover, effective research design helps establish quality decision-making procedures, structures the research to lead to easier data analysis, and addresses the main research question. Therefore, it is essential to cater undivided attention and time to create an experimental research design before beginning the practical experiment.

By creating a research design, a researcher is also giving oneself time to organize the research, set up relevant boundaries for the study, and increase the reliability of the results. Through all these efforts, one could also avoid inconclusive results. If any part of the research design is flawed, it will reflect on the quality of the results derived.

Types of Experimental Research Designs

Based on the methods used to collect data in experimental studies, the experimental research designs are of three primary types:

1. Pre-experimental Research Design

A research study could conduct pre-experimental research design when a group or many groups are under observation after implementing factors of cause and effect of the research. The pre-experimental design will help researchers understand whether further investigation is necessary for the groups under observation.

Pre-experimental research is of three types —

  • One-shot Case Study Research Design
  • One-group Pretest-posttest Research Design
  • Static-group Comparison

2. True Experimental Research Design

A true experimental research design relies on statistical analysis to prove or disprove a researcher’s hypothesis. It is one of the most accurate forms of research because it provides specific scientific evidence. Furthermore, out of all the types of experimental designs, only a true experimental design can establish a cause-effect relationship within a group. However, in a true experiment, a researcher must satisfy these three factors —

  • There is a control group that is not subjected to changes and an experimental group that will experience the changed variables
  • A variable that can be manipulated by the researcher
  • Random distribution of the variables

This type of experimental research is commonly observed in the physical sciences.

3. Quasi-experimental Research Design

The word “Quasi” means similarity. A quasi-experimental design is similar to a true experimental design. However, the difference between the two is the assignment of the control group. In this research design, an independent variable is manipulated, but the participants of a group are not randomly assigned. This type of research design is used in field settings where random assignment is either irrelevant or not required.

The classification of the research subjects, conditions, or groups determines the type of research design to be used.

experimental research design

Advantages of Experimental Research

Experimental research allows you to test your idea in a controlled environment before taking the research to clinical trials. Moreover, it provides the best method to test your theory because of the following advantages:

  • Researchers have firm control over variables to obtain results.
  • The subject does not impact the effectiveness of experimental research. Anyone can implement it for research purposes.
  • The results are specific.
  • Post results analysis, research findings from the same dataset can be repurposed for similar research ideas.
  • Researchers can identify the cause and effect of the hypothesis and further analyze this relationship to determine in-depth ideas.
  • Experimental research makes an ideal starting point. The collected data could be used as a foundation to build new research ideas for further studies.

6 Mistakes to Avoid While Designing Your Research

There is no order to this list, and any one of these issues can seriously compromise the quality of your research. You could refer to the list as a checklist of what to avoid while designing your research.

1. Invalid Theoretical Framework

Usually, researchers miss out on checking if their hypothesis is logical to be tested. If your research design does not have basic assumptions or postulates, then it is fundamentally flawed and you need to rework on your research framework.

2. Inadequate Literature Study

Without a comprehensive research literature review , it is difficult to identify and fill the knowledge and information gaps. Furthermore, you need to clearly state how your research will contribute to the research field, either by adding value to the pertinent literature or challenging previous findings and assumptions.

3. Insufficient or Incorrect Statistical Analysis

Statistical results are one of the most trusted scientific evidence. The ultimate goal of a research experiment is to gain valid and sustainable evidence. Therefore, incorrect statistical analysis could affect the quality of any quantitative research.

4. Undefined Research Problem

This is one of the most basic aspects of research design. The research problem statement must be clear and to do that, you must set the framework for the development of research questions that address the core problems.

5. Research Limitations

Every study has some type of limitations . You should anticipate and incorporate those limitations into your conclusion, as well as the basic research design. Include a statement in your manuscript about any perceived limitations, and how you considered them while designing your experiment and drawing the conclusion.

6. Ethical Implications

The most important yet less talked about topic is the ethical issue. Your research design must include ways to minimize any risk for your participants and also address the research problem or question at hand. If you cannot manage the ethical norms along with your research study, your research objectives and validity could be questioned.

Experimental Research Design Example

In an experimental design, a researcher gathers plant samples and then randomly assigns half the samples to photosynthesize in sunlight and the other half to be kept in a dark box without sunlight, while controlling all the other variables (nutrients, water, soil, etc.)

By comparing their outcomes in biochemical tests, the researcher can confirm that the changes in the plants were due to the sunlight and not the other variables.

Experimental research is often the final form of a study conducted in the research process which is considered to provide conclusive and specific results. But it is not meant for every research. It involves a lot of resources, time, and money and is not easy to conduct, unless a foundation of research is built. Yet it is widely used in research institutes and commercial industries, for its most conclusive results in the scientific approach.

Have you worked on research designs? How was your experience creating an experimental design? What difficulties did you face? Do write to us or comment below and share your insights on experimental research designs!

Frequently Asked Questions

Randomization is important in an experimental research because it ensures unbiased results of the experiment. It also measures the cause-effect relationship on a particular group of interest.

Experimental research design lay the foundation of a research and structures the research to establish quality decision making process.

There are 3 types of experimental research designs. These are pre-experimental research design, true experimental research design, and quasi experimental research design.

The difference between an experimental and a quasi-experimental design are: 1. The assignment of the control group in quasi experimental research is non-random, unlike true experimental design, which is randomly assigned. 2. Experimental research group always has a control group; on the other hand, it may not be always present in quasi experimental research.

Experimental research establishes a cause-effect relationship by testing a theory or hypothesis using experimental groups or control variables. In contrast, descriptive research describes a study or a topic by defining the variables under it and answering the questions related to the same.

' src=

good and valuable

Very very good

Good presentation.

Rate this article Cancel Reply

Your email address will not be published.

experimental design activities

Enago Academy's Most Popular Articles

What is Academic Integrity and How to Uphold it [FREE CHECKLIST]

Ensuring Academic Integrity and Transparency in Academic Research: A comprehensive checklist for researchers

Academic integrity is the foundation upon which the credibility and value of scientific findings are…

7 Step Guide for Optimizing Impactful Research Process

  • Publishing Research
  • Reporting Research

How to Optimize Your Research Process: A step-by-step guide

For researchers across disciplines, the path to uncovering novel findings and insights is often filled…

Launch of "Sony Women in Technology Award with Nature"

  • Industry News
  • Trending Now

Breaking Barriers: Sony and Nature unveil “Women in Technology Award”

Sony Group Corporation and the prestigious scientific journal Nature have collaborated to launch the inaugural…

Guide to Adhere Good Research Practice (FREE CHECKLIST)

Achieving Research Excellence: Checklist for good research practices

Academia is built on the foundation of trustworthy and high-quality research, supported by the pillars…

ResearchSummary

  • Promoting Research

Plain Language Summary — Communicating your research to bridge the academic-lay gap

Science can be complex, but does that mean it should not be accessible to the…

Choosing the Right Analytical Approach: Thematic analysis vs. content analysis for…

Comparing Cross Sectional and Longitudinal Studies: 5 steps for choosing the right…

Research Recommendations – Guiding policy-makers for evidence-based decision making

experimental design activities

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

experimental design activities

As a researcher, what do you consider most when choosing an image manipulation detector?

Experimental Design: Types, Examples & Methods

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

Probably the most common way to design an experiment in psychology is to divide the participants into two groups, the experimental group and the control group, and then introduce a change to the experimental group, not the control group.

The researcher must decide how he/she will allocate their sample to the different experimental groups.  For example, if there are 10 participants, will all 10 participants participate in both groups (e.g., repeated measures), or will the participants be split in half and take part in only one group each?

Three types of experimental designs are commonly used:

1. Independent Measures

Independent measures design, also known as between-groups , is an experimental design where different participants are used in each condition of the independent variable.  This means that each condition of the experiment includes a different group of participants.

This should be done by random allocation, ensuring that each participant has an equal chance of being assigned to one group.

Independent measures involve using two separate groups of participants, one in each condition. For example:

Independent Measures Design 2

  • Con : More people are needed than with the repeated measures design (i.e., more time-consuming).
  • Pro : Avoids order effects (such as practice or fatigue) as people participate in one condition only.  If a person is involved in several conditions, they may become bored, tired, and fed up by the time they come to the second condition or become wise to the requirements of the experiment!
  • Con : Differences between participants in the groups may affect results, for example, variations in age, gender, or social background.  These differences are known as participant variables (i.e., a type of extraneous variable ).
  • Control : After the participants have been recruited, they should be randomly assigned to their groups. This should ensure the groups are similar, on average (reducing participant variables).

2. Repeated Measures Design

Repeated Measures design is an experimental design where the same participants participate in each independent variable condition.  This means that each experiment condition includes the same group of participants.

Repeated Measures design is also known as within-groups or within-subjects design .

  • Pro : As the same participants are used in each condition, participant variables (i.e., individual differences) are reduced.
  • Con : There may be order effects. Order effects refer to the order of the conditions affecting the participants’ behavior.  Performance in the second condition may be better because the participants know what to do (i.e., practice effect).  Or their performance might be worse in the second condition because they are tired (i.e., fatigue effect). This limitation can be controlled using counterbalancing.
  • Pro : Fewer people are needed as they participate in all conditions (i.e., saves time).
  • Control : To combat order effects, the researcher counter-balances the order of the conditions for the participants.  Alternating the order in which participants perform in different conditions of an experiment.

Counterbalancing

Suppose we used a repeated measures design in which all of the participants first learned words in “loud noise” and then learned them in “no noise.”

We expect the participants to learn better in “no noise” because of order effects, such as practice. However, a researcher can control for order effects using counterbalancing.

The sample would be split into two groups: experimental (A) and control (B).  For example, group 1 does ‘A’ then ‘B,’ and group 2 does ‘B’ then ‘A.’ This is to eliminate order effects.

Although order effects occur for each participant, they balance each other out in the results because they occur equally in both groups.

counter balancing

3. Matched Pairs Design

A matched pairs design is an experimental design where pairs of participants are matched in terms of key variables, such as age or socioeconomic status. One member of each pair is then placed into the experimental group and the other member into the control group .

One member of each matched pair must be randomly assigned to the experimental group and the other to the control group.

matched pairs design

  • Con : If one participant drops out, you lose 2 PPs’ data.
  • Pro : Reduces participant variables because the researcher has tried to pair up the participants so that each condition has people with similar abilities and characteristics.
  • Con : Very time-consuming trying to find closely matched pairs.
  • Pro : It avoids order effects, so counterbalancing is not necessary.
  • Con : Impossible to match people exactly unless they are identical twins!
  • Control : Members of each pair should be randomly assigned to conditions. However, this does not solve all these problems.

Experimental design refers to how participants are allocated to an experiment’s different conditions (or IV levels). There are three types:

1. Independent measures / between-groups : Different participants are used in each condition of the independent variable.

2. Repeated measures /within groups : The same participants take part in each condition of the independent variable.

3. Matched pairs : Each condition uses different participants, but they are matched in terms of important characteristics, e.g., gender, age, intelligence, etc.

Learning Check

Read about each of the experiments below. For each experiment, identify (1) which experimental design was used; and (2) why the researcher might have used that design.

1 . To compare the effectiveness of two different types of therapy for depression, depressed patients were assigned to receive either cognitive therapy or behavior therapy for a 12-week period.

The researchers attempted to ensure that the patients in the two groups had similar severity of depressed symptoms by administering a standardized test of depression to each participant, then pairing them according to the severity of their symptoms.

2 . To assess the difference in reading comprehension between 7 and 9-year-olds, a researcher recruited each group from a local primary school. They were given the same passage of text to read and then asked a series of questions to assess their understanding.

3 . To assess the effectiveness of two different ways of teaching reading, a group of 5-year-olds was recruited from a primary school. Their level of reading ability was assessed, and then they were taught using scheme one for 20 weeks.

At the end of this period, their reading was reassessed, and a reading improvement score was calculated. They were then taught using scheme two for a further 20 weeks, and another reading improvement score for this period was calculated. The reading improvement scores for each child were then compared.

4 . To assess the effect of the organization on recall, a researcher randomly assigned student volunteers to two conditions.

Condition one attempted to recall a list of words that were organized into meaningful categories; condition two attempted to recall the same words, randomly grouped on the page.

Experiment Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of taking part in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Print Friendly, PDF & Email

Related Articles

Qualitative Data Coding

Research Methodology

Qualitative Data Coding

What Is a Focus Group?

What Is a Focus Group?

Cross-Cultural Research Methodology In Psychology

Cross-Cultural Research Methodology In Psychology

What Is Internal Validity In Research?

What Is Internal Validity In Research?

What Is Face Validity In Research? Importance & How To Measure

Research Methodology , Statistics

What Is Face Validity In Research? Importance & How To Measure

Criterion Validity: Definition & Examples

Criterion Validity: Definition & Examples

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Experimental Design: Definition and Types

By Jim Frost 3 Comments

What is Experimental Design?

An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.

An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental design as the design of experiments (DOE). Both terms are synonymous.

Scientist who developed an experimental design for her research.

Ultimately, the design of experiments helps ensure that your procedures and data will evaluate your research question effectively. Without an experimental design, you might waste your efforts in a process that, for many potential reasons, can’t answer your research question. In short, it helps you trust your results.

Learn more about Independent and Dependent Variables .

Design of Experiments: Goals & Settings

Experiments occur in many settings, ranging from psychology, social sciences, medicine, physics, engineering, and industrial and service sectors. Typically, experimental goals are to discover a previously unknown effect , confirm a known effect, or test a hypothesis.

Effects represent causal relationships between variables. For example, in a medical experiment, does the new medicine cause an improvement in health outcomes? If so, the medicine has a causal effect on the outcome.

An experimental design’s focus depends on the subject area and can include the following goals:

  • Understanding the relationships between variables.
  • Identifying the variables that have the largest impact on the outcomes.
  • Finding the input variable settings that produce an optimal result.

For example, psychologists have conducted experiments to understand how conformity affects decision-making. Sociologists have performed experiments to determine whether ethnicity affects the public reaction to staged bike thefts. These experiments map out the causal relationships between variables, and their primary goal is to understand the role of various factors.

Conversely, in a manufacturing environment, the researchers might use an experimental design to find the factors that most effectively improve their product’s strength, identify the optimal manufacturing settings, and do all that while accounting for various constraints. In short, a manufacturer’s goal is often to use experiments to improve their products cost-effectively.

In a medical experiment, the goal might be to quantify the medicine’s effect and find the optimum dosage.

Developing an Experimental Design

Developing an experimental design involves planning that maximizes the potential to collect data that is both trustworthy and able to detect causal relationships. Specifically, these studies aim to see effects when they exist in the population the researchers are studying, preferentially favor causal effects, isolate each factor’s true effect from potential confounders, and produce conclusions that you can generalize to the real world.

To accomplish these goals, experimental designs carefully manage data validity and reliability , and internal and external experimental validity. When your experiment is valid and reliable, you can expect your procedures and data to produce trustworthy results.

An excellent experimental design involves the following:

  • Lots of preplanning.
  • Developing experimental treatments.
  • Determining how to assign subjects to treatment groups.

The remainder of this article focuses on how experimental designs incorporate these essential items to accomplish their research goals.

Learn more about Data Reliability vs. Validity and Internal and External Experimental Validity .

Preplanning, Defining, and Operationalizing for Design of Experiments

A literature review is crucial for the design of experiments.

This phase of the design of experiments helps you identify critical variables, know how to measure them while ensuring reliability and validity, and understand the relationships between them. The review can also help you find ways to reduce sources of variability, which increases your ability to detect treatment effects. Notably, the literature review allows you to learn how similar studies designed their experiments and the challenges they faced.

Operationalizing a study involves taking your research question, using the background information you gathered, and formulating an actionable plan.

This process should produce a specific and testable hypothesis using data that you can reasonably collect given the resources available to the experiment.

  • Null hypothesis : The jumping exercise intervention does not affect bone density.
  • Alternative hypothesis : The jumping exercise intervention affects bone density.

To learn more about this early phase, read Five Steps for Conducting Scientific Studies with Statistical Analyses .

Formulating Treatments in Experimental Designs

In an experimental design, treatments are variables that the researchers control. They are the primary independent variables of interest. Researchers administer the treatment to the subjects or items in the experiment and want to know whether it causes changes in the outcome.

As the name implies, a treatment can be medical in nature, such as a new medicine or vaccine. But it’s a general term that applies to other things such as training programs, manufacturing settings, teaching methods, and types of fertilizers. I helped run an experiment where the treatment was a jumping exercise intervention that we hoped would increase bone density. All these treatment examples are things that potentially influence a measurable outcome.

Even when you know your treatment generally, you must carefully consider the amount. How large of a dose? If you’re comparing three different temperatures in a manufacturing process, how far apart are they? For my bone mineral density study, we had to determine how frequently the exercise sessions would occur and how long each lasted.

How you define the treatments in the design of experiments can affect your findings and the generalizability of your results.

Assigning Subjects to Experimental Groups

A crucial decision for all experimental designs is determining how researchers assign subjects to the experimental conditions—the treatment and control groups. The control group is often, but not always, the lack of a treatment. It serves as a basis for comparison by showing outcomes for subjects who don’t receive a treatment. Learn more about Control Groups .

How your experimental design assigns subjects to the groups affects how confident you can be that the findings represent true causal effects rather than mere correlation caused by confounders. Indeed, the assignment method influences how you control for confounding variables. This is the difference between correlation and causation .

Imagine a study finds that vitamin consumption correlates with better health outcomes. As a researcher, you want to be able to say that vitamin consumption causes the improvements. However, with the wrong experimental design, you might only be able to say there is an association. A confounder, and not the vitamins, might actually cause the health benefits.

Let’s explore some of the ways to assign subjects in design of experiments.

Completely Randomized Designs

A completely randomized experimental design randomly assigns all subjects to the treatment and control groups. You simply take each participant and use a random process to determine their group assignment. You can flip coins, roll a die, or use a computer. Randomized experiments must be prospective studies because they need to be able to control group assignment.

Random assignment in the design of experiments helps ensure that the groups are roughly equivalent at the beginning of the study. This equivalence at the start increases your confidence that any differences you see at the end were caused by the treatments. The randomization tends to equalize confounders between the experimental groups and, thereby, cancels out their effects, leaving only the treatment effects.

For example, in a vitamin study, the researchers can randomly assign participants to either the control or vitamin group. Because the groups are approximately equal when the experiment starts, if the health outcomes are different at the end of the study, the researchers can be confident that the vitamins caused those improvements.

Statisticians consider randomized experimental designs to be the best for identifying causal relationships.

If you can’t randomly assign subjects but want to draw causal conclusions about an intervention, consider using a quasi-experimental design .

Learn more about Randomized Controlled Trials and Random Assignment in Experiments .

Randomized Block Designs

Nuisance factors are variables that can affect the outcome, but they are not the researcher’s primary interest. Unfortunately, they can hide or distort the treatment results. When experimenters know about specific nuisance factors, they can use a randomized block design to minimize their impact.

This experimental design takes subjects with a shared “nuisance” characteristic and groups them into blocks. The participants in each block are then randomly assigned to the experimental groups. This process allows the experiment to control for known nuisance factors.

Blocking in the design of experiments reduces the impact of nuisance factors on experimental error. The analysis assesses the effects of the treatment within each block, which removes the variability between blocks. The result is that blocked experimental designs can reduce the impact of nuisance variables, increasing the ability to detect treatment effects accurately.

Suppose you’re testing various teaching methods. Because grade level likely affects educational outcomes, you might use grade level as a blocking factor. To use a randomized block design for this scenario, divide the participants by grade level and then randomly assign the members of each grade level to the experimental groups.

A standard guideline for an experimental design is to “Block what you can, randomize what you cannot.” Use blocking for a few primary nuisance factors. Then use random assignment to distribute the unblocked nuisance factors equally between the experimental conditions.

You can also use covariates to control nuisance factors. Learn about Covariates: Definition and Uses .

Observational Studies

In some experimental designs, randomly assigning subjects to the experimental conditions is impossible or unethical. The researchers simply can’t assign participants to the experimental groups. However, they can observe them in their natural groupings, measure the essential variables, and look for correlations. These observational studies are also known as quasi-experimental designs. Retrospective studies must be observational in nature because they look back at past events.

Imagine you’re studying the effects of depression on an activity. Clearly, you can’t randomly assign participants to the depression and control groups. But you can observe participants with and without depression and see how their task performance differs.

Observational studies let you perform research when you can’t control the treatment. However, quasi-experimental designs increase the problem of confounding variables. For this design of experiments, correlation does not necessarily imply causation. While special procedures can help control confounders in an observational study, you’re ultimately less confident that the results represent causal findings.

Learn more about Observational Studies .

For a good comparison, learn about the differences and tradeoffs between Observational Studies and Randomized Experiments .

Between-Subjects vs. Within-Subjects Experimental Designs

When you think of the design of experiments, you probably picture a treatment and control group. Researchers assign participants to only one of these groups, so each group contains entirely different subjects than the other groups. Analysts compare the groups at the end of the experiment. Statisticians refer to this method as a between-subjects, or independent measures, experimental design.

In a between-subjects design , you can have more than one treatment group, but each subject is exposed to only one condition, the control group or one of the treatment groups.

A potential downside to this approach is that differences between groups at the beginning can affect the results at the end. As you’ve read earlier, random assignment can reduce those differences, but it is imperfect. There will always be some variability between the groups.

In a  within-subjects experimental design , also known as repeated measures, subjects experience all treatment conditions and are measured for each. Each subject acts as their own control, which reduces variability and increases the statistical power to detect effects.

In this experimental design, you minimize pre-existing differences between the experimental conditions because they all contain the same subjects. However, the order of treatments can affect the results. Beware of practice and fatigue effects. Learn more about Repeated Measures Designs .

Assigned to one experimental condition Participates in all experimental conditions
Requires more subjects Fewer subjects
Differences between subjects in the groups can affect the results Uses same subjects in all conditions.
No order of treatment effects. Order of treatments can affect results.

Design of Experiments Examples

For example, a bone density study has three experimental groups—a control group, a stretching exercise group, and a jumping exercise group.

In a between-subjects experimental design, scientists randomly assign each participant to one of the three groups.

In a within-subjects design, all subjects experience the three conditions sequentially while the researchers measure bone density repeatedly. The procedure can switch the order of treatments for the participants to help reduce order effects.

Matched Pairs Experimental Design

A matched pairs experimental design is a between-subjects study that uses pairs of similar subjects. Researchers use this approach to reduce pre-existing differences between experimental groups. It’s yet another design of experiments method for reducing sources of variability.

Researchers identify variables likely to affect the outcome, such as demographics. When they pick a subject with a set of characteristics, they try to locate another participant with similar attributes to create a matched pair. Scientists randomly assign one member of a pair to the treatment group and the other to the control group.

On the plus side, this process creates two similar groups, and it doesn’t create treatment order effects. While matched pairs do not produce the perfectly matched groups of a within-subjects design (which uses the same subjects in all conditions), it aims to reduce variability between groups relative to a between-subjects study.

On the downside, finding matched pairs is very time-consuming. Additionally, if one member of a matched pair drops out, the other subject must leave the study too.

Learn more about Matched Pairs Design: Uses & Examples .

Another consideration is whether you’ll use a cross-sectional design (one point in time) or use a longitudinal study to track changes over time .

A case study is a research method that often serves as a precursor to a more rigorous experimental design by identifying research questions, variables, and hypotheses to test. Learn more about What is a Case Study? Definition & Examples .

In conclusion, the design of experiments is extremely sensitive to subject area concerns and the time and resources available to the researchers. Developing a suitable experimental design requires balancing a multitude of considerations. A successful design is necessary to obtain trustworthy answers to your research question and to have a reasonable chance of detecting treatment effects when they exist.

Share this:

experimental design activities

Reader Interactions

' src=

March 23, 2024 at 2:35 pm

Dear Jim You wrote a superb document, I will use it in my Buistatistics course, along with your three books. Thank you very much! Miguel

' src=

March 23, 2024 at 5:43 pm

Thanks so much, Miguel! Glad this post was helpful and I trust the books will be as well.

' src=

April 10, 2023 at 4:36 am

What are the purpose and uses of experimental research design?

15 Experimental Design Examples

experimental design types and definition, explained below

Experimental design involves testing an independent variable against a dependent variable. It is a central feature of the scientific method .

A simple example of an experimental design is a clinical trial, where research participants are placed into control and treatment groups in order to determine the degree to which an intervention in the treatment group is effective.

There are three categories of experimental design . They are:

  • Pre-Experimental Design: Testing the effects of the independent variable on a single participant or a small group of participants (e.g. a case study).
  • Quasi-Experimental Design: Testing the effects of the independent variable on a group of participants who aren’t randomly assigned to treatment and control groups (e.g. purposive sampling).
  • True Experimental Design: Testing the effects of the independent variable on a group of participants who are randomly assigned to treatment and control groups in order to infer causality (e.g. clinical trials).

A good research student can look at a design’s methodology and correctly categorize it. Below are some typical examples of experimental designs, with their type indicated.

Experimental Design Examples

The following are examples of experimental design (with their type indicated).

1. Action Research in the Classroom

Type: Pre-Experimental Design

A teacher wants to know if a small group activity will help students learn how to conduct a survey. So, they test the activity out on a few of their classes and make careful observations regarding the outcome.

The teacher might observe that the students respond well to the activity and seem to be learning the material quickly.

However, because there was no comparison group of students that learned how to do a survey with a different methodology, the teacher cannot be certain that the activity is actually the best method for teaching that subject.

2. Study on the Impact of an Advertisement

An advertising firm has assigned two of their best staff to develop a quirky ad about eating a brand’s new breakfast product.

The team puts together an unusual skit that involves characters enjoying the breakfast while engaged in silly gestures and zany background music. The ad agency doesn’t want to spend a great deal of money on the ad just yet, so the commercial is shot with a low budget. The firm then shows the ad to a small group of people just to see their reactions.

Afterwards they determine that the ad had a strong impact on viewers so they move forward with a much larger budget.

3. Case Study

A medical doctor has a hunch that an old treatment regimen might be effective in treating a rare illness.

The treatment has never been used in this manner before. So, the doctor applies the treatment to two of their patients with the illness. After several weeks, the results seem to indicate that the treatment is not causing any change in the illness. The doctor concludes that there is no need to continue the treatment or conduct a larger study with a control condition.

4. Fertilizer and Plant Growth Study

An agricultural farmer is exploring different combinations of nutrients on plant growth, so she does a small experiment.

Instead of spending a lot of time and money applying the different mixes to acres of land and waiting several months to see the results, she decides to apply the fertilizer to some small plants in the lab.

After several weeks, it appears that the plants are responding well. They are growing rapidly and producing dense branching. She shows the plants to her colleagues and they all agree that further testing is needed under better controlled conditions .

5. Mood States Study

A team of psychologists is interested in studying how mood affects altruistic behavior. They are undecided however, on how to put the research participants in a bad mood, so they try a few pilot studies out.

They try one suggestion and make a 3-minute video that shows sad scenes from famous heart-wrenching movies.

They then recruit a few people to watch the clips and measure their mood states afterwards.

The results indicate that people were put in a negative mood, but since there was no control group, the researchers cannot be 100% confident in the clip’s effectiveness.

6. Math Games and Learning Study

Type: Quasi-Experimental Design

Two teachers have developed a set of math games that they think will make learning math more enjoyable for their students. They decide to test out the games on their classes.

So, for two weeks, one teacher has all of her students play the math games. The other teacher uses the standard teaching techniques. At the end of the two weeks, all students take the same math test. The results indicate that students that played the math games did better on the test.

Although the teachers would like to say the games were the cause of the improved performance, they cannot be 100% sure because the study lacked random assignment . There are many other differences between the groups that played the games and those that did not.

Learn More: Random Assignment Examples

7. Economic Impact of Policy

An economic policy institute has decided to test the effectiveness of a new policy on the development of small business. The institute identifies two cities in a third-world country for testing.

The two cities are similar in terms of size, economic output, and other characteristics. The city in which the new policy was implemented showed a much higher growth of small businesses than the other city.

Although the two cities were similar in many ways, the researchers must be cautious in their conclusions. There may exist other differences between the two cities that effected small business growth other than the policy.

8. Parenting Styles and Academic Performance

Psychologists want to understand how parenting style affects children’s academic performance.

So, they identify a large group of parents that have one of four parenting styles: authoritarian, authoritative, permissive, or neglectful. The researchers then compare the grades of each group and discover that children raised with the authoritative parenting style had better grades than the other three groups. Although these results may seem convincing, it turns out that parents that use the authoritative parenting style also have higher SES class and can afford to provide their children with more intellectually enriching activities like summer STEAM camps.

9. Movies and Donations Study

Will the type of movie a person watches affect the likelihood that they donate to a charitable cause? To answer this question, a researcher decides to solicit donations at the exit point of a large theatre.

He chooses to study two types of movies: action-hero and murder mystery. After collecting donations for one month, he tallies the results. Patrons that watched the action-hero movie donated more than those that watched the murder mystery. Can you think of why these results could be due to something other than the movie?

10. Gender and Mindfulness Apps Study

Researchers decide to conduct a study on whether men or women benefit from mindfulness the most. So, they recruit office workers in large corporations at all levels of management.

Then, they divide the research sample up into males and females and ask the participants to use a mindfulness app once each day for at least 15 minutes.

At the end of three weeks, the researchers give all the participants a questionnaire that measures stress and also take swabs from their saliva to measure stress hormones.

The results indicate the women responded much better to the apps than males and showed lower stress levels on both measures.

Unfortunately, it is difficult to conclude that women respond to apps better than men because the researchers could not randomly assign participants to gender. This means that there may be extraneous variables that are causing the results.

11. Eyewitness Testimony Study

Type: True Experimental Design

To study the how leading questions on the memories of eyewitnesses leads to retroactive inference , Loftus and Palmer (1974) conducted a simple experiment consistent with true experimental design.

Research participants all watched the same short video of two cars having an accident. Each were randomly assigned to be asked either one of two versions of a question regarding the accident.

Half of the participants were asked the question “How fast were the two cars going when they smashed into each other?” and the other half were asked “How fast were the two cars going when they contacted each other?”

Participants’ estimates were affected by the wording of the question. Participants that responded to the question with the word “smashed” gave much higher estimates than participants that responded to the word “contacted.”

12. Sports Nutrition Bars Study

A company wants to test the effects of their sports nutrition bars. So, they recruited students on a college campus to participate in their study. The students were randomly assigned to either the treatment condition or control condition.

Participants in the treatment condition ate two nutrition bars. Participants in the control condition ate two similar looking bars that tasted nearly identical, but offered no nutritional value.

One hour after consuming the bars, participants ran on a treadmill at a moderate pace for 15 minutes. The researchers recorded their speed, breathing rates, and level of exhaustion.

The results indicated that participants that ate the nutrition bars ran faster, breathed more easily, and reported feeling less exhausted than participants that ate the non-nutritious bar.

13. Clinical Trials

Medical researchers often use true experiments to assess the effectiveness of various treatment regimens. For a simplified example: people from the population are randomly selected to participate in a study on the effects of a medication on heart disease.

Participants are randomly assigned to either receive the medication or nothing at all. Three months later, all participants are contacted and they are given a full battery of heart disease tests.

The results indicate that participants that received the medication had significantly lower levels of heart disease than participants that received no medication.

14. Leadership Training Study

A large corporation wants to improve the leadership skills of its mid-level managers. The HR department has developed two programs, one online and the other in-person in small classes.

HR randomly selects 120 employees to participate and then randomly assigned them to one of three conditions: one-third are assigned to the online program, one-third to the in-class version, and one-third are put on a waiting list.

The training lasts for 6 weeks and 4 months later, supervisors of the participants are asked to rate their staff in terms of leadership potential. The supervisors were not informed about which of their staff participated in the program.

The results indicated that the in-person participants received the highest ratings from their supervisors. The online class participants came in second, followed by those on the waiting list.

15. Reading Comprehension and Lighting Study

Different wavelengths of light may affect cognitive processing. To put this hypothesis to the test, a researcher randomly assigned students on a college campus to read a history chapter in one of three lighting conditions: natural sunlight, artificial yellow light, and standard fluorescent light.

At the end of the chapter all students took the same exam. The researcher then compared the scores on the exam for students in each condition. The results revealed that natural sunlight produced the best test scores, followed by yellow light and fluorescent light.

Therefore, the researcher concludes that natural sunlight improves reading comprehension.

See Also: Experimental Study vs Observational Study

Experimental design is a central feature of scientific research. When done using true experimental design, causality can be infered, which allows researchers to provide proof that an independent variable affects a dependent variable. This is necessary in just about every field of research, and especially in medical sciences.

Chris

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 15 Self-Actualization Examples (Maslow's Hierarchy)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Forest Schools Philosophy & Curriculum, Explained!
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Montessori's 4 Planes of Development, Explained!
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ Montessori vs Reggio Emilia vs Steiner-Waldorf vs Froebel

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

Logo for Mavs Open Press

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

8.1 Experimental design: What is it and when should it be used?

Learning objectives.

  • Define experiment
  • Identify the core features of true experimental designs
  • Describe the difference between an experimental group and a control group
  • Identify and describe the various types of true experimental designs

Experiments are an excellent data collection strategy for social workers wishing to observe the effects of a clinical intervention or social welfare program. Understanding what experiments are and how they are conducted is useful for all social scientists, whether they actually plan to use this methodology or simply aim to understand findings from experimental studies. An experiment is a method of data collection designed to test hypotheses under controlled conditions. In social scientific research, the term experiment has a precise meaning and should not be used to describe all research methodologies.

experimental design activities

Experiments have a long and important history in social science. Behaviorists such as John Watson, B. F. Skinner, Ivan Pavlov, and Albert Bandura used experimental design to demonstrate the various types of conditioning. Using strictly controlled environments, behaviorists were able to isolate a single stimulus as the cause of measurable differences in behavior or physiological responses. The foundations of social learning theory and behavior modification are found in experimental research projects. Moreover, behaviorist experiments brought psychology and social science away from the abstract world of Freudian analysis and towards empirical inquiry, grounded in real-world observations and objectively-defined variables. Experiments are used at all levels of social work inquiry, including agency-based experiments that test therapeutic interventions and policy experiments that test new programs.

Several kinds of experimental designs exist. In general, designs considered to be true experiments contain three basic key features:

  • random assignment of participants into experimental and control groups
  • a “treatment” (or intervention) provided to the experimental group
  • measurement of the effects of the treatment in a post-test administered to both groups

Some true experiments are more complex.  Their designs can also include a pre-test and can have more than two groups, but these are the minimum requirements for a design to be a true experiment.

Experimental and control groups

In a true experiment, the effect of an intervention is tested by comparing two groups: one that is exposed to the intervention (the experimental group , also known as the treatment group) and another that does not receive the intervention (the control group ). Importantly, participants in a true experiment need to be randomly assigned to either the control or experimental groups. Random assignment uses a random number generator or some other random process to assign people into experimental and control groups. Random assignment is important in experimental research because it helps to ensure that the experimental group and control group are comparable and that any differences between the experimental and control groups are due to random chance. We will address more of the logic behind random assignment in the next section.

Treatment or intervention

In an experiment, the independent variable is receiving the intervention being tested—for example, a therapeutic technique, prevention program, or access to some service or support. It is less common in of social work research, but social science research may also have a stimulus, rather than an intervention as the independent variable. For example, an electric shock or a reading about death might be used as a stimulus to provoke a response.

In some cases, it may be immoral to withhold treatment completely from a control group within an experiment. If you recruited two groups of people with severe addiction and only provided treatment to one group, the other group would likely suffer. For these cases, researchers use a control group that receives “treatment as usual.” Experimenters must clearly define what treatment as usual means. For example, a standard treatment in substance abuse recovery is attending Alcoholics Anonymous or Narcotics Anonymous meetings. A substance abuse researcher conducting an experiment may use twelve-step programs in their control group and use their experimental intervention in the experimental group. The results would show whether the experimental intervention worked better than normal treatment, which is useful information.

The dependent variable is usually the intended effect the researcher wants the intervention to have. If the researcher is testing a new therapy for individuals with binge eating disorder, their dependent variable may be the number of binge eating episodes a participant reports. The researcher likely expects her intervention to decrease the number of binge eating episodes reported by participants. Thus, she must, at a minimum, measure the number of episodes that occur after the intervention, which is the post-test .  In a classic experimental design, participants are also given a pretest to measure the dependent variable before the experimental treatment begins.

Types of experimental design

Let’s put these concepts in chronological order so we can better understand how an experiment runs from start to finish. Once you’ve collected your sample, you’ll need to randomly assign your participants to the experimental group and control group. In a common type of experimental design, you will then give both groups your pretest, which measures your dependent variable, to see what your participants are like before you start your intervention. Next, you will provide your intervention, or independent variable, to your experimental group, but not to your control group. Many interventions last a few weeks or months to complete, particularly therapeutic treatments. Finally, you will administer your post-test to both groups to observe any changes in your dependent variable. What we’ve just described is known as the classical experimental design and is the simplest type of true experimental design. All of the designs we review in this section are variations on this approach. Figure 8.1 visually represents these steps.

Steps in classic experimental design: Sampling to Assignment to Pretest to intervention to Posttest

An interesting example of experimental research can be found in Shannon K. McCoy and Brenda Major’s (2003) study of people’s perceptions of prejudice. In one portion of this multifaceted study, all participants were given a pretest to assess their levels of depression. No significant differences in depression were found between the experimental and control groups during the pretest. Participants in the experimental group were then asked to read an article suggesting that prejudice against their own racial group is severe and pervasive, while participants in the control group were asked to read an article suggesting that prejudice against a racial group other than their own is severe and pervasive. Clearly, these were not meant to be interventions or treatments to help depression, but were stimuli designed to elicit changes in people’s depression levels. Upon measuring depression scores during the post-test period, the researchers discovered that those who had received the experimental stimulus (the article citing prejudice against their same racial group) reported greater depression than those in the control group. This is just one of many examples of social scientific experimental research.

In addition to classic experimental design, there are two other ways of designing experiments that are considered to fall within the purview of “true” experiments (Babbie, 2010; Campbell & Stanley, 1963).  The posttest-only control group design is almost the same as classic experimental design, except it does not use a pretest. Researchers who use posttest-only designs want to eliminate testing effects , in which participants’ scores on a measure change because they have already been exposed to it. If you took multiple SAT or ACT practice exams before you took the real one you sent to colleges, you’ve taken advantage of testing effects to get a better score. Considering the previous example on racism and depression, participants who are given a pretest about depression before being exposed to the stimulus would likely assume that the intervention is designed to address depression. That knowledge could cause them to answer differently on the post-test than they otherwise would. In theory, as long as the control and experimental groups have been determined randomly and are therefore comparable, no pretest is needed. However, most researchers prefer to use pretests in case randomization did not result in equivalent groups and to help assess change over time within both the experimental and control groups.

Researchers wishing to account for testing effects but also gather pretest data can use a Solomon four-group design. In the Solomon four-group design , the researcher uses four groups. Two groups are treated as they would be in a classic experiment—pretest, experimental group intervention, and post-test. The other two groups do not receive the pretest, though one receives the intervention. All groups are given the post-test. Table 8.1 illustrates the features of each of the four groups in the Solomon four-group design. By having one set of experimental and control groups that complete the pretest (Groups 1 and 2) and another set that does not complete the pretest (Groups 3 and 4), researchers using the Solomon four-group design can account for testing effects in their analysis.

Table 8.1 Solomon four-group design
Group 1 X X X
Group 2 X X
Group 3 X X
Group 4 X

Solomon four-group designs are challenging to implement in the real world because they are time- and resource-intensive. Researchers must recruit enough participants to create four groups and implement interventions in two of them.

Overall, true experimental designs are sometimes difficult to implement in a real-world practice environment. It may be impossible to withhold treatment from a control group or randomly assign participants in a study. In these cases, pre-experimental and quasi-experimental designs–which we  will discuss in the next section–can be used.  However, the differences in rigor from true experimental designs leave their conclusions more open to critique.

Experimental design in macro-level research

You can imagine that social work researchers may be limited in their ability to use random assignment when examining the effects of governmental policy on individuals.  For example, it is unlikely that a researcher could randomly assign some states to implement decriminalization of recreational marijuana and some states not to in order to assess the effects of the policy change.  There are, however, important examples of policy experiments that use random assignment, including the Oregon Medicaid experiment. In the Oregon Medicaid experiment, the wait list for Oregon was so long, state officials conducted a lottery to see who from the wait list would receive Medicaid (Baicker et al., 2013).  Researchers used the lottery as a natural experiment that included random assignment. People selected to be a part of Medicaid were the experimental group and those on the wait list were in the control group. There are some practical complications macro-level experiments, just as with other experiments.  For example, the ethical concern with using people on a wait list as a control group exists in macro-level research just as it does in micro-level research.

Key Takeaways

  • True experimental designs require random assignment.
  • Control groups do not receive an intervention, and experimental groups receive an intervention.
  • The basic components of a true experiment include a pretest, posttest, control group, and experimental group.
  • Testing effects may cause researchers to use variations on the classic experimental design.
  • Classic experimental design- uses random assignment, an experimental and control group, as well as pre- and posttesting
  • Control group- the group in an experiment that does not receive the intervention
  • Experiment- a method of data collection designed to test hypotheses under controlled conditions
  • Experimental group- the group in an experiment that receives the intervention
  • Posttest- a measurement taken after the intervention
  • Posttest-only control group design- a type of experimental design that uses random assignment, and an experimental and control group, but does not use a pretest
  • Pretest- a measurement taken prior to the intervention
  • Random assignment-using a random process to assign people into experimental and control groups
  • Solomon four-group design- uses random assignment, two experimental and two control groups, pretests for half of the groups, and posttests for all
  • Testing effects- when a participant’s scores on a measure change because they have already been exposed to it
  • True experiments- a group of experimental designs that contain independent and dependent variables, pretesting and post testing, and experimental and control groups

Image attributions

exam scientific experiment by mohamed_hassan CC-0

Foundations of Social Work Research Copyright © 2020 by Rebecca L. Mauldin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Causality_hi-res logo

Introduction to Experimental Design

Calvin Lab auditorium

We will cover the fundamentals of designing experiments (i.e., picking interventions) for the purpose of learning a structural causal model. We will begin by reviewing what graphical information can be learned from interventions. Then, we will discuss basic aspects of different settings for experimental design, including the distinction between passive and active settings, possible constraints on the interventions, and the difference between noisy and noiseless settings. After establishing basic nomenclature, we will spend the bulk of our time on a survey of strategies for passive and active experimental design in the noiseless setting, emphasizing general techniques for obtaining theoretical guarantees. We will conclude with a discussion of “targeted” experimental design, in which case the learning objective may be more specific than completely learning a structural causal model, and review the potential complexity benefits.

File Slides

Video Recording

5 Exciting Ways to Teach Experimental Design Without Lecturing

experimental design activities

For many students, science is synonymous with experiments. Performing experiments can be the most exciting part of learning science for students of all levels. 

Experimental design, though, is a completely different story. It’s a lot more challenging to create experiments than to perform them. Yet experimental design is an essential skill for students planning to take STEM courses.

At first, experimental design can be a daunting topic for students to learn. Thankfully, there are creative ways to introduce them in the classroom. Here are five examples of such techniques.

1. Use Interactive Demonstrations of Experimental Design

Experiments are meant to be interactive. Experimental design should be as well. With that, it’s a good idea to teach experimental design through interactive demonstrations. It will teach students how to create their own experimental procedures in a much more hands-on way.

Students can be made to design their own experiments in the lab. There, they can see what happens in their proposed procedures. If they are not satisfied, they can make changes right away. 

Computerized lab simulations also work well. They can manipulate tools, equipment, and reagents all from a safe virtual environment. There is no risk of dangerous spills, contamination, or equipment damage if students make mistakes. With that, they will be more encouraged to explore different experimental designs.

2. Make the Topic Fun with Games and Activities

Contrary to popular belief, games can actually make the learning process a lot more effective for students. Games allow students to learn while having fun. Most of the time, they are not even aware that they’re learning as they play. But when you give them a test, they will likely get great scores.

Games can include simulations like the Experimental Design virtual lab from Labster. This simulation will teach students the principles of the scientific method as well as how to design effective experiments that satisfy the objectives of their study. The virtual lab will also let them identify experimental variables and test which ones they need to manipulate.

Experimental design in action, planning and analyis virtual lab.

Discover Labster's Experimental Design virtual lab!

3. Integrate Technology in Teaching This Topic

Using various technologies to teach this topic is one of the best ways to make students more interested. Videos, interactive presentations, apps, and games can make experimental design a lot more engaging to learn.

Videos, for example, can help students master the concepts behind good experimental design. Students can watch videos at their own pace and repeat them as many times as they need to. This way, they can master the topic more easily.

Interactive presentations can help students achieve the same thing. Students can go through these presentations and repeat certain parts as needed. Similarly, this tool can facilitate better mastery of the topic.

Apps can provide an extra layer of learning outside the classroom. As long as students have their smartphones in hand, they can play around with learning apps. With these, students can study this topic anywhere they may be.

4. Introduce Possible Careers for Inspiration

Students are more engaged in studying topics they know they can use in the future. You can achieve this by introducing careers that use experimental design on a regular basis. These include:

  • Research scientist
  • Pharmaceutical chemist
  • Chemical engineer
  • Biotechnology researcher
  • Cosmetic scientist

This is not an exhaustive list. Most careers in STEM fields use experimental design often at work. So if your students want to explore STEM careers, you can encourage them to master experimental design. 

5. Connect It to Real-World Applications

Tell students how they can use experimental design in real life. This is an excellent way to increase their engagement in the topic. 

Experimental design makes use of principles of the scientific method. These can not only be applied in the lab, but also in other aspects of life.

At home, for example, they can design experiments to solve real-life problems. For instance, a student who has an ant infestation at home can experiment with different mixtures to ward off ants. Through these home experiments, the student can find the best solution to remove ants from his house.

Final Thoughts

Experimental design may be a complicated topic, but teaching it does not have to be that hard. Methods like games, interactive demonstrations, simulations, and career introductions can make students more interested and engaged in learning. 

Experimental model with a mouse and different objects in a virtual lab.

In particular, a simulation like Labster’s Experimental Design virtual lab is effective in helping students master this topic. This virtual lab puts students in charge of a pharmaceutical scientist designing experiments to test the effects of an unknown drug. Students can enhance their knowledge of the scientific method, as well as gain valuable skills in creating experiments.

Try our free 30-day All Access Educator's Pass today and play the Experimental Design simulation alongside 300+ other virtual labs!

a man sitting in front of a computer monitor

Labster helps universities and high schools enhance student success in STEM.

Explore more.

experimental design activities

New Chemistry Simulations Here to Help Build Student Confidence and Competence

experimental design activities

Endocrine System Activities for Students

experimental design activities

5 Engaging Ways to Teach Amino Acids

experimental design activities

Discover The Most Immersive Digital Learning Platform.

Request a demo to discover how Labster helps high schools and universities enhance student success.

  • Publications
  • Conferences & Events
  • Professional Learning
  • Science Standards
  • Awards & Competitions
  • Instructional Materials
  • Free Resources
  • American Rescue Plan
  • For Preservice Teachers
  • NCCSTS Case Collection
  • Science and STEM Education Jobs
  • Interactive eBooks+
  • Digital Catalog
  • Regional Product Representatives
  • e-Newsletters
  • Bestselling Books
  • Latest Books
  • Popular Book Series
  • Prospective Authors
  • Web Seminars
  • Exhibits & Sponsorship
  • Conference Reviewers
  • National Conference • Denver 24
  • Leaders Institute 2024
  • National Conference • New Orleans 24
  • Submit a Proposal
  • Latest Resources
  • Professional Learning Units & Courses
  • For Districts
  • Online Course Providers
  • Schools & Districts
  • College Professors & Students
  • The Standards
  • Teachers and Admin
  • eCYBERMISSION
  • Toshiba/NSTA ExploraVision
  • Junior Science & Humanities Symposium
  • Teaching Awards
  • Climate Change
  • Earth & Space Science
  • New Science Teachers
  • Early Childhood
  • Middle School
  • High School
  • Postsecondary
  • Informal Education
  • Journal Articles
  • Lesson Plans
  • e-newsletters
  • Science & Children
  • Science Scope
  • The Science Teacher
  • Journal of College Sci. Teaching
  • Connected Science Learning
  • NSTA Reports
  • Next-Gen Navigator
  • Science Update
  • Teacher Tip Tuesday
  • Trans. Sci. Learning

MyNSTA Community

  • My Collections

Exploring Experimental Design: Using hands-on activities to learn about experimental design

Science and Children—January 1999

Share Start a Discussion

You may also like

Journal Article

The Early Years...

Editor's Note...

Through regular classroom communications teachers facilitate family partnership in nature-based learning. Teachers can promote family engagement in th...

This article presents a framework to design lesson plans for elementary science teachers using insights from a summer-long research experience for tea...

experimental design activities

Chemistry Education Research and Practice

Enabling general chemistry students to take part in experimental design activities.

ORCID logo

* Corresponding authors

a School of Integrative Biological and Chemical Sciences, University of Texas Rio Grande Valley, 1201 W. University Drive, Edinburg, TX, USA E-mail: [email protected]

In this study, we analyzed how general chemistry students generated experimental designs corresponding to general chemistry questions such as the ones typically found in general chemistry textbooks. We found that students were very successful in including experimental design aspects that were explicitly mentioned in the general chemistry questions, but less successful in including other experimental design aspects. We also analyzed the outcomes of students engaging in the counterpart process – expressing general chemistry laboratory experiments as typical general chemistry questions. We found that that students were very successful in considering the various components associated with expressing the experiments when considering each of the various components one at a time, but less successful when considering the various components at the same time. Considerations and suggestions for implementing these types of activities to enable a wide variety of general chemistry students to take part in experimental design are discussed. Implications for research and teaching, including a consideration of ChatGPT, are also presented.

Article information

Download citation, permissions.

experimental design activities

J. Scoggin and K. C. Smith, Chem. Educ. Res. Pract. , 2023,  24 , 1229 DOI: 10.1039/D3RP00088E

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page .

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page .

Read more about how to correctly acknowledge RSC content .

Social activity

Search articles by author, advertisements.

Critical Thinking in Science

Author: Daniell DiFrancesca
Level: Middle School
Content Area: General Science

Part 1: Introduction to Experimental Design

  • Part 2: The Story of Pi
  • Part 3: Experimenting with pH
  • Part 4: Water Quality
  • Part 5: Change Over Time
  • Part 6: Cells
  • Part 7: Microbiology and Infectious Disease
  • About the Author

experimental design activities

Introduction:

Students will learn and implement experimental design vocabulary while practicing their critical thinking skills in an inquiry based experiment. This lesson is written using the 5E Learning Model.

Learning Outcomes:

  • Students will define and apply the experimental design vocabulary.
  • Students will use the experimental design graphic organizer to plan an investigation.
  • Students will design and complete their own scientific experiment.

Curriculum Alignment:


1.01 Identify and create questions and hypotheses that can be answered through scientific investigations.

1.02 Develop appropriate experimental procedures for:

  • Given questions.
  • Student generated questions.

1.04 Analyze variables in scientific investigations:

  • Identify dependent and independent.
  • Use of a control.
  • Manipulate.
  • Describe relationships between.
  • Define operationally.

1.05 Analyze evidence to:

  • Explain observations.
  • Make inferences and predictions.
  • Develop the relationship between evidence and explanation.

1.06 Use mathematics to gather, organize, and present quantitative data resulting from scientific investigations:

  • Measurement.
  • Analysis of data.
  • Prediction models.

1.08 Use oral and written language to:

  • Communicate findings.
  • Defend conclusions of scientific investigations.
  • Describe strengths and weaknesses of claims, arguments, and/or data

Classroom Time Required:

Approximately 6 class periods (~50 minutes each) are needed, however, some things can be assigned as homework to decrease the time spent in class.

Materials Needed:

  • Overhead transparency of Experimental Design Graphic Organizer
  • Student copies of: Experimental Design Graphic Organizer, Vocabulary Graphic Organizer, Explore worksheet, Explain worksheet
  • Supplies for experiment: Dixie drinking cups, Pepsi and Coke (~1.5 to 2 ounces per student) Possibly ice to keep soda cold
  • Copies of worksheet, Overhead, Small drinking cups (2 per student), Pepsi and Coke
  • Dictionaries or definition sheets for the vocabulary words

Technology Resources:

  • Overhead Projector

Pre-Activities/ Activities:

  • What are the “rules” for designing an experiment?
  • Teacher and class will discuss the following questions:
  • Is there a specific way to design an experiment? (Try to get them to mention the scientific method and discuss any “holes” in this.
  • Are their rules scientists follow when designing an experiment?
  • Are all experiments designed the same?
  • What kinds of experiments have you done on your own? (Good things to discuss are cooking, testing sports techniques, trying to fix things, etc….. Try to relate experimentation to their everyday life.)
  • Review an experiment and answer questions.
  • Students will read a description of an experiment and answer questions about the design of the experiment without using the vocabulary. (See Worksheet 1)
  • Vocabulary introduction and application
  • Students will define the experimental design vocabulary using the graphic organizer (See Worksheet 2).
  • Independent variable, Dependent variable, Control, Constant, Hypothesis, Qualitative observation, Quantitative observation, Inference (Definitions available)
  • Students will review the worksheet from the explore section and match the vocabulary to the pieces of the experiment. Review answers with the class.
  • Students will read a second experiment description and identify the pieces of the experiment using their vocabulary definitions (See Worksheet 3).
  • Introduce Experimental Design Graphic Organizer (EDGO) and complete class designed experiment.
  • The teacher should review the EXAMPLE PEPSI VS COKE EDGO for any ideas or questions
  • Use overhead projector to review the blank EDGO and complete as a class (See Worksheet 4)
  • Tell the class that you are going to do the Pepsi Coke Challenge. The question they need to answer is: Can girls taste the difference between Pepsi and Coke better than the boys?
  • As a class, plan the Pepsi verses Coke experiment. This is a good time to discuss double blind studies and why it is important to make this a double blind study. Students can look at the results within their own class as well as the whole team.
  • This is a good chance to also test multiple variables. You do not need to let students know this, but if the data chart also records things like age, frequency of drinking soda (daily, weekly, monthly, rarely), ability to roll tongue, or anything else they think might be interested in, the results can be analyzed for each variable.
  • I removed labels from the bottles and labeled them A and B. I used a different labeling system for under the cups so the students did not see a pattern (numbered cups were Pepsi, lettered cups were Coke).
  • I recorded the data and organized the tasting while students completed other work in their seats. Two students at a time tasted the soda and I recorded data. You could also have a volunteer who is not participating help with this.
  • Check for students who do not want to drink soda as well as any dietary needs such as diet soda.
  • Do not verify guesses until all of the classes have completed the experiment.
  • How can you accurately remember the pieces of an experiment?
  • Write a poem about four of the vocabulary words.
  • Write a song about four of the vocabulary words.
  • Create a memorization tool for four of the vocabulary words.
  • Make a poster about four of the vocabulary words.

Teachers should evaluate these choices to ensure they show an understanding of the vocabulary.

Assessment:

See Evaluate piece of Activities Section.

Modifications:

  • A different experiment can be designed in the Elaborate section.
  • EDGO can be edited for any motor skill deficiencies by making it larger, or making it available to be typed on.
  • All basic modifications can be used for these activities.

Alternative Assessments:

  • Make necessary adjustments for different experiments.

Critical Vocabulary

  • Independent Variable- the part of the experiment that is controlled or changed by the experimenter
  • Dependent Variable- the part of the experiment that is observed or measured to gather data; changes because of the independent variable
  • Control- standard of comparison in the experiment; level of the independent variable that is left in the natural state, unchanged
  • Constant- part of the experiment that is kept the same to avoid affecting the independent variable
  • Hypothesis- educated guess or prediction about the experimental results
  • Qualitative observation- word observations such as color or texture
  • Quantitative observation- number observations or measurements
  • Inference- attempt to explain the observations

This is the first lesson in the Critical Thinking in Science Unit. The other lessons continue using the vocabulary and Experimental Design Graphic Organizer while teaching the 8th grade content. Students are designing their own experiments to improve their ability to approach problems and questions scientifically. By developing their ability to reason through problems they are becoming critical thinkers.

Supplemental Files: 

application/pdf icon

Address: Campus Box 7006. Raleigh, NC 27695 Telephone: 919.515.5118 Fax: 919.515.5831 E-Mail: [email protected]

Powered by Drupal, an open source content management system

Change Password

Your password must have 8 characters or more and contain 3 of the following:.

  • a lower case character, 
  • an upper case character, 
  • a special character 

Password Changed Successfully

Your password has been changed

  • Sign in / Register

Request Username

Can't sign in? Forgot your username?

Enter your email address below and we will send you your username

If the address matches an existing account you will receive an email with instructions to retrieve your username

Designing Activities to Teach Higher-Order Skills: How Feedback and Constraint Affect Learning of Experimental Design

  • Denise Pope
  • Joel K. Abraham
  • Kerry J Kim
  • Susan Maruca
  • Jennifer Palacio

*Address correspondence to: Eli Meir ( E-mail Address: [email protected] ).

SimBiotic Software, Missoula, MT 59801

Search for more papers by this author

Graduate School, University of Massachusetts, Amherst, MA 01003

Biological Science, California State University, Fullerton, Fullerton, CA 92831

Division of Continuing Education, Harvard University, Cambridge, MA 02138

Active learning approaches to biology teaching, including simulation-based activities, are known to enhance student learning, especially of higher-order skills; nonetheless, there are still many open questions about what features of an activity promote optimal learning. Here we designed three versions of a simulation-based tutorial called Understanding Experimental Design that asks students to design experiments and collect data to test their hypotheses. The three versions vary the experimental design task along the axes of feedback and constraint, where constraint measures how much choice students have in performing a task. Using a variety of assessments, we ask whether each of those features affects student learning of experimental design. We find that feedback has a direct positive effect on learning. We further find that small changes in constraint have only subtle and mostly indirect effects on learning. This work suggests that designers of tools for teaching higher-order skills should strive to include feedback to increase impact and may feel freer to vary the degree of constraint within a range to optimize for other features such as the ability to provide immediate feedback and time-on-task.

INTRODUCTION

With the current emphasis on teaching complex, higher-order skills (American Association for the Advancement of Science, 2011; NGSS Lead States, 2013), and a large body of research that students learn such skills better through active-learning approaches ( Freeman et al. , 2014 ), it is still an open question what types of active learning are best suited to maximize learning ( Behar-Horenstein and Niu, 2011 ; Freeman et al ., 2014 ). A wide range of classroom activities classified as active learning have been shown effective, but they have many different features ( Table 1 ). The literature contains categorizations of active learning approaches, such as by the degree of scaffolding (e.g., Buck et al. , 2008 ), or along a scale of constructivism (e.g., Arthurs and Kreager, 2017). As designers of educational tools, we consider characteristics that might make an activity effective in classrooms (e.g., McConnell et al. , 2017 ), while not requiring too much instructor effort to be practical for larger classes ( Momsen et al. , 2010 ). For instance, computer simulations are known to be effective ( Rutten et al. , 2012 ), but it is often unclear exactly which aspect(s) of a simulation-based learning environment makes it effective and studies often lack data on how specific features such as feedback impact effectiveness ( Chernikova et al. , 2020 ). For this study, we abstracted three features that have been hypothesized as important.

A selection of activities used in large-enrollment biology classes to introduce more active learning, and some broad characteristics typical of each. See definitions in the text for Constraint and Feedback as used here.

ActivityConstraintFeedbackInstructor effort (per student)
Polling questionsHighImmediateLow
Think-pair-shareHighImmediateLow
Case studiesIntermediateImmediate to NoneHigh
Written or oral presentationsLowDelayedHigh
Concept-mappingIntermediate to LowDelayed to NoneHigh
Intelligent tutoring systemsHighImmediateLow
Computer-based model construction/simulation explorationLowDelayed to NoneLow to High
Highly structured simulationsHigh to IntermediateImmediate to NoneLow
Traditional “hands-on” labsHigh to IntermediateDelayed to NoneLow to High

Some inquiry activities afford students little freedom of choice, which we term here a “constraint” on the students' own exploration (after Scalise and Gifford, 2006 ). As an example, computer-based questioning systems designed to help students solidify knowledge on a topic (e.g., Urry et al. , 2017 ) are often highly constrained, consisting of multiple-choice or similar format questions, with limited answer options. By contrast, examples of low-constraint activities include building models in a simulation environment (e.g., Klopfer et al. , 2009 ; Bodine et al. , 2020 ), researching and making written or oral presentations, or other activities where there are many paths available for students to take (even if they have highly scaffolded instructions guiding them). The degree and type of constraint, on their own, can affect learning ( Meir et al. , 2019 ; Puntambekar et al. , 2020).

Another characteristic that differs among active-learning activities is the availability of feedback. Feedback can have a major influence on student learning ( Hattie and Timperley, 2007 ; Shute, 2008 ), but there are mixed results on when and where feedback is most effective ( Kingston and Nash, 2011 ; McMillan et al ., 2013 ; Van der Kleij et al. , 2015 ; Wisniewski et al. , 2020 ). To help tease apart how feedback influences learning, different authors have proposed categorizing feedback along multiple axes. Proposed categories include immediate versus delayed feedback, the level at which the feedback is aimed (e.g., task vs. process vs. self-regulation), whether the feedback simply provides the correct answer, explains the rationale for that answer, or provides guidance for what the student should try next ( Hattie and Timperley, 2007 ; Shute, 2008 ; Brooks et al. , 2019 ). Germane to this study, there are indications that the type and timing of feedback can interact with the type of task the student is completing. The optimal timing of feedback (immediate vs. delayed) is still under debate and may be related to whether tasks are aimed at lower- or higher-order thinking ( Van der Kleij et al. , 2015 ). Elaborated feedback, where an explanation is provided, has sometimes but not always been shown to be more effective than simply providing the correct answer ( Van der Kleij et al. , 2015 ) and its effect may depend on whether task items are highly constrained like multiple choice or lower constraint (LC) constructed response items ( Wang et al. , 2019 ). Much of this prior work postulates that features of feedback that make it effective, especially for higher-order tasks, are those that help students reflect on their understanding in ways that help them improve their future performance (e.g., Maier et al. , 2016 ; Brooks et al. , 2019 ). Most importantly to this work, the bulk of previous research has looked at feedback in contexts of either very constrained tasks such as multiple choice questions ( Van der Kleij et al. , 2015 ; Zhu et al. , 2020 ), or less commonly lightly-constrained tasks such as constructed responses (e.g., Wang et al. , 2019 ; Zhu et al. , 2020 ), but rarely in the context of tasks with constraint that is intermediate between those such as the type of simulation-based teaching tool we explore here.

Finally, while it does not directly impact student learning, the effort involved in preparing and providing feedback or scoring for each student has a large influence on whether instructors adopt a particular type of activity, particularly for large-enrollment classes, which are typical of many introductory-level science courses ( Momsen et al. , 2010 ; McConnell et al. , 2017 ). In Table 1 , we summarize these three features for a range of activities that are commonly used in science classes.

In considering how these three characteristics might theoretically influence the effectiveness of activities ( Nehm, 2019 ), we note that the presence, type, and timing of feedback are often dependent on the amount of constraint, as is the per-student instructor effort. That is, providing timely feedback is often only possible when an activity is highly constrained, or at least thoughtfully constrained at some intermediate level ( Meir, 2022 ). Activities where the interaction is highly constrained, such as through multiple choice questions, can easily provide immediate feedback to the learner, with a low level of per-student instructor effort. Low-constraint activities, such as open-ended simulation environments or written or oral presentations typically do not have immediate explicit feedback because there is often no feasible way to provide such feedback, other than in classroom settings where the feedback comes from the teacher responding to student discussion (e.g., Brooks et al. , 2019 ). Instead, they may have implicit feedback (e.g., the student-built model does or doesn’t run or behave as expected), or limited explicit feedback (e.g., the audience asks good questions and/or provides a few thoughts on the presentation), but the bulk of the specific feedback comes to the student with a delay of days or weeks, if ever, once the instructor has a chance to review and assess the work of all of the students (thus, much higher per-student instructor effort).

Many activities fall between these extremes of high constraint with immediate feedback or low constraint with delayed/no feedback. Case studies, for instance, are often guided more lightly than worksheets of practice problems but are more structured than an open-ended research project. Feedback for a case study might be given immediately if on a computer, for instance, if some or all of the questions are in formats that can be algorithmically scored ( Clarke-Midura et al. , 2018 ; Magalhaes et al. , 2020 ). Or, feedback might be given with a short delay on a worksheet-based case study when groups of students working in a discussion section might periodically have a conversation with the teaching assistant, with a longer delay if the worksheet is turned in for grading, or even never, if the students complete the worksheet and receive points for it without specific detailed feedback.

There is evidence of both learning enhancement and barriers to learning at different positions on the feedback and constraint axes. Timely feedback often leads to more effective and efficient learning but can also be used by students as a crutch or to game the system by relying too much on feedback rather than thinking through the question themselves (reviewed in: Hattie and Timperley, 2007 ; Baker, 2011 ). Different degrees of constraint can similarly be beneficial or detrimental to learning by, for instance, challenging students too little, too much, or just enough given their current skill level ( Colburn, 2000 ; Sweller et al. , 2007; Meir et al. , 2019 ; Meir, 2022 ). Here we ask how these two axes of active learning, feedback, and degree of constraint, may affect learning experimental design, a skill that is complex, difficult, and core to biology (and all sciences).

Experimental design is a difficult higher-order skill

One of the most fundamental skills for students in biology–and indeed all science classes–is designing a good experiment (American Association for the Advancement of Science, 2011; NGSS Lead States, 2013). Experimental process is at the heart of science, yet students often miss important aspects of both the design and implementation of experiments ( Dasgupta et al. , 2017 ; Woolley et al. , 2018 ; Pelaez et al. , 2022 ). Because of this, we chose experimental design as our focal skill for this study. Many aspects of experimental design are challenging to students across all levels of study (e.g., Kuhn et al. , 2009 ; Brownell et al. , 2014 ; summarized in: Schwichow et al. , 2016 ; Dasgupta et al. , 2017 ; Pelaez et al ., 2017 ). From this broad literature, we extracted a set of 17 learning outcomes, listed in the Supplemental Materials (Supplemental Table S1), that we used in a backward design process when writing both the learning tutorials and the assessment items in this study. We do not focus further on these learning outcomes here because, while our specific research questions are about experimental design, the purpose of this study is to illuminate how feedback and constraint may affect the learning of higher-order skills more generally.

This study centers on a simulation-based learning tutorial called Understanding Experimental Design (UED) written for students in undergraduate biology classes ( Pope et al. , 2016 ). In addition to targeting experimental design learning outcomes (Supplemental Materials), we also designed UED to explicitly test ideas about the role of feedback and constraint in enabling student learning. As authors of open-ended simulation-based learning tutorials that often target higher-order skills, we were frustrated that we were not able to provide immediate, specific feedback to students based on their open-ended explorations using our simulations. This is a common problem as it is hard to provide immediate feedback on LC, open-ended activities, particularly in larger classes. Because of this, much of the research on the effects of feedback is done with more constrained tasks such as multiple-choice questions or memorization of lists ( Van der Kleij et al. , 2015 ). We wondered whether a lack of direct feedback reduced student learning efficiency in complex, open-ended tasks. Our premise was that adding some constraints to an open-ended simulation might allow us to provide specific, immediate feedback to students, while still preserving the exploratory aspect of a simulation environment.

Research Questions

Does immediate feedback (enabled by constraint) improve student learning of experimental design?

Does the degree of constraint (higher or lower) impact student learning of experimental design?

Description of the UED tutorial

The version of UED used in this study was the third major revision of an experimental design tutorial on this study, based on extensive student testing and prior research studies with earlier versions ( Abraham et al. , 2009 ). The evolution and justification of choices made in the tutorial and its predecessors are elaborated on elsewhere ( Clarke-Midura et al. , 2018 ; Meir, 2022 ). Here we provide a brief description of UED as it was presented to participants in this study.

Students are given the following scenario. The town of Idyllic has an endemic species called “Simploids” that are beloved by town residents but have recently been getting sick and dying. Students are tasked by the town with doing experiments to discover the cause of the sickness, with two potential causes suspected (parasites and herbicide). The tutorial is split into two sections, which differ in their objectives and level of constraint. This split was based on earlier work on this project (unpublished data) which showed that students’ experimental designs, and their ability to discuss and rationalize those designs with the language of science, were often poorly correlated. We thus aimed one section at teaching the terminology and concepts of good experimental design, and the other on designing, implementing, and interpreting simulated experiments – in other words, the first focused on developing students’ declarative and conceptual knowledge and the second on developing their procedural knowledge ( Ambrose et al. , 2010 , pp. 18–19).

Section 1 is a scaffolded lesson on experimental design that provides students with the building blocks for good experiments and the language used to describe them. Among the concepts covered are systematic variation, scope of inference, independent and dependent (or outcome) variables, treatment and control groups, potentially confounding variables, and replication. Students are assessed on their understanding as they progress through the section with 19 formative assessment items. Most questions in this section use the high-constraint multiple choice format, with five multiple-select, numerical, and other formats that are less constrained than multiple choice but more than open response termed “intermediate constraint” or “IC” ( Scalise and Gifford, 2006 ; Meir et al. , 2019 ).

In Section 2 students design and conduct their own experiments using a simulation of the Simploids. After being guided to choose a hypothesis and plan their experimental design, the heart of the section is an interactive control panel where students design and run their experiment ( Figure 1A ). They can choose to use up to eight study plots through an “Add Study Plot” button. They must decide how many Simploids and plants (the food of the Simploids) to place in each plot, and whether to include herbicide and/or parasites in each. Once they start running the simulation, they must decide how long to run their experiment ( Figure 1B ). Students can see the health status of the Simploids throughout the simulation (there are three states – healthy, sick, and dead), and after its completion, an interactive data table lets them view their results. They can adjust the design of their experiment, and complete as many runs as they choose. Each run can last up to 28 d, in 7-d increments. This section also has students answer questions in formats with intermediate degrees of constraint (for instance, constructing sentences from fill-in-the-blank choices [LabLibs: Meir, 2019]) and several short-answer questions on their experimental design plans and conclusions. All questions, except the short-answer format and the interactive data table, provide immediate feedback to students.

FIGURE 1. Experimental design activity in UED. (A) The panel students use to design their experiment. Students can use up to eight study plots. Each has sliders for selecting number of Simploids and plants, and checkboxes for whether to include herbicide or parasite. A “Check My Setup” button near the bottom provides feedback on the current design. (B) Once ready, clicking “Conduct Test” at the top of the design interface switches to an interface allowing student to run the simulation and collect data. The simulation uses individual-based models. Buttons at the bottom let student choose how long to run and when to collect data.

Section 2 includes two different experiments, modeling for students that the experimental process is often iterative. After carrying out one or more experimental runs to test one hypothesis for the cause of the disease and drawing a conclusion (“Initial Experiment”), students are asked to consider what they still don’t know and design and carry out a second experiment to expand on their knowledge (“Follow-Up Experiment”). For example, they may choose in the Follow-Up Experiment to test the other putative cause of the disease, or test both potential causes simultaneously (they are strongly encouraged in the Initial Experiment to test only one variable). The underlying simulation is complex enough that almost all students have room to learn more about the system after the Initial Experiment.

In the Initial Experiment, students can choose to receive immediate, specific feedback on their experimental design before running their experiment, through a “Check My Setup” button (which they can use as often as they wish). In terms of the major feedback classification systems mentioned in the introduction, this feedback is mostly at the task and process level (how and why to use certain experimental design concepts), elaborated (with explanations), and a mix of what should be done, how to do it, and where to go next. For instance, if a student has no variation between their plots, they receive feedback that reads “To draw conclusions from an experiment you need to create systematic variation so that you can make comparisons between plots. It doesn’t look like there is any variation between your plots. In particular, you should vary the independent variable that you specified in your hypothesis.” The focus of the feedback is on process rather than on whether the student got the task right or wrong. The words “systematic variation” and “independent variable” link to definitions of those terms (setting expectations), there is an indication of whether the student performed as expected (“it doesn’t look like …”) and the student is given suggestions for where to go next (“you should vary …”). This feedback has the hallmarks of types of feedback that have been successful in other studies ( Brooks et al. , 2019 ).

Our algorithms provide feedback on four aspects of students’ experimental design: whether their design systematically varied variables across plots; whether they had appropriate controls for each variable; whether their experiment matched their hypothesis; and whether they had appropriate replication. We did not provide feedback within the experimental design area about two “natural history” related aspects of the experiment: whether they ran the experiment long enough to see the disease progression; and whether they included enough plants to feed all the Simploids in a plot. They should have been able to determine good values for those two parameters from natural history information about the Simploids and the disease progression provided earlier in the section, and from observing the progress of their experiments and examining their experimental results (a form of indirect feedback). The Follow-Up Experiment does not include the “Check My Setup” button, because it is intended as a near-transfer assessment where students can apply what they learn from the feedback (both direct and indirect) that they received on the Initial Experiment.

See https://simbio.com/content/understanding-experimental-design for a video introduction to UED. The released version of the UED tutorial (with small modifications from the versions used in this study) is available for evaluation purposes from SimBiotic Software by writing to info@simbio. com and referencing this paper.

Three experimental versions of UED

To separately measure the effects of feedback and of constraint on student learning, our study compared three versions of UED. All versions included the same Section 1 of the tutorial, with different versions of Section 2. One version, which we call IC, With Feedback (ICWF), constrained students in the design activity by only allowing them to select presence or absence of parasites and herbicide in each study plot (using radio buttons), and only allowing addition of Simploids and plants by increments of 10 (using sliders; Figure 2A ). This ICWF version includes the “Check My Setup” button which students are free to click at any time while doing the Initial Experiment to receive feedback about their current design. This ICWF version is equivalent to the full version of the tutorial as described in the previous section.

FIGURE 2. Three versions of UED. (A) ICWF has sliders and checkboxes for determining contents of each study plot and includes a “Check My Setup” button. (B) ICNF is identical to ICWF but has no “Check My Setup” button. (C) LCNF has students add contents to each study plot with drag and drop for more flexibility and has no “Check My Setup” button.

A second version, which we called IC, No Feedback (ICNF), was identical to the ICWF version, except the “Check My Setup” button was hidden, so no feedback was available in the Initial Experiment ( Figure 2B ). We thus could compare students who completed the ICWF version of the tutorial with students who completed the ICNF version to test for the effect of feedback, without a change in constraint.

A third version which we called LC, No Feedback (LCNF) also was missing the “Check My Setup” button. In addition, rather than radio buttons and sliders, students controlled all four parameters (Simploid population, plant population, parasite, and herbicide) by placing those items in the study plots with the computer’s mouse ( Figure 2C ). They could place multiples of an item with a click and drag (like drawing a rectangle in a drawing program). Students thus had finer control over the number of each item – rather than presence or absence, or multiples of 10, they could place, say, five units of herbicide, three units of parasite, 24 Simploids, and 32 plants, opening up other possible experiments such as testing for dose effects. Students using the LCNF version could create plots with the same parameter settings as were available in the IC versions, as well as many other combinations. Another difference is that the simulation ran in 1-d, rather than 7-d, increments (the maximum duration was still 28 d). Thus, the students in this treatment had more choices for their experimental design, but required a bit more effort per experiment. We note that this is a relatively small reduction in constraint, and many constraints remain (there are still only four variables available, only eight study plots, etc.). By comparing students who complete the ICNF version to those who complete the LCNF version, we are able to isolate the effect of a small change in constraint, without a change in feedback.

An ideal experimental design would also include a LC With Feedback condition. However, we were not able to create that version because our feedback algorithms require a more constrained number of combinations to provide accurate feedback.

The Follow-Up Experiment in each version – the second round of experiments where students are encouraged to test a second hypothesis – was similar to the Initial Experiment in most ways, including the level of constraint. However, the “Check My Setup” button was not present in the Follow-Up Experiment in any of the versions. We intended to use students’ designs in the Follow-Up Experiments as a performance-based comparison of experimental design ability by treatment.

Student testing of UED and prior versions.

We used extensive think-aloud interviews to check the clarity and fidelity to our intent of all UED activities and questions. These were conducted as part of an iterative design-research process, starting with another SimBio module called “Darwinian Snails”. That module was first extended with a section on experimental design. Through several more iterations that section was split into its own module, a tutorial on principles of good design was added, and the storyline was changed to discuss the fictional Simploids. We conducted over 80 student interviews throughout this process to gather the data which drove the iterations, and conducted another three specifically with the LCNF version of the tutorial. All interviews were with undergraduate students recruited from introductory biology classes from colleges and universities around Boston, MA, ranging from research universities to undergraduate-focused colleges to community colleges, both public and private.

Measures of experimental design

For the study presented here, we used four assessments of students learning, and three sets of students, to address our research questions ( Tables 2 and 3 ). This section provides a brief overview of the assessments and the next provides an overview of experimental samples.

Assessments and populations used to address research questions. See text for descriptions of each assessment and population. Supplemental Tables S1 and S2 are sections 1 and 2 of UED

ComparisonInference
Samples: Split-Class & Individual
 Pre to Post Paired comparisons, within each populationEffect of UED S1 on declarative & conceptual knowledge of experimental design
Samples: Split-Class & Large-Scale
 Post Mean scores between populationsGeneralizability of results
Samples: Split-Class & Individual
 ICWF to ICNF Experimental Design scoresEffect of on experimental design skills in UED S2
 ICNF to LCNF Experimental Design scoresEffect of on experimental design skills in UED S2
 ICWF to ICNF to LCNF Biology scores effects of feedback and constraint on experimental design skills in UED S2
Samples: Split-Class & Large-Scale
 ICWF Experimental Design scores Between populationsGeneralizability of results
Samples: Individual
 Pre to Post Paired comparisons, all treatments combinedEffect of completing UED on experimental design skills in transfer task
 ICWF to ICNF to LCNF Pre to Post changeEffect of and on experimental design skills in UED S2
Samples: Individual
 Pre to Post Paired comparisons, all treatments combinedEffect of completing UED on declarative & conceptual knowledge of experimental design
Data sources by sample studied

SampleIndividualSplit-ClassLarger-Scale
Versions of UED testedICWFICNFLCNFICWFICNFLCNFICWF
MV-EDAT and interview(Total ED score; ED elements)ICWF ( = 11);ICNF ( = 14);LCNF ( = 14)
EDCT Pre-Section 1X ( = 41)X ( = 165)
EDCT Post-Section 1X ( = 41)X ( = 165)X ( = 1292)
UED Section 2 Experimental Designs(Experimental Score; Biology Score)ICWF ( = 11);ICNF ( = 14);LCNF ( = 14)ICWF ( = 52);ICNF ( = 64);LCNF ( = 44)ICWF ( = 648)

a UED is the UED tutorial, with versions ICWF, ICNF, and LCNF. MV-EDAT is the MV-EDAT and EDCT is the EDCT (see text for details of these tests). Sample sizes in parentheses – sizes listed for each treatment were those used for treatment comparisons.

Screening survey.

To stratify students in one of the study sets (see below) by prior understanding of experimental design concepts, we asked each interested participant to fill out an online prescreening survey. In addition to asking several demographic and logistical questions about their availability, the survey contained a nine-question conceptual assessment (see Supplemental Material B) which was used to split students into high, medium, and low performing sets. As this screening survey was used only for this purpose, and for only one set of study subjects, we spent minimal effort on validation and do not discuss those survey results further in this paper.

Experimental Design Concepts Test (EDCT).

To assess student understanding of the experimental design concepts and vocabulary that the first section of UED was designed to convey (i.e., declarative and conceptual knowledge), we wrote an assessment we call the Experimental Design Concepts Test (EDCT). While several other assessments on competence in experimental design exist (e.g., Sirum and Humburg 2011 ; Gobert et al. , 2013 ; Dasgupta et al. , 2017 ; Deane et al. , 2017 ), none published at the time had sufficient coverage of the learning outcomes addressed in UED while also being amenable to autoscoring. The EDCT consisted of 14 multiple choice questions, 11 of which had four answer choices and the last three with two answer choices. All questions were written to address one of the learning outcomes we were targeting (Supplemental Material Section A). The Supplemental Materials (Supplemental Material Section C2) present several lines of validity evidence that the EDCT was measuring student performance on the focal outcomes.

For the three experimental versions of UED, we placed the EDCT before and after Section 1 to measure learning gains from that highly constrained portion of the tutorial. As Section 1 of UED was identical for all treatments, we also used EDCT results to check for any preexisting differences in performance between treatment groups.

Multiple Variable Experimental Design Ability Test (MV-EDAT).

Distinguishing between true learning of experimental design versus learning design within the specific context of the UED tutorial required a performance-based assessment independent of the tutorial. For this purpose, we looked for a pre/post assessment of experimental design procedural knowledge that was open-ended and could capture many of the skills that UED was designed to teach. We started with the EDAT ( Sirum and Humburg, 2011 ) and the Expanded EDAT ( Brownell et al. , 2014 ), which prompt students with a real-world scenario and ask them to design an experiment to address the challenge posed in the prompt (e.g., testing the validity of claims that a supplement has a specific impact on human performance). The prompts in those assessments were not well suited to this study, so we built our own derived prompts that we call the MV-EDAT because they include more variables than the original EDAT. There are two versions of the MV-EDAT, one called “Lizard” and the other “Fish”, after the species used in each prompt (see Supplemental Material Section D for more on the prompts and the logic for creating them). Students answered the prompts by drawing and/or writing on the paper that included the prompt.

The MV-EDAT prompts were followed by semistructured interviews to more completely document student declarative and conceptual knowledge. The interview started by asking them to describe the experiment they had designed on paper, and then followed up with questions designed to probe their understanding of experimental design concepts (interview script available on request). To assess their declarative knowledge, some questions asked them to identify elements of their experiment (e.g., “Which is your control group?”); to probe their conceptual knowledge, other questions asked them to explain a concept (e.g., “How do you define control group?”). The interviews allowed us to disentangle procedural knowledge that students draw on when designing an experiment (i.e., the Apply, Analyze, and Create levels of Blooms taxonomy) from declarative and conceptual knowledge that students can cite when prompted (i.e., the Remember and Understand levels of Blooms).

Administration and scoring of the MV-EDAT.

Two of the authors [D.P. and J.P.] conducted all the student interviews involving the MV-EDAT. To the extent possible we blinded interviewers and those scoring the interviews to the treatment for each student interviewed. We scored the students’ MV-EDAT experimental designs using both the descriptions they wrote on the paper with the MV-EDAT prompt, and also their verbal descriptions at the start of each interview. For each element of their experimental design where their written and verbal responses differed, they received the higher of the two scores. The experiments described on paper and/or verbally were scored on the presence of six different elements: 1) Uses systematic variation, 2) Addresses hypothesis, 3) Includes replication, 4) Includes variables held constant, 5) Includes dependent variable, and 6) Includes experiment duration. Each of these elements was scored on a scale from 0–2, with 0 being absence (i.e., no systematic variation, no replication, no mention of duration, etc.), 1 being incomplete or partially correct expression of the element and 2 being full and correct expression of the element. We also calculated a total experimental design score by summing all six elements (for a total possible score of 12). Using a randomization test (see below), we found no significant difference between MV-EDAT prompts so we consider the two prompts to be equivalent.

We separately scored eight of the probing interview questions intended to further explore student declarative (four questions) and conceptual (four questions) knowledge of experimental design (we did not analyze all probing questions because the semistructured nature of the interview meant that not all questions were asked consistently of all students). We used a rubric to score responses to the probing questions on the degree of expert-like response, from 0 (no evidence of understanding), 1 (partial evidence of understanding), and 2 (more complete evidence of understanding).

Two team members independently scored every student’s experimental design and response to probing questions using the rubrics described above, and then we discussed and came to consensus on each score. See Supplemental Materials (Supplemental Material D2) for more detail on administering and scoring the MV-EDAT, including measures taken to blind interviewers and scorers to treatments.

Analysis of experimental designs within UED.

An “Experimental Score,” which combines three factors that is important to a well-designed experiment. Students received one point if their design had systematic variation between experimental plots; one point if they had appropriate control plot(s); and one point if they had full replication of all plots or ½ point for replication of some but not all treatments (e.g., replicating the experimental but not control treatment). The Experimental Score for a student could thus run between 0–3. Crucially, in the ICWF treatment these were all items where the “Check My Setup” button provided specific feedback when their design scored less than one on any of these.

A “Biology Score,” which combines two factors that are specific to the biological example presented. Students received one point for having sufficient plants in each plot so the Simploids did not starve, and one point for running the simulation long enough that the disease could take its course. The Biology Score for a student could thus run between 0–2. Crucially, the “Check My Setup” button did not offer direct feedback on either of these items, but students had the information about the number of plants necessary, and by observing the course of their simulations, they had information available to infer when they had made these errors (i.e., without sufficient plants, all Simploids in the plot died immediately, and if they had the disease, Simploids first appeared sick before they died).

A “Match Hypothesis Score” which tallies whether the experiment the student designed tests the hypothesis they had previously chosen earlier in Section 2. In the Initial experiment, students had a choice of two hypotheses (Herbicide or Parasites as the disease causal factor). In the Follow-Up experiment, they could also choose “a combination of herbicide and parasites.” The Match Hypothesis Score was 1 if they varied the variable(s) in their hypothesis and no others, 0.5 if they varied the variable(s) in their hypothesis and others, and 0 if they did not vary the variable(s) in their hypothesis. For the Follow-Up Experiment, we only scored the Match Hypothesis for the Split-Class students, because the hypothesis for the Follow-Up Experiment was an open-response item in the Individual- and Larger-Scale studies.

An “Experimental Complexity Score” which measures whether the student attempted to manipulate zero, one, or two independent variables.

See the Supplemental Materials (Supplemental Material E) for more details on scoring of those items.

We looked at the Experimental Score for the first design students ran in the Initial and in the Follow-Up Experiments, the last design they ran in each, and the best design (i.e., highest-scoring) they ran in each. The differences between first, last, and best were small and the patterns were the same regardless of which we chose, so here we report all results using the first design students ran in the Follow-Up Experiment, and in the few cases where results differed between experiments, the best design students ran in the Initial Experiment. We chose the best design in the Initial Experiment as we thought students might learn from earlier runs of the experiment (or the feedback they received, in the case of the ICWF treatment) before running a good design, while we chose the first design in the Follow-Up Experiment as we thought by that point in the tutorial, students were less likely to be engaging in exploratory learning and more likely to attempt to directly design the experiment they wanted to run.

Faculty produced good experimental designs.

To validate our interpretation of scoring the UED experimental designs, we asked five biology faculty, who we assume have relatively expert knowledge of the experimental design process (especially relative to students) to go through UED, including the experimental design tasks. Four of the faculty-produced experimental designs we would have scored as perfect. One had a design we would have scored as perfect on “Experimental Score” and would have deducted a point on “Biology Score” as they included more Simploids than plants, causing the Simploids to die from starvation.

Given that faculty (our “experts”) generally produced designs that our algorithms scored as perfect, we performed statistics on the Experimental Score and the Biology Score by comparing students who received a perfect score versus students who didn’t, lumping together all nonperfect scores. This way we were judging the proportion of “expert-like” experimental designs in each sample.

Study samples

This study includes three distinct samples of students. Each sample facilitated the collection of some types of data but not others. To draw robust conclusions required combining the insights gained from each of the three samples. This section describes each sample and the purpose of including them in the study. 

Individual comparison of UED versions.

a. Interested students took the Screening Survey online and were invited for in-person interviews. These students were randomized between UED treatments within the low, medium and high strata determined by the Screening Survey.

a. Students completed the MV-EDAT, including the follow-on interview

b. Students completed Section 1 of UED (the lesson on experimental design vocabulary and concepts), including the EDCT as a pre- and posttest around this first section. Their actions in the tutorial were recorded with screen recording. They were given unlimited time to complete this section of the tutorial. This completed the first interview session.

a. Students completed Section 2 of UED (in which they designed and carried out simulated experiments), again with screen recording. They were given unlimited time to complete this section.

b. After completion, students were given a second version of the MV-EDAT (using a different prompt) including the follow-up interview.

A total of 42 students participated in the interviews, 14 per treatment. Three students in the ICWF treatment were removed from analysis because they never requested any feedback on their experimental designs in Section 2 and therefore could not be used to test for the effects of feedback, leaving a total of 39 students in the study.

Split-class comparison of UED versions.

The Individual sample was by necessity limited in size because each interview required extensive time and resources. To expand the data available to address the research questions we conducted a split-class study within an introductory biology class at a western U.S. masters-granting institution. The class consisted of 11 sections of around 24 students ( N = 259 total), with each section receiving one of the three UED versions. The sections were split across two lecture instructors and five lab teaching assistants. For practical reasons related to the class structure and the software architecture, we needed to assign each section to a treatment rather than randomizing treatments across all students. With only 11 sections, randomizing treatment by section was likely to lead to the introduction of confounding factors. Rather, when assigning the sections to different versions of the UED tutorial, we worked to balance sections across lecture instructors, lab TAs, and lab start times to minimize additional variation between treatment groups (e.g., TAs with two sections would have each section assigned to a different treatment). Data from this sample came from the EDCT and from student designs within the UED tutorial ( Table 2 ).

The UED tutorial was delivered through SimBio’s SimUText System, a robust and widely-used software package for distributing simulation-based biology teaching materials ( https://simbio.com ). The UED tutorial was assigned as homework, with credit given only for completion of the tutorial, not for correctness of responses within the tutorial. Before the UED assignment, another simulation-based module covering topics in evolution (SimBio’s Evolutionary Evidence) was assigned, so students had already overcome any logistical challenges to accessing the SimUText System. During the initial subscription process to the SimUText System students were asked for consent to participate in research. When discussing results from the EDCT assessment, we include only data from consenting students who completed all questions of the EDCT both before and after completing Section 1 of UED. When comparing experiments designed in Section 2 of UED between treatments, we include only data from consenting students who completed both the Initial Experiment and the Follow-Up Experiment in Section 2. Students in the ICWF (with feedback) treatment who never requested feedback were removed so data from that treatment only included students who received at least one piece of feedback on their design. Some students completed both EDCT assessments, but not both experimental design activities and vice versa, so the students analyzed for those two data sets overlap but are not identical. We refer to this sample as our “Split-Class” sample (see Table 3 for sample sizes).

Larger-scale testing of UED.

To probe the generality of our results, we provided the ICWF version of UED to 27 classes at 14 institutions during the 2016/17 school year (total N = 1348 consenting students). This version differed from those used above in that the EDCT was only placed as a posttest after Section 1, but not as a pretest (in the interest of saving class time for instructors volunteering their classes). Of these 27 classes, we dropped three classes that had fewer than 10 consenting students each, along with any student who did not answer all questions on the EDCT, leaving a total of 24 classes and 1292 consenting students. We included all these students when analyzing EDCT scores. Two of these classes did not use Section 2 of UED, and several others had less than 10 students completing Section 2 of UED. Of the remaining 17 classes, only 648 individual students fully completed Section 2. We include only these 648 students when analyzing experimental designs from the Initial and Follow-Up Experiments in Section 2. Data from this sample came from the EDCT posttest and from student designs within the UED tutorial, but unlike the other samples, there was only a single treatment for all students (the ICWF version). Thus, this sample cannot be used to test for treatment effects.

These classes came from a variety of institutions, including one community college, five liberal arts colleges, two masters and six doctoral-granting institutions. Four classes were for upper-level biology students, one was introductory environmental science, and the rest were either majors or nonmajors introductory biology, with the number of consenting students ranging from 11–154 per class. Faculty were recruited through webinars offered via SimBio’s mailing list, and only institutions whose own IRB committees approved the study were included in the data used here. Faculty was asked to assign UED for credit in their course, but otherwise was free to use it as they wished. We collected some demographic information from students, including whether they used the tutorial as part of a group or individually, and whether they used it in or outside of class time. We refer to this sample as our “Larger-Scale” sample.

To summarize, we had three experimental samples: Individual, Split-Class, and Larger-Scale, that each provided different subsets of data relating to our research questions. Tables 2 and 3 provide an overview of what data was available from each sample.

We collected a large amount of data from each student in the Individual sample, but statistically the sample is small, particularly when divided into the three treatments. We thus chose to analyze data from the Individual sample using a combination of randomization tests and standard parametric tests. Randomization tests can perform better than conventional statistics for small samples and when it is unknown whether the data has a normal distribution ( Hesterberg et al. , 2003 ; Craig and Fisher, 2019 ). We wrote randomization tests in Python (attached: UEDRandomization.py). For comparisons where we ran both parametric and randomization tests, the results were similar and we choose to only report the parametric results for conciseness.

To check for homogeneity of the samples, we compared the ratio of variance on the preassessment scores to that of the postassessment scores and it was suitably low (ratio = 1.66; Craig and Fisher, 2019 ). We used permutations of the data as is appropriate when looking for significance ( Hesterberg et al. , 2003 ) with 10,000 permutations per test. Each permutation took the measured values and redistributed them randomly between students (see below). Where test statistics are two-tailed, we took the absolute value of each calculated statistic. For a statistical significance level of 0.05, we interpret a result as significant if < 5% of the permutation data sets has a test statistic value equal to or higher than the actual data set.

To compare single values between groups, such as comparing MV-EDAT scores between the Lizard and Fish prompts, scores were randomly redistributed between the Lizard and Fish categories during each randomization, and we used the t statistic to compare the groups.

To compare pre- to postassessment scores on the MV-EDAT, we pooled all pre- and postassessment scores, randomly redistributed them between the pre- and postcategories, and then recalculated pre to post differences. Although no statistical difference was found between the Lizard and Fish prompts, we also controlled for differences in prompt difficulty by conducting a second randomization test where values were randomly redistributed only within the same prompt type (i.e., scores on the Lizard prompt were randomly assigned to other Lizard prompts and same for Fish). To control for differences among individual students, we separately tried a third randomization test where pre/post values were only redistributed within each student (i.e., each student had their actual pre- and postscores randomly mixed, but scores were not mixed between students). Neither controlling for interview prompt nor controlling for student differences changed our results, so for simplicity we report only the results for randomizing fully across all students and both prompt types.

We performed all parametric analyses in RStudio version 2023.06.0+421 ( R Core Team, 2023 ). We used repeated-measures ANOVA (RMANOVA) to compare Split-Class and Individual pre- to post performance on the EDCT, and Individual pre- to post performance on the MV-EDAT. We also compared pretest data across the three treatments in the Split Class sample using a single factor ANOVA. We checked all data for normality using Shapiro-Wilk tests and Q–Q plots and, in the case of repeated measures, checked for sphericity using Mauchly’s test of sphericity.

Differences on binary variables (student receiving perfect score on Experimental and Biology Scores) were tested using Fisher’s exact test in r 3.1.1.

We calculated normalized change scores from pre to post assessments according to Theobald and Freeman (2014) . Whenever we ran multiple analyses on the same data we adjusted the alpha level using a Bonferroni correction. We calculated effect size as generalized eta-squared (η 2 g ; Lakens, 2013 ).

Students were comparable in initial experimental design conceptual knowledge across all samples

To check for any differences between treatments within a sample, we compared student performance on the two pre-UED assessments – the EDCT and MV-EDAT. There was no significant difference between EDCT scores of students in the three treatments of either the Individual or the Split-Class sample (F[2162] = 0.766, p > 0.05). The sample size precludes us from knowing whether there might have been smaller undetected differences but we have no evidence of any large differences between participants either within or between samples (latter, unpublished data).

We do not have EDCT pretest data for the Larger-Scale sample, but we compared posttest (after UED section 1) EDCT data. Both Split-Class and Individual samples were in the middle of the range of the scores seen with the Larger-Scale classes ( Figure 3 ). We saw no clear trends in the larger-scale EDCT data with class level, institution type, or class size (data not shown).

FIGURE 3. Scores on the EDCT from Individual and Split-Class samples. Students in the Individual and Split-Class samples completed EDCT pre- and post-UED Section 1; Larger-Scale sample students completed it post only. Individual and Split-Class samples: Centerlines show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend to the 5th and 95th percentiles; values outside these percentiles are shown as individual points; means are indicated by “+” symbol. The Individual and Split-Class samples were comparable in their pre scores ( p = 0.17) and both showed very small increases between pre and post.

We also compared preassessment scores on the MV-EDAT between treatments in the Individual sample. Although preassessment scores were a bit higher in the ICWF treatment than the others (4.0 ICWF; 2.6 ICNF; 3.1 LCNF), none of the differences were significant ( p > 0.17 for all comparisons with randomization test; unpublished data).

Higher constraint first section of UED has almost no effect on students’ conceptual knowledge of experimental design

To assess student learning from the higher constraint first section of UED, we compared students’ pre- and posttest scores on the EDCT (taken before and after section 1 of UED) using a repeated-measures ANOVA. There were some minor departures from normality in the data, but the assumption of sphericity was met in both the Split Class and Individual Class data. While posttest scores were slightly higher in both the Individual and the Split-Class samples, the differences were very small based on any standard interpretation of effect size ( Table 4 ; Figure 3 ). Treatment had a small but significant effect across both time points in the Split class, but we did not find evidence of a treatment effect in the Individual Class data. We did not see a significant interaction between the treatment and time in either sample.

Summary of repeated-measures ANOVA (RMANOVA) tests for main and interaction effects of treatment and time on student performance on the MV-EDAT and split-class EDCT from individual and split-class populations

SampleDataTestF Statistic valueEffect size
IndividualEDCTRMANOVATreatmentF(2, 24) = 0.052>0.05η = 0.003
TimeF(1, 12) = 9.042 η = 0.01
Treatment*TimeF(2, 24) = 1.363>0.05η = 0.004
MV-EDATRMANOVATreatmentF(2, 20) = 3.693 η = 0.119
TimeF(1, 10) = 21.656 η = 0.278
Treatment*TimeF(2, 20) = 0.347>0.05η = 0.007
Split ClassEDCTRMANOVATreatmentF(2, 94) = 22.1 η = 0.04
TimeF(1, 47) = 10.546 η = 0.009
Treatment*TimeF(2, 94) = 1.678>0.05η = 0.002

a EDCT = Experimental Design Competency Test; MV-EDAT = Multi-variable Experimental Design Ability Test; η 2 g = Generalized eta-squared; bold p -values are significant at the 0.05 level.

b p value corrected due to violation of sphericity; before correction, was < 0.05.

Student’s experimental design skills improve after using UED, as measured by MV-EDAT transfer task

In the Individual sample, we saw overall improvement in students’ experimental design skills after completing UED as measured on the MV-EDAT independent experimental design task ( Table 4 ). The data violated both normality and sphericity assumptions. Depending on the sphericity correction, treatment is or is not significant; we took a conservative approach and consider it not significant ( Table 4 ). The experiments that the students designed on paper and described to interviewers were scored on a 0–2 scale on six experimental design elements, for a maximum possible score of 12. The Experimental Design score for students summed across all treatments improved significantly from an average of 4.8 on the preassessment to an average of 7.8 on the post assessment, showing a large effect size of 0.28 η 2 g (or Cohen’s d = 1.0) ( Table 4 ; Figure 4 ). Most students (30 of 39) showed an increased score from pre to post. When examined independently, we saw student improvement within each of the three treatments (data not shown) so no one treatment was driving this effect.

FIGURE 4. Scores on the MV-EDAT in the Individual population ( n = 39), before and after completing the UED tutorial. Experiments that the students designed based on the MV-EDAT prompts were scored for six design elements on a 0–2 scale, for a maximum possible score of 12. The total score for students in all treatments improved significantly ( p < 0.01 using randomization test; effect size η 2 g = 0.278). First panel: paired pre- and postscores for each student in all three treatments ( n = 39); 30 students scored higher on the post assessment, two showed no change, and scores decreased for seven students. Line thickness indicates number of students with each pre-post score (e.g., two students’ score increased from 1 on the preassessment to six on the post; four students’ score decreased from eight to seven). Second panel: normalized change (mean 0.41, indicated by + symbol); centerline at the median; box limits at 25th and 75th percentiles; whiskers extend to 5th and 95th percentiles.

Three of the individual experimental design elements in the MV-EDAT – Uses Systematic Variation, Addresses Hypothesis, and Includes Replication – likewise, after Bonferroni correction (α = 0.008), showed highly significant improvement between pre- and postassessments for students in all treatments ( Figure 5 ; p < 0.001 for each using randomization tests). The other three elements – Includes Variables Held Constant ( p = 0.055), Includes Dependent Variable ( p = 0.25), and Includes Experiment Duration ( p = 0.068) – were not significant at the 0.05 level.

Complexity of designed experiments decreased.

To probe how students improved their experimental design skills by completing UED, we looked in more depth at the experimental design elements that showed improvement. The elements Includes Replication and Addresses Hypothesis are both relatively straightforward – that is, after completing UED, more students replicated all treatments in their MV-EDAT designs, and did a better job addressing the stated hypothesis. The change in the Systematic Variation element was more nuanced. We scored students well on Systematic Variation if they changed only one variable per treatment, except for where they explicitly test two variables and their interactions, and they included appropriate control(s). We intentionally included three potential explanatory variables in the MV-EDAT prompts, so that students could choose to test more than one variable in the experiments they designed. They could do this in one of two ways – either design three parallel experiments (one for each variable), or a single experiment that included all three variables. The latter is a more challenging experimental design, because this would require treatments with all combinations of the three variables for a fully balanced design.

On the preassessment, 79% of students designed experiments to test more than one variable ( Figure 6 ). Of these, the majority (77%) attempted to test multiple variables in a single experiment, rather than parallel experiments. This changed in the postassessment, where fewer students (51%) designed experiments to test more than one variable, with only half of these testing multiple variables in a single experiment. Thus, some of the improvement in the Systematic Variation score is because students chose to test only a single variable in the experiment they designed. But even among those students testing more than one variable in the post assessment, there was evidence of improvement in their experimental design. On the preassessment, the mean Systematic Variation score of those students who attempted to test only one variable was 1.75, compared with a mean of 0.84 for those who attempted to test more than one variable. On the postassessment, the difference was half as much – an average of 1.84 for those who attempted to test one variable, compared with 1.45 for those who attempted to test more than one ( Figure 6 ).

Probing interview questions showed no net change in declarative or conceptual knowledge.

After students described the experiment they designed based on the MV-EDAT prompts, we followed up with probing questions that were designed to elicit their declarative and conceptual knowledge. Students did not show any apparent net change in their responses to the eight probing interview questions that we analyzed. For most of the questions, most student answers in the preassessment were scored as one (partial evidence of understanding) or two (more complete evidence of understanding), with very few scores of zero, and the proportions of student scores changed very little in the postassessment. This suggests that the students interviewed came in with a fairly good baseline level of declarative and conceptual knowledge, but this contrasts with their lower level of procedural knowledge, as assessed by the actual experiments they designed in response to the MV-EDAT prompts. For example, most students (over 80% in both pre- and postassessment) could tell us when asked what they would measure, or what variables they would hold constant between groups in their experiment, but about 40% of students did not explicitly include a dependent variable or potentially confounding variables in their description (either verbal or written) of their experimental design, even on the postassessment.

Comparisons between treatments required larger sample and different analyses

The strong and significant overall improvement in performance on the MV-EDAT across all treatments indicates that something about the UED tutorial is working to help students improve on these skills. To address our research questions about the role of feedback and constraint required a comparison between treatments. Ideally, we would have compared changes in MV-EDAT scores between treatments. The small sample size in the Individual sample, though, precludes statistically distinguishing between factors that might be leading to that improvement. So, with this context that a far-transfer assessment shows improved experimental design skill, we focus the rest of our analysis on the experiments students performed within UED. Crucially, the Split-Class sample provides larger data sets from the in-tutorial experimental designs and we can thus use Split-Class data to better test the roles of feedback and constraint.

Feedback affects students experimental design practices

To test the effect of feedback on aiding student learning, we compared the ICWF and ICNF treatments in the Split-Class sample. The only difference between those UED versions was the presence or absence of feedback in the Initial Experiment that students perform. To compare, we examined the number of students who received a perfect Experimental Score on their Follow-Up Experiments, and separately the number who received perfect Biology Scores.

Students in the ICWF treatment were much more likely to have perfect in-tutorial Experimental Scores than those in the ICNF treatment, supporting the inference that feedback aids student learning of experimental design ( Table 5 ; Figure 7B ; odds ratio = 4.0).

Comparing In-module experiments between treatments

ComparisonTreatment 1 (perfect/total)Treatment 2 (perfect/total) valuesOdds ratio
 Experimental Score Follow-Up Expt.ICWF (40/52)ICNF (29/64) < 0.014.0
 Biology Score Initial Expt.ICWF (46/52)ICNF (43/64) = 0.0083.7
 Biology Score Follow-Up Expt.ICWF (45/52)ICNF (49/64) = 0.232.0
 Experimental Score Follow-Up Expt.ICNF (29/64)LCNF (16/44) = 0.431.5
 Biology Score Initial Expt.ICNF (43/64)LCNF (19/44) = 0.022.7
 Biology Score Follow-Up Expt.ICNF (49/64))LCNF (27/44) = 0.132.0

a Comparison is of students who received perfect scores on each component of experimental design, shown as a ratio to total number of students per sample. Feedback tests compare ICWF to ICNF treatments; Constraint tests compare ICNF to LCNF. Comparisons use Fisher’s exact test.

Students did not receive direct feedback on the components of the Biology Score when making their in-tutorial experimental designs, but they could learn appropriate settings by observing the simulation. In the Split-Class sample, we saw a significant difference on Biology Score between ICWF and ICNF in the Initial Experiment, but not in the Follow-Up Experiment (data not shown).

We see similar patterns in the Individual sample where a higher proportion of students in the ICWF treatment had perfect Experimental Scores than in the ICNF treatment ( Table 5 ; Figure 7A ; odds ratio = 3.4). There were no patterns in that sample for Biology Score.

Change in constraint has little effect on student experimental design practices

To test the effect of constraint on aiding student learning, we compared the ICNF and LCNF treatments in the Split-Class sample. The only difference between these treatments was the degree of constraint (IC; LC) imposed on the experimental design activity. More Split-Class students had perfect Experimental Scores in the ICNF treatment than the LCNF treatments, but the difference was not significant ( Table 5 ; Figure 7B ).

When looking at Biology Score in the Split-Class sample, a higher proportion of those in the ICNF treatment incorporated good natural history into their designs than those in the LCNF treatment. This difference was significant on the Initial but not the Follow-Up Experiment (data not shown).

Again, the Individual sample showed a similar pattern. A higher proportion of ICNF students had perfect Experimental Scores compared with LCNF ( Table 5 ; Figure 7A ). There were no patterns for Biology Score.

We thus have no evidence that the difference in constraint affects students’ learning of core experiment design skills, but there may be an effect on students’ ability to properly incorporate details of the experimental system into a good experimental design.

Constraint and feedback have small effects on time-on-task

One might imagine that time on task would be different between the treatments, and this could affect our results. Median time-on-task for the second, open-ended section of UED varied a bit between treatments in the Individual sample (for which we have the most precise data on time spent on the section). Students in the ICWF treatment took a median of 58 min to complete the section, as compared with 61 min for ICNF and 66 min for LCNF. We assume that most of this time difference happened within the experimental design activity, the only part that was different between treatments. This interpretation is supported by the fact that students in the ICWF treatment tended to do fewer experimental runs (combining all runs for both the Initial and Follow-Up Experiments) than students in the other two treatments, in both the Individual (ICWF: 2.73 ± 1.19 SD, ICNF: 3.36 ± 1.45, LCNF: 3.21 ± 1.63) and Split-Class (ICWF: 2.29 ± 0.59, ICNF: 2.52 ± 0.96, LCNF: 2.89 ± 1.48) samples. Thus, higher constraint and feedback likely aided students in completing the key design activity more quickly and enabled them to learn from the feedback directly rather than relying on trial and error.

Split-class and Individual students perform similarly to students in other classes

The Individual sample was relatively small and all came from one metro area. The Split-Class sample was all from one class at one school, albeit in a different metro area. To draw general conclusions from those results, it would be nice to have evidence that those students were not outliers in their experimental design learning. The Larger-Scale sample provides some evidence for this.

The Larger-Scale sample came from 17 classes at institutions around the United States. All students used the ICWF version of UED, so we cannot use these data to probe treatment effects, but we can compare the performance of this sample to the ICWF treatment in the two other samples. In the Larger-Scale sample, 66% of students designed an experiment that received the maximum Experimental Score of three (in individual classes, the proportion of students with scores of three ranged from 30–- 83%), and 75% received the maximum Biology Score of two (classes ranged from 56–100%) on the first Follow-Up Experiment they ran. The values for the equivalent ICWF treatments in the Split-Class sample (Experimental 69%; Biology 80%) are squarely in the middle of these ranges, and the Individual sample values (Experimental 80%; Biology 80%) are higher but also within the range, indicating that neither of those samples are outliers in their performance on these assessments.

Correlations between assessments are present but vary in strength for different elements

We looked for correlations between the results from the three assessments we used in this study as an indication of whether they were measuring skills similarly. We did this knowing that there are deliberate differences in what each assessment measures.

The MV-EDAT and the in-tutorial experimental designs assess overlapping but not identical skills. Both assess student ability to use systematic variation, to replicate their treatments, and to address their chosen hypothesis, so we compared those specific skills between the two assessments. Of these, the two assessments are most parallel in their presentation and scoring for the skill of replication.

We found significant correlation between MV-EDAT and the in-tutorial experiments for replication. Over half the students (23/39) showed similar skill levels on both assessments (high, medium, or low on both) for replication ( p < 0.01, Fisher’s exact test).

By contrast, for systematic variation there was little correlation between MV-EDAT and in-tutorial design performance. The MV-EDAT had many more degrees of freedom (e.g., the prompt suggested three possible causal variables, and there were no suggestions about how to group or house the animals being tested).

There was also no significant correlation for whether students could fully match their hypothesis to their experimental design ( p = 0.22, Fisher’s exact test, comparing pretutorial MV-EDAT and Match Hypothesis score for the Initial Experiment). This is somewhat confounded, though, because in the tutorial, the page where students choose their first hypothesis is separated by several pages and multiple activities from where they conduct their Initial Experiment, while in the MV-EDAT there was no separation. We do note that in both the MV-EDAT (pretutorial) and the Initial in-tutorial experiment, most but not all students were able to fully match their hypothesis to their experiment in both Individual and Split-Class samples (MV-EDAT pretutorial = 61%; Individual Initial Experiment = 50%; Split-Class Initial Experiment = 58%), indicating some consistency in this skill between the assessments.

The questions in the EDCT were written to target the learning outcomes focused on in Section 1 of the UED tutorial, which were focused on declarative/conceptual knowledge of experimental design, mostly at the lower (Remember and Understand) levels of Blooms Taxonomy (Remember and Understand), while the experiments in Section 2 were designed to assess students’ procedural knowledge of these concepts (i.e., the Apply and Create levels of Blooms). We compared total EDCT posttest score against a sum of each student’s Experimental, Biology, and Match Hypothesis scores on their experiments in Section 2. As scores on the in-tutorial experiments varied by treatment, we did separate comparisons against each treatment, using data from the Split-Class samples. Correlation between the EDCT and in-tutorial scores ranged from low (r 2 = 0.3) for ICWF to virtually nonexistent for the other two treatments.

In the Individual sample, we compared posttest EDCT and MV-EDAT. There was no correlation between scores on those assessments.

Active learning is widely accepted as good practice in science education after so many studies have shown active approaches to be superior to passive approaches in teaching ( Freeman et al. , 2014 ). But simply claiming to use active learning practices is not guaranteed to result in improved learning ( Andrews et al. , 2011 ), and certain active learning practices are more effective than, or better when combined with, others (e.g., Nehm et al. , 2022 ). The questions now have moved beyond comparing active and passive learning to research on what particular aspects of an active learning approach lead to effective learning. Pushing this research forward is particularly important for complex concepts and skills which are harder to measure and thus less likely to have developed clear guidance. In this study, we aimed to probe what design features of an activity to teach a complex biological skill – experimental design – led to increased learning. In particular, we look at the effects of feedback and constraint on learning in a digital inquiry-driven tutorial called UED.

The UED tutorial is effective at teaching experimental design skills, with large learning gains among college biology students (η 2 g effect size = 0.28; Cohen’s d effect size = 1.0) as measured by our independent assessment of experimental design ability, the MV-EDAT (derived from the EDAT; Sirum and Humburg, 2011 ). We do not have a comparison group of students who did not use UED in this study, so we draw no conclusions about whether UED is more or less effective than an equivalent use of time with another experimental design activity. We do note, though, that the learning gains we measured compare favorably to other activities designed to teach the experimental design process. Using the similar Expanded-EDAT assessment, Brownell et al. (2014) report an effect size of Cohen’s d = 0.36 from a one-class period pencil-and-paper experimental design activity, while semester long courses with a focus on experimental process report effect sizes (measured with Cohen’s d ) from 0.38 ( Abdullah et al. , 2015 ) to 0.99 ( Shanks et al. , 2017 ). Thus, it seems likely that UED is at the least equivalently effective to other possible activities designed to teach similar skills.

Within the context of demonstrating that UED is an effective activity for learning experimental design skills, our study tried to tease out what features were most responsible for these learning gains, focusing on constraint and feedback. The metrics we have for this are not perfect. Ideally, we would have used our MV-EDAT data to compare treatments, but the interviews were too time-intensive to provide large enough sample sizes to allow for robust comparisons. We, therefore, draw many of our conclusions from comparing the experiments that students designed within Section 2 of the tutorial in the three treatments of the Split-Class sample. We argue that between-treatment differences in student performance on the in-tutorial experimental design tasks in this sample likely reflect differences in learning. This argument is supported by a correlation between the in-tutorial experimental designs and the MV-EDAT data on at least one measure (Replication), and some similarity in a second measure (Match Hypothesis), indicating that these two assessments measure related, though not identical, skills.

Feedback contributes to learning within an intermediate constraint activity

Our Split-Class data clearly shows students who received specific feedback on the experimental design task designed better experiments in UED, as measured by their Experimental Scores, which summed their score for systematic variation, appropriate controls, and replication of all treatments in experiments designed within the tutorial. The Experimental Score was higher in the ICWF treatment, where students received feedback, than in the ICNF treatment, where they did not ( Figure 7 ). We saw this higher Experimental Score in the Follow-Up Experiment where neither treatment provided any feedback, so it was not just a function of students responding directly to the immediate feedback they received, but represented learning that lasted at least a short time and transferred to a similar activity.

Students did not receive feedback on the aspects of their design that went into the Biology Score (supplying enough plants to feed the Simploids and running the experiment long enough to see the disease progress), and although the students in the ICWF treatment had higher Biology Scores in the Initial Experiment, in the Follow-Up Experiments there were no differences between the ICWF and ICNF treatments. This lack of difference on the Follow-Up Experiment supports the impact of feedback on performance, because students in the two treatments ended with equivalent scores for an aspect of the experimental design where we did not provide direct feedback.

That feedback helps students is not a surprise, especially on learning higher-order tasks ( Van der Kleij et al. , 2015 ). While our study was not designed to test different types of feedback, our effect size was large compared with many other studies of feedback effectiveness ( Van der Kleij et al. , 2015 ; Wisniewski et al. , 2020 ). These results thus lend support to previous research indicating immediate, elaborated feedback that includes information on what to do next is effective (e.g., Brooks et al. , 2019 ; Wisniewski et al. , 2020 ), especially for higher-order tasks ( Van der Kleij et al. , 2015 ). More novel to this study is showing those characteristics of feedback remain effective for automated feedback on higher-order, intermediate constraint activities such as the simulation-based activities in UED, where providing feedback is challenging and few previous studies have been conducted. The idea for the UED experimental design activity originated with an earlier activity on natural selection called Darwinian Snails (Abraham et al. , 2009, Clarke-Midura et al. , 2018 ) which had a much less constrained experimental design activity without any feedback. In our iterative testing of various learning modules with individual students and classes, we have found that by adding some constraints to an activity while still retaining much of the open-ended nature of a learning environment, as we have done in UED, we are able to devise algorithms to automatically provide specific feedback ( Meir, 2022 ). Our results here show this is worth doing as the feedback clearly improves student performance and, we infer, student learning. In particular, there is a larger difference in student scores on the transfer task (the MV-EDAT) on skills for which explicit feedback was provided in the tutorial (e.g., Systematic Variation, Includes Replication, and Addressing Hypothesis) compared with those for which they did not receive feedback ( Figure 5 ).

FIGURE 5. Scores for the six individual experimental design elements of the MV-EDAT in the Individual sample ( n = 39), before and after completing UED. Bars show the proportion of students in all treatments combined scoring 0 (light gray), 1 (medium gray), or 2 (black) on that design element pre and post. The scores for students in all treatments were significantly higher ( p < 0.001) on the post assessment compared with the pre for three of the individual design elements – Systematic Variation, Includes Replication, and Addresses Hypothesis; the other three elements – Includes Variables Held Constant ( p = 0.055), Includes Dependent Variable ( p = 0.25), and Includes Duration ( p = 0.068) – were not significant.

FIGURE 6. Systematic Variation score by number of variables tested in the MV-EDAT. Students are divided into those who designed experiments testing a single variable (Tested 1) or more than one variable (Tested >1), and the bars show proportion of students scoring 0 (light gray), 1 (medium gray), or 2 (black) on Systematic Variation in the pre- and postassessment. More students ( n = 31) tried to test multiple variables on the preassessment than on postassessment ( n = 20), and those who attempted to test more than one variable scored lower on Systematic Variation. The gap in Systematic Variation scores between students manipulating one versus multiple variables was twice as much in the pre than in the post, suggesting that students learned to test fewer variables and/or when testing multiple variables, to do so more systematically.

Small changes in constraint have little effect on learning

When teaching complex scientific skills, how much freedom should one provide students within a learning environment? There are arguments in favor of both heavily constraining the student experience to make it easy for students to perform the skill (“structured inquiry” as defined by Colburn, 2000 ; Sweller et al. , 2007 ), and on the other extreme providing a largely discovery-based approach where students are given a space in which to explore and discover largely for themselves (“open inquiry”). Many activities provide constraint intermediate between those extremes, but even within the intermediate range, it’s not clear where on that axis learning best takes place.

Here we tested how varying constraints within an intermediate range might affect learning, and for the most part, we find little effect. We would consider both ICNF and LCNF treatments to be intermediate in degree of constraint compared with other activities we and other groups have designed. Within this range, we see only minor effects on learning. While there was a trend towards Experimental and Biology Scores being lower in LCNF, aside from Biology Score in the Initial Experiment, the difference was not significant. It is possible that with a larger sample those trends would have risen to significance, but from our results we conclude that even if an effect is there, it is not large.

This contrasts with other data comparing a much broader range of constraint on question types. In a separate, related study, we compared questions written in open-ended (short answer) versus intermediate constraint formats, for instance filling in blanks of sentences from predefined sets of words and phrases (“LabLibs”; Meir et al. , 2019 ). There, we found that student learning increased when they were asked questions in an intermediate constraint format compared with the same material using open-ended questions, in some cases even without specific feedback on their answers. Other research groups have also opted for intermediate levels of constraint in both learning and assessment environments (e.g., Blanchard et al. , 2010 ; Gobert et al. , 2012 ). Anecdotally, when we watched students use the much lower constraint experimental design activity in the Darwinian Snails module from which UED evolved, it appeared that many (perhaps most) did not take full advantage of that environment to truly explore. This aligns with data showing students using intermediate constraint questions can express their thinking with more clarity than in essay questions ( Meir et al. , 2019 ). Constraint may guide students towards more productive thinking and exploration in open environments. Our results here are consistent with previous work showing intermediate constraint activities are helpful in promoting learning, but do not offer much evidence in favor of the hypothesis that level of constraint matters.

Instead, we take these results as an invitation to consider other factors when developing learning tools. Rather than worrying about direct effects on learning of different levels of constraint, the primary considerations within the broad intermediate region may be indirect effects on other aspects of the environment, such as ability to provide feedback and learning efficiency. To promote discovery learning, one might try to target the least constrained environment for which one can still devise algorithms to provide good feedback. In this light, as algorithms for providing feedback improve, it would be interesting to repeat the experiments we have done here comparing intermediate and low constraint environments where both have immediate, specific feedback. In the other direction, we note that students in the LCNF treatment took longer to complete the tutorial, without any evidence of greater learning. Assuming time on task tracks with extraneous cognitive load, one might also reasonably increase the constraint on the environment to maximize learning efficiency ( Paas and Van Merrienboer, 1993 ). Increasing constraint enough to provide automated feedback and scoring benefits instructors as well, allowing them to use activities in larger classes with less effort ( Table 1 ). Thus instructional designers might manipulate constraint higher or lower to maximize learning based on other factors dependent on the constraint, without worrying about degree of constraint itself impacting learning.

Results may apply across a broad range of undergraduate students

While we were not able to conduct controlled comparisons in more than one class, we were able to compare data from the ICWF version of UED gathered from a broad range of classes across the United States. There were wide variations among student scores between classes, as one might expect. But the results from our Individual and Split-Class samples fit well within the range seen in other classes on both in-tutorial experimental design and scores on the EDCT (our test of conceptual knowledge), suggesting that our results here may apply to a broad range of students. Anecdotally, we have subsequently heard from instructors who feel their students did better on other experimental design tasks in their classes after having completed the UED tutorial, supporting this conclusion. The variation between classes may indicate that different populations would benefit from different amounts of feedback and constraint. As we argue elsewhere ( Meir, 2022 ), changing the level of constraint in the learning environment might be a particularly powerful lever to adjust activities to maximize student learning for different populations.

Performance-based assessment is important for complex knowledge and skills

We have a few lines of evidence of a disconnect between students’ declarative/conceptual knowledge of experimental design (i.e., defining terms and explaining concepts) and their procedural knowledge (i.e., designing an experiment). For example, there was a relatively high baseline performance on both the EDCT and interview probing questions, and relatively little change in performance on those assessments from pre to post (EDCT – Figure 3 ; interview questions – unpublished data). On the other hand, we have solid evidence of students improving in their procedural knowledge, as assessed by the MV-EDAT ( Figures 4 and 5 ) after going through the process of designing and running an experiment (as in Section 2 of UED). We also have evidence that immediate feedback, in particular, improved their procedural knowledge and experimental design skills, as measured by the in-tutorial experimental designs ( Figure 7 ). Overall, we saw more change in experimental design skills than in declarative and conceptual knowledge. It is also noteworthy that the two performance-based assessments (the MV-EDAT and the in-tutorial experimental designs) had some correlation, but we saw low to no correlation between the EDCT and either performance-based assessment.

FIGURE 7. Experimental Scores for the Follow-Up Experiments designed by students in Section 2 of the UED tutorial, by treatment. The components of the Experimental Score were: systematic variation (score 0 or 1); appropriate controls (score 0 or 1); and replication (score 0, 0.5, or 1), for a total possible score of 3. Effect of feedback: a higher proportion of students in the ICWF treatment achieved perfect Experimental Scores of 3 compared with the ICNF treatment in both the Individual (A) and Split-Class (B) sample. The difference is significant in the Split-Class sample ( p = 0.0006 with Fisher’s exact test; odds ratio = 4.0) and follows the same pattern in the Individual sample (sample was underpowered for statistics; odds ratio = 3.4). Effect of levels of constraint: more students in the ICNF treatment achieved perfect scores compared with the LCNF treatment, but the difference was not significant.

This is not to suggest that teaching declarative and conceptual knowledge is unnecessary for developing higher-level skills. Indeed, we learned from earlier iterations of experimental design activities that without establishing a common baseline of vocabulary and a review of basic conceptual principles, students were not always able to benefit from feedback that relied on this declarative and conceptual knowledge. But, if improving performance is the goal, performance-based activities are important for building complex skills, and procedural knowledge is best assessed with a performance-based assessment such as the in-tutorial experiment or the MV-EDAT (or other versions of the EDAT).

There are numerous studies showing highly-constrained assessments such as multiple choice-based tests miss aspects of understanding and skills that less constrained assessments capture (e.g., Nehm et al. , 2012 ; Beggrow et al. , 2014 ; Hubbard et al. , 2017 ; Uhl et al. , 2021 ). Designing performance-based assessments with intermediate degrees of constraint may have benefits. Asking students to complete tasks with some constraints, such as the experimental design tasks in UED, may help gauge student skill level, and help focus in on exactly where a student is confused in ways that higher constraint assessments, and potentially even low-constraint assessments, cannot, while also allowing those assessments to be autoscored ( Hubbard et al. , 2017 ; Meir, 2022 ).

Study limitations leave open other interpretations

Our own experimental design includes some inherent limitations, so the conclusions we reach above come with caveats.

While we validated the EDCT in several ways, this study was the first time it was used and there is a reasonable chance that it simply is not sensitive enough to distinguish large changes in understanding in the samples in our study. The Wright map, for instance, indicates that many items on the EDCT were easy for the samples we studied.

This study was also the first time that our revised version of the EDAT (the MV-EDAT) was used. We designed the MV-EDAT to fit our assessment needs, by creating prompts with nonhuman contexts and also deliberately suggesting more than one independent variable that could be tested, in order to assess how students deal with the realistic scenario of designing experiments with multiple putative causative independent variables. We also decided to implement the MV-EDAT with accompanying probing questions because we wanted to assess their declarative and conceptual understanding of concepts, which may not be apparent from what they volunteer on paper if they leave key concepts out of their experimental design description (e.g., what variables they would hold constant between treatments). Based on our results, we can recommend the use of the MV-EDAT prompts we designed in contexts where they might be useful (e.g., in a class focused on nonhuman rather than human biology, and when particularly interested in how students face the challenge of designing experiments with multiple independent variables). However, we did not really see an added benefit to pairing the MV-EDAT with interviews designed to probe student declarative and conceptual knowledge. This level of knowledge is well suited for constrained choice tests, like the EDCT we developed. When implementing any version of the EDAT (e.g., the original EDAT, the Expanded EDAT, or our MV-EDAT), researchers or instructors should understand that procedural knowledge is what is being tested. Our experience with the interviews shows that students not including elements in their experimental design do not mean they cannot identify or even define or explain that element. Our experience just reinforces the importance of identifying what level of knowledge you are interested in assessing and choosing the appropriate assessment for that level.

We also acknowledge that many of our conclusions comparing among treatments may be limited because they are based on an assessment task within the activity itself, rather than the more independent assessment MV-EDAT. Given the consistency on the within-tutorial assessments between the Individual and Split-Class samples, we think it likely that were we able to devote the resources to complete the full interview protocol with a larger number of students, we would have seen the same results in the MV-EDAT based on nonsignificant trends in that data. It is certainly possible, however, that feedback only affected students’ ability to complete the experimental design tasks within UED, and did not have a differential effect on their ability to transfer that learning to the other context represented in the MV-EDAT. Supporting our interpretation, though is that the element scored most similarly between the in-tutorial exercises and the MV-EDAT (Replication) was correlated between the two.

CONCLUSIONS

While there is no doubt that active learning approaches are critical for mastering core scientific skills and knowledge, the phrases “active learning,” “student-centered teaching,” and other similar language encompass a broad range of activities. To determine which approaches within that range are most effective in different situations requires experiments that test alternatives of how to design those activities ( Freeman et al. , 2014 ). Here we show that two key axes upon which learning activities can vary, feedback and constraint, are both likely to be important in maximizing learning of a core skill in biological science, although for different reasons. We show that immediate, specific feedback is highly effective for helping students learn. Our data suggests that some variation in constraint, at least within the intermediate range, may not have a large direct effect on learning. But because constraint allows feedback and has other indirect effects, degree of constraint is useful to consider as a way of maximizing learning through other avenues. While our research focused on experimental design skills, we suggest these results may also be applicable to the teaching of other skills of similar complexity.

ACKNOWLEDGMENTS

This multifaceted study would not have been possible without help from many people on and off the project team. As part of the team at various times, we greatly appreciate the intellectual contributions made by Jody Clarke-Midura, Jenna Conversano, and Eric Klopfer, administrative help from Evelyne Tschibelu and Carol Burke, and the many people at SimBiotic Software involved in putting together the special versions of UED used here including Steve Allison-Bunell, Eric Harris, Josh Quick, Derek Stal, Eleanor Steinberg, and Jennifer Wallner. We also thank our advisory board of Ross Nehm, Ryan Baker, and Kathryn Perez. We thank the editor for a heroic statistical review and two anonymous reviewers who provided feedback that greatly improved the manuscript. Outside the team we extend a special thanks to the many instructors and students who took part in the study. This material is based upon work supported in part by the National Science Foundation under Grant No.-1227245 and while serving at the National Science Foundation.

  • Abdullah, C., Parris, J., Lie, R., Guzdar, A., & Tour, E. ( 2015 ). Critical analysis of primary literature in a master’s-level class: Effects on self-efficacy and science-process skills . CBE—Life Sciences Education , 14 (3), ar34.  https://doi.org/10.1187/cbe.14-10-0180 Medline ,  Google Scholar
  • Abraham, J. K., Meir, E., Perry, J., Herron, J. C., Maruca, S., & Stal, D. ( 2009 ). Addressing undergraduate student misconceptions about natural selection with an interactive simulated laboratory . Evolution: Education and Outreach , 2 (3), 393–404.  https://doi.org/10.1007/s12052-009-0142-3 Google Scholar
  • American Association for the Advancement of Science . ( 2011 ). Vision and change in undergraduate biology education: A call to action . Washington, DC: AAAS. Google Scholar
  • Ambrose, S. A., Bridges, M. W., DiPietro, M., Lovett, M. C., & Norman, M. K. ( 2010 ). How learning works: Seven research-based principles for smart teaching . San Francisco, CA: Jossey-Bass. Google Scholar
  • Andrews, T. M., Leonard, M. J., Colgrove, C. A., & Kalinowski, S. T. ( 2011 ). Active learning not associated with student learning in a random sample of college biology courses. CBE—Life Sciences Education , 10 (4), 394–405.  https://doi.org/10.1187/cbe.11-07-0061 Link ,  Google Scholar
  • Arthurs, L. A., & Kreager, B. Z. ( 2017 ). An integrative review of in-class activities that enable active learning in college science classroom settings , International Journal of Science Education , 39 (15), 2073–2091, https://doi.org/10.1080/09500693.2017.1363925 Google Scholar
  • Baker, R. J. D. ( 2011 ). Gaming the system: A retrospective look . Philipp Comput J , 6 (2011), 9–13. Google Scholar
  • Behar-Horenstein, L. S., & Niu, L. ( 2011 ). Teaching critical thinking skills in higher education: A review of the literature . J. College Teaching and Learning , 8 (2), 25–42.  https://doi.org/10.19030/tlc.v8i2.3554 Google Scholar
  • Beggrow, E. P., Ha, M., Nehm, R. H., Pearl, D., & Boone, W. J. ( 2014 ). Assessing scientific practices using machine-learning methods: How closely do they match clinical interview performance? Journal of Science Education and Technology , 23 , 160–182.  https://doi.org/10.1007/s10956-013-9461-9 Google Scholar
  • Blanchard, M. R., Southerland, S. A., Osborne, J. W., Sampson, V. D., Annetta, L. A., & Granger, E. M. ( 2010 ). Is inquiry possible in light of accountability? A quantitative comparison of the relative effectiveness of guided inquiry and verification laboratory instruction . Science Education , 94 , 577–616. Google Scholar
  • Bodine, E. N., Panoff, R. M., Voit, E. O., & Weisstein, A. E. ( 2020 ). Agent-based modeling and simulation in mathematics and biology education . Bulletin of Mathematical Biology , 82 , 101.  https://doi.org/10.1007/s11538-020-00778-z Medline ,  Google Scholar
  • Brooks, C., Carroll, A., Gillies, R. M., & Hattie, J. ( 2019 ). A matrix of feedback for learning . Australian Journal of Teacher Education , 44 (4), 14–32. http://dx.doi.org/10.14221/ajte.2018v44n4.2 Google Scholar
  • Brownell, S. E., Wenderoth, M. P., Theobald, R., Okoroafor, N., Koval, M., Freeman, S. , … & Crowe, A. J. ( 2014 ). How students think about experimental design: novel conceptions revealed by in-class activities . BioScience , 64 (2), 125–137.  https://doi.org/10.1093/biosci/bit016 Google Scholar
  • Buck, L. B., Bretz, S. L., & Towns, M. H. ( 2008 ). Characterizing the level of inquiry in the undergraduate laboratory . J. College Science Teaching , 38 , 52–58. Google Scholar
  • Chernikova, O., Heitzmann, N., Stadler, M., Holtzberger, D., Seidel, T., & Fischer, F. ( 2020 ). Simulation-based learning in higher education: A meta-analysis . Review of Educational Research , 90 (4), 499–541. https://doi.org/10.3102/0034654320933544 Google Scholar
  • Clarke-Midura, J., Pope, D. S., Maruca, S., Abraham, J. K., & Meir, E. ( 2018 ). Iterative design of a simulation-based module for teaching evolution by natural selection . Evolution Education & Outreach , 11 (4), 1–17.  https://doi.org/10.1186/s12052-018-0078-6 Google Scholar
  • Colburn, A. ( 2000 ). An inquiry primer . Science Scope , 23 (6), 42–44. Google Scholar
  • Craig, A. R., & Fisher, W. W. ( 2019 ). Randomization tests as alternative analysis methods for behavior-analytic data . Journal of the Experimental Analysis of Behavior , 111 (2), 309–328.  https://doi.org/10.1002/jeab.500 Medline ,  Google Scholar
  • Dasgupta, A. P., Anderson, T. R., & Paleaz, N. ( 2017 ). Development and validation of a rubric for diagnosing students’ experimental design knowledge and difficulties . CBE—Life Sciences Education , 13 (2), 265–284.  https://doi.org/10.1187/cbe.13-09-0192 Google Scholar
  • Deane, T., Nomme, K., Jeffery, E., Pollock, C., & Birol, G. ( 2017 ). Development of the biological experimental design concept inventory (BEDCI) . CBE—Life Sciences Education , 13 (3), 540–551.  https://doi.org/10.1187/cbe.13-11-0218 Google Scholar
  • Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., & Wenderoth, M. P. ( 2014 ). Active learning increases student performance in science, engineering, and mathematics . Proc Natl Acad Sci USA , 111 (23), 8410–8415.  https://doi.org/10.1073/pnas.1319030111 Medline ,  Google Scholar
  • Gobert, J. D., Sao Pedro, M. A., Baker, R. S. J.D., Toto, E., & Montalvo, O. ( 2012 ). Leveraging educational data mining for real-time performance assessment of scientific inquiry skills within microworlds . Journal of Educational Data Mining , 4 (1), 104–143. Google Scholar
  • Gobert, J. D., Sao Pedro, M., Raziuddin, J., & Baker, R. S. ( 2013 ). From log files to assessment metrics: measuring students' science inquiry skills using educational data mining . Journal of the Learning Sciences , 22 (4), 521–563.  https://doi.org/10.1080/10508406.2013.837391 Google Scholar
  • Hattie, J., & Timperley, H. ( 2007 ). The power of feedback . Review of Educational Research , 77 (1), 81–112. Google Scholar
  • Hesterberg, T. C., Moore, D. S., Monaghan, S., Clipson, A., & Epstein, R. ( 2003 ). Bootstrap methods and permutation tests . In: The Practice of Business Statistics (pp. 16.1–16.57). New York, NY: W.H. Freeman & Co. Google Scholar
  • Hubbard, J. K., Potts, M. A., & Couch, B. A. ( 2017 ). How question types reveal student thinking: An experimental comparison of multiple–true–false and free-response formats . CBE—Life Sciences Education , 16 (2), ar26. https://doi.org/10.1187/cbe.16-12-0339 Medline ,  Google Scholar
  • Kingston, N., & Nash, B. ( 2011 ). Formative assessment: A meta-analysis and a call for research . Educational Measurement: Issues and Practice , 30 (4), 28–37.  https://doi.org/10.1111/j.1745-3992.2011.00220.x Google Scholar
  • Klopfer, E., Scheintaub, H., Huang, W., Wendel, D., & Roque, R. ( 2009 ). The simulation cycle: Combining games, simulations, engineering and science using StarLogo TNG . E-Learning , 6 (1), 71–96. Google Scholar
  • Kuhn, D., Pease, M., & Wirkala, C. ( 2009 ). Coordinating the effects of multiple variables: A skill fundamental to scientific thinking . Journal of Experimental Child Psycholog,y , 103 (3), 268–284.  https://doi.org/10.1016/j.jecp.2009.01.009 Medline ,  Google Scholar
  • Lakens, D. ( 2013 ). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs . Front Psychol , 4 , 863. https://doi.org/10.3389/fpsyg.2013.00863 Medline ,  Google Scholar
  • Magalhaes, P., Ferreira, D., Cunha, J., & Rosario, P. ( 2020 ). Online vs traditional homework: A systematic review on the benefits to students’ performance . Computers and Education , 152 , 103869. www.sciencedirect.com/science/article/abs/pii/S0360131520300695 Google Scholar
  • Maier, U., Wolf, N., & Randler, C. ( 2016 ). Effects of a computer-assisted formative assessment intervention based on multiple-tier diagnostic items and different feedback types . Computers & Education , 95 (2016), 85–98.  http://dx.doi.org/10.1016/j.compedu.2015.12.002 Google Scholar
  • McConnell, D. A., Chapman, L., Czajka, D., Jones, J. P., Ryker, K. D., & Wiggen, J. ( 2017 ). Instructional Utility and Learning Efficacy of Common Active Learning Strategies . Journal of Geoscience Education , 65 , 604–625. Google Scholar
  • McMillan, J. H., Venable, J. C., & Varier, D. ( 2013 ). Studies of the effect of formative assessment on student achievement: So much more is needed . Practical Assessment Research & Evaluation , 18 (2), 1–15. Google Scholar
  • Meir, E. ( 2022 ). Strategies for targeting the learning of complex skills like experimentation to different student levels: The intermediate constraint hypothesis . In Pelaez, N. J.Gardner, S. M.Anderson, T. R. (Eds.), Trends in Teaching Experimentation in Life Sciences (pp. 523–545). Cham, Switzerland: Springer Nature Switzerland AG. Google Scholar
  • Meir, E., Wendel, D., Pope, D. S., Hsiao, L., Chen, D., & Kim, K. J. ( 2019 ). Are intermediate constraint question formats useful for evaluating student thinking and promoting learning in formative assessments? Computers & Education , 141 , 103606. https://doi.org/10.1016/j.compedu.2019.103606 Google Scholar
  • Momsen, J. L., Long, T. M., Wyse, S. A., & Ebert-May, D. ( 2010 ). Just the facts? Introductory undergraduate biology courses focus on low-level cognitive skills . CBE—Life Sciences Education , 9 , 435–440.  https://doi.org/10.1187/cbe.10-01-0001 Link ,  Google Scholar
  • Nehm, R. H. ( 2019 ). Biology education research: Building integrative frameworks for teaching and learning about living systems . Discip Interdscip Sci Educ Res , 1 , 15 (2019).  https://doi.org/10.1186/s43031-019-0017-6 Google Scholar
  • Nehm, R. H., Beggrow, E. P., Opfer, J. E., & Ha, M. ( 2012 ). Reasoning about natural selection: Diagnosing contextual competency using the ACORNS instrument . American Biology Teacher , 74 , 92–98. Google Scholar
  • Nehm, R. H., Finch, S. J., & Sbeglia, G. C. ( 2022 ). Is active learning enough? The contributions of misconception-focused instruction and active-learning dosage on student learning of evolution . BioScience , 72 (11), 1105–1117.  https://doi.org/10.1093/biosci/biac073 Google Scholar
  • NGSS Lead States . ( 2013 ). Next Generation Science Standards: For States, By States . Washington, DC: The National Academies Press. Google Scholar
  • Paas, F. G. W.C., & Van Merrienboer, J. J. G.V. ( 1993 ). The efficiency of instructional conditions: An approach to combine mental effort and performance measures . Human Factors , 35 (4), 737–743. Google Scholar
  • Pelaez, N. J., Anderson, T. R., Gardner, S. M., Yin, Y., Abraham, J. K., Bartlett, E. , … & Stevens, M. ( 2017 ). The basic competencies of biological experimentation: Concept-skill statements . PIBERG Instructional Innovation Materials , Paper, 4. Retrieved from http://docs.lib.purdue.edu/pibergiim/4 Google Scholar
  • Pelaez, N. J., Gardner, S. M., & Anderson, T. R. ( 2022 ). The problem with teaching experimentation: Development and use of a framework to define fundamental competencies for biological experimentation . In Pelaez, N. J.Gardner, S. M.Anderson, T. R. (Eds.), Trends in Teaching Experimentation in Life Sciences (pp. 3–27). Cham, Switzerland: Springer Nature Switzerland AG. Google Scholar
  • Pope, D., Maruca, S., Palacio, J., Meir, E., & Herron, J. ( 2016 ). Understanding Experimental Design . SimBiotic Software, Missoula MT:. Simbio.com Google Scholar
  • Puntambekar, S., Gnesdilow, D., Dornfeld Tissenbaum, C., Narayanan, N. H., & Rebello, N. S. ( 2020 ). Supporting middle school students’ science talk: A comparison of physical and virtual labs . J Research Science Teaching , 58 , 392–419.  https://doi.org/10.1002/tea.21664 Google Scholar
  • R Core Team ( 2023 ). R: A language and environment for statistical computing . R Foundation for Statistical Computing, Vienna, Austria:. URL Retrieved from www.R-project.org/ Google Scholar
  • Rutten, N., van Joolingen, W. R., & van der Veen, J. T. ( 2012 ). The learning effects of computer simulations in science education . Computers & Education , 58 , 136–153. doi: 10.1016/j.compedu.2011.07.017 Google Scholar
  • Scalise, K., & Gifford, B. ( 2006 ). Computer-based assessment in e-learning: A framework for constructing “intermediate constraint” questions and tasks for technology platforms . Journal of Technology, Learning, and Assessment , 4 (6), 1–46. http://files.eric.ed.gov/fulltext/EJ843857.pdf Google Scholar
  • Schwichow, M., Croker, S., Zimmerman, C., Höffler, T., & Härtig, H. ( 2016 ). Teaching the control-of-variables strategy: A meta-analysis . Developmental Review , 39 , 37–63 http://dx.doi.org/10.1016/j.dr.2015.12.001 Google Scholar
  • Shute, V. J. ( 2008 ). Focus on formative feedback . Review of Educational Research , 78 (1), 153–189.  https://doi.org/10.3102/0034654307313795 Google Scholar
  • Sirum, K., & Humburg, J. ( 2011 ). The Experimental Design Ability Test (EDAT) . Bioscene , 37 , 8–16. Google Scholar
  • Shanks, R. A., Robertson, C. L., Haygood, C. S., Herdliksa, A. M., Herdliska, H. R., & Lloyd, S. A. ( 2017 ). Measuring and advancing experimental design ability in an introductory course without altering existing lab curriculum . J Microbiology & Biology Education , 18 (1), 1–8. https://doi.org/ 10.1128/jmbe.v18i1.1194 Google Scholar
  • Sweller, J., Kirschner, P. A., & Clark, R. E. ( 2007 ). Why minimally guided teaching techniques do not work: A reply to commentaries . Educational Psychologist , 42 , 115–121. Google Scholar
  • Theobald, R., & Freeman, S. ( 2014 ). Is it the intervention or the students? Using linear regression to control for student characteristics in undergraduate STEM education research . CBE—Life Sciences Education , 13 , 41–48.  https://doi.org/10.1187/cbe-13-07-0136 Link ,  Google Scholar
  • Uhl, J. D., Sripathi, K. N., Meir, E., Merrill, J., Urban-Lurain, M., & Haudek, K. C. ( 2021 ). Automated writing assessments measure undergraduate learning after completion of a computer-based cellular respiration tutorial . CBE—Life Sciences Education , 20 (3), ar33.  https://doi.org/10.1187/cbe.20-06-0122 Medline ,  Google Scholar
  • Urry, L. A., Cain, M. L., Wasserman, S. A., Minorsky, P. V., & Reece, J. B. ( 2017 ). Mastering Biology . New York, NY: Pearson Education Inc. Google Scholar
  • Van der Kleij, F. M., Feskens, R. C. W., & Eggen, T. J. H.M. ( 2015 ). Effects of feedback in a computer-based learning environment on students’ learning outcomes: A meta-analysis . Review of Education Research , 85 (4), 1–37.  https://doi.org/10.3102/0034654314564881 Google Scholar
  • Wang, Z., Gong, S. Y., Xu, S., & Hu, X. E. ( 2019 ). Elaborated feedback and learning: Examining cognitive and motivational influences . Computers & Education , 136 (2019), 130–140.  https://doi.org/10.1016/j.compedu.2019.04.003 Google Scholar
  • Wisniewski, B., Zierer, K., & Hattie, J. ( 2020 ). The power of feedback revisited: A meta-analysis of educational feedback research . Frontiers in Psychology , 10 , 3087. https://doi.org/10.3389/fpsyg.2019.03087 Medline ,  Google Scholar
  • Woolley, J. S., Deal, A. M., Green, J., Hathenbruck, F., Kurtz, S. A., Park, T. K. H. , … & Jensen, J. L. ( 2018 ). Undergraduate students demonstrate common false scientific reasoning strategies . Thinking Skills and Creativity , 27 , 101–113.  https://doi.org/10.1016/j.tsc.2017.12.004 Google Scholar
  • Zhu, M., Liu, O. L., & Lee, H. S. ( 2020 ). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing . Computers and Education , 143 (2020), 103668.  https://doi.org/10.1016/j.compedu.2019.103668 Google Scholar

experimental design activities

Submitted: 11 August 2022 Revised: 18 September 2023 Accepted: 3 October 2023

© 2024 E. Meir et al. CBE—Life Sciences Education © 2024 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 4.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/4.0).

  • Original Article
  • Open access
  • Published: 04 June 2024

Morality matters: social psychological perspectives on how and why CSR activities and communications affect stakeholders’ support - experimental design evidence for the mediating role of perceived organizational morality comparing WEIRD (UK) and non-WEIRD (Russia) country

  • Tatiana Chopova   ORCID: orcid.org/0000-0003-0249-4879 1 ,
  • Naomi Ellemers   ORCID: orcid.org/0000-0001-9810-1165 1 &
  • Elena Sinelnikova 2  

International Journal of Corporate Social Responsibility volume  9 , Article number:  10 ( 2024 ) Cite this article

108 Accesses

Metrics details

Companies’ communications about Corporate Social Responsibility (CSR) have become increasingly prevalent yet psychological reasons for why those communications might lead to positive reactions of the general public are not fully understood. Building on theories on impression formation and social evaluation, we assess how CSR communications affect perceived morality and competence of a company. We theorize that the organization’s CSR activities would positively impact on perceived organizational morality rather than on perceived organizational competence and that this increase in perceived organizational morality leads to an increase in stakeholders’ support. Two experimental design studies show support for our theorizing. We cross-validated the robustness and generality of the prediction in two countries with different business practices (UK ( N  = 203), Russia ( N  = 96)). We demonstrated that while the general perceptions of companies and CSR differ between the UK and Russia, the underlying psychological mechanisms work in a similar fashion. By testing our predictions in western, educated, industrialized, rich, and democratic (WEIRD) and in non- WEIRD countries, we also extend current socio-psychological insights on the social evaluation of others. We discuss theoretical and practical implications.

Introduction

Almost every day on the news people read about positive actions of various companies such as promoting diversity or working on environmentally friendly production solutions (Corporate Social Responsibility or CSR activities). People become increasingly aware of the importance of CSR including addressing environmental issues (Sabherwal et al., 2021 ). Corporate communications about those type of activities are increasingly prevalent and it became an important topic in academic research across different disciplines (e.g. Aguinis & Glavas, 2012 ). Will this affect your perceptions of the company and why? While there is a large body of evidence that suggests that you would be positively affected by such corporate communications, the reasons behind why this is the case are not fully understood (Aguinis & Glavas, 2012 ; Jamali & Karam, 2018 ; Simpson & Aprim, 2018 ).

In the present research, we address the identified research need and we contribute to the current literature in several ways. First, we apply the insights from Social Identity Theory (Tajfel, 1974 ; Tajfel & Turner, 1979 , 1986 ) and theories on social evaluation of others (Abele et al., 2021 ; Abele & Wojciszke, 2007 ; Hack et al., 2013 ; Wojciszke et al., 1998 ) to explain the relationship between CSR activities and stakeholders’ reactions (i.e. reactions of actual or potential employees or customers of a company), thus extending prior micro- or individual level CSR literature (Aguinis & Glavas, 2019 ; Jamali & Karam, 2018 ). By applying theories of social evaluation to people’s assessments of companies, we extend the emerging theory on how people develop impressions of non-human subjects (Ashforth et al., 2020 ; Epley et al., 2007 ; Gawronski et al., 2018 ). Second, we provide empirical evidence to our theorizing by conducting experimental design studies in two countries (Russia and UK) with different business practices (e.g. Russia is ranked at the bottom of the corruption index offered by Transparency International (137 out of 180 countries), and the UK (12 out of 180)), which can impact on development and perceptions of CSR. We propose and demonstrate that while country-specific conditions can indeed influence both the types of CSR activities (Awuah, et al., 2021 ; Ervits, 2021 ) and stakeholders’ reactions to CSR activities (Grabner-Kräuter et al., 2020 ; Jamali & Karam, 2018 ), the socio-psychological mechanisms explaining the relationship between CSR and stakeholders’ support work in similar fashion in two countries with different business practices (Cuddy et al., 2009 ). Finally, in the social psychological and organizational behavior literature there are growing concerns about the potential lack of generalizability of study results, as most of the theory is supported by the empirical evidence obtained in Western, Educated, Industrialized, Rich, Democratic (WEIRD) countries (Cheon et al., 2020 ; Henrichet al., 2010b ). This is particularly problematic since WEIRD-based research accounts for over 90% of the psychological research, while only 12% of the world lives in WEIRD countries (Henrich et al., 2010a ). Thus, by explicitly testing our theorizing in both WEIRD and non-WEIRD samples, we extend current socio-psychological insights on the social evaluation of others.

Morality and competence as key dimensions for social evaluation of others

Individuals assess others on the basis of two key dimensions. Although different approaches have emphasized slightly different aspects of these dimensions and use different labels, the two key dimensions can generally be interpreted as referring to task ability (competence/agency) vs. interpersonal intentions (morality/communion/warmth) (Fiske et al., 2007 ; Goodwin et al., 2014 ; Leach et al., 2007 ; Wojciszke, 1994 ). We know that those key dimensions capture distinct behavioral features of various targets (Wojciszke, 1994 ).

Importantly, researchers have started to apply dimensions of social evaluation of other human targets to the emerging theory on how people develop impressions of non-human subjects such as companies and brands (Kervyn et al., 2012 ; Shea & Hawn, 2019 ). Similarly, we apply those two dimensions of social evaluation to people’s perceptions of companies, thus building on this latest trend in the organizational behavior literature to leverage on the findings from social psychology as people tend to anthropomorphize non-human targets, including organizations (Ashforth et al., 2020 ; Epley et al., 2007 ).

We know that, generally speaking, CSR activities imply that a company is focusing on something above and beyond of what is strictly speaking required by law (McWilliams & Siegel, 2001 ). One of the recognized key goals of the company is to make a profit. When organizations engage in CSR, this generally cannot be explained from profit-making motives, or from legal requirements. Examples of CSR activities include introducing additional measures to attract minority groups or better accommodating employees or customers with disabilities. Behaving responsibly is generally seen as ethical (Carroll, 2016 ; Mitnick et al., 2023 ) or ‘morally good’, and hence this might improve the perceived morality of a company. To date, the specific relationship between displays of CSR and perceptions of organizational morality, or perceived trustworthiness (Leach et al., 2015 ) of companies, has been proposed in mainly been established with survey-based studies (e.g., Ellemers et al., 2011 ; Farooq et al., 2014 ; Hillenbrand, et al., 2013 ). Accordingly, we would expect that learning about companies’ CSR activities would increase the perceived organizational morality of a company. We use experimental design studies that allow us to draw causal conclusions (Shadish et al., 2002 ), thus providing a strong test of our prediction. Our work speaks to the classic admonition that in research there is “no causation without manipulation” (Holland, 1986 ).

Hypothesis 1: Learning about companies’ CSR activities would increase the perceived organizational morality of a company.

Organizational morality as a source of stakeholders’ support

The fact that morality and competence, as two key dimensions of impression formation, account for over 80% of the variance in our impressions of others (Wojciszke et al., 1998 ), means that any information that would positively impact any of those two dimensions would result in a positive overall impression of other evaluative targets. Since we apply morality and competence to the evaluation of companies, this implies that any information about a company that would positively impact any of those two dimensions would result in a positive overall impression of a company or in the overall increase in stakeholders’ support for a company. In a business context, competence is clearly important. It seems evident that if a company is perceived more competent, for example, because it has better products than its competitors, then such a company would get more support from customers or would be better positioned to attract and retain employees. Why an increase in perceived organizational morality would also positively impact stakeholders’ support in a business context can be explained by Social Identity Theory (Tajfel, 1974 ; Tajfel & Turner, 1979 , 1986 ).

Based on Social Identity Theory (Tajfel, 1974 ; Tajfel & Turner, 1979 , 1986 ), it has been argued and shown that the perceived characteristics of an organization determine its subjective attractiveness, and drive the willingness of individuals to associate with that organization (Ashforth and Mael 1989 ; Ellemers et al., 2004 ; Haslam et al.,  2009 ; Haslam et al., 2000 ). Furthermore, people tend to identify with companies not only as employees but also as consumers (Fennis & Pruyn, 2007 ; MacInnis & Folkes, 2017 ; Stokburger-Sauer et al., 2012 ; Tuškej et al., 2013 ). Over the years, research, inspired mostly by reasoning based on social identity theory, has demonstrated that morality is particularly important for our assessment of other people, especially when these others somehow relate to the self (Abele et al., 2021 ; Goodwin et al., 2014 ; Leach et al., 2007 ; Wojciszke et al., 1998 ). Recent theory posited that both employees and customers tend to evaluate companies by interpersonal standards (Ashforth et al., 2020 ). That means that since both employees and consumers tend to identify with companies – even in a business context – the perceived morality of an organization would have an impact on the evaluations of companies by both employees and customers. Moreover, perceptions of organizational morality have been found to be at least as important as perceptions of organizational competence in attracting and committing the support of relevant stakeholders (van Prooijen & Ellemers, 2015 ; van Prooijen et al., 2018 ). Thus, we propose that in business contexts as well, an increase in perceived organizational morality should lead to an increase in the desire to associate the self with the company i.e. to increased intentions to buy companies’ products or to work for a company. Since we argue that CSR activities enhance the perceived morality of the company (Hypothesis 1). We also propose that the perceived morality of the company should mediate the relationship between learning that a company is engaged in CSR activities and stakeholders’ support for this company.

Hypothesis 2: We predict that informing participants about CSR activities of a company should increase stakeholders’ support for that company.

Hypothesis 3:   Perceived organizational morality is a mediator for the relationship between CSR activities and stakeholders’ support.

CSR perceptions in Russia

The examination of CSR in developing countries is an emerging field of study (Boubakri et al., 2021 ; Jamali & Mirshak, 2007 ; Khojastehpour & Jamali, 2021 ; Kolk & van Tulder, 2010 ). The economic and institutional differences between developing and developed countries raise questions about the applicability of some of the general CSR findings to emerging markets contexts and make this a topic worthy of investigation (Jamali & Karam, 2018 ). For example, prior work demonstrates that the differences in economic inequality can impact on how people behave in business contexts (König et al., 2020 ). Research shows that cultural traditions can impact on stakeholders’ reactions to CSR (Wang et al., 2018 ). Similarly, the differences in business practices related to different levels of perceived corruption between countries can result in differences in CSR approaches (Barkemeyer et al., 2018 ) or, which might mean that people have different views and different perceptions of CSR between a country with a relatively high level of corruption (e.g. Russia) and a country with a relatively low level of corruption (e.g. the UK).

Even within the limited research field focused on CSR in developing countries, some regions or countries have benefited from more attention than others. On a comparative basis, while in recent years CSR researchers have examined the situation in China and Africa, meriting even review research (Idemudia, 2011 ; Moon & Shen, 2010 ), CSR in the developing economies of Central and Eastern Europe and Russia in particular, which experienced radical redevelopment of economic and corporate governance systems (Aluchna et al., 2020 ; Tkachenko & Pervukhina, 2020 ) has attracted minimal research efforts. So far, not surprisingly, there is some evidence that the forms of CSR visible in Central and Eastern Europe and in Russia are affected by the historical socialist or central planning legacy (Fifka & Pobizhan, 2014 ; Koleva et al., 2010 ). For example, during Soviet times, in Russia, companies used to take care of their employees by providing kindergartens, health and recreation facilities, which was valuable to employees in the absence of public social security system (Fifka & Pobizhan, 2014 ). Thus, in the past, Russian companies were strong in, what can be considered as CSR activities towards their employees. On the other hand, historically, Russian companies did not view customers or clients as important stakeholders to consider in their business decisions and for CSR activities (Alon et al., 2010 ; Fifka & Pobizhan, 2014 ). While historical circumstances suggest that there might be differences in CSR approaches between the UK and Russia, the limited amount of available research does not reveal whether Russians perceive CSR differently than their UK-based counterparts. For example, one study, looking at the attitudes of Russian managers towards CSR, concluded that, in contrast to Western managers, Russian managers do not view CSR as a positive way to influence consumers’ perceptions about a company (Kuznetsov et al., 2009 ). On the other hand, a different line of research revealed that many Russian firms do provide some CSR information to external stakeholders (Preuss & Barkemeyer, 2011 ). This suggests that the managers of at least those companies think providing such information might somehow be beneficial for their companies.

In sum, the limited amount of research about CSR in Russia does not provide us with an answer to how the Russians would perceive CSR activities. Thus, we propose to turn to the insights about basic social psychological mechanisms that are likely to play a role across different countries and contexts, to inform our views about stakeholders’ perceptions of CSR activities in Russia.

We note that morality and competence are among the few social psychological concepts which were tested in multiple countries. In fact, some of the first conclusions about morality and competence were drawn based on Polish samples (Wojciszke, 1994 ; Wojciszke et al., 1998 ). These two dimensions were later tested in the US context (Cuddy et al., 2007 ; Fiske et al., 2002 ), in Dutch context (Leach et al., 2007 ) and in Polish and German settings (Abele & Wojciszke, 2007 ). An impressive cross-cultural collaboration showed the applicability of those two key dimensions across ten nations, including such countries as Spain, Germany, France, the UK, Japan, and South Korea (Cuddy et al., 2009 ).

While those dimensions have not yet been tested in Russia, we argue, based on the robust evidence for the cross-cultural relevance of those two dimensions of impression formation, that those dimensions should be equally applicable in both UK and Russian contexts. Thus, we propose that while there are multiple factors that could make the evaluation of CSR activities to be different between the UK and Russia (Jamali & Karam, 2018 ; Jamali & Mirshak, 2007 ), the psychological process at work would be the same as in the UK. Consequently, we argue that we will find support for our theorizing also in the Russian sample, providing further empirical support to our Hypotheses 1,2 and 3.

Current research

In two experimental studies, we assessed how CSR communications of a company affected perceived morality, perceived competence and stakeholders’ support for the company (as a customer or prospective employee). In both studies, we focused on evaluations of companies by the general public. Members of the general public are the key target, whom companies try to reach (e.g., as prospective clients, employees, or investors) by communicating about their CSR activities. Perceptions of the general public are shown to be a good predictor of key positive outcomes for companies (e.g. an increase in the shareholders’ value, Raithel & Schwaiger, 2015 ). In Study 1, we tested our hypotheses in the UK. In Study 2 (Russia), we replicated the results of Study 1. We cross-validated the robustness and generality of the relations we predicted between CSR, perceived morality and stakeholders’ support by examining whether this would hold across these two very different business contexts.

This research was pre-approved by the University’s Ethics Committee.

Participants and design

All participants for Study 1 were based in the UK and approached via Prolific. 249 participants completed the survey. We retained 203 participants (127 female), M age = 36 ( SD  = 12). M work experience = 15 ( SD  = 12), excluding participants who failed an attention check (participants were asked to tick a certain number and to select if they read about Company A or X). Please note we checked the results, including all participants who completed the questionnaire, and the main patterns remained the same.

Participants were randomly divided into two groups. Both groups received some neutral company information: “Company A is a mid-size IT advisory company based in the UK. It delivers websites, web-based IT systems, and computing as a service. It also provides information technology, research and consulting services.”

Thereafter, the control group proceeded directly to the dependent variables. The experimental condition group first read that the company was engaged in CSR activities (via a short press release about CSR activities). It was stated that Company A issued a CSR report detailing the company’s progress on environmental, social and governance initiatives. No specific reason for engaging in CSR activities was stated. After receiving this information and the participants proceeded to the dependent variables. Finally, all participants were thanked, debriefed and compensated.

Dependent variables

We assessed morality and competence with the items developed by (Leach et al., 2007 ). We have asked the participants to answer the following question: “We would like to get an impression of how you view Company A. Please have a look at the list of various traits and rank to what extent you view Company A as…” Items comprising this scale were presented to participants in a randomized order. Factor analysis confirmed that these items indicate morality and competence as two different constructs in line with (Leach et al., 2007 ):morality, 3 items: honest, trustworthy, sincere (ɑ = 0.91), competence, 3 items: intelligent, competent, skillful (ɑ = 0.86).

We evaluated support of various stakeholders such as clients and employees i.e. stakeholders’ support for a company using the following questions: ‘Please rate your intentions to buy products/services of Company A’, ‘Please imagine you can apply for a job in company A. Do you feel motivated to work for Company A?’ (ɑ = 0.81. The two items we used to evaluate the support of two key types of stakeholders’ such as potential customers/clients and potential employees. Those two types of stakeholders are often the focus of CSR research (e.g. Baskentli et al., 2019 ; Bauman & Skitka, 2012 ). We utilized a 7-point Likert-type scale ranging from 1 (strongly disagree) to 7 (strongly agree), asking participants to indicate how well each of these items reflected their own position. Likert-type scale ranging from 1 to 7 was used to measure participants’ reactions in all studies unless stated otherwise.

To guard against capitalization on chance, we conducted a MANOVA with communication about CSR activities of Company A (yes/no) as the between-subjects variable and morality, competence and stakeholders’ support, as dependent variables, which revealed a multivariate significant effect F (3,200) = 5.20, p  = 0.002. We then examined univariate effects on morality, competence stakeholders’ support, separately.

Morality and competence

Consistent with Hypothesis 1 , participants who read that Company A was engaged in CSR activities viewed Company A as more moral (morality M csr = 5.07 SD  = 0.97) than participants who didn’t read anything about CSR activities of Company A (morality M no csr = 4.68, SD  = 1.07), F (1, 202) = 7.70, p  = 0.006. The effect of the experimental condition on competence was not significant F (1,202) = 0.02, p  = 0.89. These results show that the experimental manipulation improved the perceived morality of the company. The fact that we did not find an effect of our experimental manipulation on perceived competence shows that CSR information does not just improve the general impression people have of the company. If that were the case, we would have expected improved perceptions of both morality and competence. This is not what we observed. Instead, our manipulation only improved the perceived morality of the company.

Stakeholders’ support

The univariate effect on stakeholders’ support was significant, F (1,202) = 5.54, p  = 0.02. Consistent with Hypothesis 2 , participants who read that Company A was engaged in CSR activities expressed higher stakeholder’ support for Company A ( M csr = 5.14, SD  = 1.04) than participants who didn’t read about CSR activities of Company A ( M no csr = 4.76, SD  = 1.23).

We then assessed whether the effect of the experimental condition on the stakeholders’ support for Company A was mediated by the perceived morality. We were able to infer morality mediation thanks to the temporal order in our experimental design (Shea & Hawn, 2019 ). A mediation model analysis was conducted using PROCESS macro (Hayes, 2017 ) for SPSS based on 10,000 bootstrap resamples.

As is depicted in Fig.  1 , communications about CSR activities indirectly influenced stakeholders’ support through its effect on the perceived morality of a company. The participants, who read about CSR activities, perceived Company A to be more moral and they also showed more support for the company. The confidence interval for the indirect effect was above 0. Thus, in line with predictions, the analysis provided support for our reasoning that morality ( b  = 0.286, SE = 0.108; CI = LL: 0.0.095; UL: 0.515, 10,000 bootstrap resamples), accounts for the relationship between CSR activities and stakeholders’ support. Thus, the results are consistent with Hypothesis 3, that morality mediates the relationship between CSR activities and stakeholders’ support.

figure 1

Mediation model Study 1 c is total effect, it shows that there is an effect of X on Y that may be mediated. Path c' is called the direct effect. The mediator has been called an intervening or process variable. We can see that there is a mediation, as variable X no longer affects Y after M (perceived company’s morality) has been controlled, making path c' statistically non-significant

All participants in Study 2 were based in Russia. One of the co-authors approached Psychology and Applied Psychology students from a university, to participate in the research. One hundred eighteen participants completed the quantitative part of the study, out of which twenty-two participants failed the attention check, which asked participants to tick a certain number and to select if they read about Company A or X. When checking the results, including all participants, the main patterns remained the same. The final sample we used to analyze the quantitative data for this study consisted of 96 participants (80% female), M age = 21 ( SD  = 2.7), M work experience = 2 ( SD  = 2.9).

Similar to Study 1, participants were randomly assigned to the control and experimental groups. Both control and experimental groups received the same information as in Study 1; we only changed the description specifying that the company was a Russian company to fit this specific context. Participants in the experimental group read a short text about CSR and information about Company A being active in CSR, similar to Study 1 this was presented as a press release from Company A. Participants of both groups completed the dependent variables. The participants received no monetary compensation.

Morality and Competence

We assessed perceptions of organizational morality (ɑ = 0.84) and competence (ɑ = 0.76) with items we use in Study 1 (Leach et al., 2007 ).

We decided to expand on the two items we used in Study 1 by adding two supplementary questions. We evaluated stakeholders’ support for the company with the following items: ‘Please imagine that you are a client of Company A. How likely is it that you would purchase Company A’s products?’, ‘How likely is it that you would want to recommend Company A’s products?’, ‘Please imagine that you can apply for a job at Company A. Would you feel motivated to apply for a job at Company A?’, ‘Would you feel motivated to work for Company A?’ (ɑ = 0.86).

We conducted a MANOVA with communication about CSR activities of Company A (yes/no) as the between-subjects variable and dependent variables. This revealed a multivariate significant effect of the experimental manipulation F (3,93) = 2.73, p  = 0.048. We then examined univariate effects on morality, competence and stakeholders’ support separately.

Consistent with Hypothesis 1 , participants who read that Company A was engaged in CSR activities viewed Company A as more moral (morality M csr = 4.51, SD  = 0.95) than participants who didn’t read anything about CSR activities of Company A (morality M no csr = 4.00, SD  = 1.17), F (1, 95) = 5.30, p  = 0.024. Like in Study 1, the effect of the experimental condition on competence was not significant F (1,95) = 1.11, p  = 0.30, countering the alternative explanation that information about CSR activities improves the overall impression of the company.

The univariate effect on stakeholders’ support was significant, F (1,95) = 5.30, p  = 0.024. Consistent with Hypothesis 2 , participants who had read that Company A was engaged in CSR activities expressed higher stakeholder’ support for Company A ( M csr = 4.67, SD  = 1.21) than participants who didn’t read about CSR activities of Company A ( M no csr = 4.10, SD  = 1.22).

A mediation model analysis was conducted using PROCESS macro (Hayes, 2017 ) for SPSS based on 10,000 bootstrap resamples.

The model shows that communications about CSR activities indirectly influenced stakeholders’ support through its effect on the perceived morality of a company. The participants, who read about CSR activities, perceived Company A to be more moral and they also showed more support for the company. The confidence interval for the indirect effect was above 0. Thus, in line with predictions, the analysis provided support for our reasoning that morality ( b  = 0.28, SE = 0.13; CI = LL: 0.0.05; UL: 0.58, 10,000 bootstrap resamples), accounts for the relationship between CSR activities and stakeholders’ support. Thus, the results are consistent with Hypothesis 3, that morality mediates the relationship between CSR activities and stakeholders’ support.

Cross-country comparison: additional analysis comparing the results of Study 1 (the UK) and Study 2 (Russia)

To check whether the hypothesized effects are robust across both national contexts, we additionally compared the results of the two studies.

We conducted a 2 × 2 MANOVA with a CSR experimental condition (CSR communication vs. control) and country (the UK vs. Russia) as the between-subjects variables and perceived morality, competence and stakeholders’ support as dependent variables. This revealed significant multivariate main effects of country ( F (3,296) = 9.01, p  < 0.001) and the CSR experimental condition ( F (3,296) = 6.57, p  < 0.001). There was no interaction effect ( F (3,296) = 0.23, p  = 0.88), indicating that our experimental manipulations had parallel effects in both countries. The fact that there is no interaction means that the theorized processes worked similarly in both countries.

At the univariate level, the effect of country was significant for morality ( F (1,298) = 23.23, p  < 0.001), stakeholders’ support ( F (1,298) = 13.35, p  < 0.001), and competence ( F (1,298) = 5.32, p  = 0.42). The relevant means show that participants in the UK perceived the company as more moral (M UK = 4.87, SD = 1.04, M Russia = 4.23, SD = 1.10) and more competent than in Russia (M UK = 5.30, SD = 0.96, M Russia = 5.01, SD = 0.98). UK participants also expressed more support for the company (M UK = 4.95, SD = 1.16, M Russia = 4.40, SD = 1.23) than Russian participants. This shows that, there were differences in people’s perceptions between those two countries, where UK perceptions were overall more positive that the perceptions of Russian participants.

At the univariate level, across the two national samples, the effect of CSR experimental condition was significant for morality ( F (1,298) = 12.32, p  = 0.001) and stakeholders’ support ( F (1,298) = 9.60, p  = 0.002). There was no significant univariate effect for competence ( F (1,298) = 0.86, p  = 0.34).

The relevant means show that in the experimental condition participants perceived the company as more moral (M csr = 4.90, SD = 0.99; M control = 4.45, SD = 1.15) than in the control condition. They also expressed more support for the company (M csr = 5.00, SD = 1.11, M control = 4.56, SD = 1.23) in the experimental condition compared to the control condition.

These results provide support to Hypotheses 1 and 2. We show that, regardless of the overall difference in evaluations between the countries, the manipulation had the same effect in both countries: there was an overall main effect of the manipulation and no interaction effect.

Mediation analysis

As a next step, we carried out a mediation analysis with total participants from both studies. The confidence interval for the indirect effect was above 0. Thus, in line with predictions, the analysis provided support for our reasoning that morality ( b  = 0.298, SE = 0.087; CI = LL: 0.1348; UL: 0.478, 10,000 bootstrap resamples), accounts for the relationship between CSR activities and stakeholders’ support. Thus, the results are consistent with Hypothesis 3, that morality mediates the relationship between CSR activities and stakeholders’ support.

Theoretical contributions

Several theoretical implications follow from our work. First, building on Social Identity Theory (Tajfel, 1974 ; Tajfel & Turner, 1979 , 1986 ) and theories on social evaluation of others (Abele & Wojciszke, 2007 ; Hack et al., 2013 ; Wojciszke, et al., 1998 ), we theorize and demonstrate in two experimental design studies that learning that a company is engaged in CSR activities leads to an increase in perceived morality of that company. The perceived organizational morality, in turn, increases stakeholders’ support. Thus, we also expand current understanding of the mechanisms which impact the relationship between CSR and stakeholders’ support (Aguinis & Glavas, 2012 ; Hillenbrand et al., 2013 ). By applying theories of social evaluation to people’s assessments of companies, we extend the emerging theory on how people develop impressions of non-human subjects (Ashforth et al., 2020 ; Epley et al., 2007 ; Gawronski et al., 2018 ; Mishina et al., 2012 ).

Second, our work extends current insights on strategic CSR and international management. We test our theorizing in two different countries: the UK and Russia. Most CSR work to date has been carried out in a single country context (Lim et al., 2018 ). As companies become more global, there is an increased demand for more cross-country CSR research (Scherer & Palazzo, 2011 ), which we address in the present research.

Furthermore, experiment based CSR research is often dominated by WEIRD samples (e.g. (De Vries et al., 2015 ; Ellemers et al., 2011 ; Chopova & Ellemers, 2023 ; see also Ellemers & Chopova, 2021 ). We, on the other hand, test our theorizing in two countries with different business practices, which can impact on development and perceptions of CSR. We find mean level differences between perceptions reported by participants in those two countries, showing that, overall, our study participants in Russia are more critical and less supportive of the company than participants in the UK. Responding to the call to devote more academic attention to CSR in developing countries (Jamali & Karam, 2018 ; Jamali & Mirshak, 2007 ), we were able to demonstrate that the impact of CSR on perceived organizational morality and stakeholders’ support remains the same across study samples obtained in the UK and Russia.

Furthermore, we address the identified need in the social psychology for testing support for general theory both in WEIRD and non-WEIRD countries, as most of the current research is carried out in WEIRD countries, while most of the world lives in non-WEIRD countries (Henrich et al., 2010a ). While it is encouraging to note that some recent work has been aiming to address this issue (Pagliaro et al., 2021 ), those attempts remain rare. Thus, we extend current insights in social psychology on morality as a key dimension in social judgment by demonstrating that SIT (Tajfel, 1974 ; Tajfel & Turner, 1979 , 1986 ) and theories on social evaluations of others (Abele & Wojciszke, 2007 ; Hack et al., 2013 ; Wojciszke et al., 1998 ) are also applicable in a non-WEIRD country.

Practical implications

Our work also has clear practical implications. First, experimental research is the key to understand what people can do to alter stakeholders’ responses to a company in terms of practical interventions. Thus, we provide strong evidence that communicating about CSR enhances perceived organizational morality and stakeholders’ support.

Second, there seems to be some testimony in the literature that morality is not always seen by businesses as important for CSR communications (Norberg, 2018 ). Our research shows that managers should not shy away from explaining that companies engage in CSR for moral or ethical reasons. These observations are also supported by a different line of work, where it was shown that the focus on the business case solely was detrimental to managers’ inclinations to engage in CSR as these managers experienced weaker moral emotions when confronted with ethical problems (Hafenbradl & Waeger, 2017 ). Our recommendations are also in line with the reported evolution of concept CSR in the literature and the statements that business interests can go together with sustainability efforts (Porter & Kramer, 2011 ; Porter & Kramer, 2018 ; Latapí Agudelo et al., 2019 ; Matten & Moon, 2020 ).

Finally, there seems to be a notion among some practitioners that CSR might be less important in emerging economies. For example, in 2016, the Netherlands Enterprise Agency, on a commission from the Ministry of Foreign Affairs of the Netherlands, published a fact sheet about Corporate Social Responsibility (CSR) in Russia for companies wishing to work in the Russian Federation. This stated that “there is still limited support for CSR in [Russian] society”. This sweeping statement does not specify what is meant by “society”, or how they reached this conclusion. We hope that our work can inspire practitioners working in developing countries and in Russia, in particular, to take note that while there can be differences in perceptions of CSR between countries, CSR activities and the perceived moral image of a company are important for stakeholders’ support.

Limitations

In this research, we see that Russian participants, in general, evaluate the company more negatively than UK-based participants. We have not addressed why this could be the case, which can be seen as a limitation. However, we would like to point out that this was not the focus of our research. Nevertheless, we demonstrated that shifts in perceived morality are possible due to specific communications, regardless of higher vs. lower levels of overall perceived morality. In fact, we propose that the fact this causal relationship could be demonstrated in both countries, regardless of the significant differences in the evaluations between the countries, speaks to the strength of the mechanisms we examine in our research.

Furthermore, we used an “unknown” mid-size IT consultancy company as a basis for experimental studies. It can be argued that people generally are less likely to have strong views about IT consultancy companies, which can perhaps be seen as a limitation, as people usually have views and associated with certain industries or products (e.g. banking, tobacco, Coca-Cola). To this, we would like to highlight that our aim was to show how the processes work in general. Thus, we explicitly chose to have a company that people are less likely to have preconceived views about.

Future directions

In this research, we specifically focused on a company with a relatively neutral image with respect to CSR. It is known, that some industries, such as the financial sector or tobacco, are negatively evaluated by the general public in the moral domain in particular. We know that a negative moral image is more difficult to repair, and it is particularly problematic for people working in those types of industries (Ashforth & Kreiner, 2014 ; Chopova & Ellemers, 2023 ). Moral disengagement (Bandura, 1999 ) can be a potential response of current investors and employees to the experience of social identity threat when the moral standing of their organization or their professional group is called into question. Future research might want to study how CSR communications affect morality and stakeholders’ support in industries with a priori negative moral image (Hadani, 2023 ).

We apply prior social psychological findings to non-human targets, thus building on the fact that humans can anthropomorphize non-human targets (Ashforth et al., 2020 ; Epley et al., 2007 ). In our work, we used a broad definition of CSR, including both human-focused (e.g. employees’ focused) and non-human focused (environmental protection) activities, which, we hope, improves the generalizability of our findings. We showed that this broad CSR definition leads to an increase in the perceived organizational morality. Future research might want to study to which extent the type of CSR activity impacts on the perception of organizational morality. Historically, western religious and ethical thinking was mainly human-centric, where human actions affecting non-humans were not perceived as morally relevant (Pandey et al., 2013 ). Hence, it is possible that people would tend to see human-focused CSR activities as more moral than environmentally focused activities. Additionally, prior work showed that people have different personal tendencies to anthropomorphize non-human targets (Waytz et al., 2010 ). Further research might want to examine to what extent this variable can be a moderator for the relationship between learning that a company is engaged in CSR activities, perceived organizational morality and stakeholders’ support.

Our paper has multiple implications for CSR and social psychological literature. Namely we demonstrate in two experimental design studies that corporate CSR communications lead to an increase in the perceived organizational morality, which in turn leads to an increase in stakeholders support. We build on social psychological literature, we explain the processes underlying this relationship. We show that morality is a relevant dimension for evaluation of companies by stakeholders, thus, extending prior findings about the importance of morality for evaluations of human targets to non-human targets. We empirically test our theory in both WEIRD (the UK) and in non-WEIRD (Russia) country. We believe that our findings are particularly relevant in the current context where various politicians and media suggest that psychological differences are too large to be able to compare people from a country such as the UK and to people from Russia. While we only focus on CSR perceptions and subsequent stakeholders’ support, our work suggests that in that area the underlying psychological mechanisms work in a similar fashion in both countries.

Availability of data and materials

The data is available at the university depository (public access can be requested).

Abbreviations

Standard deviation

Confidence interval

Lower limit

Upper limit

Abele, A. E., Ellemers, N., Fiske, S. T., Koch, A., & Yzerbyt, V. (2021). Navigating the social world: Toward an integrated framework for evaluating self, individuals, and groups. Psychological Review, 128 (2), 290.

Article   Google Scholar  

Abele, A. E., & Wojciszke, B. (2007). Agency and Communion From the Perspective of Self Versus Others. Journal of Personality & Social Psychology, 93 (5), 751–763. https://doi.org/10.1037/0022-3514.93.5.751

Aguinis, H., & Glavas, A. (2012). What We Know and Don’t Know About Corporate Social Responsibility: A Review and Research Agenda. Journal of Management, 38 (4), 932–968. https://doi.org/10.1177/0149206311436079

Aguinis, H., & Glavas, A. (2019). On Corporate Social Responsibility, Sensemaking, and the Search for Meaningfulness Through Work. Journal of Management, 45 (3), 1057–1086. https://doi.org/10.1177/0149206317691575

Alon, I., Lattemann, C., Fetscherin, M., & li, S., & Schneider, A. M. (2010). Usage of public corporate communications of social responsibility in Brazil, Russia, India and China (BRIC). International Journal of Emerging Markets, 5 (1), 6–22. https://doi.org/10.1108/17468801011018248

Aluchna, M., Idowu, S.O., Tkachenko, I. (2020). Exploring the Issue of Corporate Governance in Central Europe and Russia: An Introduction. In: Aluchna, M., Idowu, S.O., Tkachenko, I. (eds) Corporate Governance in Central Europe and Russia. CSR, Sustainability, Ethics & Governance. Springer, Cham. https://doi.org/10.1007/978-3-030-39504-9_1

Ashforth, B. . E., & Mael, F. (1989). Social Identity Theory and the Organization. Academy of Management Review, 14 (1), 20–39.

Ashforth, B. E., Schinoff, B. S., & Brickson, S. L. (2020). My company is friendly”, “mine’s a rebel”: Anthropomorphism and shifting organizational identity from “what” to “who. Academy of Management Review, 45 (1), 29–57. https://doi.org/10.5465/amr.2016.0496

Ashforth, B. E., & Kreiner, G. E. (2014). Dirty Work a n d Dirtier Work: Differences in Countering Physical, Social, a n d M o r a l Stigma. Management and Organization Review, 10 (1), 81–108. https://doi.org/10.1111/morc.l2044

Awuah, L. S., Amoako, K. O., Yeboah, S., Marfo, E. O., & Ansu-Mensah, P. (2021). Corporate Social Responsibility (CSR): Motivations and challenges of a Multinational Enterprise (MNE) subsidiary’s engagement with host communities in Ghana. International Journal of Corporate Social Responsibility, 6 (1), 1–13.

Bandura, A. (1999). Moral disengagement in the perpetration of inhumanities. Personality and Social Psychology Review, 3 (3), 193–209.

Barkemeyer, R., Preuss, L., & Ohana, M. (2018). Developing country firms and the challenge of corruption: Do company commitments mirror the quality of national-level institutions? Journal of Business Research, 90 (April), 26–39. https://doi.org/10.1016/j.jbusres.2018.04.025

Baskentli, S., Sen, S., Du, S., & Bhattacharya, C. . B. (2019). Consumer reactions to corporate social responsibility : The role of CSR domains. Journal of Business Research, 95 , 502–513. https://doi.org/10.1016/j.jbusres.2018.07.046

Bauman, C. W., & Skitka, L. J. (2012). Corporate social responsibility as a source of employee satisfaction. Research in Organizational Behavior, 32 , 63–86. https://doi.org/10.1016/j.riob.2012.11.002

Boubakri, N., El Ghoul, S., Guedhami, O., & Wang, H. H. (2021). Corporate social responsibility in emerging market economies: Determinants, consequences, and future research directions. Emerging Markets Review, 46 , 100758.

Carroll, A. B. (2016). Carroll’s pyramid of CSR: Taking another look. International Journal of Corporate Social Responsibility, 1 (1), 3. https://doi.org/10.1186/s40991-016-0004-6

Cheon, B. K., Melani, I., & Hong, Y. Y. (2020). How USA-Centric Is Psychology? An Archival Study of Implicit Assumptions of Generalizability of Findings to Human Nature Based on Origins of Study Samples. Social Psychological and Personality Science . https://doi.org/10.1177/1948550620927269

Chopova, T., & Ellemers, N. (2023). The importance of morality for collective self-esteem and motivation to engage in socially responsible behavior at work among professionals in the finance industry. Business Ethics, the Environment & Responsibility., 32 (1), 401–414.

Cuddy, A. . J. . C., Fiske, S. . T., Kwan, V. . S. . Y., Glick, P., Demoulin, S., Leyens, J. .-P., & Ziegler, R. (2009). Stereotype content model across cultures: towards universal similarities and some differences. The British Journal of Social Psychology / the British Psychological Society, 48 (Pt 1), 1–33. https://doi.org/10.1348/014466608X314935

Cuddy, A. J. C., Fiske, S. T., & Glick, P. (2007). The BIAS map: Behaviors from intergroup affect and stereotypes. Journal of Personality and Social Psychology, 92 (4), 631–648. https://doi.org/10.1037/0022-3514.92.4.631

De Vries, G., Terwel, B. W., Ellemers, N., & Daamen, D. D. L. (2015). Sustainability or profitability? How communicated motives for environmental policy affect public perceptions of corporate greenwashing. Corporate Social Responsibility and Environmental Management, 22 (3), 142–154. https://doi.org/10.1002/csr.1327

Ellemers, N., Kingma, L., van de Burgt, J., & Barreto, M. (2011). Corporate Social Responsibility as a Source of Organizational Morality, Employee Commitment and Statisfaction. Journal of Organisational Moral Psychology, 1 (2), 97–124.

Google Scholar  

Epley, N., Waytz, A., & Cacioppo, J. . T. (2007). n Seeing Human : A Three-Factor Theory of Anthropomorphism. Psychological Review, 114 (4), 864–886. https://doi.org/10.1037/0033-295X.114.4.864

Ellemers, N., & Chopova, T. (2021). The social responsibility of organizations: Perceptions of organizational morality as a key mechanism explaining the relation between CSR activities and stakeholder support. Research in Organizational Behavior, 41 , 100156.

Ellemers, N., De Gilder, D., & Haslam, S. A. (2004). Motivating individuals and groups at work: A social identity perspective on leadership and group performance. Academy of Management Review, 29 (3), 459–478. https://doi.org/10.5465/AMR.2004.13670967

Ervits, I. (2021). CSR reporting by Chinese and Western MNEs: Patterns combining formal homogenization and substantive differences. International Journal of Corporate Social Responsibility, 6 (1), 1–24.

Farooq, O., Payaud, M., Merunka, D., & Valette-Florence, P. (2014). The Impact of Corporate Social Responsibility on Organizational Commitment: Exploring Multiple Mediation Mechanisms. Journal of Business Ethics, 125 (4), 563–580. https://doi.org/10.1007/s10551-013-1928-3

Fennis, B. M., & Pruyn, A. T. H. (2007). You are what you wear: Brand personality influences on consumer impression formation. Journal of Business Research, 60 (6), 634–639. https://doi.org/10.1016/j.jbusres.2006.06.013

Fifka, M. S., & Pobizhan, M. (2014). An institutional approach to corporate social responsibility in Russia. Journal of Cleaner Production, 82 , 192–201. https://doi.org/10.1016/j.jclepro.2014.06.091

Fiske, S. T., Cuddy, A. J. C., & Glick, P. (2007). Universal dimensions of social cognition: Warmth and competence. Trends in Cognitive Sciences, 11 (2), 77–83. https://doi.org/10.1016/j.tics.2006.11.005

Fiske, S. T., Cuddy, A. J. C., Glick, P., & Xu, J. (2002). A model of (often mixed) stereotype content: Competence and warmth respectively follow from perceived status and competition. Journal of Personality and Social Psychology, 82 (6), 878–902. https://doi.org/10.1037/0022-3514.82.6.878

Gawronski, B., Rydell, R. J., Houwer, J. De, Brannon, S. M., Ye, Y., Vervliet, B., & Hu, X. (2018). Contextualized Attitude Change . Advances in Experimental Social Psychology (1st ed., Vol. 57). Elsevier Inc. https://doi.org/10.1016/bs.aesp.2017.06.001

Goodwin, G. P., Piazza, J., & Rozin, P. (2014). Moral character predominates in person perception and evaluation. Journal of Personality and Social Psychology, 106 (1), 148–168. https://doi.org/10.1037/a0034726

Grabner-Kräuter, S., Breitenecker, R. J., & Tafolli, F. (2020). Exploring the relationship between employees’ CSR perceptions and intention to emigrate: Evidence from a developing country. Business Ethics, the Environment & Responsibility, 30 , 87–102.

Hack, T., Goodwin, S. A., & Fiske, S. T. (2013). Warmth Trumps Competence in Evaluations of Both Ingroup and Outgroup. International Journal of Science, Commerce and Humanities, 1 (6), 99–105. https://doi.org/10.1530/ERC-14-0411.Persistent

Hadani, M. (2023). The impact of trustworthiness on the association of corporate social responsibility and irresponsibility on legitimacy. Journal of Management Studies, 61 (4), 1266-1294.

Hafenbradl, S., & Waeger, D. (2017). Indeology and the Micro-Foundations of CSR: Why Executives Believe in the Business Case for CSR and How This Affects Their CSR Engagements. Academy of Management Jou, 60 (4), 1582–1606. https://doi.org/10.5465/amj.2014.0691

Haslam, S. A., Jetten, J., Postmes, T., & Haslam, C. (2009). Social identity, health and well-being: An emerging agenda for applied psychology. Applied Psychology, 58 (1), 1–23. https://doi.org/10.1111/j.1464-0597.2008.00379.x

Haslam, S. A., Powell, C., & Turner, J. C. (2000). Social identity, self-categorization, and work motivation: Rethinking the contribution of the group to positive and sustainable organisational outcomes. Applied Psychology, 49 (3), 319–339. https://doi.org/10.1111/1464-0597.00018

Hayes, A. . F. . (2017). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach . Guilford publications.

Henrich, J., Heine, S. . J., & Norenzayan, A. (2010a). Most people are not WEIRD. Nature, 466 (7302), 29–29. https://doi.org/10.1017/S0140525X0999152X

Henrich, J., Heine, S. J., & Norenzayan, A. (2010b). The weirdest people in the world? Behavioral and Brain Sciences, 33 (2–3), 61–83. https://doi.org/10.1017/S0140525X0999152X

Hillenbrand, C., Money, K., & Ghobadian, A. (2013). Unpacking the Mechanism by which Corporate Responsibility Impacts Stakeholder Relationships. British Journal of Management, 24 (1), 127–146. https://doi.org/10.1111/j.1467-8551.2011.00794.x

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81 (396), 945–960. https://doi.org/10.1080/01621459.1986.10478354

Idemudia, U. (2011). Corporate social responsibility and developing countries: Moving the critical CSR research agenda in Africa forward. Progress in Development Studies, 11 (1), 1–18. https://doi.org/10.1177/146499341001100101

Jamali, D., & Karam, C. (2018). Corporate Social Responsibility in Developing Countries as an Emerging Field of Study. International Journal of Management Reviews, 20 (1), 32–61. https://doi.org/10.1111/ijmr.12112

Jamali, D., & Mirshak, R. (2007). Corporate Social Responsibility (CSR): Theory and practice in a developing country context. Journal of Business Ethics, 72 (3), 243–262. https://doi.org/10.1007/s10551-006-9168-4

Kervyn, N., Fiske, S., & Malone, C. (2012). NIH Public Access. Journal of Consumer Psychology , 22 (2). https://doi.org/10.1016/j.jcps.2011.09.006.Brands

Khojastehpour, M., & Jamali, D. (2021). Institutional complexity of host country and corporate social responsibility: Developing vs developed countries. Social Responsibility Journal, 17 (5), 593–612.

Koleva, P., Rodet-Kroichvili, N., David, P., & Marasova, J. (2010). Is corporate social responsibility the privilege of developed market economies? some evidence from central and eastern europe. International Journal of Human Resource Management, 21 (2), 274–293. https://doi.org/10.1080/09585190903509597

Kolk, A., & van Tulder, R. (2010). International business, corporate social responsibility and sustainable development. International Business Review, 19 (2), 119–125. https://doi.org/10.1016/j.ibusrev.2009.12.003

König, C. . J., Langer, M., Fell, C. . B., Pathak, R. . D., Bajwa, N., ul, H., Derous, E., & Ziem, M. (2020). Economic Predictors of Differences in Interview Faking Between Countries: Economic Inequality Matters, Not the State of Economy. Applied Psychology, 70 (3), 1360–1379. https://doi.org/10.1111/apps.12278

Kuznetsov, A., Kuznetsova, O. ., & Warren, R. (2009). CSR and the legitimacy of business in transition economies: The case of Russia. Scandinavian Journal of Management, 25 (1), 37–45. https://doi.org/10.1016/j.scaman.2008.11.008

Latapí Agudelo, M. A., Jóhannsdóttir, L., & Davídsdóttir, B. (2019). A literature review of the history and evolution of corporate social responsibility. International Journal of Corporate Social Responsibility, 4 (1), 1–23.

Leach, C. . W., Ellemers, N., & Barreto, M. (2007). Group virtue: The importance of morality (vs. competence and sociability) in the positive evaluation of in-groups. Journal of Personality and Social Psychology, 93 (2), 234–249. https://doi.org/10.1017/CBO9781107415324.004

Leach, C. W., Bilali, R., & Pagliaro, S. (2015). Groups and Morality. APA Handbook of Personality and Social Psychology, 2 , 123–149.

Lim, R. E., Sung, Y. H., & Lee, W. N. (2018). Connecting with global consumers through corporate social responsibility initiatives: A cross-cultural investigation of congruence effects of attribution and communication styles. Journal of Business Research, 88 (February), 11–19. https://doi.org/10.1016/j.jbusres.2018.03.002

MacInnis, D. J., & Folkes, V. S. (2017). Humanizing brands: When brands seem to be like me, part of me, and in a relationship with me. Journal of Consumer Psychology, 27 (3), 355–374. https://doi.org/10.1016/j.jcps.2016.12.003

Matten, D., & Moon, J. (2020). Reflections on the 2018 decade award: The meaning and dynamics of corporate social responsibility. Academy of Management Review, 45 (1), 7–28.

McWilliams, A., & Siegel, D. (2001). Profit maximizing corporate social responsibility. Academy of Management Review, 26(4), 504-505.

Mishina, Y., Block, E. . S., & Mannor, M. (2012). he Path Dependence of Organizational Reputation: How Social Judgement Influences Asessements of Capability and Charecter. Strategic Management Journal, 477 , 459–477. https://doi.org/10.1002/smj

Mitnick, B. M., Windsor, D., & Wood, D. J. (2023). Moral CSR. Business & Society, 62 (1), 192–220.

Moon, J., & Shen, X. (2010). CSR in China research: Salience, focus and nature. Journal of Business Ethics, 94 (4), 613–629. https://doi.org/10.1007/s10551-009-0341-4

Norberg, P. (2018). Bankers Bashing Back : Amoral CSR Justifications. Journal of Business Ethics, 147 , 401–418. https://doi.org/10.1007/s10551-015-2965-x

Pagliaro, S., Sacchi, S., Pacilli, M. G., Brambilla, M., Lionetti, F., Bettache, K., ... & Zubieta, E. (2021). Trust predicts COVID-19 prescribed and discretionary behavioral intentions in 23 countries. PloS one, 16(3), e0248334.

Pandey, N., Rupp, D. .E., & Thornton, M. (2013). The morality of corporate environment sustainability: A psychological and philosophical perspective. In A. H. Huffman S. R. Klein (Eds.). Green Organizations: Driving Change with I-O . New York: NY: Routledge. Psychology.

Porter, M. . E., & Kramer, M. . R. (2011). Creating shared value . Harvard Business Review. January-February.

Porter, M. E., & Kramer, M. R. (2018). Creating shared value: How to reinvent capitalism—And unleash a wave of innovation and growth. Managing sustainable business: An executive education case and textbook (pp. 323–346). Springer, Netherlands.

Preuss, L., & Barkemeyer, R. (2011). CSR priorities of emerging economy firms: Is Russia a different shape of BRIC? Corporate Governance, 11 (4), 371–385. https://doi.org/10.1108/14720701111159226

Raithel, S., & Schwaiger, M. (2015). The effects of corporate reputation perceptions of the general public on shareholder value. Strategic Management Journal, 36 (6), 945–956.

Sabherwal, A., Ballew, M. T., van Der Linden, S., Gustafson, A., Goldberg, M. H., Maibach, E. W., ... & Leiserowitz, A. (2021). The Greta Thunberg Effect: Familiarity with Greta Thunberg predicts intentions to engage in climate activism in the United States. Journal of Applied Social Psychology.

Scherer, A. . G., & Palazzo, G. (2011). The New Political Role of Business in a Globalized World: A Review of a New Perspective on CSR and its Implications for the Firm, Governance, and Democracy. Journal of Management Studies, 48 , 4. https://doi.org/10.1111/j.1467-6486.2010.00950.x

Shadish, W. . R., Cook, T. . D., & Campbell, D. . T. (2002). Experimental and quasi-experimental designs for generalized causal inference/William R. Shedish, Thomas D. Cook, Donald T. Campbell . Boston: Houghton Mifflin.

Shea, C. T., & Hawn, O. (2019). Microfoundations of Corporate Social Responsibility and Irresponsibility. Academy of Management Journal, 62 (5), 1609–1642.

Simpson, S. N. Y., & Aprim, E. K. (2018). Do corporate social responsibility practices of firms attract prospective employees? Perception of university students from a developing country. International Journal of Corporate Social Responsibility, 3 (1), 1–11.

Stokburger-Sauer, N., Ratneshwar, S., & Sen, S. (2012). Drivers of consumer-brand identification. International Journal of Research in Marketing, 29 (4), 406–418. https://doi.org/10.1016/j.ijresmar.2012.06.001

Tajfel, H., & Turner, J. C. (1979). An integrative theory of intergroup conflict. In W. G. Austin & S. Worchel (Eds.), The social psychology of intergroup relations (pp. 33–47). Monterey, CA: Brooks/Cole.

Tajfel, H. (1974). Social Identity and Intergroup Behaviour. Information (international Social Science Council), 13 (2), 65–93.

Tajfel, H., & Turner, J. C. (1986). The Social Identity Theory of Intergroup Behavior. In S. Worchel & W. G. Austin (Eds.), Psychology of Intergroup Relation (pp. 7–24). Hall Publishers.

Tkachenko, I., & Pervukhina, I. (2020). Stakeholder value assessment: Attaining company-stakeholder relationship synergy. Corporate Governance in Central Europe and Russia: Framework, Dynamics, and Case Studies from Practice , 89–105.

Tuškej, U., Golob, U., & Podnar, K. (2013). The role of consumer-brand identification in building brand relationships. Journal of Business Research, 66 (1), 53–59. https://doi.org/10.1016/j.jbusres.2011.07.022

van Prooijen, A.-M., & Ellemers, N. (2015). Does it pay to be moral? How indicators of morality and competence enhance organizational and work team attractiveness. British Journal of Management, 26 (2), 225–236. https://doi.org/10.1111/1467-8551.12055

van Prooijen, A.-M., Ellemers, N., Van der Lee, R., & Scheepers, D. (2018). What seems attractive may not always work well : Evaluative and cardiovascular responses to morality and competence levels in decision-making teams. Group Processes and Intergroup Relations . https://doi.org/10.1177/1368430216653814

Wang, X., Li, F., & Sun, Q. (2018). Confucian ethics, moral foundations, and shareholder value perspectives: An exploratory study. Business Ethics, 27 (3), 260–271. https://doi.org/10.1111/beer.12186

Waytz, A., Cacioppo, J., & Epley, N. (2010). Who sees human? The stability and importance of individual differences in anthropomorphism. Perspectives on Psychological Science, 5 (3), 219–232.

Wojciszke, B. (1994). Multiple meanings - competence morality. Journal of Personality & Social Psychology, 67 (2), 222–232.

Wojciszke, B., Bazinska, R., & Jaworski, M. (1998). On the Dominance of Moral Categories in Impression Formation. Personality and Social Psychology Bulletin . https://doi.org/10.1177/01461672982412001

Download references

Acknowledgements

NWO Spinoza grant to Naomi Ellemers.

Author information

Authors and affiliations.

University of Utrecht, Utrecht, Netherlands

Tatiana Chopova & Naomi Ellemers

St. Petersburg State Transport University, Saint Petersburg, Russia

Elena Sinelnikova

You can also search for this author in PubMed   Google Scholar

Contributions

Authors contributed equally to the paper. Dr Chopova is corresponding author.

Corresponding author

Correspondence to Tatiana Chopova .

Ethics declarations

Ethics approval and consent to participate.

The research has been approved by the Ethics Committee of the University of Utrecht. The research complies with institutional standards on research involving Human Participants and Informed Consent.

We have no conflicts of interest to disclose.

Competing interests

Additional information, publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Company A is engaged in Corporate Social Responsibility (CSR) activities. Its CSR activities are focused on the role the company plays in the community where it operates, on the company’s impact on the environment and on creating a diverse workforce. Please see below the extract from the latest press release about Company A’s Corporate Social Responsibility activities (Fig.

figure 2

CSR text in the UK Study. We used similar text in the Russian study (in Russian)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Chopova, T., Ellemers, N. & Sinelnikova, E. Morality matters: social psychological perspectives on how and why CSR activities and communications affect stakeholders’ support - experimental design evidence for the mediating role of perceived organizational morality comparing WEIRD (UK) and non-WEIRD (Russia) country. Int J Corporate Soc Responsibility 9 , 10 (2024). https://doi.org/10.1186/s40991-024-00088-w

Download citation

Received : 13 March 2023

Accepted : 16 January 2024

Published : 04 June 2024

DOI : https://doi.org/10.1186/s40991-024-00088-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Impression formation
  • Social evaluation
  • WEIRD and non-WEIRD countries
  • Stakeholders
  • Corporate communications

experimental design activities

Experimental analysis of an innovative geo-exchange system installed on the island of Ischia, in southern Italy

  • Massarotti, N.

The need to use efficient and eco-friendly air conditioning systems is a fundamental requirement for today's society. In this context, low and medium enthalpy geothermal energy plays a crucial role. The research and development activities carried out in the present work have made it possible to successfully design and analyze an innovative technology capable of supplying thermal energy to environments without pumping fluid from the subsoil. The proposed system is based on the use of a downhole heat exchanger (DHE). The authors have developed an innovative system for the sustainable use of low and medium enthalpy geothermal energy, installed on the island of Ischia, near Napoli, in southern Italy. The proposed system is based on the use of an ad hoc designed DHE, capable of optimizing the heat transfer with the subsoil, without the need to withdraw fluid from the aquifer. A numerical finite element model was developed to study the interaction between the DHE, the well and the surrounding aquifer. The experimental set-up consists of the heat exchanger and an above ground system, necessary to test the efficiency of the exchanger. The DHE is inserted inside a geothermal well made with a steel casing, equipped with a filtering section in correspondence of the DHE, in order to increase the heat transfer performance due to increased convection with the surrounding aquifer. The experimental data show that the DHE allows to exchange more than 40 kW with the ground, obtaining overall heat transfer coefficient values larger than 450 W/m 2 K.

Help | Advanced Search

Computer Science > Computers and Society

Title: unpacking approaches to learning and teaching machine learning in k-12 education: transparency, ethics, and design activities.

Abstract: In this conceptual paper, we review existing literature on artificial intelligence/machine learning (AI/ML) education to identify three approaches to how learning and teaching ML could be conceptualized. One of them, a data-driven approach, emphasizes providing young people with opportunities to create data sets, train, and test models. A second approach, learning algorithm-driven, prioritizes learning about how the learning algorithms or engines behind how ML models work. In addition, we identify efforts within a third approach that integrates the previous two. In our review, we focus on how the approaches: (1) glassbox and blackbox different aspects of ML, (2) build on learner interests and provide opportunities for designing applications, (3) integrate ethics and justice. In the discussion, we address the challenges and opportunities of current approaches and suggest future directions for the design of learning activities.
Subjects: Computers and Society (cs.CY)
 classes: K.3.0
Cite as: [cs.CY]
  (or [cs.CY] for this version)
  Focus to learn more arXiv-issued DOI via DataCite
Journal reference: The 19th WiPSCE Conference on Primary and Secondary Computing Education Research 2024

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Scientific Method And Experimental Design Worksheet

    experimental design activities

  2. 42 fundamentals of experimental design worksheet

    experimental design activities

  3. Experimental Design Worksheet

    experimental design activities

  4. Experimental Design Activity Bundle by Morpho Science

    experimental design activities

  5. 15 Experimental Design Examples (2024)

    experimental design activities

  6. Experimental Design Worksheet Scientific Method

    experimental design activities

VIDEO

  1. Singapore Design Week 2017: A March of Design

  2. AIGADEC AshleyTaylorDoriGriffin Expanse 1

  3. Design of Experiments (DOE) Tutorial for Beginners

  4. Needs of Experimental Design

  5. Promotion and Design Activities of Forest Products- Bayındır Fatih İlkokulu, İzmir- Sema Yağcıoğlu

  6. What is Experimental design and it's basic principles Explain in hindi

COMMENTS

  1. Guide to Experimental Design

    Table of contents. Step 1: Define your variables. Step 2: Write your hypothesis. Step 3: Design your experimental treatments. Step 4: Assign your subjects to treatment groups. Step 5: Measure your dependent variable. Other interesting articles. Frequently asked questions about experiments.

  2. Experimental Design Steps & Activities

    Place 5 plants in each tray; label one set "sunlight" and one set "shade". Position sunlight tray by a south-facing window, and shade tray in a dark closet. Water both trays with 50 mL water every 2 days. After 3 weeks, remove plants and measure heights in cm.

  3. 19+ Experimental Design Examples (Methods + Types)

    1) True Experimental Design. In the world of experiments, the True Experimental Design is like the superstar quarterback everyone talks about. Born out of the early 20th-century work of statisticians like Ronald A. Fisher, this design is all about control, precision, and reliability.

  4. Experimental Design

    Experimental Design. Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results. Experimental design typically includes ...

  5. A Quick Guide to Experimental Design

    A good experimental design requires a strong understanding of the system you are studying. There are five key steps in designing an experiment: Consider your variables and how they are related. Write a specific, testable hypothesis. Design experimental treatments to manipulate your independent variable.

  6. Experimental Research Designs: Types, Examples & Advantages

    There are 3 types of experimental research designs. These are pre-experimental research design, true experimental research design, and quasi experimental research design. 1. The assignment of the control group in quasi experimental research is non-random, unlike true experimental design, which is randomly assigned. 2.

  7. Experimental Design: Types, Examples & Methods

    Three types of experimental designs are commonly used: 1. Independent Measures. Independent measures design, also known as between-groups, is an experimental design where different participants are used in each condition of the independent variable. This means that each condition of the experiment includes a different group of participants.

  8. Experimental Design: Definition and Types

    An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions. An experiment is a data collection ...

  9. Experimental Design for Advanced Science Projects

    Terik Daly, an accomplished experimenter and a Science Buddies volunteer, summarized the importance of experimental design and data analysis by stating: "Data analysis for an advanced science project involves more than bar graphs and scatter plots, it should involve statistically minded exploratory data analysis and inference.

  10. Experimental Design Step by Step: A Practical Guide for Beginners

    The aim of the present tutorial is to introduce the experimental design to beginners, by providing the theoretical basic principles as well as a practical guide to use the most common designs, from the experimental plan to the final optimization. Response surface methodology will be discussed, and the main terms related to model computation and ...

  11. 15 Experimental Design Examples (2024)

    15 Experimental Design Examples. By Chris Drew (PhD) | October 9, 2023. Experimental design involves testing an independent variable against a dependent variable. It is a central feature of the scientific method. A simple example of an experimental design is a clinical trial, where research participants are placed into control and treatment ...

  12. Experimental Design Activities

    Experimental Design Activities. These series of activities are designed for students to collaborate together to come up with something unique and interesting. For many of the experimental questions, students will work in pairs or groups to come up with their own experimental design. If the school has resources and there is enough time, students ...

  13. 8.1 Experimental design: What is it and when should it be used?

    Experimental group- the group in an experiment that receives the intervention; Posttest- a measurement taken after the intervention; Posttest-only control group design- a type of experimental design that uses random assignment, and an experimental and control group, but does not use a pretest; Pretest- a measurement taken prior to the intervention

  14. 5 Ways to Make Experimental Design A More Approachable Topic

    3. Make it more interactive with classroom activities. Engaging your students in small classroom activities can be another way to help them understand the idea of experimental designs. Educators can help their students navigate through the problem scientifically. Here's a stepwise example.

  15. Introduction to Experimental Design

    We will cover the fundamentals of designing experiments (i.e., picking interventions) for the purpose of learning a structural causal model. We will begin by reviewing what graphical information can be learned from interventions. Then, we will discuss basic aspects of different settings for experimental design, including the distinction between passive and active settings, possible constraints ...

  16. PDF Experimental Design Activities

    Experimental Design Activities These series of activities are designed for students to collaborate together to come ... For many of the experimental questions, students will work in pairs or groups to come up with their own experimental design. If the school has resources and there is enough time, students can test their designs. If students ...

  17. 5 Exciting Ways to Teach Experimental Design Without Lecturing

    At first, experimental design can be a daunting topic for students to learn. Thankfully, there are creative ways to introduce them in the classroom. Here are five examples of such techniques. 1. Use Interactive Demonstrations of Experimental Design. Experiments are meant to be interactive. Experimental design should be as well.

  18. Exploring Experimental Design: Using hands-on activities to ...

    This article presents the basic concepts of experimental design through a series of hands-on activities. This allowed students to focus on one or two fundamental concepts during each activity, reducing the likelihood of confusion. Even though the primary goal was to teach students how to conduct, analyze, and report an experiment, they also developed stronger problem-solving skills and ...

  19. Introducing students to experimental design skills

    Hane (2007) reported that even university students' understanding of experimental design was improved by inquiry activities when they had increasing levels of responsibility for designing experiments or observational studies. Concepts of experimental design were also emphasized in inquiry-based components of the course lecture.

  20. Enabling general chemistry students to take part in experimental design

    Considerations and suggestions for implementing these types of activities to enable a wide variety of general chemistry students to take part in experimental design are discussed. Implications for research and teaching, including a consideration of ChatGPT, are also presented.

  21. Relationship between creative thinking and experimental design thinking

    Specifically, experimental design thinking is a kind of scientific subject thinking connected with experimental design activities in the field of scientific learning. For example, Wang (2000) pointed that experimental design thinking refers to the comprehensive consideration before implementing the experiment in order to accurately represent ...

  22. Part 1: Introduction to Experimental Design

    1.01 Identify and create questions and hypotheses that can be answered through scientific investigations. 1.02 Develop appropriate experimental procedures for: Given questions. Student generated questions. 1.04 Analyze variables in scientific investigations: Identify dependent and independent. Use of a control.

  23. Designing Activities to Teach Higher-Order Skills: How Feedback and

    Active learning approaches to biology teaching, including simulation-based activities, are known to enhance student learning, especially of higher-order skills; nonetheless, there are still many open questions about what features of an activity promote optimal learning. Here we designed three versions of a simulation-based tutorial called Understanding Experimental Design that asks students to ...

  24. Morality matters: social psychological perspectives on how and why CSR

    We theorize that the organization's CSR activities would positively impact on perceived organizational morality rather than on perceived organizational competence and that this increase in perceived organizational morality leads to an increase in stakeholders' support. Two experimental design studies show support for our theorizing. We ...

  25. Experimental analysis of an innovative geo-exchange system installed on

    The need to use efficient and eco-friendly air conditioning systems is a fundamental requirement for today's society. In this context, low and medium enthalpy geothermal energy plays a crucial role. The research and development activities carried out in the present work have made it possible to successfully design and analyze an innovative technology capable of supplying thermal energy to ...

  26. [2406.03480] Unpacking Approaches to Learning and Teaching Machine

    View PDF HTML (experimental) Abstract: In this conceptual paper, we review existing literature on artificial intelligence/machine learning (AI/ML) education to identify three approaches to how learning and teaching ML could be conceptualized. One of them, a data-driven approach, emphasizes providing young people with opportunities to create data sets, train, and test models.