• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Experimental Design: Definition and Types

By Jim Frost 3 Comments

What is Experimental Design?

An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.

An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental design as the design of experiments (DOE). Both terms are synonymous.

Scientist who developed an experimental design for her research.

Ultimately, the design of experiments helps ensure that your procedures and data will evaluate your research question effectively. Without an experimental design, you might waste your efforts in a process that, for many potential reasons, can’t answer your research question. In short, it helps you trust your results.

Learn more about Independent and Dependent Variables .

Design of Experiments: Goals & Settings

Experiments occur in many settings, ranging from psychology, social sciences, medicine, physics, engineering, and industrial and service sectors. Typically, experimental goals are to discover a previously unknown effect , confirm a known effect, or test a hypothesis.

Effects represent causal relationships between variables. For example, in a medical experiment, does the new medicine cause an improvement in health outcomes? If so, the medicine has a causal effect on the outcome.

An experimental design’s focus depends on the subject area and can include the following goals:

  • Understanding the relationships between variables.
  • Identifying the variables that have the largest impact on the outcomes.
  • Finding the input variable settings that produce an optimal result.

For example, psychologists have conducted experiments to understand how conformity affects decision-making. Sociologists have performed experiments to determine whether ethnicity affects the public reaction to staged bike thefts. These experiments map out the causal relationships between variables, and their primary goal is to understand the role of various factors.

Conversely, in a manufacturing environment, the researchers might use an experimental design to find the factors that most effectively improve their product’s strength, identify the optimal manufacturing settings, and do all that while accounting for various constraints. In short, a manufacturer’s goal is often to use experiments to improve their products cost-effectively.

In a medical experiment, the goal might be to quantify the medicine’s effect and find the optimum dosage.

Developing an Experimental Design

Developing an experimental design involves planning that maximizes the potential to collect data that is both trustworthy and able to detect causal relationships. Specifically, these studies aim to see effects when they exist in the population the researchers are studying, preferentially favor causal effects, isolate each factor’s true effect from potential confounders, and produce conclusions that you can generalize to the real world.

To accomplish these goals, experimental designs carefully manage data validity and reliability , and internal and external experimental validity. When your experiment is valid and reliable, you can expect your procedures and data to produce trustworthy results.

An excellent experimental design involves the following:

  • Lots of preplanning.
  • Developing experimental treatments.
  • Determining how to assign subjects to treatment groups.

The remainder of this article focuses on how experimental designs incorporate these essential items to accomplish their research goals.

Learn more about Data Reliability vs. Validity and Internal and External Experimental Validity .

Preplanning, Defining, and Operationalizing for Design of Experiments

A literature review is crucial for the design of experiments.

This phase of the design of experiments helps you identify critical variables, know how to measure them while ensuring reliability and validity, and understand the relationships between them. The review can also help you find ways to reduce sources of variability, which increases your ability to detect treatment effects. Notably, the literature review allows you to learn how similar studies designed their experiments and the challenges they faced.

Operationalizing a study involves taking your research question, using the background information you gathered, and formulating an actionable plan.

This process should produce a specific and testable hypothesis using data that you can reasonably collect given the resources available to the experiment.

  • Null hypothesis : The jumping exercise intervention does not affect bone density.
  • Alternative hypothesis : The jumping exercise intervention affects bone density.

To learn more about this early phase, read Five Steps for Conducting Scientific Studies with Statistical Analyses .

Formulating Treatments in Experimental Designs

In an experimental design, treatments are variables that the researchers control. They are the primary independent variables of interest. Researchers administer the treatment to the subjects or items in the experiment and want to know whether it causes changes in the outcome.

As the name implies, a treatment can be medical in nature, such as a new medicine or vaccine. But it’s a general term that applies to other things such as training programs, manufacturing settings, teaching methods, and types of fertilizers. I helped run an experiment where the treatment was a jumping exercise intervention that we hoped would increase bone density. All these treatment examples are things that potentially influence a measurable outcome.

Even when you know your treatment generally, you must carefully consider the amount. How large of a dose? If you’re comparing three different temperatures in a manufacturing process, how far apart are they? For my bone mineral density study, we had to determine how frequently the exercise sessions would occur and how long each lasted.

How you define the treatments in the design of experiments can affect your findings and the generalizability of your results.

Assigning Subjects to Experimental Groups

A crucial decision for all experimental designs is determining how researchers assign subjects to the experimental conditions—the treatment and control groups. The control group is often, but not always, the lack of a treatment. It serves as a basis for comparison by showing outcomes for subjects who don’t receive a treatment. Learn more about Control Groups .

How your experimental design assigns subjects to the groups affects how confident you can be that the findings represent true causal effects rather than mere correlation caused by confounders. Indeed, the assignment method influences how you control for confounding variables. This is the difference between correlation and causation .

Imagine a study finds that vitamin consumption correlates with better health outcomes. As a researcher, you want to be able to say that vitamin consumption causes the improvements. However, with the wrong experimental design, you might only be able to say there is an association. A confounder, and not the vitamins, might actually cause the health benefits.

Let’s explore some of the ways to assign subjects in design of experiments.

Completely Randomized Designs

A completely randomized experimental design randomly assigns all subjects to the treatment and control groups. You simply take each participant and use a random process to determine their group assignment. You can flip coins, roll a die, or use a computer. Randomized experiments must be prospective studies because they need to be able to control group assignment.

Random assignment in the design of experiments helps ensure that the groups are roughly equivalent at the beginning of the study. This equivalence at the start increases your confidence that any differences you see at the end were caused by the treatments. The randomization tends to equalize confounders between the experimental groups and, thereby, cancels out their effects, leaving only the treatment effects.

For example, in a vitamin study, the researchers can randomly assign participants to either the control or vitamin group. Because the groups are approximately equal when the experiment starts, if the health outcomes are different at the end of the study, the researchers can be confident that the vitamins caused those improvements.

Statisticians consider randomized experimental designs to be the best for identifying causal relationships.

If you can’t randomly assign subjects but want to draw causal conclusions about an intervention, consider using a quasi-experimental design .

Learn more about Randomized Controlled Trials and Random Assignment in Experiments .

Randomized Block Designs

Nuisance factors are variables that can affect the outcome, but they are not the researcher’s primary interest. Unfortunately, they can hide or distort the treatment results. When experimenters know about specific nuisance factors, they can use a randomized block design to minimize their impact.

This experimental design takes subjects with a shared “nuisance” characteristic and groups them into blocks. The participants in each block are then randomly assigned to the experimental groups. This process allows the experiment to control for known nuisance factors.

Blocking in the design of experiments reduces the impact of nuisance factors on experimental error. The analysis assesses the effects of the treatment within each block, which removes the variability between blocks. The result is that blocked experimental designs can reduce the impact of nuisance variables, increasing the ability to detect treatment effects accurately.

Suppose you’re testing various teaching methods. Because grade level likely affects educational outcomes, you might use grade level as a blocking factor. To use a randomized block design for this scenario, divide the participants by grade level and then randomly assign the members of each grade level to the experimental groups.

A standard guideline for an experimental design is to “Block what you can, randomize what you cannot.” Use blocking for a few primary nuisance factors. Then use random assignment to distribute the unblocked nuisance factors equally between the experimental conditions.

You can also use covariates to control nuisance factors. Learn about Covariates: Definition and Uses .

Observational Studies

In some experimental designs, randomly assigning subjects to the experimental conditions is impossible or unethical. The researchers simply can’t assign participants to the experimental groups. However, they can observe them in their natural groupings, measure the essential variables, and look for correlations. These observational studies are also known as quasi-experimental designs. Retrospective studies must be observational in nature because they look back at past events.

Imagine you’re studying the effects of depression on an activity. Clearly, you can’t randomly assign participants to the depression and control groups. But you can observe participants with and without depression and see how their task performance differs.

Observational studies let you perform research when you can’t control the treatment. However, quasi-experimental designs increase the problem of confounding variables. For this design of experiments, correlation does not necessarily imply causation. While special procedures can help control confounders in an observational study, you’re ultimately less confident that the results represent causal findings.

Learn more about Observational Studies .

For a good comparison, learn about the differences and tradeoffs between Observational Studies and Randomized Experiments .

Between-Subjects vs. Within-Subjects Experimental Designs

When you think of the design of experiments, you probably picture a treatment and control group. Researchers assign participants to only one of these groups, so each group contains entirely different subjects than the other groups. Analysts compare the groups at the end of the experiment. Statisticians refer to this method as a between-subjects, or independent measures, experimental design.

In a between-subjects design , you can have more than one treatment group, but each subject is exposed to only one condition, the control group or one of the treatment groups.

A potential downside to this approach is that differences between groups at the beginning can affect the results at the end. As you’ve read earlier, random assignment can reduce those differences, but it is imperfect. There will always be some variability between the groups.

In a  within-subjects experimental design , also known as repeated measures, subjects experience all treatment conditions and are measured for each. Each subject acts as their own control, which reduces variability and increases the statistical power to detect effects.

In this experimental design, you minimize pre-existing differences between the experimental conditions because they all contain the same subjects. However, the order of treatments can affect the results. Beware of practice and fatigue effects. Learn more about Repeated Measures Designs .

Assigned to one experimental condition Participates in all experimental conditions
Requires more subjects Fewer subjects
Differences between subjects in the groups can affect the results Uses same subjects in all conditions.
No order of treatment effects. Order of treatments can affect results.

Design of Experiments Examples

For example, a bone density study has three experimental groups—a control group, a stretching exercise group, and a jumping exercise group.

In a between-subjects experimental design, scientists randomly assign each participant to one of the three groups.

In a within-subjects design, all subjects experience the three conditions sequentially while the researchers measure bone density repeatedly. The procedure can switch the order of treatments for the participants to help reduce order effects.

Matched Pairs Experimental Design

A matched pairs experimental design is a between-subjects study that uses pairs of similar subjects. Researchers use this approach to reduce pre-existing differences between experimental groups. It’s yet another design of experiments method for reducing sources of variability.

Researchers identify variables likely to affect the outcome, such as demographics. When they pick a subject with a set of characteristics, they try to locate another participant with similar attributes to create a matched pair. Scientists randomly assign one member of a pair to the treatment group and the other to the control group.

On the plus side, this process creates two similar groups, and it doesn’t create treatment order effects. While matched pairs do not produce the perfectly matched groups of a within-subjects design (which uses the same subjects in all conditions), it aims to reduce variability between groups relative to a between-subjects study.

On the downside, finding matched pairs is very time-consuming. Additionally, if one member of a matched pair drops out, the other subject must leave the study too.

Learn more about Matched Pairs Design: Uses & Examples .

Another consideration is whether you’ll use a cross-sectional design (one point in time) or use a longitudinal study to track changes over time .

A case study is a research method that often serves as a precursor to a more rigorous experimental design by identifying research questions, variables, and hypotheses to test. Learn more about What is a Case Study? Definition & Examples .

In conclusion, the design of experiments is extremely sensitive to subject area concerns and the time and resources available to the researchers. Developing a suitable experimental design requires balancing a multitude of considerations. A successful design is necessary to obtain trustworthy answers to your research question and to have a reasonable chance of detecting treatment effects when they exist.

Share this:

experimental research statistical analysis

Reader Interactions

' src=

March 23, 2024 at 2:35 pm

Dear Jim You wrote a superb document, I will use it in my Buistatistics course, along with your three books. Thank you very much! Miguel

' src=

March 23, 2024 at 5:43 pm

Thanks so much, Miguel! Glad this post was helpful and I trust the books will be as well.

' src=

April 10, 2023 at 4:36 am

What are the purpose and uses of experimental research design?

Comments and Questions Cancel reply

Wiley Online Library

  • Search term Advanced Search Citation Search
  • Individual login
  • Institutional login

Statistical Design and Analysis of Experiments : With Applications to Engineering and Science

About this book.

  • Features numerous examples using actual engineering and scientific studies.
  • Presents statistics as an integral component of experimentation from the planning stage to the presentation of the conclusions.
  • Deep and concentrated experimental design coverage, with equivalent but separate emphasis on the analysis of data from the various designs.
  • Topics can be implemented by practitioners and do not require a high level of training in statistics.
  • New edition includes new and updated material and computer output.

"...can really provide useful information for the intended audience..." ( Zentralblatt Math , Vol. 1029, 2004)

“...a practitioner’s guide to statistical methods for designing and analyzing experiments...” ( Quarterly of Applied Mathematics , Vol. LXI, No. 3, September 2003)

"...a perfect desktop reference..." ( Technometrics , Vol. 45, No. 3, August 2003)

Author Bios

RICHARD F. GUNST, PhD, is a professor in the Department of Statistical Science at Southern Methodist University in Dallas, Texas.

JAMES L. HESS, PhD, is Staff Vice President, Operations, at Leggett & Platt Inc. in Carthage, Missouri.

Table of Contents

Frontmatter (pages: i-xix).

  • Request permissions

Statistics in Engineering and Science (Pages: 1-32)

Fundamentals of statistical inference (pages: 33-68), inferences on means and standard deviations (pages: 69-105), statistical principles in experimental design (pages: 107-139), factorial experiments in completely randomized designs (pages: 140-169), analysis of completely randomized designs (pages: 170-227), fractional factorial experiments (pages: 228-270), analysis of fractional factorial experiments (pages: 271-308), experiments in randomized block designs (pages: 309-346), analysis of designs with random factor levels (pages: 347-377), nested designs (pages: 378-399), special designs for process improvement (pages: 400-422), analysis of nested designs and designs for process improvement (pages: 423-458), linear regression with one predictor variable (pages: 459-495), linear regression with several predictor variables (pages: 496-534), linear regression with factors and covariates as predictors (pages: 535-567), designs and analyses for fitting response surfaces (pages: 568-613), model assessment (pages: 614-658), variable selection techniques (pages: 659-687), appendix: statistical tables (pages: 689-721), index (pages: 723-728), wiley series in probability and statistics (pages: 729-736).

  • Buy this Book
  • Contact your account manager
  • For authors

experimental research statistical analysis

Log in to Wiley Online Library

Change password, your password must have 10 characters or more:.

  • a lower case character, 
  • an upper case character, 
  • a special character 

Password Changed Successfully

Your password has been changed

Create a new account

Forgot your password.

Enter your email address below.

Please check your email for instructions on resetting your password. If you do not receive an email within 10 minutes, your email address may not be registered, and you may need to create a new Wiley Online Library account.

Request Username

Can't sign in? Forgot your username?

Enter your email address below and we will send you your username

If the address matches an existing account you will receive an email with instructions to retrieve your username

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Athl Train
  • v.45(1); Jan-Feb 2010

Study/Experimental/Research Design: Much More Than Statistics

Kenneth l. knight.

Brigham Young University, Provo, UT

The purpose of study, experimental, or research design in scientific manuscripts has changed significantly over the years. It has evolved from an explanation of the design of the experiment (ie, data gathering or acquisition) to an explanation of the statistical analysis. This practice makes “Methods” sections hard to read and understand.

To clarify the difference between study design and statistical analysis, to show the advantages of a properly written study design on article comprehension, and to encourage authors to correctly describe study designs.

Description:

The role of study design is explored from the introduction of the concept by Fisher through modern-day scientists and the AMA Manual of Style . At one time, when experiments were simpler, the study design and statistical design were identical or very similar. With the complex research that is common today, which often includes manipulating variables to create new variables and the multiple (and different) analyses of a single data set, data collection is very different than statistical design. Thus, both a study design and a statistical design are necessary.

Advantages:

Scientific manuscripts will be much easier to read and comprehend. A proper experimental design serves as a road map to the study methods, helping readers to understand more clearly how the data were obtained and, therefore, assisting them in properly analyzing the results.

Study, experimental, or research design is the backbone of good research. It directs the experiment by orchestrating data collection, defines the statistical analysis of the resultant data, and guides the interpretation of the results. When properly described in the written report of the experiment, it serves as a road map to readers, 1 helping them negotiate the “Methods” section, and, thus, it improves the clarity of communication between authors and readers.

A growing trend is to equate study design with only the statistical analysis of the data. The design statement typically is placed at the end of the “Methods” section as a subsection called “Experimental Design” or as part of a subsection called “Data Analysis.” This placement, however, equates experimental design and statistical analysis, minimizing the effect of experimental design on the planning and reporting of an experiment. This linkage is inappropriate, because some of the elements of the study design that should be described at the beginning of the “Methods” section are instead placed in the “Statistical Analysis” section or, worse, are absent from the manuscript entirely.

Have you ever interrupted your reading of the “Methods” to sketch out the variables in the margins of the paper as you attempt to understand how they all fit together? Or have you jumped back and forth from the early paragraphs of the “Methods” section to the “Statistics” section to try to understand which variables were collected and when? These efforts would be unnecessary if a road map at the beginning of the “Methods” section outlined how the independent variables were related, which dependent variables were measured, and when they were measured. When they were measured is especially important if the variables used in the statistical analysis were a subset of the measured variables or were computed from measured variables (such as change scores).

The purpose of this Communications article is to clarify the purpose and placement of study design elements in an experimental manuscript. Adopting these ideas may improve your science and surely will enhance the communication of that science. These ideas will make experimental manuscripts easier to read and understand and, therefore, will allow them to become part of readers' clinical decision making.

WHAT IS A STUDY (OR EXPERIMENTAL OR RESEARCH) DESIGN?

The terms study design, experimental design, and research design are often thought to be synonymous and are sometimes used interchangeably in a single paper. Avoid doing so. Use the term that is preferred by the style manual of the journal for which you are writing. Study design is the preferred term in the AMA Manual of Style , 2 so I will use it here.

A study design is the architecture of an experimental study 3 and a description of how the study was conducted, 4 including all elements of how the data were obtained. 5 The study design should be the first subsection of the “Methods” section in an experimental manuscript (see the Table ). “Statistical Design” or, preferably, “Statistical Analysis” or “Data Analysis” should be the last subsection of the “Methods” section.

Table. Elements of a “Methods” Section

An external file that holds a picture, illustration, etc.
Object name is i1062-6050-45-1-98-t01.jpg

The “Study Design” subsection describes how the variables and participants interacted. It begins with a general statement of how the study was conducted (eg, crossover trials, parallel, or observational study). 2 The second element, which usually begins with the second sentence, details the number of independent variables or factors, the levels of each variable, and their names. A shorthand way of doing so is with a statement such as “A 2 × 4 × 8 factorial guided data collection.” This tells us that there were 3 independent variables (factors), with 2 levels of the first factor, 4 levels of the second factor, and 8 levels of the third factor. Following is a sentence that names the levels of each factor: for example, “The independent variables were sex (male or female), training program (eg, walking, running, weight lifting, or plyometrics), and time (2, 4, 6, 8, 10, 15, 20, or 30 weeks).” Such an approach clearly outlines for readers how the various procedures fit into the overall structure and, therefore, enhances their understanding of how the data were collected. Thus, the design statement is a road map of the methods.

The dependent (or measurement or outcome) variables are then named. Details of how they were measured are not given at this point in the manuscript but are explained later in the “Instruments” and “Procedures” subsections.

Next is a paragraph detailing who the participants were and how they were selected, placed into groups, and assigned to a particular treatment order, if the experiment was a repeated-measures design. And although not a part of the design per se, a statement about obtaining written informed consent from participants and institutional review board approval is usually included in this subsection.

The nuts and bolts of the “Methods” section follow, including such things as equipment, materials, protocols, etc. These are beyond the scope of this commentary, however, and so will not be discussed.

The last part of the “Methods” section and last part of the “Study Design” section is the “Data Analysis” subsection. It begins with an explanation of any data manipulation, such as how data were combined or how new variables (eg, ratios or differences between collected variables) were calculated. Next, readers are told of the statistical measures used to analyze the data, such as a mixed 2 × 4 × 8 analysis of variance (ANOVA) with 2 between-groups factors (sex and training program) and 1 within-groups factor (time of measurement). Researchers should state and reference the statistical package and procedure(s) within the package used to compute the statistics. (Various statistical packages perform analyses slightly differently, so it is important to know the package and specific procedure used.) This detail allows readers to judge the appropriateness of the statistical measures and the conclusions drawn from the data.

STATISTICAL DESIGN VERSUS STATISTICAL ANALYSIS

Avoid using the term statistical design . Statistical methods are only part of the overall design. The term gives too much emphasis to the statistics, which are important, but only one of many tools used in interpreting data and only part of the study design:

The most important issues in biostatistics are not expressed with statistical procedures. The issues are inherently scientific, rather than purely statistical, and relate to the architectural design of the research, not the numbers with which the data are cited and interpreted. 6

Stated another way, “The justification for the analysis lies not in the data collected but in the manner in which the data were collected.” 3 “Without the solid foundation of a good design, the edifice of statistical analysis is unsafe.” 7 (pp4–5)

The intertwining of study design and statistical analysis may have been caused (unintentionally) by R.A. Fisher, “… a genius who almost single-handedly created the foundations for modern statistical science.” 8 Most research did not involve statistics until Fisher invented the concepts and procedures of ANOVA (in 1921) 9 , 10 and experimental design (in 1935). 11 His books became standard references for scientists in many disciplines. As a result, many ANOVA books were titled Experimental Design (see, for example, Edwards 12 ), and ANOVA courses taught in psychology and education departments included the words experimental design in their course titles.

Before the widespread use of computers to analyze data, designs were much simpler, and often there was little difference between study design and statistical analysis. So combining the 2 elements did not cause serious problems. This is no longer true, however, for 3 reasons: (1) Research studies are becoming more complex, with multiple independent and dependent variables. The procedures sections of these complex studies can be difficult to understand if your only reference point is the statistical analysis and design. (2) Dependent variables are frequently measured at different times. (3) How the data were collected is often not directly correlated with the statistical design.

For example, assume the goal is to determine the strength gain in novice and experienced athletes as a result of 3 strength training programs. Rate of change in strength is not a measurable variable; rather, it is calculated from strength measurements taken at various time intervals during the training. So the study design would be a 2 × 2 × 3 factorial with independent variables of time (pretest or posttest), experience (novice or advanced), and training (isokinetic, isotonic, or isometric) and a dependent variable of strength. The statistical design , however, would be a 2 × 3 factorial with independent variables of experience (novice or advanced) and training (isokinetic, isotonic, or isometric) and a dependent variable of strength gain. Note that data were collected according to a 3-factor design but were analyzed according to a 2-factor design and that the dependent variables were different. So a single design statement, usually a statistical design statement, would not communicate which data were collected or how. Readers would be left to figure out on their own how the data were collected.

MULTIVARIATE RESEARCH AND THE NEED FOR STUDY DESIGNS

With the advent of electronic data gathering and computerized data handling and analysis, research projects have increased in complexity. Many projects involve multiple dependent variables measured at different times, and, therefore, multiple design statements may be needed for both data collection and statistical analysis. Consider, for example, a study of the effects of heat and cold on neural inhibition. The variables of H max and M max are measured 3 times each: before, immediately after, and 30 minutes after a 20-minute treatment with heat or cold. Muscle temperature might be measured each minute before, during, and after the treatment. Although the minute-by-minute data are important for graphing temperature fluctuations during the procedure, only 3 temperatures (time 0, time 20, and time 50) are used for statistical analysis. A single dependent variable H max :M max ratio is computed to illustrate neural inhibition. Again, a single statistical design statement would tell little about how the data were obtained. And in this example, separate design statements would be needed for temperature measurement and H max :M max measurements.

As stated earlier, drawing conclusions from the data depends more on how the data were measured than on how they were analyzed. 3 , 6 , 7 , 13 So a single study design statement (or multiple such statements) at the beginning of the “Methods” section acts as a road map to the study and, thus, increases scientists' and readers' comprehension of how the experiment was conducted (ie, how the data were collected). Appropriate study design statements also increase the accuracy of conclusions drawn from the study.

CONCLUSIONS

The goal of scientific writing, or any writing, for that matter, is to communicate information. Including 2 design statements or subsections in scientific papers—one to explain how the data were collected and another to explain how they were statistically analyzed—will improve the clarity of communication and bring praise from readers. To summarize:

  • Purge from your thoughts and vocabulary the idea that experimental design and statistical design are synonymous.
  • Study or experimental design plays a much broader role than simply defining and directing the statistical analysis of an experiment.
  • A properly written study design serves as a road map to the “Methods” section of an experiment and, therefore, improves communication with the reader.
  • Study design should include a description of the type of design used, each factor (and each level) involved in the experiment, and the time at which each measurement was made.
  • Clarify when the variables involved in data collection and data analysis are different, such as when data analysis involves only a subset of a collected variable or a resultant variable from the mathematical manipulation of 2 or more collected variables.

Acknowledgments

Thanks to Thomas A. Cappaert, PhD, ATC, CSCS, CSE, for suggesting the link between R.A. Fisher and the melding of the concepts of research design and statistics.

Exploring Experimental Research: Methodologies, Designs, and Applications Across Disciplines

  • SSRN Electronic Journal

Sereyrath Em at The National University of Cheasim Kamchaymear

  • The National University of Cheasim Kamchaymear

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Rachid Ejjami

  • Khaoula Boussalham

Kimleng Hing

  • COMPUT COMMUN REV

Anastasius Gavras

  • Debbie Rohwer

Sokhom Chan

  • Sorakrich Maneewan
  • Ravinder Koul
  • Int J Contemp Hospit Manag

Anna Mattila

  • J EXP ANAL BEHAV
  • Alan E. Kazdin
  • Jimmie Leppink
  • Keith Morrison
  • Louis Cohen
  • Lawrence Manion
  • ACCOUNT ORG SOC
  • Wim A. Van der Stede
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

Statistical Design and Analysis of Biological Experiments

Chapter 1 principles of experimental design, 1.1 introduction.

The validity of conclusions drawn from a statistical analysis crucially hinges on the manner in which the data are acquired, and even the most sophisticated analysis will not rescue a flawed experiment. Planning an experiment and thinking about the details of data acquisition is so important for a successful analysis that R. A. Fisher—who single-handedly invented many of the experimental design techniques we are about to discuss—famously wrote

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ( Fisher 1938 )

(Statistical) design of experiments provides the principles and methods for planning experiments and tailoring the data acquisition to an intended analysis. Design and analysis of an experiment are best considered as two aspects of the same enterprise: the goals of the analysis strongly inform an appropriate design, and the implemented design determines the possible analyses.

The primary aim of designing experiments is to ensure that valid statistical and scientific conclusions can be drawn that withstand the scrutiny of a determined skeptic. Good experimental design also considers that resources are used efficiently, and that estimates are sufficiently precise and hypothesis tests adequately powered. It protects our conclusions by excluding alternative interpretations or rendering them implausible. Three main pillars of experimental design are randomization , replication , and blocking , and we will flesh out their effects on the subsequent analysis as well as their implementation in an experimental design.

An experimental design is always tailored towards predefined (primary) analyses and an efficient analysis and unambiguous interpretation of the experimental data is often straightforward from a good design. This does not prevent us from doing additional analyses of interesting observations after the data are acquired, but these analyses can be subjected to more severe criticisms and conclusions are more tentative.

In this chapter, we provide the wider context for using experiments in a larger research enterprise and informally introduce the main statistical ideas of experimental design. We use a comparison of two samples as our main example to study how design choices affect an analysis, but postpone a formal quantitative analysis to the next chapters.

1.2 A Cautionary Tale

For illustrating some of the issues arising in the interplay of experimental design and analysis, we consider a simple example. We are interested in comparing the enzyme levels measured in processed blood samples from laboratory mice, when the sample processing is done either with a kit from a vendor A, or a kit from a competitor B. For this, we take 20 mice and randomly select 10 of them for sample preparation with kit A, while the blood samples of the remaining 10 mice are prepared with kit B. The experiment is illustrated in Figure 1.1 A and the resulting data are given in Table 1.1 .

Table 1.1: Measured enzyme levels from samples of twenty mice. Samples of ten mice each were processed using a kit of vendor A and B, respectively.
A 8.96 8.95 11.37 12.63 11.38 8.36 6.87 12.35 10.32 11.99
B 12.68 11.37 12.00 9.81 10.35 11.76 9.01 10.83 8.76 9.99

One option for comparing the two kits is to look at the difference in average enzyme levels, and we find an average level of 10.32 for vendor A and 10.66 for vendor B. We would like to interpret their difference of -0.34 as the difference due to the two preparation kits and conclude whether the two kits give equal results or if measurements based on one kit are systematically different from those based on the other kit.

Such interpretation, however, is only valid if the two groups of mice and their measurements are identical in all aspects except the sample preparation kit. If we use one strain of mice for kit A and another strain for kit B, any difference might also be attributed to inherent differences between the strains. Similarly, if the measurements using kit B were conducted much later than those using kit A, any observed difference might be attributed to changes in, e.g., mice selected, batches of chemicals used, device calibration, or any number of other influences. None of these competing explanations for an observed difference can be excluded from the given data alone, but good experimental design allows us to render them (almost) arbitrarily implausible.

A second aspect for our analysis is the inherent uncertainty in our calculated difference: if we repeat the experiment, the observed difference will change each time, and this will be more pronounced for a smaller number of mice, among others. If we do not use a sufficient number of mice in our experiment, the uncertainty associated with the observed difference might be too large, such that random fluctuations become a plausible explanation for the observed difference. Systematic differences between the two kits, of practically relevant magnitude in either direction, might then be compatible with the data, and we can draw no reliable conclusions from our experiment.

In each case, the statistical analysis—no matter how clever—was doomed before the experiment was even started, while simple ideas from statistical design of experiments would have provided correct and robust results with interpretable conclusions.

1.3 The Language of Experimental Design

By an experiment we understand an investigation where the researcher has full control over selecting and altering the experimental conditions of interest, and we only consider investigations of this type. The selected experimental conditions are called treatments . An experiment is comparative if the responses to several treatments are to be compared or contrasted. The experimental units are the smallest subdivision of the experimental material to which a treatment can be assigned. All experimental units given the same treatment constitute a treatment group . Especially in biology, we often compare treatments to a control group to which some standard experimental conditions are applied; a typical example is using a placebo for the control group, and different drugs for the other treatment groups.

The values observed are called responses and are measured on the response units ; these are often identical to the experimental units but need not be. Multiple experimental units are sometimes combined into groupings or blocks , such as mice grouped by litter, or samples grouped by batches of chemicals used for their preparation. More generally, we call any grouping of the experimental material (even with group size one) a unit .

In our example, we selected the mice, used a single sample per mouse, deliberately chose the two specific vendors, and had full control over which kit to assign to which mouse. In other words, the two kits are the treatments and the mice are the experimental units. We took the measured enzyme level of a single sample from a mouse as our response, and samples are therefore the response units. The resulting experiment is comparative, because we contrast the enzyme levels between the two treatment groups.

Three designs to determine the difference between two preparation kits A and B based on four mice. A: One sample per mouse. Comparison between averages of samples with same kit. B: Two samples per mouse treated with the same kit. Comparison between averages of mice with same kit requires averaging responses for each mouse first. C: Two samples per mouse each treated with different kit. Comparison between two samples of each mouse, with differences averaged.

Figure 1.1: Three designs to determine the difference between two preparation kits A and B based on four mice. A: One sample per mouse. Comparison between averages of samples with same kit. B: Two samples per mouse treated with the same kit. Comparison between averages of mice with same kit requires averaging responses for each mouse first. C: Two samples per mouse each treated with different kit. Comparison between two samples of each mouse, with differences averaged.

In this example, we can coalesce experimental and response units, because we have a single response per mouse and cannot distinguish a sample from a mouse in the analysis, as illustrated in Figure 1.1 A for four mice. Responses from mice with the same kit are averaged, and the kit difference is the difference between these two averages.

By contrast, if we take two samples per mouse and use the same kit for both samples, then the mice are still the experimental units, but each mouse now groups the two response units associated with it. Now, responses from the same mouse are first averaged, and these averages are used to calculate the difference between kits; even though eight measurements are available, this difference is still based on only four mice (Figure 1.1 B).

If we take two samples per mouse, but apply each kit to one of the two samples, then the samples are both the experimental and response units, while the mice are blocks that group the samples. Now, we calculate the difference between kits for each mouse, and then average these differences (Figure 1.1 C).

If we only use one kit and determine the average enzyme level, then this investigation is still an experiment, but is not comparative.

To summarize, the design of an experiment determines the logical structure of the experiment ; it consists of (i) a set of treatments (the two kits); (ii) a specification of the experimental units (animals, cell lines, samples) (the mice in Figure 1.1 A,B and the samples in Figure 1.1 C); (iii) a procedure for assigning treatments to units; and (iv) a specification of the response units and the quantity to be measured as a response (the samples and associated enzyme levels).

1.4 Experiment Validity

Before we embark on the more technical aspects of experimental design, we discuss three components for evaluating an experiment’s validity: construct validity , internal validity , and external validity . These criteria are well-established in areas such as educational and psychological research, and have more recently been discussed for animal research ( Würbel 2017 ) where experiments are increasingly scrutinized for their scientific rationale and their design and intended analyses.

1.4.1 Construct Validity

Construct validity concerns the choice of the experimental system for answering our research question. Is the system even capable of providing a relevant answer to the question?

Studying the mechanisms of a particular disease, for example, might require careful choice of an appropriate animal model that shows a disease phenotype and is accessible to experimental interventions. If the animal model is a proxy for drug development for humans, biological mechanisms must be sufficiently similar between animal and human physiologies.

Another important aspect of the construct is the quantity that we intend to measure (the measurand ), and its relation to the quantity or property we are interested in. For example, we might measure the concentration of the same chemical compound once in a blood sample and once in a highly purified sample, and these constitute two different measurands, whose values might not be comparable. Often, the quantity of interest (e.g., liver function) is not directly measurable (or even quantifiable) and we measure a biomarker instead. For example, pre-clinical and clinical investigations may use concentrations of proteins or counts of specific cell types from blood samples, such as the CD4+ cell count used as a biomarker for immune system function.

1.4.2 Internal Validity

The internal validity of an experiment concerns the soundness of the scientific rationale, statistical properties such as precision of estimates, and the measures taken against risk of bias. It refers to the validity of claims within the context of the experiment. Statistical design of experiments plays a prominent role in ensuring internal validity, and we briefly discuss the main ideas before providing the technical details and an application to our example in the subsequent sections.

Scientific Rationale and Research Question

The scientific rationale of a study is (usually) not immediately a statistical question. Translating a scientific question into a quantitative comparison amenable to statistical analysis is no small task and often requires careful consideration. It is a substantial, if non-statistical, benefit of using experimental design that we are forced to formulate a precise-enough research question and decide on the main analyses required for answering it before we conduct the experiment. For example, the question: is there a difference between placebo and drug? is insufficiently precise for planning a statistical analysis and determine an adequate experimental design. What exactly is the drug treatment? What should the drug’s concentration be and how is it administered? How do we make sure that the placebo group is comparable to the drug group in all other aspects? What do we measure and what do we mean by “difference?” A shift in average response, a fold-change, change in response before and after treatment?

The scientific rationale also enters the choice of a potential control group to which we compare responses. The quote

The deep, fundamental question in statistical analysis is ‘Compared to what?’ ( Tufte 1997 )

highlights the importance of this choice.

There are almost never enough resources to answer all relevant scientific questions. We therefore define a few questions of highest interest, and the main purpose of the experiment is answering these questions in the primary analysis . This intended analysis drives the experimental design to ensure relevant estimates can be calculated and have sufficient precision, and tests are adequately powered. This does not preclude us from conducting additional secondary analyses and exploratory analyses , but we are not willing to enlarge the experiment to ensure that strong conclusions can also be drawn from these analyses.

Risk of Bias

Experimental bias is a systematic difference in response between experimental units in addition to the difference caused by the treatments. The experimental units in the different groups are then not equal in all aspects other than the treatment applied to them. We saw several examples in Section 1.2 .

Minimizing the risk of bias is crucial for internal validity and we look at some common measures to eliminate or reduce different types of bias in Section 1.5 .

Precision and Effect Size

Another aspect of internal validity is the precision of estimates and the expected effect sizes. Is the experimental setup, in principle, able to detect a difference of relevant magnitude? Experimental design offers several methods for answering this question based on the expected heterogeneity of samples, the measurement error, and other sources of variation: power analysis is a technique for determining the number of samples required to reliably detect a relevant effect size and provide estimates of sufficient precision. More samples yield more precision and more power, but we have to be careful that replication is done at the right level: simply measuring a biological sample multiple times as in Figure 1.1 B yields more measured values, but is pseudo-replication for analyses. Replication should also ensure that the statistical uncertainties of estimates can be gauged from the data of the experiment itself, without additional untestable assumptions. Finally, the technique of blocking , shown in Figure 1.1 C, can remove a substantial proportion of the variation and thereby increase power and precision if we find a way to apply it.

1.4.3 External Validity

The external validity of an experiment concerns its replicability and the generalizability of inferences. An experiment is replicable if its results can be confirmed by an independent new experiment, preferably by a different lab and researcher. Experimental conditions in the replicate experiment usually differ from the original experiment, which provides evidence that the observed effects are robust to such changes. A much weaker condition on an experiment is reproducibility , the property that an independent researcher draws equivalent conclusions based on the data from this particular experiment, using the same analysis techniques. Reproducibility requires publishing the raw data, details on the experimental protocol, and a description of the statistical analyses, preferably with accompanying source code. Many scientific journals subscribe to reporting guidelines to ensure reproducibility and these are also helpful for planning an experiment.

A main threat to replicability and generalizability are too tightly controlled experimental conditions, when inferences only hold for a specific lab under the very specific conditions of the original experiment. Introducing systematic heterogeneity and using multi-center studies effectively broadens the experimental conditions and therefore the inferences for which internal validity is available.

For systematic heterogeneity , experimental conditions are systematically altered in addition to the treatments, and treatment differences estimated for each condition. For example, we might split the experimental material into several batches and use a different day of analysis, sample preparation, batch of buffer, measurement device, and lab technician for each batch. A more general inference is then possible if effect size, effect direction, and precision are comparable between the batches, indicating that the treatment differences are stable over the different conditions.

In multi-center experiments , the same experiment is conducted in several different labs and the results compared and merged. Multi-center approaches are very common in clinical trials and often necessary to reach the required number of patient enrollments.

Generalizability of randomized controlled trials in medicine and animal studies can suffer from overly restrictive eligibility criteria. In clinical trials, patients are often included or excluded based on co-medications and co-morbidities, and the resulting sample of eligible patients might no longer be representative of the patient population. For example, Travers et al. ( 2007 ) used the eligibility criteria of 17 random controlled trials of asthma treatments and found that out of 749 patients, only a median of 6% (45 patients) would be eligible for an asthma-related randomized controlled trial. This puts a question mark on the relevance of the trials’ findings for asthma patients in general.

1.5 Reducing the Risk of Bias

1.5.1 randomization of treatment allocation.

If systematic differences other than the treatment exist between our treatment groups, then the effect of the treatment is confounded with these other differences and our estimates of treatment effects might be biased.

We remove such unwanted systematic differences from our treatment comparisons by randomizing the allocation of treatments to experimental units. In a completely randomized design , each experimental unit has the same chance of being subjected to any of the treatments, and any differences between the experimental units other than the treatments are distributed over the treatment groups. Importantly, randomization is the only method that also protects our experiment against unknown sources of bias: we do not need to know all or even any of the potential differences and yet their impact is eliminated from the treatment comparisons by random treatment allocation.

Randomization has two effects: (i) differences unrelated to treatment become part of the ‘statistical noise’ rendering the treatment groups more similar; and (ii) the systematic differences are thereby eliminated as sources of bias from the treatment comparison.

Randomization transforms systematic variation into random variation.

In our example, a proper randomization would select 10 out of our 20 mice fully at random, such that the probability of any one mouse being picked is 1/20. These ten mice are then assigned to kit A, and the remaining mice to kit B. This allocation is entirely independent of the treatments and of any properties of the mice.

To ensure random treatment allocation, some kind of random process needs to be employed. This can be as simple as shuffling a pack of 10 red and 10 black cards or using a software-based random number generator. Randomization is slightly more difficult if the number of experimental units is not known at the start of the experiment, such as when patients are recruited for an ongoing clinical trial (sometimes called rolling recruitment ), and we want to have reasonable balance between the treatment groups at each stage of the trial.

Seemingly random assignments “by hand” are usually no less complicated than fully random assignments, but are always inferior. If surprising results ensue from the experiment, such assignments are subject to unanswerable criticism and suspicion of unwanted bias. Even worse are systematic allocations; they can only remove bias from known causes, and immediately raise red flags under the slightest scrutiny.

The Problem of Undesired Assignments

Even with a fully random treatment allocation procedure, we might end up with an undesirable allocation. For our example, the treatment group of kit A might—just by chance—contain mice that are all bigger or more active than those in the other treatment group. Statistical orthodoxy recommends using the design nevertheless, because only full randomization guarantees valid estimates of residual variance and unbiased estimates of effects. This argument, however, concerns the long-run properties of the procedure and seems of little help in this specific situation. Why should we care if the randomization yields correct estimates under replication of the experiment, if the particular experiment is jeopardized?

Another solution is to create a list of all possible allocations that we would accept and randomly choose one of these allocations for our experiment. The analysis should then reflect this restriction in the possible randomizations, which often renders this approach difficult to implement.

The most pragmatic method is to reject highly undesirable designs and compute a new randomization ( Cox 1958 ) . Undesirable allocations are unlikely to arise for large sample sizes, and we might accept a small bias in estimation for small sample sizes, when uncertainty in the estimated treatment effect is already high. In this approach, whenever we reject a particular outcome, we must also be willing to reject the outcome if we permute the treatment level labels. If we reject eight big and two small mice for kit A, then we must also reject two big and eight small mice. We must also be transparent and report a rejected allocation, so that critics may come to their own conclusions about potential biases and their remedies.

1.5.2 Blinding

Bias in treatment comparisons is also introduced if treatment allocation is random, but responses cannot be measured entirely objectively, or if knowledge of the assigned treatment affects the response. In clinical trials, for example, patients might react differently when they know to be on a placebo treatment, an effect known as cognitive bias . In animal experiments, caretakers might report more abnormal behavior for animals on a more severe treatment. Cognitive bias can be eliminated by concealing the treatment allocation from technicians or participants of a clinical trial, a technique called single-blinding .

If response measures are partially based on professional judgement (such as a clinical scale), patient or physician might unconsciously report lower scores for a placebo treatment, a phenomenon known as observer bias . Its removal requires double blinding , where treatment allocations are additionally concealed from the experimentalist.

Blinding requires randomized treatment allocation to begin with and substantial effort might be needed to implement it. Drug companies, for example, have to go to great lengths to ensure that a placebo looks, tastes, and feels similar enough to the actual drug. Additionally, blinding is often done by coding the treatment conditions and samples, and effect sizes and statistical significance are calculated before the code is revealed.

In clinical trials, double-blinding creates a conflict of interest. The attending physicians do not know which patient received which treatment, and thus accumulation of side-effects cannot be linked to any treatment. For this reason, clinical trials have a data monitoring committee not involved in the final analysis, that performs intermediate analyses of efficacy and safety at predefined intervals. If severe problems are detected, the committee might recommend altering or aborting the trial. The same might happen if one treatment already shows overwhelming evidence of superiority, such that it becomes unethical to withhold this treatment from the other patients.

1.5.3 Analysis Plan and Registration

An often overlooked source of bias has been termed the researcher degrees of freedom or garden of forking paths in the data analysis. For any set of data, there are many different options for its analysis: some results might be considered outliers and discarded, assumptions are made on error distributions and appropriate test statistics, different covariates might be included into a regression model. Often, multiple hypotheses are investigated and tested, and analyses are done separately on various (overlapping) subgroups. Hypotheses formed after looking at the data require additional care in their interpretation; almost never will \(p\) -values for these ad hoc or post hoc hypotheses be statistically justifiable. Many different measured response variables invite fishing expeditions , where patterns in the data are sought without an underlying hypothesis. Only reporting those sub-analyses that gave ‘interesting’ findings invariably leads to biased conclusions and is called cherry-picking or \(p\) -hacking (or much less flattering names).

The statistical analysis is always part of a larger scientific argument and we should consider the necessary computations in relation to building our scientific argument about the interpretation of the data. In addition to the statistical calculations, this interpretation requires substantial subject-matter knowledge and includes (many) non-statistical arguments. Two quotes highlight that experiment and analysis are a means to an end and not the end in itself.

There is a boundary in data interpretation beyond which formulas and quantitative decision procedures do not go, where judgment and style enter. ( Abelson 1995 )
Often, perfectly reasonable people come to perfectly reasonable decisions or conclusions based on nonstatistical evidence. Statistical analysis is a tool with which we support reasoning. It is not a goal in itself. ( Bailar III 1981 )

There is often a grey area between exploiting researcher degrees of freedom to arrive at a desired conclusion, and creative yet informed analyses of data. One way to navigate this area is to distinguish between exploratory studies and confirmatory studies . The former have no clearly stated scientific question, but are used to generate interesting hypotheses by identifying potential associations or effects that are then further investigated. Conclusions from these studies are very tentative and must be reported honestly as such. In contrast, standards are much higher for confirmatory studies, which investigate a specific predefined scientific question. Analysis plans and pre-registration of an experiment are accepted means for demonstrating lack of bias due to researcher degrees of freedom, and separating primary from secondary analyses allows emphasizing the main goals of the study.

Analysis Plan

The analysis plan is written before conducting the experiment and details the measurands and estimands, the hypotheses to be tested together with a power and sample size calculation, a discussion of relevant effect sizes, detection and handling of outliers and missing data, as well as steps for data normalization such as transformations and baseline corrections. If a regression model is required, its factors and covariates are outlined. Particularly in biology, handling measurements below the limit of quantification and saturation effects require careful consideration.

In the context of clinical trials, the problem of estimands has become a recent focus of attention. An estimand is the target of a statistical estimation procedure, for example the true average difference in enzyme levels between the two preparation kits. A main problem in many studies are post-randomization events that can change the estimand, even if the estimation procedure remains the same. For example, if kit B fails to produce usable samples for measurement in five out of ten cases because the enzyme level was too low, while kit A could handle these enzyme levels perfectly fine, then this might severely exaggerate the observed difference between the two kits. Similar problems arise in drug trials, when some patients stop taking one of the drugs due to side-effects or other complications.

Registration

Registration of experiments is an even more severe measure used in conjunction with an analysis plan and is becoming standard in clinical trials. Here, information about the trial, including the analysis plan, procedure to recruit patients, and stopping criteria, are registered in a public database. Publications based on the trial then refer to this registration, such that reviewers and readers can compare what the researchers intended to do and what they actually did. Similar portals for pre-clinical and translational research are also available.

1.6 Notes and Summary

The problem of measurements and measurands is further discussed for statistics in Hand ( 1996 ) and specifically for biological experiments in Coxon, Longstaff, and Burns ( 2019 ) . A general review of methods for handling missing data is Dong and Peng ( 2013 ) . The different roles of randomization are emphasized in Cox ( 2009 ) .

Two well-known reporting guidelines are the ARRIVE guidelines for animal research ( Kilkenny et al. 2010 ) and the CONSORT guidelines for clinical trials ( Moher et al. 2010 ) . Guidelines describing the minimal information required for reproducing experimental results have been developed for many types of experimental techniques, including microarrays (MIAME), RNA sequencing (MINSEQE), metabolomics (MSI) and proteomics (MIAPE) experiments; the FAIRSHARE initiative provides a more comprehensive collection ( Sansone et al. 2019 ) .

The problems of experimental design in animal experiments and particularly translation research are discussed in Couzin-Frankel ( 2013 ) . Multi-center studies are now considered for these investigations, and using a second laboratory already increases reproducibility substantially ( Richter et al. 2010 ; Richter 2017 ; Voelkl et al. 2018 ; Karp 2018 ) and allows standardizing the treatment effects ( Kafkafi et al. 2017 ) . First attempts are reported of using designs similar to clinical trials ( Llovera and Liesz 2016 ) . Exploratory-confirmatory research and external validity for animal studies is discussed in Kimmelman, Mogil, and Dirnagl ( 2014 ) and Pound and Ritskes-Hoitinga ( 2018 ) . Further information on pilot studies is found in Moore et al. ( 2011 ) , Sim ( 2019 ) , and Thabane et al. ( 2010 ) .

The deliberate use of statistical analyses and their interpretation for supporting a larger argument was called statistics as principled argument ( Abelson 1995 ) . Employing useless statistical analysis without reference to the actual scientific question is surrogate science ( Gigerenzer and Marewski 2014 ) and adaptive thinking is integral to meaningful statistical analysis ( Gigerenzer 2002 ) .

In an experiment, the investigator has full control over the experimental conditions applied to the experiment material. The experimental design gives the logical structure of an experiment: the units describing the organization of the experimental material, the treatments and their allocation to units, and the response. Statistical design of experiments includes techniques to ensure internal validity of an experiment, and methods to make inference from experimental data efficient.

  • Privacy Policy

Research Method

Home » Experimental Design – Types, Methods, Guide

Experimental Design – Types, Methods, Guide

Table of Contents

Experimental Research Design

Experimental Design

Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results.

Experimental design typically includes identifying the variables that will be manipulated or measured, defining the sample or population to be studied, selecting an appropriate method of sampling, choosing a method for data collection and analysis, and determining the appropriate statistical tests to use.

Types of Experimental Design

Here are the different types of experimental design:

Completely Randomized Design

In this design, participants are randomly assigned to one of two or more groups, and each group is exposed to a different treatment or condition.

Randomized Block Design

This design involves dividing participants into blocks based on a specific characteristic, such as age or gender, and then randomly assigning participants within each block to one of two or more treatment groups.

Factorial Design

In a factorial design, participants are randomly assigned to one of several groups, each of which receives a different combination of two or more independent variables.

Repeated Measures Design

In this design, each participant is exposed to all of the different treatments or conditions, either in a random order or in a predetermined order.

Crossover Design

This design involves randomly assigning participants to one of two or more treatment groups, with each group receiving one treatment during the first phase of the study and then switching to a different treatment during the second phase.

Split-plot Design

In this design, the researcher manipulates one or more variables at different levels and uses a randomized block design to control for other variables.

Nested Design

This design involves grouping participants within larger units, such as schools or households, and then randomly assigning these units to different treatment groups.

Laboratory Experiment

Laboratory experiments are conducted under controlled conditions, which allows for greater precision and accuracy. However, because laboratory conditions are not always representative of real-world conditions, the results of these experiments may not be generalizable to the population at large.

Field Experiment

Field experiments are conducted in naturalistic settings and allow for more realistic observations. However, because field experiments are not as controlled as laboratory experiments, they may be subject to more sources of error.

Experimental Design Methods

Experimental design methods refer to the techniques and procedures used to design and conduct experiments in scientific research. Here are some common experimental design methods:

Randomization

This involves randomly assigning participants to different groups or treatments to ensure that any observed differences between groups are due to the treatment and not to other factors.

Control Group

The use of a control group is an important experimental design method that involves having a group of participants that do not receive the treatment or intervention being studied. The control group is used as a baseline to compare the effects of the treatment group.

Blinding involves keeping participants, researchers, or both unaware of which treatment group participants are in, in order to reduce the risk of bias in the results.

Counterbalancing

This involves systematically varying the order in which participants receive treatments or interventions in order to control for order effects.

Replication

Replication involves conducting the same experiment with different samples or under different conditions to increase the reliability and validity of the results.

This experimental design method involves manipulating multiple independent variables simultaneously to investigate their combined effects on the dependent variable.

This involves dividing participants into subgroups or blocks based on specific characteristics, such as age or gender, in order to reduce the risk of confounding variables.

Data Collection Method

Experimental design data collection methods are techniques and procedures used to collect data in experimental research. Here are some common experimental design data collection methods:

Direct Observation

This method involves observing and recording the behavior or phenomenon of interest in real time. It may involve the use of structured or unstructured observation, and may be conducted in a laboratory or naturalistic setting.

Self-report Measures

Self-report measures involve asking participants to report their thoughts, feelings, or behaviors using questionnaires, surveys, or interviews. These measures may be administered in person or online.

Behavioral Measures

Behavioral measures involve measuring participants’ behavior directly, such as through reaction time tasks or performance tests. These measures may be administered using specialized equipment or software.

Physiological Measures

Physiological measures involve measuring participants’ physiological responses, such as heart rate, blood pressure, or brain activity, using specialized equipment. These measures may be invasive or non-invasive, and may be administered in a laboratory or clinical setting.

Archival Data

Archival data involves using existing records or data, such as medical records, administrative records, or historical documents, as a source of information. These data may be collected from public or private sources.

Computerized Measures

Computerized measures involve using software or computer programs to collect data on participants’ behavior or responses. These measures may include reaction time tasks, cognitive tests, or other types of computer-based assessments.

Video Recording

Video recording involves recording participants’ behavior or interactions using cameras or other recording equipment. This method can be used to capture detailed information about participants’ behavior or to analyze social interactions.

Data Analysis Method

Experimental design data analysis methods refer to the statistical techniques and procedures used to analyze data collected in experimental research. Here are some common experimental design data analysis methods:

Descriptive Statistics

Descriptive statistics are used to summarize and describe the data collected in the study. This includes measures such as mean, median, mode, range, and standard deviation.

Inferential Statistics

Inferential statistics are used to make inferences or generalizations about a larger population based on the data collected in the study. This includes hypothesis testing and estimation.

Analysis of Variance (ANOVA)

ANOVA is a statistical technique used to compare means across two or more groups in order to determine whether there are significant differences between the groups. There are several types of ANOVA, including one-way ANOVA, two-way ANOVA, and repeated measures ANOVA.

Regression Analysis

Regression analysis is used to model the relationship between two or more variables in order to determine the strength and direction of the relationship. There are several types of regression analysis, including linear regression, logistic regression, and multiple regression.

Factor Analysis

Factor analysis is used to identify underlying factors or dimensions in a set of variables. This can be used to reduce the complexity of the data and identify patterns in the data.

Structural Equation Modeling (SEM)

SEM is a statistical technique used to model complex relationships between variables. It can be used to test complex theories and models of causality.

Cluster Analysis

Cluster analysis is used to group similar cases or observations together based on similarities or differences in their characteristics.

Time Series Analysis

Time series analysis is used to analyze data collected over time in order to identify trends, patterns, or changes in the data.

Multilevel Modeling

Multilevel modeling is used to analyze data that is nested within multiple levels, such as students nested within schools or employees nested within companies.

Applications of Experimental Design 

Experimental design is a versatile research methodology that can be applied in many fields. Here are some applications of experimental design:

  • Medical Research: Experimental design is commonly used to test new treatments or medications for various medical conditions. This includes clinical trials to evaluate the safety and effectiveness of new drugs or medical devices.
  • Agriculture : Experimental design is used to test new crop varieties, fertilizers, and other agricultural practices. This includes randomized field trials to evaluate the effects of different treatments on crop yield, quality, and pest resistance.
  • Environmental science: Experimental design is used to study the effects of environmental factors, such as pollution or climate change, on ecosystems and wildlife. This includes controlled experiments to study the effects of pollutants on plant growth or animal behavior.
  • Psychology : Experimental design is used to study human behavior and cognitive processes. This includes experiments to test the effects of different interventions, such as therapy or medication, on mental health outcomes.
  • Engineering : Experimental design is used to test new materials, designs, and manufacturing processes in engineering applications. This includes laboratory experiments to test the strength and durability of new materials, or field experiments to test the performance of new technologies.
  • Education : Experimental design is used to evaluate the effectiveness of teaching methods, educational interventions, and programs. This includes randomized controlled trials to compare different teaching methods or evaluate the impact of educational programs on student outcomes.
  • Marketing : Experimental design is used to test the effectiveness of marketing campaigns, pricing strategies, and product designs. This includes experiments to test the impact of different marketing messages or pricing schemes on consumer behavior.

Examples of Experimental Design 

Here are some examples of experimental design in different fields:

  • Example in Medical research : A study that investigates the effectiveness of a new drug treatment for a particular condition. Patients are randomly assigned to either a treatment group or a control group, with the treatment group receiving the new drug and the control group receiving a placebo. The outcomes, such as improvement in symptoms or side effects, are measured and compared between the two groups.
  • Example in Education research: A study that examines the impact of a new teaching method on student learning outcomes. Students are randomly assigned to either a group that receives the new teaching method or a group that receives the traditional teaching method. Student achievement is measured before and after the intervention, and the results are compared between the two groups.
  • Example in Environmental science: A study that tests the effectiveness of a new method for reducing pollution in a river. Two sections of the river are selected, with one section treated with the new method and the other section left untreated. The water quality is measured before and after the intervention, and the results are compared between the two sections.
  • Example in Marketing research: A study that investigates the impact of a new advertising campaign on consumer behavior. Participants are randomly assigned to either a group that is exposed to the new campaign or a group that is not. Their behavior, such as purchasing or product awareness, is measured and compared between the two groups.
  • Example in Social psychology: A study that examines the effect of a new social intervention on reducing prejudice towards a marginalized group. Participants are randomly assigned to either a group that receives the intervention or a control group that does not. Their attitudes and behavior towards the marginalized group are measured before and after the intervention, and the results are compared between the two groups.

When to use Experimental Research Design 

Experimental research design should be used when a researcher wants to establish a cause-and-effect relationship between variables. It is particularly useful when studying the impact of an intervention or treatment on a particular outcome.

Here are some situations where experimental research design may be appropriate:

  • When studying the effects of a new drug or medical treatment: Experimental research design is commonly used in medical research to test the effectiveness and safety of new drugs or medical treatments. By randomly assigning patients to treatment and control groups, researchers can determine whether the treatment is effective in improving health outcomes.
  • When evaluating the effectiveness of an educational intervention: An experimental research design can be used to evaluate the impact of a new teaching method or educational program on student learning outcomes. By randomly assigning students to treatment and control groups, researchers can determine whether the intervention is effective in improving academic performance.
  • When testing the effectiveness of a marketing campaign: An experimental research design can be used to test the effectiveness of different marketing messages or strategies. By randomly assigning participants to treatment and control groups, researchers can determine whether the marketing campaign is effective in changing consumer behavior.
  • When studying the effects of an environmental intervention: Experimental research design can be used to study the impact of environmental interventions, such as pollution reduction programs or conservation efforts. By randomly assigning locations or areas to treatment and control groups, researchers can determine whether the intervention is effective in improving environmental outcomes.
  • When testing the effects of a new technology: An experimental research design can be used to test the effectiveness and safety of new technologies or engineering designs. By randomly assigning participants or locations to treatment and control groups, researchers can determine whether the new technology is effective in achieving its intended purpose.

How to Conduct Experimental Research

Here are the steps to conduct Experimental Research:

  • Identify a Research Question : Start by identifying a research question that you want to answer through the experiment. The question should be clear, specific, and testable.
  • Develop a Hypothesis: Based on your research question, develop a hypothesis that predicts the relationship between the independent and dependent variables. The hypothesis should be clear and testable.
  • Design the Experiment : Determine the type of experimental design you will use, such as a between-subjects design or a within-subjects design. Also, decide on the experimental conditions, such as the number of independent variables, the levels of the independent variable, and the dependent variable to be measured.
  • Select Participants: Select the participants who will take part in the experiment. They should be representative of the population you are interested in studying.
  • Randomly Assign Participants to Groups: If you are using a between-subjects design, randomly assign participants to groups to control for individual differences.
  • Conduct the Experiment : Conduct the experiment by manipulating the independent variable(s) and measuring the dependent variable(s) across the different conditions.
  • Analyze the Data: Analyze the data using appropriate statistical methods to determine if there is a significant effect of the independent variable(s) on the dependent variable(s).
  • Draw Conclusions: Based on the data analysis, draw conclusions about the relationship between the independent and dependent variables. If the results support the hypothesis, then it is accepted. If the results do not support the hypothesis, then it is rejected.
  • Communicate the Results: Finally, communicate the results of the experiment through a research report or presentation. Include the purpose of the study, the methods used, the results obtained, and the conclusions drawn.

Purpose of Experimental Design 

The purpose of experimental design is to control and manipulate one or more independent variables to determine their effect on a dependent variable. Experimental design allows researchers to systematically investigate causal relationships between variables, and to establish cause-and-effect relationships between the independent and dependent variables. Through experimental design, researchers can test hypotheses and make inferences about the population from which the sample was drawn.

Experimental design provides a structured approach to designing and conducting experiments, ensuring that the results are reliable and valid. By carefully controlling for extraneous variables that may affect the outcome of the study, experimental design allows researchers to isolate the effect of the independent variable(s) on the dependent variable(s), and to minimize the influence of other factors that may confound the results.

Experimental design also allows researchers to generalize their findings to the larger population from which the sample was drawn. By randomly selecting participants and using statistical techniques to analyze the data, researchers can make inferences about the larger population with a high degree of confidence.

Overall, the purpose of experimental design is to provide a rigorous, systematic, and scientific method for testing hypotheses and establishing cause-and-effect relationships between variables. Experimental design is a powerful tool for advancing scientific knowledge and informing evidence-based practice in various fields, including psychology, biology, medicine, engineering, and social sciences.

Advantages of Experimental Design 

Experimental design offers several advantages in research. Here are some of the main advantages:

  • Control over extraneous variables: Experimental design allows researchers to control for extraneous variables that may affect the outcome of the study. By manipulating the independent variable and holding all other variables constant, researchers can isolate the effect of the independent variable on the dependent variable.
  • Establishing causality: Experimental design allows researchers to establish causality by manipulating the independent variable and observing its effect on the dependent variable. This allows researchers to determine whether changes in the independent variable cause changes in the dependent variable.
  • Replication : Experimental design allows researchers to replicate their experiments to ensure that the findings are consistent and reliable. Replication is important for establishing the validity and generalizability of the findings.
  • Random assignment: Experimental design often involves randomly assigning participants to conditions. This helps to ensure that individual differences between participants are evenly distributed across conditions, which increases the internal validity of the study.
  • Precision : Experimental design allows researchers to measure variables with precision, which can increase the accuracy and reliability of the data.
  • Generalizability : If the study is well-designed, experimental design can increase the generalizability of the findings. By controlling for extraneous variables and using random assignment, researchers can increase the likelihood that the findings will apply to other populations and contexts.

Limitations of Experimental Design

Experimental design has some limitations that researchers should be aware of. Here are some of the main limitations:

  • Artificiality : Experimental design often involves creating artificial situations that may not reflect real-world situations. This can limit the external validity of the findings, or the extent to which the findings can be generalized to real-world settings.
  • Ethical concerns: Some experimental designs may raise ethical concerns, particularly if they involve manipulating variables that could cause harm to participants or if they involve deception.
  • Participant bias : Participants in experimental studies may modify their behavior in response to the experiment, which can lead to participant bias.
  • Limited generalizability: The conditions of the experiment may not reflect the complexities of real-world situations. As a result, the findings may not be applicable to all populations and contexts.
  • Cost and time : Experimental design can be expensive and time-consuming, particularly if the experiment requires specialized equipment or if the sample size is large.
  • Researcher bias : Researchers may unintentionally bias the results of the experiment if they have expectations or preferences for certain outcomes.
  • Lack of feasibility : Experimental design may not be feasible in some cases, particularly if the research question involves variables that cannot be manipulated or controlled.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Phenomenology

Phenomenology – Methods, Examples and Guide

Case Study Research

Case Study – Methods, Examples and Guide

Triangulation

Triangulation in Research – Types, Methods and...

Explanatory Research

Explanatory Research – Types, Methods, Guide

Textual Analysis

Textual Analysis – Types, Examples and Guide

experimental research statistical analysis

Statistical Methods for Experimental Research in Education and Psychology

  • © 2019
  • Jimmie Leppink   ORCID: https://orcid.org/0000-0002-8713-1374 0

Hull York Medical School, University of York, York, UK

You can also search for this author in PubMed   Google Scholar

  • Focuses on experimental research
  • Uses examples from a wide variety of statistical software, including emerging zero-cost Open Source packages such as JASP and Jamovi
  • Bridges the two disciplines education and psychology in common theory, experimental designs, and statistical methods
  • Provides statistical analysis plans that fit a wide range of experimental research questions and designs
  • Unites traditional and emerging approaches to statistical testing and estimation

Part of the book series: Springer Texts in Education (SPTE)

25k Accesses

35 Citations

12 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

About this book

This book focuses on experimental research in two disciplines that have a lot of common ground in terms of theory, experimental designs used, and methods for the analysis of experimental research data: education and psychology. Although the methods covered in this book are also frequently used in many other disciplines, including sociology and medicine, the examples in this book come from contemporary research topics in education and psychology. Various statistical packages, commercial and zero-cost Open Source ones, are used.

The goal of this book is neither to cover all possible statistical methods out there nor to focus on a particular statistical software package. There are many excellent statistics textbooks on the market that present both basic and advanced concepts at an introductory level and/or provide a very detailed overview of options in a particular statistical software programme. This is not yet another book in that genre.  

Core theme of this book is a heuristic called the question-design-analysis bridge: there is a bridge connecting research questions and hypotheses, experimental design and sampling procedures, and common statistical methods in that context. Each statistical method is discussed in a concrete context of a set of research question with directed (one-sided) or undirected (two-sided) hypotheses and an experimental setup in line with these questions and hypotheses. Therefore, the titles of the chapters in this book do not include any names of statistical methods such as ‘analysis of variance’ or ‘analysis of covariance’. In a total of seventeen chapters, this book covers a wide range of topics of research questions that call for experimental designs and statistical methods, fairly basic or more advanced.

Similar content being viewed by others

experimental research statistical analysis

Statistical Analysis: Getting to Insight Through Collaboration and Critical Thinking

experimental research statistical analysis

Structural Equation Modeling Approaches in Educational Research and Practice

experimental research statistical analysis

Quasi-experimental designs for causal inference: an overview

  • Experimental research in education
  • Experimental research in psychology
  • Experimental research in educational psychology
  • Experimental research on learning and instruction
  • Experimental research in medical education
  • Experimental research in health professions education
  • Statistical methods for experimental research
  • Statistics in education research
  • Question-design-analysis bridge
  • Sampling procedures for experimental research
  • Two one-sided tests equivalence testing in education
  • Statistical testing and estimation in experimental research
  • Bayesian analysis in experimental research
  • Multilevel analysis in experimental research
  • Models for treatment order effects
  • Models for ordinal data in experimental research
  • Multicategory nominal outcome variables in experimental research
  • Survival analysis in experimental research
  • Poisson regression in experimental research
  • learning and instruction

Table of contents (17 chapters)

Front matter, common questions, the question-design-analysis bridge.

Jimmie Leppink

Statistical Testing and Estimation

Measurement and quality criteria, dealing with missing data, types of outcome variables, dichotomous outcome variables, multicategory nominal outcome variables, ordinal outcome variables, quantitative outcome variables, types of comparisons, common approaches to multiple testing, directed hypotheses and planned comparisons, two-way and three-way factorial designs, factor-covariate combinations, multilevel designs, interaction between participants, two or more raters, group-by-time interactions, authors and affiliations, bibliographic information.

Book Title : Statistical Methods for Experimental Research in Education and Psychology

Authors : Jimmie Leppink

Series Title : Springer Texts in Education

DOI : https://doi.org/10.1007/978-3-030-21241-4

Publisher : Springer Cham

eBook Packages : Education , Education (R0)

Copyright Information : The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Softcover ISBN : 978-3-030-21240-7 Published: 12 June 2019

eBook ISBN : 978-3-030-21241-4 Published: 30 May 2019

Series ISSN : 2366-7672

Series E-ISSN : 2366-7680

Edition Number : 1

Number of Pages : XXVI, 289

Number of Illustrations : 9 b/w illustrations, 48 illustrations in colour

Topics : Educational Psychology , Learning & Instruction , Statistics for Social Sciences, Humanities, Law , Experimental Psychology

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

10 Experimental research

Experimental research—often considered to be the ‘gold standard’ in research designs—is one of the most rigorous of all research designs. In this design, one or more independent variables are manipulated by the researcher (as treatments), subjects are randomly assigned to different treatment levels (random assignment), and the results of the treatments on outcomes (dependent variables) are observed. The unique strength of experimental research is its internal validity (causality) due to its ability to link cause and effect through treatment manipulation, while controlling for the spurious effect of extraneous variable.

Experimental research is best suited for explanatory research—rather than for descriptive or exploratory research—where the goal of the study is to examine cause-effect relationships. It also works well for research that involves a relatively limited and well-defined set of independent variables that can either be manipulated or controlled. Experimental research can be conducted in laboratory or field settings. Laboratory experiments , conducted in laboratory (artificial) settings, tend to be high in internal validity, but this comes at the cost of low external validity (generalisability), because the artificial (laboratory) setting in which the study is conducted may not reflect the real world. Field experiments are conducted in field settings such as in a real organisation, and are high in both internal and external validity. But such experiments are relatively rare, because of the difficulties associated with manipulating treatments and controlling for extraneous effects in a field setting.

Experimental research can be grouped into two broad categories: true experimental designs and quasi-experimental designs. Both designs require treatment manipulation, but while true experiments also require random assignment, quasi-experiments do not. Sometimes, we also refer to non-experimental research, which is not really a research design, but an all-inclusive term that includes all types of research that do not employ treatment manipulation or random assignment, such as survey research, observational research, and correlational studies.

Basic concepts

Treatment and control groups. In experimental research, some subjects are administered one or more experimental stimulus called a treatment (the treatment group ) while other subjects are not given such a stimulus (the control group ). The treatment may be considered successful if subjects in the treatment group rate more favourably on outcome variables than control group subjects. Multiple levels of experimental stimulus may be administered, in which case, there may be more than one treatment group. For example, in order to test the effects of a new drug intended to treat a certain medical condition like dementia, if a sample of dementia patients is randomly divided into three groups, with the first group receiving a high dosage of the drug, the second group receiving a low dosage, and the third group receiving a placebo such as a sugar pill (control group), then the first two groups are experimental groups and the third group is a control group. After administering the drug for a period of time, if the condition of the experimental group subjects improved significantly more than the control group subjects, we can say that the drug is effective. We can also compare the conditions of the high and low dosage experimental groups to determine if the high dose is more effective than the low dose.

Treatment manipulation. Treatments are the unique feature of experimental research that sets this design apart from all other research methods. Treatment manipulation helps control for the ‘cause’ in cause-effect relationships. Naturally, the validity of experimental research depends on how well the treatment was manipulated. Treatment manipulation must be checked using pretests and pilot tests prior to the experimental study. Any measurements conducted before the treatment is administered are called pretest measures , while those conducted after the treatment are posttest measures .

Random selection and assignment. Random selection is the process of randomly drawing a sample from a population or a sampling frame. This approach is typically employed in survey research, and ensures that each unit in the population has a positive chance of being selected into the sample. Random assignment, however, is a process of randomly assigning subjects to experimental or control groups. This is a standard practice in true experimental research to ensure that treatment groups are similar (equivalent) to each other and to the control group prior to treatment administration. Random selection is related to sampling, and is therefore more closely related to the external validity (generalisability) of findings. However, random assignment is related to design, and is therefore most related to internal validity. It is possible to have both random selection and random assignment in well-designed experimental research, but quasi-experimental research involves neither random selection nor random assignment.

Threats to internal validity. Although experimental designs are considered more rigorous than other research methods in terms of the internal validity of their inferences (by virtue of their ability to control causes through treatment manipulation), they are not immune to internal validity threats. Some of these threats to internal validity are described below, within the context of a study of the impact of a special remedial math tutoring program for improving the math abilities of high school students.

History threat is the possibility that the observed effects (dependent variables) are caused by extraneous or historical events rather than by the experimental treatment. For instance, students’ post-remedial math score improvement may have been caused by their preparation for a math exam at their school, rather than the remedial math program.

Maturation threat refers to the possibility that observed effects are caused by natural maturation of subjects (e.g., a general improvement in their intellectual ability to understand complex concepts) rather than the experimental treatment.

Testing threat is a threat in pre-post designs where subjects’ posttest responses are conditioned by their pretest responses. For instance, if students remember their answers from the pretest evaluation, they may tend to repeat them in the posttest exam.

Not conducting a pretest can help avoid this threat.

Instrumentation threat , which also occurs in pre-post designs, refers to the possibility that the difference between pretest and posttest scores is not due to the remedial math program, but due to changes in the administered test, such as the posttest having a higher or lower degree of difficulty than the pretest.

Mortality threat refers to the possibility that subjects may be dropping out of the study at differential rates between the treatment and control groups due to a systematic reason, such that the dropouts were mostly students who scored low on the pretest. If the low-performing students drop out, the results of the posttest will be artificially inflated by the preponderance of high-performing students.

Regression threat —also called a regression to the mean—refers to the statistical tendency of a group’s overall performance to regress toward the mean during a posttest rather than in the anticipated direction. For instance, if subjects scored high on a pretest, they will have a tendency to score lower on the posttest (closer to the mean) because their high scores (away from the mean) during the pretest were possibly a statistical aberration. This problem tends to be more prevalent in non-random samples and when the two measures are imperfectly correlated.

Two-group experimental designs

R

Pretest-posttest control group design . In this design, subjects are randomly assigned to treatment and control groups, subjected to an initial (pretest) measurement of the dependent variables of interest, the treatment group is administered a treatment (representing the independent variable of interest), and the dependent variables measured again (posttest). The notation of this design is shown in Figure 10.1.

Pretest-posttest control group design

Statistical analysis of this design involves a simple analysis of variance (ANOVA) between the treatment and control groups. The pretest-posttest design handles several threats to internal validity, such as maturation, testing, and regression, since these threats can be expected to influence both treatment and control groups in a similar (random) manner. The selection threat is controlled via random assignment. However, additional threats to internal validity may exist. For instance, mortality can be a problem if there are differential dropout rates between the two groups, and the pretest measurement may bias the posttest measurement—especially if the pretest introduces unusual topics or content.

Posttest -only control group design . This design is a simpler version of the pretest-posttest design where pretest measurements are omitted. The design notation is shown in Figure 10.2.

Posttest-only control group design

The treatment effect is measured simply as the difference in the posttest scores between the two groups:

\[E = (O_{1} - O_{2})\,.\]

The appropriate statistical analysis of this design is also a two-group analysis of variance (ANOVA). The simplicity of this design makes it more attractive than the pretest-posttest design in terms of internal validity. This design controls for maturation, testing, regression, selection, and pretest-posttest interaction, though the mortality threat may continue to exist.

C

Because the pretest measure is not a measurement of the dependent variable, but rather a covariate, the treatment effect is measured as the difference in the posttest scores between the treatment and control groups as:

Due to the presence of covariates, the right statistical analysis of this design is a two-group analysis of covariance (ANCOVA). This design has all the advantages of posttest-only design, but with internal validity due to the controlling of covariates. Covariance designs can also be extended to pretest-posttest control group design.

Factorial designs

Two-group designs are inadequate if your research requires manipulation of two or more independent variables (treatments). In such cases, you would need four or higher-group designs. Such designs, quite popular in experimental research, are commonly called factorial designs. Each independent variable in this design is called a factor , and each subdivision of a factor is called a level . Factorial designs enable the researcher to examine not only the individual effect of each treatment on the dependent variables (called main effects), but also their joint effect (called interaction effects).

2 \times 2

In a factorial design, a main effect is said to exist if the dependent variable shows a significant difference between multiple levels of one factor, at all levels of other factors. No change in the dependent variable across factor levels is the null case (baseline), from which main effects are evaluated. In the above example, you may see a main effect of instructional type, instructional time, or both on learning outcomes. An interaction effect exists when the effect of differences in one factor depends upon the level of a second factor. In our example, if the effect of instructional type on learning outcomes is greater for three hours/week of instructional time than for one and a half hours/week, then we can say that there is an interaction effect between instructional type and instructional time on learning outcomes. Note that the presence of interaction effects dominate and make main effects irrelevant, and it is not meaningful to interpret main effects if interaction effects are significant.

Hybrid experimental designs

Hybrid designs are those that are formed by combining features of more established designs. Three such hybrid designs are randomised bocks design, Solomon four-group design, and switched replications design.

Randomised block design. This is a variation of the posttest-only or pretest-posttest control group design where the subject population can be grouped into relatively homogeneous subgroups (called blocks ) within which the experiment is replicated. For instance, if you want to replicate the same posttest-only design among university students and full-time working professionals (two homogeneous blocks), subjects in both blocks are randomly split between the treatment group (receiving the same treatment) and the control group (see Figure 10.5). The purpose of this design is to reduce the ‘noise’ or variance in data that may be attributable to differences between the blocks so that the actual effect of interest can be detected more accurately.

Randomised blocks design

Solomon four-group design . In this design, the sample is divided into two treatment groups and two control groups. One treatment group and one control group receive the pretest, and the other two groups do not. This design represents a combination of posttest-only and pretest-posttest control group design, and is intended to test for the potential biasing effect of pretest measurement on posttest measures that tends to occur in pretest-posttest designs, but not in posttest-only designs. The design notation is shown in Figure 10.6.

Solomon four-group design

Switched replication design . This is a two-group design implemented in two phases with three waves of measurement. The treatment group in the first phase serves as the control group in the second phase, and the control group in the first phase becomes the treatment group in the second phase, as illustrated in Figure 10.7. In other words, the original design is repeated or replicated temporally with treatment/control roles switched between the two groups. By the end of the study, all participants will have received the treatment either during the first or the second phase. This design is most feasible in organisational contexts where organisational programs (e.g., employee training) are implemented in a phased manner or are repeated at regular intervals.

Switched replication design

Quasi-experimental designs

Quasi-experimental designs are almost identical to true experimental designs, but lacking one key ingredient: random assignment. For instance, one entire class section or one organisation is used as the treatment group, while another section of the same class or a different organisation in the same industry is used as the control group. This lack of random assignment potentially results in groups that are non-equivalent, such as one group possessing greater mastery of certain content than the other group, say by virtue of having a better teacher in a previous semester, which introduces the possibility of selection bias . Quasi-experimental designs are therefore inferior to true experimental designs in interval validity due to the presence of a variety of selection related threats such as selection-maturation threat (the treatment and control groups maturing at different rates), selection-history threat (the treatment and control groups being differentially impacted by extraneous or historical events), selection-regression threat (the treatment and control groups regressing toward the mean between pretest and posttest at different rates), selection-instrumentation threat (the treatment and control groups responding differently to the measurement), selection-testing (the treatment and control groups responding differently to the pretest), and selection-mortality (the treatment and control groups demonstrating differential dropout rates). Given these selection threats, it is generally preferable to avoid quasi-experimental designs to the greatest extent possible.

N

In addition, there are quite a few unique non-equivalent designs without corresponding true experimental design cousins. Some of the more useful of these designs are discussed next.

Regression discontinuity (RD) design . This is a non-equivalent pretest-posttest design where subjects are assigned to the treatment or control group based on a cut-off score on a preprogram measure. For instance, patients who are severely ill may be assigned to a treatment group to test the efficacy of a new drug or treatment protocol and those who are mildly ill are assigned to the control group. In another example, students who are lagging behind on standardised test scores may be selected for a remedial curriculum program intended to improve their performance, while those who score high on such tests are not selected from the remedial program.

RD design

Because of the use of a cut-off score, it is possible that the observed results may be a function of the cut-off score rather than the treatment, which introduces a new threat to internal validity. However, using the cut-off score also ensures that limited or costly resources are distributed to people who need them the most, rather than randomly across a population, while simultaneously allowing a quasi-experimental treatment. The control group scores in the RD design do not serve as a benchmark for comparing treatment group scores, given the systematic non-equivalence between the two groups. Rather, if there is no discontinuity between pretest and posttest scores in the control group, but such a discontinuity persists in the treatment group, then this discontinuity is viewed as evidence of the treatment effect.

Proxy pretest design . This design, shown in Figure 10.11, looks very similar to the standard NEGD (pretest-posttest) design, with one critical difference: the pretest score is collected after the treatment is administered. A typical application of this design is when a researcher is brought in to test the efficacy of a program (e.g., an educational program) after the program has already started and pretest data is not available. Under such circumstances, the best option for the researcher is often to use a different prerecorded measure, such as students’ grade point average before the start of the program, as a proxy for pretest data. A variation of the proxy pretest design is to use subjects’ posttest recollection of pretest data, which may be subject to recall bias, but nevertheless may provide a measure of perceived gain or change in the dependent variable.

Proxy pretest design

Separate pretest-posttest samples design . This design is useful if it is not possible to collect pretest and posttest data from the same subjects for some reason. As shown in Figure 10.12, there are four groups in this design, but two groups come from a single non-equivalent group, while the other two groups come from a different non-equivalent group. For instance, say you want to test customer satisfaction with a new online service that is implemented in one city but not in another. In this case, customers in the first city serve as the treatment group and those in the second city constitute the control group. If it is not possible to obtain pretest and posttest measures from the same customers, you can measure customer satisfaction at one point in time, implement the new service program, and measure customer satisfaction (with a different set of customers) after the program is implemented. Customer satisfaction is also measured in the control group at the same times as in the treatment group, but without the new program implementation. The design is not particularly strong, because you cannot examine the changes in any specific customer’s satisfaction score before and after the implementation, but you can only examine average customer satisfaction scores. Despite the lower internal validity, this design may still be a useful way of collecting quasi-experimental data when pretest and posttest data is not available from the same subjects.

Separate pretest-posttest samples design

An interesting variation of the NEDV design is a pattern-matching NEDV design , which employs multiple outcome variables and a theory that explains how much each variable will be affected by the treatment. The researcher can then examine if the theoretical prediction is matched in actual observations. This pattern-matching technique—based on the degree of correspondence between theoretical and observed patterns—is a powerful way of alleviating internal validity concerns in the original NEDV design.

NEDV design

Perils of experimental research

Experimental research is one of the most difficult of research designs, and should not be taken lightly. This type of research is often best with a multitude of methodological problems. First, though experimental research requires theories for framing hypotheses for testing, much of current experimental research is atheoretical. Without theories, the hypotheses being tested tend to be ad hoc, possibly illogical, and meaningless. Second, many of the measurement instruments used in experimental research are not tested for reliability and validity, and are incomparable across studies. Consequently, results generated using such instruments are also incomparable. Third, often experimental research uses inappropriate research designs, such as irrelevant dependent variables, no interaction effects, no experimental controls, and non-equivalent stimulus across treatment groups. Findings from such studies tend to lack internal validity and are highly suspect. Fourth, the treatments (tasks) used in experimental research may be diverse, incomparable, and inconsistent across studies, and sometimes inappropriate for the subject population. For instance, undergraduate student subjects are often asked to pretend that they are marketing managers and asked to perform a complex budget allocation task in which they have no experience or expertise. The use of such inappropriate tasks, introduces new threats to internal validity (i.e., subject’s performance may be an artefact of the content or difficulty of the task setting), generates findings that are non-interpretable and meaningless, and makes integration of findings across studies impossible.

The design of proper experimental treatments is a very important task in experimental design, because the treatment is the raison d’etre of the experimental method, and must never be rushed or neglected. To design an adequate and appropriate task, researchers should use prevalidated tasks if available, conduct treatment manipulation checks to check for the adequacy of such tasks (by debriefing subjects after performing the assigned task), conduct pilot tests (repeatedly, if necessary), and if in doubt, use tasks that are simple and familiar for the respondent sample rather than tasks that are complex or unfamiliar.

In summary, this chapter introduced key concepts in the experimental design research method and introduced a variety of true experimental and quasi-experimental designs. Although these designs vary widely in internal validity, designs with less internal validity should not be overlooked and may sometimes be useful under specific circumstances and empirical contingencies.

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center
  • Introduction
  • Tabular methods
  • Graphical methods
  • Exploratory data analysis
  • Events and their probabilities
  • Random variables and probability distributions
  • The binomial distribution
  • The Poisson distribution
  • The normal distribution
  • Sampling and sampling distributions
  • Estimation of a population mean
  • Estimation of other parameters
  • Estimation procedures for two populations
  • Hypothesis testing
  • Bayesian methods

Analysis of variance and significance testing

Regression model, least squares method, analysis of variance and goodness of fit, significance testing.

  • Residual analysis
  • Model building
  • Correlation
  • Time series and forecasting
  • Nonparametric methods
  • Acceptance sampling
  • Statistical process control
  • Sample survey methods
  • Decision analysis

bar graph

Experimental design

Our editors will review what you’ve submitted and determine whether to revise the article.

  • Arizona State University - Educational Outreach and Student Services - Basic Statistics
  • Princeton University - Probability and Statistics
  • Statistics LibreTexts - Introduction to Statistics
  • University of North Carolina at Chapel Hill - The Writing Center - Statistics
  • Corporate Finance Institute - Statistics
  • statistics - Children's Encyclopedia (Ages 8-11)
  • statistics - Student Encyclopedia (Ages 11 and up)
  • Table Of Contents

Data for statistical studies are obtained by conducting either experiments or surveys. Experimental design is the branch of statistics that deals with the design and analysis of experiments. The methods of experimental design are widely used in the fields of agriculture, medicine , biology , marketing research, and industrial production.

Recent News

In an experimental study, variables of interest are identified. One or more of these variables, referred to as the factors of the study , are controlled so that data may be obtained about how the factors influence another variable referred to as the response variable , or simply the response. As a case in point, consider an experiment designed to determine the effect of three different exercise programs on the cholesterol level of patients with elevated cholesterol. Each patient is referred to as an experimental unit , the response variable is the cholesterol level of the patient at the completion of the program, and the exercise program is the factor whose effect on cholesterol level is being investigated. Each of the three exercise programs is referred to as a treatment .

Three of the more widely used experimental designs are the completely randomized design, the randomized block design, and the factorial design. In a completely randomized experimental design, the treatments are randomly assigned to the experimental units. For instance, applying this design method to the cholesterol-level study, the three types of exercise program (treatment) would be randomly assigned to the experimental units (patients).

The use of a completely randomized design will yield less precise results when factors not accounted for by the experimenter affect the response variable. Consider, for example, an experiment designed to study the effect of two different gasoline additives on the fuel efficiency , measured in miles per gallon (mpg), of full-size automobiles produced by three manufacturers. Suppose that 30 automobiles, 10 from each manufacturer, were available for the experiment. In a completely randomized design the two gasoline additives (treatments) would be randomly assigned to the 30 automobiles, with each additive being assigned to 15 different cars. Suppose that manufacturer 1 has developed an engine that gives its full-size cars a higher fuel efficiency than those produced by manufacturers 2 and 3. A completely randomized design could, by chance , assign gasoline additive 1 to a larger proportion of cars from manufacturer 1. In such a case, gasoline additive 1 might be judged to be more fuel efficient when in fact the difference observed is actually due to the better engine design of automobiles produced by manufacturer 1. To prevent this from occurring, a statistician could design an experiment in which both gasoline additives are tested using five cars produced by each manufacturer; in this way, any effects due to the manufacturer would not affect the test for significant differences due to gasoline additive. In this revised experiment, each of the manufacturers is referred to as a block, and the experiment is called a randomized block design. In general, blocking is used in order to enable comparisons among the treatments to be made within blocks of homogeneous experimental units.

Factorial experiments are designed to draw conclusions about more than one factor, or variable. The term factorial is used to indicate that all possible combinations of the factors are considered. For instance, if there are two factors with a levels for factor 1 and b levels for factor 2, the experiment will involve collecting data on a b treatment combinations. The factorial design can be extended to experiments involving more than two factors and experiments involving partial factorial designs.

A computational procedure frequently used to analyze the data from an experimental study employs a statistical procedure known as the analysis of variance. For a single-factor experiment, this procedure uses a hypothesis test concerning equality of treatment means to determine if the factor has a statistically significant effect on the response variable. For experimental designs involving multiple factors, a test for the significance of each individual factor as well as interaction effects caused by one or more factors acting jointly can be made. Further discussion of the analysis of variance procedure is contained in the subsequent section.

Regression and correlation analysis

Regression analysis involves identifying the relationship between a dependent variable and one or more independent variables . A model of the relationship is hypothesized, and estimates of the parameter values are used to develop an estimated regression equation . Various tests are then employed to determine if the model is satisfactory. If the model is deemed satisfactory, the estimated regression equation can be used to predict the value of the dependent variable given values for the independent variables.

In simple linear regression , the model used to describe the relationship between a single dependent variable y and a single independent variable x is y = β 0 + β 1 x + ε. β 0 and β 1 are referred to as the model parameters, and ε is a probabilistic error term that accounts for the variability in y that cannot be explained by the linear relationship with x . If the error term were not present, the model would be deterministic; in that case, knowledge of the value of x would be sufficient to determine the value of y .

In multiple regression analysis , the model for simple linear regression is extended to account for the relationship between the dependent variable y and p independent variables x 1 , x 2 , . . ., x p . The general form of the multiple regression model is y = β 0 + β 1 x 1 + β 2 x 2 + . . . + β p x p + ε. The parameters of the model are the β 0 , β 1 , . . ., β p , and ε is the error term.

Either a simple or multiple regression model is initially posed as a hypothesis concerning the relationship among the dependent and independent variables. The least squares method is the most widely used procedure for developing estimates of the model parameters. For simple linear regression, the least squares estimates of the model parameters β 0 and β 1 are denoted b 0 and b 1 . Using these estimates, an estimated regression equation is constructed: ŷ = b 0 + b 1 x . The graph of the estimated regression equation for simple linear regression is a straight line approximation to the relationship between y and x .

experimental research statistical analysis

As an illustration of regression analysis and the least squares method, suppose a university medical centre is investigating the relationship between stress and blood pressure . Assume that both a stress test score and a blood pressure reading have been recorded for a sample of 20 patients. The data are shown graphically in Figure 4 , called a scatter diagram . Values of the independent variable, stress test score, are given on the horizontal axis, and values of the dependent variable, blood pressure, are shown on the vertical axis. The line passing through the data points is the graph of the estimated regression equation: ŷ = 42.3 + 0.49 x . The parameter estimates, b 0 = 42.3 and b 1 = 0.49, were obtained using the least squares method.

A primary use of the estimated regression equation is to predict the value of the dependent variable when values for the independent variables are given. For instance, given a patient with a stress test score of 60, the predicted blood pressure is 42.3 + 0.49(60) = 71.7. The values predicted by the estimated regression equation are the points on the line in Figure 4 , and the actual blood pressure readings are represented by the points scattered about the line. The difference between the observed value of y and the value of y predicted by the estimated regression equation is called a residual . The least squares method chooses the parameter estimates such that the sum of the squared residuals is minimized.

A commonly used measure of the goodness of fit provided by the estimated regression equation is the coefficient of determination . Computation of this coefficient is based on the analysis of variance procedure that partitions the total variation in the dependent variable, denoted SST, into two parts: the part explained by the estimated regression equation, denoted SSR, and the part that remains unexplained, denoted SSE.

The measure of total variation, SST, is the sum of the squared deviations of the dependent variable about its mean: Σ( y − ȳ ) 2 . This quantity is known as the total sum of squares. The measure of unexplained variation, SSE, is referred to as the residual sum of squares. For the data in Figure 4 , SSE is the sum of the squared distances from each point in the scatter diagram (see Figure 4 ) to the estimated regression line: Σ( y − ŷ ) 2 . SSE is also commonly referred to as the error sum of squares. A key result in the analysis of variance is that SSR + SSE = SST.

The ratio r 2 = SSR/SST is called the coefficient of determination. If the data points are clustered closely about the estimated regression line, the value of SSE will be small and SSR/SST will be close to 1. Using r 2 , whose values lie between 0 and 1, provides a measure of goodness of fit; values closer to 1 imply a better fit. A value of r 2 = 0 implies that there is no linear relationship between the dependent and independent variables.

When expressed as a percentage , the coefficient of determination can be interpreted as the percentage of the total sum of squares that can be explained using the estimated regression equation. For the stress-level research study, the value of r 2 is 0.583; thus, 58.3% of the total sum of squares can be explained by the estimated regression equation ŷ = 42.3 + 0.49 x . For typical data found in the social sciences, values of r 2 as low as 0.25 are often considered useful. For data in the physical sciences, r 2 values of 0.60 or greater are frequently found.

In a regression study, hypothesis tests are usually conducted to assess the statistical significance of the overall relationship represented by the regression model and to test for the statistical significance of the individual parameters. The statistical tests used are based on the following assumptions concerning the error term: (1) ε is a random variable with an expected value of 0, (2) the variance of ε is the same for all values of x , (3) the values of ε are independent, and (4) ε is a normally distributed random variable.

The mean square due to regression, denoted MSR, is computed by dividing SSR by a number referred to as its degrees of freedom ; in a similar manner, the mean square due to error, MSE , is computed by dividing SSE by its degrees of freedom. An F-test based on the ratio MSR/MSE can be used to test the statistical significance of the overall relationship between the dependent variable and the set of independent variables. In general, large values of F = MSR/MSE support the conclusion that the overall relationship is statistically significant. If the overall model is deemed statistically significant, statisticians will usually conduct hypothesis tests on the individual parameters to determine if each independent variable makes a significant contribution to the model.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Choosing the Right Statistical Test | Types & Examples

Choosing the Right Statistical Test | Types & Examples

Published on January 28, 2020 by Rebecca Bevans . Revised on June 22, 2023.

Statistical tests are used in hypothesis testing . They can be used to:

  • determine whether a predictor variable has a statistically significant relationship with an outcome variable.
  • estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

Table of contents

What does a statistical test do, when to perform a statistical test, choosing a parametric test: regression, comparison, or correlation, choosing a nonparametric test, flowchart: choosing a statistical test, other interesting articles, frequently asked questions about statistical tests.

Statistical tests work by calculating a test statistic – a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

It then calculates a p value (probability value). The p -value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

Prevent plagiarism. Run a free check.

You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment , or through observations made using probability sampling methods .

For a statistical test to be valid , your sample size needs to be large enough to approximate the true distribution of the population being studied.

To determine which statistical test to use, you need to know:

  • whether your data meets certain assumptions.
  • the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

  • Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).
  • Homogeneity of variance : the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test’s effectiveness.
  • Normality of data : the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data .

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test , which allows you to make comparisons without any assumptions about the data distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:

  • Continuous (aka ratio variables): represent measures and can usually be divided into units smaller than one (e.g. 0.75 grams).
  • Discrete (aka integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1 tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:

  • Ordinal : represent data with an order (e.g. rankings).
  • Nominal : represent group names (e.g. brands or species names).
  • Binary : represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment , these are the independent and dependent variables ). Consult the tables below to see which test best matches your variables.

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships . They can be used to estimate the effect of one or more continuous variables on another variable.

Predictor variable Outcome variable Research question example
What is the effect of income on longevity?
What is the effect of income and minutes of exercise per day on longevity?
Logistic regression What is the effect of drug dosage on the survival of a test subject?

Comparison tests

Comparison tests look for differences among group means . They can be used to test the effect of a categorical variable on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults).

Predictor variable Outcome variable Research question example
Paired t-test What is the effect of two different test prep programs on the average exam scores for students from the same class?
Independent t-test What is the difference in average exam scores for students from two different schools?
ANOVA What is the difference in average pain levels among post-surgical patients given three different painkillers?
MANOVA What is the effect of flower species on petal length, petal width, and stem length?

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.

Variables Research question example
Pearson’s  How are latitude and temperature related?

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.

Predictor variable Outcome variable Use in place of…
Spearman’s 
Pearson’s 
Sign test One-sample -test
Kruskal–Wallis  ANOVA
ANOSIM MANOVA
Wilcoxon Rank-Sum test Independent t-test
Wilcoxon Signed-rank test Paired t-test

This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.

Choosing the right statistical test

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient
  • Null hypothesis

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Statistical tests commonly assume that:

  • the data are normally distributed
  • the groups that are being compared have similar variance
  • the data are independent

If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.

A test statistic is a number calculated by a  statistical test . It describes how far your observed data is from the  null hypothesis  of no relationship between  variables or no difference among sample groups.

The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved September 8, 2024, from https://www.scribbr.com/statistics/statistical-tests/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, normal distribution | examples, formulas, & uses, what is your plagiarism score.

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organisations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organise and summarise the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalise your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarise your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, frequently asked questions about statistics.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

  • Null hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.
  • Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • Null hypothesis: Parental income and GPA have no relationship with each other in college students.
  • Alternative hypothesis: Parental income and GPA are positively correlated in college students.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

  • In an experimental design , you can assess a cause-and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.
  • In a correlational design , you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.
  • In a descriptive design , you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

  • In a between-subjects design , you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).
  • In a within-subjects design , you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • In a mixed (factorial) design , one variable is altered between subjects and another is altered within subjects (e.g., pretest and posttest scores from participants who either did or didn’t do a meditation exercise).
  • Experimental
  • Correlational

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalise your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

  • Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).
  • Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

  • Probability sampling: every member of the population has a chance of being selected for the study through random selection.
  • Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.

In theory, for highly generalisable findings, you should use a probability sampling method. Random selection reduces sampling bias and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to be biased, they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

  • your sample is representative of the population you’re generalising your findings to.
  • your sample lacks systematic bias.

Keep in mind that external validity means that you can only generalise your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialised, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalised in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

  • Will you have resources to advertise your study widely, including outside of your university setting?
  • Will you have the means to recruit a diverse sample that represents a broad population?
  • Do you have time to contact and follow up with members of hard-to-reach groups?

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

  • Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Statistical power : the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.
  • Expected effect size : a standardised indication of how large the expected result of your study will be, usually based on other similar studies.
  • Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarise them.

Inspect your data

There are various ways to inspect your data, including the following:

  • Organising data from each variable in frequency distribution tables .
  • Displaying data from a key variable in a bar chart to view the distribution of responses.
  • Visualising the relationship between two variables using a scatter plot .

By visualising your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

  • Mode : the most popular response or value in the data set.
  • Median : the value in the exact middle of the data set when ordered from low to high.
  • Mean : the sum of all values divided by the number of values.

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

  • Range : the highest value minus the lowest value of the data set.
  • Interquartile range : the range of the middle half of the data set.
  • Standard deviation : the average distance between each value in your data set and the mean.
  • Variance : the square of the standard deviation.

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

  • Estimation: calculating population parameters based on sample statistics.
  • Hypothesis testing: a formal process for testing research predictions about the population using samples.

You can make two types of estimates of population parameters from sample statistics:

  • A point estimate : a value that represents your best guess of the exact parameter.
  • An interval estimate : a range of values that represent your best guess of where the parameter lies.

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

  • A test statistic tells you how much your data differs from the null hypothesis of the test.
  • A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.

Statistical tests come in three main varieties:

  • Comparison tests assess group differences in outcomes.
  • Regression tests assess cause-and-effect relationships between variables.
  • Correlation tests assess relationships between variables without assuming causation.

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

  • A simple linear regression includes one predictor variable and one outcome variable.
  • A multiple linear regression includes two or more predictor variables and one outcome variable.

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

  • A t test is for exactly 1 or 2 groups when the sample is small (30 or less).
  • A z test is for exactly 1 or 2 groups when the sample is large.
  • An ANOVA is for 3 or more groups.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

  • If you have only one sample that you want to compare to a population mean, use a one-sample test .
  • If you have paired measurements (within-subjects design), use a dependent (paired) samples test .
  • If you have completely separate measurements from two unmatched groups (between-subjects design), use an independent (unpaired) samples test .
  • If you expect a difference between groups in a specific direction, use a one-tailed test .
  • If you don’t have any expectations for the direction of a difference between groups, use a two-tailed test .

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

  • a t value (test statistic) of 3.00
  • a p value of 0.0028

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

  • a t value of 3.08
  • a p value of 0.001

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimise the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasises null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
  • If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Statistical analysis is the main method for analyzing quantitative research data . It uses probabilities and models to test predictions about a population from sample data.

Is this article helpful?

Other students also liked, a quick guide to experimental design | 5 steps & examples, controlled experiments | methods & examples of control, between-subjects design | examples, pros & cons, more interesting articles.

  • Central Limit Theorem | Formula, Definition & Examples
  • Central Tendency | Understanding the Mean, Median & Mode
  • Correlation Coefficient | Types, Formulas & Examples
  • Descriptive Statistics | Definitions, Types, Examples
  • How to Calculate Standard Deviation (Guide) | Calculator & Examples
  • How to Calculate Variance | Calculator, Analysis & Examples
  • How to Find Degrees of Freedom | Definition & Formula
  • How to Find Interquartile Range (IQR) | Calculator & Examples
  • How to Find Outliers | Meaning, Formula & Examples
  • How to Find the Geometric Mean | Calculator & Formula
  • How to Find the Mean | Definition, Examples & Calculator
  • How to Find the Median | Definition, Examples & Calculator
  • How to Find the Range of a Data Set | Calculator & Formula
  • Inferential Statistics | An Easy Introduction & Examples
  • Levels of measurement: Nominal, ordinal, interval, ratio
  • Missing Data | Types, Explanation, & Imputation
  • Normal Distribution | Examples, Formulas, & Uses
  • Null and Alternative Hypotheses | Definitions & Examples
  • Poisson Distributions | Definition, Formula & Examples
  • Skewness | Definition, Examples & Formula
  • T-Distribution | What It Is and How To Use It (With Examples)
  • The Standard Normal Distribution | Calculator, Examples & Uses
  • Type I & Type II Errors | Differences, Examples, Visualizations
  • Understanding Confidence Intervals | Easy Examples & Formulas
  • Variability | Calculating Range, IQR, Variance, Standard Deviation
  • What is Effect Size and Why Does It Matter? (Examples)
  • What Is Interval Data? | Examples & Definition
  • What Is Nominal Data? | Examples & Definition
  • What Is Ordinal Data? | Examples & Definition
  • What Is Ratio Data? | Examples & Definition
  • What Is the Mode in Statistics? | Definition, Examples & Calculator
  • Open access
  • Published: 02 September 2024

Saving lives with statistics

  • Jo Røislien 1 , 2  

Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine volume  32 , Article number:  79 ( 2024 ) Cite this article

Metrics details

Healthcare is awash with numbers, and figuring out what knowledge these numbers might hold is worthwhile in order to improve patient care. Numbers allow for objective mathematical analysis of the information at hand, but while mathematics is objective by design, our choice of mathematical approach in a given situation is not. In prehospital and critical care, numbers stem from a wide range of different sources and situations, be it experimental setups, observational data or data registries, and what constitutes a “good” statistical analysis can be unclear. A well-crafted statistical analysis can help us see things our eyes cannot, and find patterns where our brains come short, ultimately contributing to changing clinical practice and improving patient outcome. With increasingly more advanced research questions and research designs, traditional statistical approaches are often inadequate, and being able to properly merge statistical competence with clinical knowhow is essential in order to arrive at not only correct, but also valuable and usable research results. By marrying clinical knowhow with rigorous statistical analysis we can accelerate the field of prehospital and critical care.

Statistics deals with numbers, not people, and is more concerned with group averages than individual patients. Yet, the use of statistics in medicine has been hailed as one of the most important medical developments of the last 1000 years [ 1 ].

Healthcare is full of numbers, and figuring out what knowledge these numbers hold is worthwhile in order to improve patient care. However, the human brain and our sensory apparatus is not particularly well suited for dealing with abstract numbers – particularly percentages, probabilities and other ratio concepts [ 2 , 3 ] – and we need tools to help make sense of it all.

The invention of the microscope was a revolution: suddenly we could see things that had previously been invisible to us. With the recent increase of numerical data in society, we need new tools to see what our eyes cannot. “Mathematics is biology’s next microscope – only better,” Cohen wrote in 2004 [ 4 ].

In a world awash with numbers mathematical competence is key. By marrying clinical knowhow with rigorous statistical analysis we can accelerate the field of prehospital and critical care.

Numbers allow for objective mathematical analysis of the information at hand. However, while mathematics is objective by design – two plus two equals four regardless of where or when or by whom the calculation is performed – our choice of mathematical approach in a given situation is not. Applying mathematics for straight lines is of limited value if your problem is one of circles and curves.

In prehospital and critical care, numbers stem from a wide range of different sources and situations, and what constitutes a “good” statistical analysis can be unclear. A combination of mathematical and clinical knowhow is needed.

Experiments

Experiments are part of the scientific bedrock. Randomized controlled trial can assess a causal association between two factors, as the rest of the world is zeroed out by design, and the accompanying numbers can be analyzed using simple statistical tests. However, with increasingly more complicated research questions and designs, even the analysis of experimental setups is not necessarily straightforward.

In a project comparing different methods for transporting trauma patients, Hyldmo et al. experimented with cadavers, meticulously measuring neck rotation and movement [ 5 , 6 ]. For ethical reasons the experiment called for a non-traditional setup, and analytical results using traditional statistical tests were inconclusive. But when taking the specific structure of the experiment into account in statistical models, associations that had been hidden came into light, and the project could give concrete advice on the transportation of trauma patients.

Observational data

When applying statistical methods to analyze our data, we simultaneously impose strict assumptions on the numbers at hand – be it the assumption of symmetrically distributed data, linear associations, or other. If these assumptions are not sufficiently correct, we hinder the numbers to communicate freely what information they actually hold.

The protein fibrinogen is vital to the body’s built-in blood stopping mechanism, and clinical guidelines state that when fibrinogen levels drop below a certain threshold mortality increases, and one should act upon it [ 7 ]. However, when looking at observational data on fibrogen levels and mortality, the numbers seem to tell a slightly different story [ 8 ]. Standard regression models assume linearity in the data. While the association between two variables is provably linear on a small enough scale [ 9 ], and linear regression thus often a suitable – and common – statistical analysis, this does not necessarily hold true on larger scales. By applying a more flexible statistical approach with fewer assumptions – applying Generalized Additive Models (GAM) rather than Generalized Linear Models (GLM) – to fibrinogen data, the critical value for what should be considered too low fibrinogen levels, is found to be substantially higher than indicated in the guidelines [ 8 ]. Applying more advanced statistical methodology directly impacts the analytical results, and the accompanying clinical conclusions.

Data registries

More frequent use of data registries provides large amounts of healthcare data, without having to set up experiments or collect observational data from scratch. However, data registries are not designed to answer specific research questions, and results must be evaluated accordingly.

The trauma registry at Oslo University Hospital in Norway holds thousands of individual events. Plotting them on a timeline reveals a seasonal pattern [ 10 ], with more trauma admissions in summer than in winter. However, there’s not much we can do about seasonal changes. Seasonality just is. However, with changing seasons comes changing weather, and weather matters. Replacing the generic phenomenon “seasons” with daily factors like “hours of sunlight” and “amount of rain” [ 11 ] results in a statistical model that is not only significantly better [ 10 ], but also allows for action. Rather than planning work schedules at ERs weeks in advance, the statistical model implies that it would be more cost-efficient to ask meteorologist for estimates of sunlight and perspiration a few days ahead, calculate the expected number of trauma incidents, and staff up accordingly. More staff on sunny days, fewer when it rains.

Or – maybe not. While a statistical model with weather variables as predictors might be objectively better than mere seasonal effects, it would also result in markedly poorer quality of life for the healthcare personnel involved, having their work schedule decided by short-term weather forecasts. Statistician George Box has said that “All models are wrong, but some are useful” [ 12 ]. Even a “good” statistical analysis is not necessarily useful.

Statistical analysis has two ingredients: Mathematics and context. Mathematics is often the easy part: It’s either right or wrong. The real world, however, is rarely black or white, but tends to be shades of grey, and the accompanying statistical analysis will be shades of right and shades of wrong. A well-crafted statistical analysis can help us see things our eyes cannot, and find patterns where our brains come short, ultimately contributing to changing clinical practice and improving patient outcome. Being able to properly merge statistical competence with clinical knowhow is essential in order to arrive at not only correct, but also valuable and usable research results.

Can statistics save lives? Indeed. But only when the mathematical and contextual side of statistics work together.

Data availability

No datasets were generated or analysed during the current study.

Looking back on the millennium in medicine. N Engl J Med. 2000;342(1):42–9.

Article   Google Scholar  

Røislien J, Johnsen AL. Those troublesome fractions. Tidsskr nor Laegeforen. 2024;144(7).

Reyna VF, Nelson WL, Han PK, Dieckmann NF. How numeracy influences risk comprehension and medical decision making. Psychol Bull. 2009;135(6):943–73.

Article   PubMed   PubMed Central   Google Scholar  

Cohen JE. Mathematics is biology’s next microscope, only better; biology is mathematics’ next physics, only better. PLoS Biol. 2004;2(12):e439.

Hyldmo PK, Horodyski MB, Conrad BP, Dubose DN, Røislien J, Prasarn M, et al. Safety of the lateral trauma position in cervical spine injuries: a cadaver model study. Acta Anaesthesiol Scand. 2016;60(7):1003–11.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hyldmo PK, Horodyski M, Conrad BP, Aslaksen S, Røislien J, Prasarn M, et al. Does the novel lateral trauma position cause more motion in an unstable cervical spine injury than the logroll maneuver? Am J Emerg Med. 2017;35(11):1630–5.

Article   PubMed   Google Scholar  

Spahn DR, Bouillon B, Cerny V, Coats TJ, Duranteau J, Fernández-Mondéjar E, et al. Management of bleeding and coagulopathy following major trauma: an updated European guideline. Crit Care. 2013;17(2):R76.

Hagemo JS, Stanworth S, Juffermans NP, Brohi K, Cohen M, Johansson PI, et al. Prevalence, predictors and outcome of hypofibrinogenaemia in trauma: a multicentre observational study. Crit Care. 2014;18(2):R52.

Feigenbaum L. Brook Taylor and the method of increments. Arch Hist Exact Sci. 1985;34(1):1–140.

Røislien J, Søvik S, Eken T. Seasonality in trauma admissions - are daylight and weather variables better predictors than general cyclic effects? PLoS ONE. 2018;13(2):e0192568.

Bhattacharyya T, Millham FH. Relationship between Weather and Seasonal factors and trauma admission volume at a level I trauma Center. J Trauma Acute Care Surg. 2001;51(1).

Box GEP. Robustness in the strategy of scientific model building. In: Launer RL, Wilkinson GN, editors. Robustness in statistics. Academic; 1979. pp. 201–36.

Download references

Author information

Authors and affiliations.

Department of Research, The Norwegian Air Ambulance Foundation, Oslo, Norway

Jo Røislien

Faculty of Health Sciences, University of Stavanger, Stavanger, Norway

You can also search for this author in PubMed   Google Scholar

Contributions

Jo Røislien had the idea for and wrote the manuscript.

Corresponding author

Correspondence to Jo Røislien .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Comment provides a summary of the Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine Honorary Lecture given at the Oslo HEMS Conference 2023

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Røislien, J. Saving lives with statistics. Scand J Trauma Resusc Emerg Med 32 , 79 (2024). https://doi.org/10.1186/s13049-024-01256-4

Download citation

Received : 21 August 2024

Accepted : 23 August 2024

Published : 02 September 2024

DOI : https://doi.org/10.1186/s13049-024-01256-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Statistical analysis
  • Prehospital care

Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine

ISSN: 1757-7241

experimental research statistical analysis

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

education-logo

Article Menu

experimental research statistical analysis

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Emotional intelligence profiles and cyber-victimization in secondary school students: a multilevel analysis.

experimental research statistical analysis

1. Introduction

  • The variability of average cyber-victimization across EI-strata is significant and non-zero. Different emotional intelligence profiles imply different levels of cyber-victimization risk.
  • This difference can be explained by the individual characteristics of the subjects. According to previous evidence, the following was expected: 2.1 Inadequate levels of emotional intelligence increase the probability of cyber-victimization [ 25 ]. 2.2 Gender is related to the degree of cyber-victimization [ 36 ]. 2.3 Cyber-victimization increases when gender orientation is non-heterosexual [ 37 ]. 2.4 Between the ages of 11 and 18, older youth are more likely to be cyber-victims. 2.5 Low self-esteem increases the risk of cyber-victimization 2.6 High social anxiety increases risk of being cyber-victimized 2.7 The risk of cyber victimization increases with risky Internet behavior. 2.8 Low parental control increases likelihood of cyber-victimization

2. Materials and Methods

2.1. participants and procedures, 2.2. instruments, 2.2.1. emotional intelligence, 2.2.2. cyber-victimization, 2.2.3. risk factors questionnaire for cyber-victimization [ 41 ], 2.2.4. peer bullying questionnaire–bullying behavior scale [ 42 ], 2.3. variables, 2.4. statistical analysis, 2.4.1. statistical description of the variables and justification for cross-stratification, 2.4.2. analysis procedure, unconditional means model or null model, importance of predictive variables in cyber-victimization, random intersection models or main effects averages as outcomes, model of random coefficients (slope) as outcomes, model of random interceps (meanss) and coefficients (slopes) as outcomes, 2.4.3. measurement of changes in reporting criteria, 2.4.4. characterization of the ei-strata, 3.1. justification for cross-stratification (tmms-g3), 3.2. unconditional means model, 3.3. importance of independent variables in cyber-victimization, 3.4. random intersection models or main effects averages as outcomes, 3.5. model of random coefficients (slopes) as outcomes (model 3), 3.6. model of random intersections (averages) and coefficients (slopes) as outcomes, 3.7. characterization of the emotional intelligence-strata (ei), 4. discussion, institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

  • Rueda, P.; Pérez-Romero, N.; Cerezo, M.V.; Fernández-Berrocal, P. The role of Emotional Intelligence in Adolescent Bullying: A Systematic Review. Psicol. Educ. 2022 , 28 , 53–59. [ Google Scholar ] [ CrossRef ]
  • Martínez-Monteagudo, M.C.; Delgado, B.; García-Fernández, J.M.; Rubio, E. Cyberbullying, Aggressiveness, and Emotional Intelligence in Adolescence. Int. J. Environ. Res. Public Health 2019 , 16 , 5079. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Calmaestra, J.; Rodríguez-Hidalgo, A.J.; Mero-Delgado, O.; Solera, E. Cyberbullying in Adolescents from Ecuador and Spain: Prevalence and Differences in Gender, School Year and Ethnic-Cultural Background. Sustainability 2020 , 12 , 4597. [ Google Scholar ] [ CrossRef ]
  • Alvarez-García, D.; Dobarro, A.; Núñez, J.C. Validez y fiabilidad del Cuestionario de cibervictimización en estudiantes de Secundaria. Aula Abierta 2015 , 43 , 32–38. [ Google Scholar ] [ CrossRef ]
  • Nixon, C.L. Current perspectives: The impact of cyberbullying on adolescent health. Adolesc. Health Med. Ther. 2022 , 5 , 143–158. [ Google Scholar ] [ CrossRef ]
  • Pérez-Gómez, M.A.; Echazarreta-Soler, C.; Audebert, M.; Sánchez-Miret, C. El ciberacoso como elemento articulador de las nuevas violencias digitales: Métodos y contextos. Commun. Pap. 2020 , 9 , 43–58. [ Google Scholar ] [ CrossRef ]
  • Wiertsema, M.; Vrijen, C.; Van-der-Ploeg, R.; Sentse, M.; Kretschmer, T. Bullying perpetration and social status in the peer group: A meta-analysis. J. Adolesc. 2023 , 95 , 34–55. [ Google Scholar ] [ CrossRef ]
  • Perren, S.; Corcoran, L.; Cowie, H.; Dehue, F.; García, D.J.; Mc Guckin, C.; Sevcikova, A.; Tsatsou, P.; Völlink, T. Tackling Cyberbullying: Review of Empirical Evidence Regarding Successful Responses by Students, Parents, and Schools. Int. J. Confl. Violence 2012 , 6 , 283–293. [ Google Scholar ] [ CrossRef ]
  • Cañas, E.; Estévez, E.; Martínez-Monteagudo, M.C.; Delgado, B. Emotional adjustment in victims and perpetrators of cyberbullying and traditional bullying. Soc. Psychol. Educ. 2020 , 23 , 917–942. [ Google Scholar ] [ CrossRef ]
  • Angoff, H.D.; Barnhart, W.R. Bullying and Cyberbullying among LGBQ and Heterosexual Youth from an Intersectional Perspective: Findings from the 2017 National Youth Risk Behavior Survey. J. Sch. Violence 2021 , 20 , 274–286. [ Google Scholar ] [ CrossRef ]
  • Garaigordobil, M.; Larrain, E. Bullying and cyberbullying in LGBT adolescents: Prevalence and effects on mental health. Comun. 2020 , 62 , 79–90. [ Google Scholar ] [ CrossRef ]
  • Patchin, J.W.; Hinduja, S. Cyberbullying Among Tweens in the United States: Prevalence, Impact, and Helping Behaviors. J. Early Adolesc. 2021 , 42 , 027243162110367. [ Google Scholar ] [ CrossRef ]
  • Lei, H.; Mao, W.; Cheong, C.M.; Wen, Y.; Gui, Y.; Cai, Z. The relationship between self-esteem and cyberbullying: A meta-analysis of children and youth students. Curr. Psychol. 2020 , 39 , 830–842. [ Google Scholar ] [ CrossRef ]
  • Núñez, A.; Álvarez-García, D.; Pérez-Fuentes, M.C. Anxiety and self-esteem in cyber-victimization profiles of adolescents. Comunicar 2021 , 67 , 43–54. [ Google Scholar ] [ CrossRef ]
  • Zhu, C.; Huang, S.; Evans, R.; Zhang, W. Cyberbullying Among Adolescents and Children: A Comprehensive Review of the Global Situation, Risk Factors, and Preventive Measures. Front. Public Health 2021 , 9 , 634909. [ Google Scholar ] [ CrossRef ]
  • Martín-Criado, J.M.; Casas, J.A.; Ortega-Ruíz, R.; Del-Rey, R. Parental supervision and victims of cyberbullying: Influence of the use of social networks and online extimacy. Rev. Psicodidáctica 2021 , 26 , 161–168. [ Google Scholar ] [ CrossRef ]
  • Martínez-Martínez, A.M.; Roith, C.; Aguilar-Parra, J.M.; Manzano-León, A.; Rodríguez-Ferrer, J.M.; López-Liria, R. Relationship between Emotional Intelligence, Victimization, and Academic Achievement in High School Students. Soc. Sci. 2022 , 11 , 247. [ Google Scholar ] [ CrossRef ]
  • Wright MF& Wachs, S. The buffering effect of parent social support in the longitudinal associations between cyber polyvictimization and academic outcomes. Soc. Psychol. Educ. 2021 , 24 , 1145–1161. [ Google Scholar ] [ CrossRef ]
  • Carmona-Rojas, M.; Ortega-Ruíz, R.; Romera-Félix, E.M. Bullying and cyberbullying, what do they have in common and what not? A latent class analysis. Analls Psychol. 2023 , 39 , 435–445. [ Google Scholar ] [ CrossRef ]
  • Ortega, R.; Elipe, P.; Mora-Merchán, J.A.; Genta, M.L.; Brighi, A.; Guarini, A.; Smith, P.K.; Thopson, F.; Tippett, N. The Emotional Impact of Bullying and Ciberbullying on Victims: A European Cross-National Study. Aggress. Behav. 2012 , 38 , 342–356. [ Google Scholar ] [ CrossRef ]
  • Quintana-Orts, C.; Rey, L.; Chamizo-Nieto, M.T.; Worthington, E.L. A Serial Mediation Model of the Relationship between Cybervictimization and Cyberaggression: The Role of Stress and Unforgiveness Motivations. Int. J. Environ. Res. Public Health 2020 , 17 , 7966. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Graham, R.; Wood, F.R. Associations between cyberbullying victimization and deviant health risk behaviors. Soc. Sci. J. 2019 , 56 , 183–188. [ Google Scholar ] [ CrossRef ]
  • Micklewright, D.; Parry, D.; Robinson, T.; Deacon, G.; Renfree, A.; St Clair Gibson, A.; Matthews, W.J. Risk perception influences athletic pacing strategy. Med. Sci. Sports Exerc. 2015 , 47 , 1026–1037. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Quintana-Orts, C.; Rey, L.; Mérida-López, S.; Extremera, N. What bridges the gap between emotional intelligence and suicide risk in victims of bullying? A moderated mediation study. J. Affect. Disord. 2019 , 245 , 798–805. [ Google Scholar ] [ CrossRef ]
  • García, L.; Quintana-Orts, C.; Rey, L. Cibervictimización y satisfacción vital en adolescentes: La inteligencia emocional como variable mediadora. Rev. Psicol. Clín. Niños Adolesc. 2020 , 7 , 38–45. [ Google Scholar ] [ CrossRef ]
  • Martínez-Martínez, A.M.; López-Liria, R.; Aguilar-Parra, J.M.; Trigueros, R.; Morales-Gazquez, M.J.; Rocamora-Pérez, P. Relationship between Emotional Intelligence, Cybervictimization, and Academic Performance in Secondary School Students. Int. J. Environ. Res. Public Health 2020 , 17 , 7717. [ Google Scholar ] [ CrossRef ]
  • Menabò, L.; Skrzypiec, G.; Slee, P.; Guarini, A. Victimization and cybervictimization: The role of school factors. J. Adolesc. 2024 , 96 , 598–611. [ Google Scholar ] [ CrossRef ]
  • Extremera, N.; Fernández-Berrocal, P. Emotional Intelligence as predictor of mental, social, and physical health in university students. Span. J. Psychol. 2006 , 9 , 45–51. [ Google Scholar ] [ CrossRef ]
  • Fernández-Berrocal, P.; Extremera, N.; Ramos, N. Validity and reliability of the Spanish modified version of the Trait Meta-Mood Scale. Psychol. Rep. 2004 , 94 , 751–755. [ Google Scholar ] [ CrossRef ]
  • González, R.; Custodio, J.B.; Abal, F.J.P. Psychometric properties of the Trait Meta-Mood Scale-24 in Argentinian university students. Psicogente 2020 , 23 , 1–26. [ Google Scholar ] [ CrossRef ]
  • Salovey, P.; Mayer, J.D. Emotional Intelligence. Imagin. Cogn. Personal. 1990 , 9 , 185–211. [ Google Scholar ] [ CrossRef ]
  • Mestre, J.M.; Guil, R.; Lopes, P.N.; Salovey, P.; Gil-Olarte, P. Emotional Intelligence and social and academic adaptation to school. Psicothema 2006 , 18 , 112–117. [ Google Scholar ] [ PubMed ]
  • Taramuel-Villacreces, J.A.; Zapata-Achi, V.H. Aplicación del test TMMS-24 para el análisis y descripción de la Inteligencia Emocional considerando la influencia del sexo. Rev. Publicando 2017 , 4 , 162–181. [ Google Scholar ]
  • Guerra-Bustamante, J.; Yuste-Tosina, R.; López-Ramos, V.M.; Mendo-Lázaro, S. The Modelling Effect of Emotional Competence on Cyberbullying Profiles. Ann. Psychol. 2021 , 37 , 202–209. [ Google Scholar ] [ CrossRef ]
  • Grommisch, G.; Hinton, J.D.X.; Hollenstein, T.; Koval, P.; Gleeson, J.; Kuppens, P.; Lischetzke, T. Modeling Individual Differences in Emotion Regulation Repertoire in Daily Life With Multilevel Latent Profile Analysis. Emotion 2020 , 20 , 1462–1474. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Lozano-Blasco, R.; Quilez-Robres, A.; Latorre-Cosculluela, C. Sex, age and cyber-victimization: A meta-analysis. Comput. Hum. Behav. 2023 , 139 , 107491. [ Google Scholar ] [ CrossRef ]
  • Ojeda, M.; Espino, E.; Elipe, P.; Del-Rey, R. Even if they don’t say it to you, is hurts too: Internalized homonegativity in LGTBQ+ cyberbullying among adolescents. Comunicar 2023 , 75 , 21–34. [ Google Scholar ] [ CrossRef ]
  • Evans, C.R.; Willians, D.R.; Onnela, J.P.; Subramanian, S.V. A multilevel approach to modeling health inequalities at the intersection of multiple social identities. Soc. Sci. Med. 2018 , 203 , 64–73. [ Google Scholar ] [ CrossRef ]
  • Giuffrè, M.; Shung, D.L. Harnessing the power of synthetic data in healthcare: Innovation, application, and privacy. npj Digit. Med. 2023 , 6 , 186. [ Google Scholar ] [ CrossRef ]
  • Botella-Ausina, J.; Sánchez-Meca, J. Meta-Análisis en Ciencias Sociales y de la Salud ; Síntesis: Madrid, Spain, 2015. [ Google Scholar ]
  • Alvarez-Garcia, D.; Nuñez-Perez, J.C.; Dobarro-Gonzalez, A.; Rodriguez-Perez, C. Factores de riesgo asociados a la cibervictimización en la adolescencia. Int. J. Clin. Health Psychol. 2015 , 15 , 226–235. [ Google Scholar ]
  • Magaz, A.M.; Chorot, P.; Santed, M.A.; Valiente, R.M.; Sandín, B. Evaluación del bullying como victimización: Estructura, fiabilidad y validez del Cuestionario de Acoso entre Iguales (CAI). Rev. Psicopatol. Psicol. Clín. 2016 , 21 , 77–95. [ Google Scholar ] [ CrossRef ]
  • Enders, C.K.; Tofighi, D. Centering Predictor Variables in Cross-Sectional Multileve Models: A New Look at and Old Issue. Psychol. Methods 2007 , 12 , 121–138. [ Google Scholar ] [ CrossRef ]
  • Peugh, J.L.; Enders, C.K. Using the SPSS Mixed Procedure to Fit Cross-Sectional and Longitudinal Multilevel Models. Educ. Psychol. Meas. 2005 , 65 , 717–741. [ Google Scholar ] [ CrossRef ]
  • Austin, P.C.; Merlo, J. Intermediate and advanced topics in multilevel logistic regression analysis. Stat. Med. 2017 , 36 , 3257–3277. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Martínez-Garrido, C.; Murillo, F.J. Programas para la realización de Modelos Multinivel. Un análisis comparativo entre MLwiN, HLM, SPSS y Stata. REUNIDO 2014 , 14 , 1–24. [ Google Scholar ] [ CrossRef ]
  • Arango-Botero, D.; Hernández-Barajas, F.; Valencia-Arias, A. Misspecification in Generalized Linear Mixed Models and Its Impact on the Statistical Wald Test. Appl. Sci. 2023 , 13 , 977. [ Google Scholar ] [ CrossRef ]
  • Huang, S.; Valdivia, D.S. Wald χ 2 Test for Differential Item Functioning Detection with Polytomous Items in Multilevel Data. Educ. Psychol. Meas. 2023 , 84 , 530–548. [ Google Scholar ] [ CrossRef ]
  • Murthen, B.O.; Satorra, A. Complex Sample Data in Structural Equation Modeling. Sociol. Methodol. 1995 , 25 , 267–316. [ Google Scholar ] [ CrossRef ]
  • Peugh, J.L. A practical guide to multilevel modelling. J. Sch. Psychol. 2010 , 48 , 85–112. [ Google Scholar ] [ CrossRef ]
  • Kish, L. Survey Sampling ; John Wiley & Sons: New York, NY, USA, 1965. [ Google Scholar ] [ CrossRef ]
  • Lai, M.H.C.; Kwok, O. Examining the Rule of Thumb of Not Using Multilevel Modeling: The “Design Effect Smaller Than Two” Rule. J. Exp. Educ. 2015 , 83 , 423–438. [ Google Scholar ] [ CrossRef ]
  • Shieh, Y.-Y.; Fouladi, R.T. The Effect of Multicollinearity on Multilevel Modeling Parameter Estimates and Standard Errors. Educ. Psychol. Meas. 2003 , 63 , 951–985. [ Google Scholar ] [ CrossRef ]
  • Aggarwal, C.C. Neural Networks and Deep Learning ; Springer: Cham, Switzerland, 2018. [ Google Scholar ] [ CrossRef ]
  • Pardo, A.; Ruiz, M.A.; San-Martín, R. Cómo ajustar e interpretar modelos multinivel con SPSS. Psicothema 2007 , 19 , 308–321. [ Google Scholar ] [ PubMed ]
  • Pardo-Merino, A.; Ruiz-Diaz, M.A. Análisis de Datos en Ciencias Sociales y de la Salud III ; Síntesis: Madrid, Spain, 2012. [ Google Scholar ]
  • Alvarez-Cáceres, R. Estadística Multivariante y No Paramétrica con SPSS: Aplicación a las Ciencias de la Salud ; Diaz de Santos: Madrid, Spain, 1995. [ Google Scholar ]
  • Constante-Amores, A.; Florenciano-Martínez, E.; Navarro-Asensio, E.; Fernández-Mellizo, M. Factores asociados al abandono universitario. Educ. XX1 2021 , 24 , 17–44. [ Google Scholar ] [ CrossRef ]
  • Field, A. Discovering Statistics Using SPSS for Window: Advanced Techniques for the Beginner ; Sage: New York, NY, USA, 2000. [ Google Scholar ]
  • Pardo, A.; San-Martín, R. Análisis de Datos en Ciencias Sociales y de la Salud II ; Síntesis: Madrid, Spain, 2010. [ Google Scholar ]
  • Álvarez-García, D.; Núñez, A.; Pérez-Fuentes, M.C.; Vallejo, G. Efecto del grupo-clase sobre la cibervictimización en estudiantes de Secundaria: Un análisis multinivel. Rev. Educ. 2022 , 397 , 153–178. [ Google Scholar ] [ CrossRef ]
  • Elipe, P.; Muñoz, M.O.; Del Rey, R. Homophobic Bullying and Cyberbullying: Study of a Silenced Problem. J. Homosex. 2018 , 65 , 672–686. [ Google Scholar ] [ CrossRef ]
  • Romera, E.M.; Luque, R.; Ortega-Ruiz, R.; Gómez-Ortiz, O.; Camacho, A. Positive Peer Perception, Social Anxiety and Classroom Social Adjustment as Risk Factors in Peer Victimization: A Multilevel Study. Psicothema 2022 , 34 , 110–116. [ Google Scholar ] [ CrossRef ]
  • Arrivillaga, C.; Rey, L.; Extremera, N. Perfil emocional de adolescentes en riesgo de un uso problemático de internet. Rev. Psicol. Clín. Niños Adolesc. 2021 , 8 , 47–53. [ Google Scholar ] [ CrossRef ]
  • Gámez-Guadix, M.; Incera, D. Homophobia is online: Sexual victimization and risk on the internet and mental health among bisexual, homosexual, pansexual, asexual, and queer adolescents. Comput. Hum. Behav. 2021 , 119 , 106728. [ Google Scholar ] [ CrossRef ]
  • Liu, M.; Ren, S. Moderating Effect of Emotional Intelligence on the Relationship between Rumination and Anxiety. Curr. Psychol. 2018 , 37 , 272–279. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Primary Sample. N: 1908Simulate Active Dataset. N: 50,000
Descriptive Statistics
VariablesMeanSDMin.Max.MeanSDMin.Max.
Age13.651.35111813.651.351118
Peer bullying43.699.373911743.679.4039117
Parentals-controls13.965.10372813.945.03728
Self-esteem17.212.5052017.222.60520
School vict.9.433.126249.203.10624
Training-support23.013.4772822.943.31728
Shyness-Soc. anxiety8.432.674168.452.61416
Risk behaviors9.533.295209.633.28520
Academic-Performance6.101.690106.151.70010
CBV-Average1.170.1813.691.170.2013.43
Genderboys (1)94149.3%boys (1)24,70049.4%
girls (2)96750.7%girls (2)25,30050.6%
Sexual orientationheterosexual (1)181495.1%heterosexual (1)47,60095.2%
non-heterosex. (2)944.9%non-heterosex. (2)24004.8%
TMMS Attentionlow-attention (1)95049.8%low-attention (1)24,80049.6%
adequate attent. (2)79341.6%adequate atent (2)20,80041.6%
excessive attent. (3)1658.6%excessive attent. (3)44008.8%
TMMS Claritylow-clarity (1)76940.8%low-clarity (1)20,30040.6%
adequate-clarity (2)89546.9%adequate-clarity(2)23,50047.0%
excessive-clarity (3)24412.8%excessive-clarity (3)620012.4%
TMMS Regulationlow-regulation (1)77940.8%low-regulation (1)20,25040.5%
adequate-reg. (2)84044.0%adequate-reg. (2)22,10044.2%
excessive-reg. (3)28915.1%Excessive-reg. (3)765015.3%
CBV SeverityCBV occasional (1)177192.8%CBV occasional (1)46,45092.9%
CBV severe (2)1377.2%CBV severe (2)33507.1%
Data , δ , n ; p
, δ , n ; p
Standardized mean difference
Difference of proportions
Standardized Mean DifferencesSD Estim.dVar Lower CI 95%Upper CI 95%
Age1.35−0.0010.0003−0.0410.020
Peer-bullying9.370.0020.0003−0.0240.039
FCR Parental-controls5.100.0050.0003−0.0260.038
FCR Self-esteem2.50−0.0040.0003−0.0430.028
FCR School-Vict.3.120.0000.0003−0.0290.035
FCR Training-Support3.470.0000.0003−0.0340.033
FCR Shyness-Soc. anxiety2.67−0.0020.0003−0.0420.024
FCR Risk-behaviors3.290.0030.0003−0.0370.029
Academic performance1.69−0.0040.0003−0.0400.025
CBV-Average0.180.0010.0003−0.0330.035
Differences of proportionsCategoriesd Var d Lower CI 95%Upper CI 95%
GenderBoys (1)0.0040.00010.020−0.023
GenderGirls (2)−0.0070.00030.016−0.030
Sexual-Or.Heterosexual (1)−0.0010.00000.007−0.008
Sexual-Or.Non- heteros. (2)0.0010.0010.046−0.036
TMMS-AttentionLow-attention (1)−0.0020.0000.024−0.023
TMMS-AttentionAdequate-attent. (2)0.00200.0000.022−0.024
TMMS-AttentionExcessive-attent. (3)0.00300.00030.035−0.025
TMMS-ClarityLow-clarity (1)0.00100.00030.037−0.036
TMMS-ClarityAdequate-clar. (2)−0.0010.00030.022−0.015
TMMS-ClarityExcessive-clar. (3)−0.0010.00050.031−0.021
TMMS-RegulationLow-regulation (1)0.00400.00030.039−0.032
TMMS-RegulationAdquate-reg. (2)−0.0030.00030.032−0.038
TMMS-RegulationExcessive-reg. (3)−0.0010.00050.025−0.044
CBV-SeverityCBV-occasional (1)−0.0020.00000.011−0.012
CBV-SeverityCBV-severe (2)0.00200.00060.035−0.034
DV: CBV AveragePrimary Sample. N = 1908Simulated Sample. N = 50,000
IV B (Not-St)CI95% B (Not-St)CI 95%
AgeConst0.8350.824; 0.872Const0.8390.820; 0.862
Age0.0250.020; 0.031Age0.0240.022; 0.026
Peer-bullyingConst0.4470.412; 0.482Const0.4650.455; 0.474
Peer-bull.0.0170.016; 0.017Peer-bull.0.0160.016; 0.016
FCR-Parental-ControlConst1.221.19; 1.25Const1.231.22; 1.23
Parent-Cnt−0.004−0.006; −0.002Parent-Cnt−0.004−0.005; −0.004
FCR-Self-esteemConst1.581.53; 1.65Const1.581.57; 1.60
Self-esteem−0.024−0.028; −0.021Self-esteem−0.024−0.025; −0.023
FCR-School-Vict.Const0.8190.795; 0.844Const0.8370.831; 0.844
School-Vict0.0370.035; 0.040School-Vict0.0350.035; 0.036
FCR-Training-SupportConst1.431.37; 1.48Const1.451.44; 1.47
Training−0.011−0.014; −0.009Training−0.013−0.013; −0.012
FCR-Shyness-Soc.AnxietyConst1.091.07; 1.12Const1.1051.09; 1.11
Soc. Anxiety0.0080.005; 0.012Soc. Anxiety0.0080.007; 0.008
FCR-Risk-behaviorsConst0.9430.918; 0.969Const0.9520.945; 0.959
Risk-behav.0.0240.021; 0.026Risk-behav.0.0230.022; 0.024
Academic-PerformanceConst1.351.32; 1.38Const1.351.34; 1.36
Ac.Perform.−0.031−0.036Ac.Perform.−0.030−0.031; −0.028
IENexpNsim.EdadA.Per.P.CntSelf-EVictTrain.Anx.R.B.P.B.Reps%
1147812,74813.645.6113.5816.0211.5321.519.219.835.3127.1
11266162113.616.2513.6317.018.9322.318.849.186.2621.2
113519913.026.3214.2118.368.2123.329.638.826.3621.9
12160133013.746.3413.2217.058.8122.568.039.636.4213.2
122224655313.536.4213.5717.718.4522.557.958.576.5219.1
1232783513.217.0116.0918.537.6324.387.918.016.8711.9
131820513.746.2311.9517.699.3523.287.459.346.250.0
1321723112.737.2516.4118.507.5624.698.576.577.468.1
13344131913.415.7014.1018.528.0323.717.588,915.7122.9
211106274914.105.9712.2716,059.6221.638.8210.155.9729.8
21283181413.466.5114.2117.019.0623.108.839.6456.3525.7
213921013.246.6316.3117.398.0124.537.678.966.4618.2
22172191913.655.9714.0516.858.9522.538.399.115.9724.5
222386992713.656.3113.9717.468.3523.268.029.316.1221.7
22346126413.436.4915.3418.218.0023.988.128.256.6318.7
231313013.426.4112.9418.338.9723.916.388.366.510.0
2322257113.237.0215.5718.238.4123.887.639.016.8911.8
23396221913.356.4115.3618.528.0125.237.328.076.4620.5
3111431914.565.1210.5815.029.2322.4110.1511.125.3149.8
312313014.405.0111.6317.258.3123.268.269.015.0249.2
31311913.107.5011.0017.967.0022.967.986.027.450.0
3212248213.516.2511.4615.3510.1223.019.1410.276.1225.2
32251143713,726.0114.6217.569.0223.478.269.876.0123.8
323830313.026.5015.7818.639.3524.019.257.026.520.5
33123212.907.3215.2916.028.0026.9610.975.987.120.0
332928813.815.5915.6318.317.8223.706.459.085.5822.5
33346114613.416.1114.4118.288.1123.727.539.276.0323.1
Total190850,00013.606.2114.0117.159.2622.748.519.266.0222.5
ICC 0.0820.0970.0960.1700.0940.0750.1200.1250.120
Fixed-effects
ParameterEstimatorStandard errortSigCI 95%
Intercept1.1660.0011209.7384,740.0001.164; 1.168
Random effects
Covariance parameterEstimatorStandard errorWald Z.SigCI 95%
Residue0.0460.000158.0710.0000.046; 0.047
Level I+II Effect1.3630.3713.674<0.0010.800; 2.325
ICC1.363/(1.363 + 0.046) = 0.967
Null model fit information for cyber-victimization
DescriptionValue
Deviance−11,330.01
AIC−11,326.01
BIC−11,308.37
df (parameters −1): 2
Coefficient of determination Pseudo-R squared (conditional): 0.967
Subject Level (L1)ImportanceStandardized ImportanceStrata Level (L2)ImportanceStandardized Importance
L1 Peer Bullying0.43100%L2 Peer-Bullying by stratum0.1245.6%
L1 FCR Parental Control0.036.5%L2 FCR P. Control by stratum0.0832.6%
L1 FCR Self-esteem0.0924%L2 FCR Self-E. by stratum0.25100%
L1 FCR Off-line Victimization0.1434.1%L2 FCR Off-line Victimization by stratum0.2492.9%
L1 FCR Training0.1024.8%L2 FCR Training by stratum0.0621.6%
L1 FCR Anxiety0.0716.4%L2 FCR Anxiety by stratum0.1978.1%
L1 FCR Risk Behaviors0.1536.8%L2 FCR Risk-behav. by stratum0.0623.5%
Fixed-effects
ParameterEstimatorStandard errordftSigCI 95%
Intercept1.0210.00449,989260.220.0001.013; 1.029
L2_FCR_Self-esteem Centered−0.0120.00349,98916.41<0.001−0.059; −0.047
L1_Sex (1)0.0010.00249,988.90.760.441−0.002; 0.004
L1_Sexual Orientation (1)0.0010.00449,9930.3850.700−0.006; −0.009
L1_Age Centered0.0120.00149,99324.23<0.0010.011; 0.013
L1_Academic Performance Centered−0.0110.00049,993−23.59<0.001−0.012; −0.010
L1_Peer-Bullying Centered0.0060.0004992.6134.520.0000.006; 0.006
Random-effects
Covariance parameterEstimatorStandard errorWald Z SigCI95%
Residue0.0300.00015,810 0.0000.029; 0.030
Level I + II Effect0.0010.000
2-simplified model fit information for cyber-victimization
DescriptionValue
Deviance−10,582.59
AIC−10,578.59
BIC−10,561.04
df (parameters −1): 8
Fixed-effects
ParameterEstimatorStandard errortSigCI95%
Intercept1.0290.006176.190.0001.02; 1.04
L2_FCR_Self-Esteem Centered−0.0150.001−10.527<0.001−0.018; −0.013
L1_Sex (1)0.0010.0060.780.432−0.002; 0.004
L1_Sexual Orientation (1)−0.0010.0060.180.855−0.010; 0.013
L1_Age Centered0.0120.00124.25<0.0010.011; 0.013
L1_Academic Performance Centered−0.0110.000−23.54<0.001−0.012; −0.010
L1_Peer-Bullying Centered0.0060.00031.18<0.0010.006; 0.007
L2_FCR_Self-Esteem Centered * L1_Peer-Bullying Centered−0.0070.002−3.080.005−0.012; −0.002
L1_Sexual Orientation (1)* L1_Peer-Bullying Centered−0.0020.000−5.28<0.001−0.003; −0.001
Model 4 fit information for cyber-victimization
DescriptionValueLikelihood-ratio M0-M4 [SIG.CHISQ (14,704.2, 9)]
Deviance−10,906.224122.11−(−10,582.59) = 14,704.7 (Sig. 0.000)
AIC−10,900.22Likelihood-ratio M2s-M5 [SIG.CHISQ (323.63, 3)] (Sig. 0.000)
BIC−10,873.89
df (parameters −1): 11
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Villegas-Lirola, F. Emotional Intelligence Profiles and Cyber-Victimization in Secondary School Students: A Multilevel Analysis. Educ. Sci. 2024 , 14 , 971. https://doi.org/10.3390/educsci14090971

Villegas-Lirola F. Emotional Intelligence Profiles and Cyber-Victimization in Secondary School Students: A Multilevel Analysis. Education Sciences . 2024; 14(9):971. https://doi.org/10.3390/educsci14090971

Villegas-Lirola, Francisco. 2024. "Emotional Intelligence Profiles and Cyber-Victimization in Secondary School Students: A Multilevel Analysis" Education Sciences 14, no. 9: 971. https://doi.org/10.3390/educsci14090971

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. PPT

    experimental research statistical analysis

  2. 7 Types of Statistical Analysis with Best Examples

    experimental research statistical analysis

  3. PPT

    experimental research statistical analysis

  4. Statistical analysis of experimental data

    experimental research statistical analysis

  5. Statistical Analysis for Experimental Research

    experimental research statistical analysis

  6. 7 Types Of Statistical Analysis Definition And Explanation Analytics

    experimental research statistical analysis

VIDEO

  1. social work research

  2. How To Start With Research For USMLE #shorts #usmle #research

  3. How to Get Paid Research Position in the USA 🇺🇸 for USMLE #shorts #usmle #research

  4. Experimental Design in Statistics: Experiments, Observations, Simulations, Census

  5. icds supervisor- new sylabus bases- social work research

  6. Applied Statistics [Exploratory Data Analysis]

COMMENTS

  1. Introduction to Research Statistical Analysis: An Overview of the

    Introduction. Statistical analysis is necessary for any research project seeking to make quantitative conclusions. The following is a primer for research-based statistical analysis. It is intended to be a high-level overview of appropriate statistical testing, while not diving too deep into any specific methodology.

  2. PDF Chapter 10. Experimental Design: Statistical Analysis of Data Purpose

    Now, if we divide the frequency with which a given mean was obtained by the total number of sample means (36), we obtain the probability of selecting that mean (last column in Table 10.5). Thus, eight different samples of n = 2 would yield a mean equal to 3.0. The probability of selecting that mean is 8/36 = 0.222.

  3. Focus: Study Design & Statistical Analysis: Statistical relevance

    Statistical analysis is an important tool in experimental research and is essential for the reliable interpretation of experimental results. It is essential that statistical design should be considered at the very beginning of a research project, not merely as an afterthought. For example if the sample size for an experiment only allows for an ...

  4. Experimental Design: Definition and Types

    Experimental Design: Definition and Types

  5. Guide to Experimental Design

    Table of contents. Step 1: Define your variables. Step 2: Write your hypothesis. Step 3: Design your experimental treatments. Step 4: Assign your subjects to treatment groups. Step 5: Measure your dependent variable. Other interesting articles. Frequently asked questions about experiments.

  6. Statistical Analysis of Experimental Data

    z = x ¯ - μ σ . (11.8) Experimental data (with finite sample sizes) can be analyzed to obtain x ¯ as an estimate of μ and S x as an estimate of σ. This procedure permits the experimentalist to use data drawn from small samples to represent the entire population. Fig. 11.4. The normal or Gaussian distribution function.

  7. Statistical Design and Analysis of Experiments

    Statistical Design and Analysis of Experiments is intended to be a practitioner's guide to statistical methods for designing and analyzing experiments. The topics selected for inclusion in this book represent statistical techniques that we feel are most useful to experimenters and data analysts who must either collect, analyze, or interpret data.

  8. 5: Experimental Design

    Experimental design is a discipline within statistics concerned with the analysis and design of experiments. Design is intended to help research create experiments such that cause and effect can be established from tests of the hypothesis. We introduced elements of experimental design in Chapter 2.4. Here, we expand our discussion of ...

  9. PDF Statistical Principles for the Design of Experiments

    the difference that good design can make to an experimental research project. Based on Roger Mead's excellent Design of Experiments, ... Statistical Analysis of Stochastic Processes in Time, by J. K. Lindsey 15. ... 10.4 Randomisation theory of the analysis of experimental data 241 10.5 Practical implications of the two theories of analysis of

  10. Statistical Design and Analysis of Experiments

    Emphasizes the strategy of experimentation, data analysis, and the interpretation of experimental results. Features numerous examples using actual engineering and scientific studies. Presents statistics as an integral component of experimentation from the planning stage to the presentation of the conclusions. Deep and concentrated experimental design coverage, with equivalent but separate ...

  11. A Guide to Analyzing Experimental Data

    Photo by Lukas Blazek on Unsplash. Have you ever run an experimental study, or performed some A/B testing? If so, you should be familiar with the pre-analysis panic: how can you make the data reveal whether your experiment has worked?Every day — in economics, public policy, marketing, and business analytics — we face the challenges that come from running experiments and analyzing what ...

  12. Study/Experimental/Research Design: Much More Than Statistics

    Study, experimental, or research design is the backbone of good research. It directs the experiment by orchestrating data collection, defines the statistical analysis of the resultant data, and guides the interpretation of the results. When properly described in the written report of the experiment, it serves as a road map to readers, 1 helping ...

  13. Principles of Experimental Design

    The (statistical) design of experiments provides the principles and methods for planning experiments and tailoring the data acquisition to an intended analysis.The design and analysis of an experiment are best considered as two aspects of the same enterprise: the goals of the analysis strongly inform an appropriate design, and the implemented design determines the possible analyses.

  14. Exploring Experimental Research: Methodologies, Designs, and

    This book focuses on experimental research in two disciplines that have a lot of common ground in terms of theory, experimental designs used, and methods for the analysis of experimental research ...

  15. Chapter 1 Principles of Experimental Design

    1.2 A Cautionary Tale. For illustrating some of the issues arising in the interplay of experimental design and analysis, we consider a simple example. We are interested in comparing the enzyme levels measured in processed blood samples from laboratory mice, when the sample processing is done either with a kit from a vendor A, or a kit from a competitor B.

  16. Experimental Design

    Here are some common experimental design data analysis methods: Descriptive Statistics. Descriptive statistics are used to summarize and describe the data collected in the study. This includes measures such as mean, median, mode, range, and standard deviation. ... Medical Research: Experimental design is commonly used to test new treatments or ...

  17. The Beginner's Guide to Statistical Analysis

    The Beginner's Guide to Statistical Analysis | 5 Steps & ...

  18. Statistical Methods for Experimental Research in Education and

    Provides statistical analysis plans that fit a wide range of experimental research questions and designs; Unites traditional and emerging approaches to statistical testing and estimation; ... and methods for the analysis of experimental research data: education and psychology. Although the methods covered in this book are also frequently used ...

  19. Experimental research

    10 Experimental research ... Statistical analysis of this design involves a simple analysis of variance (ANOVA) between the treatment and control groups. The pretest-posttest design handles several threats to internal validity, such as maturation, testing, and regression, since these threats can be expected to influence both treatment and ...

  20. Statistics

    Statistics - Sampling, Variables, Design: Data for statistical studies are obtained by conducting either experiments or surveys. Experimental design is the branch of statistics that deals with the design and analysis of experiments. The methods of experimental design are widely used in the fields of agriculture, medicine, biology, marketing research, and industrial production.

  21. Experimental Research Designs: Types, Examples & Methods

    The true experimental research design relies on statistical analysis to approve or disprove a hypothesis. It is the most accurate type of experimental design and may be carried out with or without a pretest on at least 2 randomly assigned dependent subjects.

  22. Choosing the Right Statistical Test

    When to perform a statistical test. You can perform statistical tests on data that have been collected in a statistically valid manner - either through an experiment, or through observations made using probability sampling methods.. For a statistical test to be valid, your sample size needs to be large enough to approximate the true distribution of the population being studied.

  23. Statistical analysis of fuel combustion and emissions considering the

    The experimental fuels' naming conventions according to biomass percentage are detailed in the following table 1. ... Through a detailed statistical analysis, the research demonstrates that the chemical composition of biofuels plays a critical role in determining their combustion and emission characteristics. Fuels with higher aromatic content ...

  24. The Beginner's Guide to Statistical Analysis

    Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organisations. ... Example: Experimental research design. You design a within-subjects experiment to study whether a 5-minute meditation exercise can ...

  25. Saving lives with statistics

    Healthcare is awash with numbers, and figuring out what knowledge these numbers might hold is worthwhile in order to improve patient care. Numbers allow for objective mathematical analysis of the information at hand, but while mathematics is objective by design, our choice of mathematical approach in a given situation is not. In prehospital and critical care, numbers stem from a wide range of ...

  26. Education Sciences

    The research examined how different profiles of emotional intelligence (attention, clarity, and emotional regulation) act as protective or risk factors against cyber-victimization, taking into account individual and behavioral variables such as gender, sexual orientation, self-esteem, social anxiety, Internet risk, and parental control among high school students (11-18 years). A simulated ...