Experimental Design: Types, Examples & Methods

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

Probably the most common way to design an experiment in psychology is to divide the participants into two groups, the experimental group and the control group, and then introduce a change to the experimental group, not the control group.

The researcher must decide how he/she will allocate their sample to the different experimental groups.  For example, if there are 10 participants, will all 10 participants participate in both groups (e.g., repeated measures), or will the participants be split in half and take part in only one group each?

Three types of experimental designs are commonly used:

1. Independent Measures

Independent measures design, also known as between-groups , is an experimental design where different participants are used in each condition of the independent variable.  This means that each condition of the experiment includes a different group of participants.

This should be done by random allocation, ensuring that each participant has an equal chance of being assigned to one group.

Independent measures involve using two separate groups of participants, one in each condition. For example:

Independent Measures Design 2

  • Con : More people are needed than with the repeated measures design (i.e., more time-consuming).
  • Pro : Avoids order effects (such as practice or fatigue) as people participate in one condition only.  If a person is involved in several conditions, they may become bored, tired, and fed up by the time they come to the second condition or become wise to the requirements of the experiment!
  • Con : Differences between participants in the groups may affect results, for example, variations in age, gender, or social background.  These differences are known as participant variables (i.e., a type of extraneous variable ).
  • Control : After the participants have been recruited, they should be randomly assigned to their groups. This should ensure the groups are similar, on average (reducing participant variables).

2. Repeated Measures Design

Repeated Measures design is an experimental design where the same participants participate in each independent variable condition.  This means that each experiment condition includes the same group of participants.

Repeated Measures design is also known as within-groups or within-subjects design .

  • Pro : As the same participants are used in each condition, participant variables (i.e., individual differences) are reduced.
  • Con : There may be order effects. Order effects refer to the order of the conditions affecting the participants’ behavior.  Performance in the second condition may be better because the participants know what to do (i.e., practice effect).  Or their performance might be worse in the second condition because they are tired (i.e., fatigue effect). This limitation can be controlled using counterbalancing.
  • Pro : Fewer people are needed as they participate in all conditions (i.e., saves time).
  • Control : To combat order effects, the researcher counter-balances the order of the conditions for the participants.  Alternating the order in which participants perform in different conditions of an experiment.

Counterbalancing

Suppose we used a repeated measures design in which all of the participants first learned words in “loud noise” and then learned them in “no noise.”

We expect the participants to learn better in “no noise” because of order effects, such as practice. However, a researcher can control for order effects using counterbalancing.

The sample would be split into two groups: experimental (A) and control (B).  For example, group 1 does ‘A’ then ‘B,’ and group 2 does ‘B’ then ‘A.’ This is to eliminate order effects.

Although order effects occur for each participant, they balance each other out in the results because they occur equally in both groups.

counter balancing

3. Matched Pairs Design

A matched pairs design is an experimental design where pairs of participants are matched in terms of key variables, such as age or socioeconomic status. One member of each pair is then placed into the experimental group and the other member into the control group .

One member of each matched pair must be randomly assigned to the experimental group and the other to the control group.

matched pairs design

  • Con : If one participant drops out, you lose 2 PPs’ data.
  • Pro : Reduces participant variables because the researcher has tried to pair up the participants so that each condition has people with similar abilities and characteristics.
  • Con : Very time-consuming trying to find closely matched pairs.
  • Pro : It avoids order effects, so counterbalancing is not necessary.
  • Con : Impossible to match people exactly unless they are identical twins!
  • Control : Members of each pair should be randomly assigned to conditions. However, this does not solve all these problems.

Experimental design refers to how participants are allocated to an experiment’s different conditions (or IV levels). There are three types:

1. Independent measures / between-groups : Different participants are used in each condition of the independent variable.

2. Repeated measures /within groups : The same participants take part in each condition of the independent variable.

3. Matched pairs : Each condition uses different participants, but they are matched in terms of important characteristics, e.g., gender, age, intelligence, etc.

Learning Check

Read about each of the experiments below. For each experiment, identify (1) which experimental design was used; and (2) why the researcher might have used that design.

1 . To compare the effectiveness of two different types of therapy for depression, depressed patients were assigned to receive either cognitive therapy or behavior therapy for a 12-week period.

The researchers attempted to ensure that the patients in the two groups had similar severity of depressed symptoms by administering a standardized test of depression to each participant, then pairing them according to the severity of their symptoms.

2 . To assess the difference in reading comprehension between 7 and 9-year-olds, a researcher recruited each group from a local primary school. They were given the same passage of text to read and then asked a series of questions to assess their understanding.

3 . To assess the effectiveness of two different ways of teaching reading, a group of 5-year-olds was recruited from a primary school. Their level of reading ability was assessed, and then they were taught using scheme one for 20 weeks.

At the end of this period, their reading was reassessed, and a reading improvement score was calculated. They were then taught using scheme two for a further 20 weeks, and another reading improvement score for this period was calculated. The reading improvement scores for each child were then compared.

4 . To assess the effect of the organization on recall, a researcher randomly assigned student volunteers to two conditions.

Condition one attempted to recall a list of words that were organized into meaningful categories; condition two attempted to recall the same words, randomly grouped on the page.

Experiment Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of taking part in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Print Friendly, PDF & Email

Introduction: Practices, Strategies, and Methodologies of Experimental Control in Historical Perspective

  • Open Access
  • First Online: 27 February 2024

Cite this chapter

You have full access to this open access chapter

controlled experiment vs comparative experiment

  • Jutta Schickore 13  

Part of the book series: Archimedes ((ARIM,volume 71))

1227 Accesses

1 Altmetric

The introduction distinguishes four distinct strands in the history of experimental control. The first is the historical development of control practices to stabilize and standardize experimental conditions. The second is the emergence and career of the comparative design in experimentation, understood as a way of generating and securing knowledge of cause-effect relations. The third involves the unfolding, both in philosophy of science and in the sciences themselves, of methodological discussions on control practices and designs in experimental practice. The fourth is the history of the term “(experimental) control.” The introduction describes how the contributions to this volume address these aspects of experimental control.

You have full access to this open access chapter,  Download chapter PDF

Control is the hallmark of scientific experimentation. If an experiment is deemed to be lacking in control, it is unlikely to gain traction in the scientific community; arguably, an uncontrolled intervention is not even a genuine experiment. Today, scientific articles routinely mention controls and handbooks and instruction manuals on methods in the life sciences call for controlled experiments. Evaluating the appropriateness of controls is a core element of successful peer-review.

But despite its centrality to modern scientific inquiry, many foundational and historical questions about experimental control remain open. Experimental practice has been studied for decades, but only few analyses of scientific control practices in experimentation exist, Footnote 1 with almost nothing written on controlled experimentation in the longue durée . Footnote 2 We know little about changing expectations for well-controlled experiments or about different kinds of control, experimenters’ interpretations of control, or reasons given for applying controls. There is not even consensus about whether experimental control is an ancient, early modern, or Enlightenment concept, or whether it is a more recent feature of scientific inquiry. Footnote 3 This is, in part, because the concepts “control,” “control experiment,” and “controlled experiment” are polysemous, like “replication” or “significance.” In addition, methodological concepts for experimental practice have until recently received comparatively little scholarly attention.

“Control” has been studied mostly as a broader cultural phenomenon in the Western world. Cultural histories of control focus on ideologies and technologies for governing people, procedures, or systems of machines (Levin 2000b ; Derksen 2017 ). Historical studies of control and science have shown how cultural currents, for better or worse, transformed scientific practices into more rigorous endeavors. Historians of science have noted the increasing importance in science of a quantifying spirit (Frängsmyr et al. 1990 ) and the values of precision (Wise 1995 ). They have examined the influence on science of tools such as statistics (Porter 1995 ; Gigerenzer et al. 1989 ) and surveillance devices (Foucault 1975 , 1979 ), as well as bureaucratic procedures such as record-keeping, double bookkeeping, and accounting. These authors have argued that institutional changes in science, such as the rise of the university and urban research laboratories, have helped to standardize scientific practice and make it more exact (Tuchman 1993 ; Dierig 2006 ). Eighteenth-century sciences of state promoted record-keeping, accounting, and statistical assessments of experimental data (Seppel and Tribe 2017 ). Nineteenth- and twentieth-century physics and engineering helped to create automated feedback control mechanisms (Bennett 1993 ), intertwined control and communication systems (Wiener 1948 ), and “networks of power” (Hughes 1983 ). They also brought about catastrophic failure of control, as in failed aerospace missions, plane crashes, and collapsing bridges (Schlager 1995 ). Industrial and technological advancements allowed researchers to engineer the development of living organisms and human heredity (Pauly 1987 ; Paul 1995 ), to standardize living things as model organisms for experiments (Rader 2004 ), and to measure human performance (Rabinbach 1990 ). The twentieth-century nexus of military, industry, and information technologies enabled wide-ranging control over data and information flow (Galison 2010 ; Franklin 2015 ).

Of course, broader socio-political and cultural developments such as industrialization, the institutionalization of university research laboratories, and the expansion of bureaucracies and state administration are impactful. These developments change how practices of research, recording, and record-keeping are organized, as so many authors have demonstrated. But they do not fully determine experimental designs or experimenters’ views on what is considered good and well-controlled or deficient and poorly controlled experimental practice.

This volume shifts the focus from broader socio-political and cultural contexts of control onto practitioners’ methodological strategies of inquiry and experimental design. While acknowledging that broader cultural forces do affect control practices, we contend that these forces only partially shape experimental design and strategy. We identify additional social dimensions of experimental control. On the one hand, identifying experimental conditions, confounders, and solutions to technical problems in experimental design takes time, and unfolds by the activities of multiple individuals or groups. On the other hand, whether an experiment counts as “sufficiently” or even “fully” controlled is not entirely decided by the experimenters themselves, nor can the question be settled by comparing actual experimentation with an abstract standard of the ideal controlled experiment. Footnote 4 The adequacy of control critically depends on the social interactions and negotiations among experimenters and their various interlocutors; as such, the issue is open to revisiting, revision, and renegotiation.

To capture the complicated and multilayered history of experimental control, it is useful to distinguish control strategies, control practices, and methodological ideas about experimental control. Control strategies are general designs and plans to follow in an experiment, like the comparison of an intervention target with a control. Control practices are the concrete actions by which experimenters implement control strategies in particular contexts. These contexts comprise all the resources available to the experimenters, including materials, tools, techniques, local expertise, and institutional opportunities. Methodological ideas are the broader notions of how to study nature and everything in it. They are contained in accounts of control strategies and practices, as the practitioners themselves give them. Footnote 5

Contributions to this volume deal with the details of experimental control practices, as well as with the expectations and perceived obstacles for experimental designs. The chapters are also sensitive to long-term developments of control strategies and methodological ideas. We provide a set of focused studies on control practices, strategies, and ideas that, together, cover a period of more than 300 years, with glimpses back to antiquity and forward to the late twentieth century. We contend that the long-term perspective is productive for understanding experimental methodologies and experimental control in particular. Footnote 6 The chapters offer several examples of how control practices using those strategies and ideas are shaped by local contexts—material-technical, conceptual, and social. Together, they illustrate that control strategies and methodological ideas often remain stable for a long time and change only gradually.

To study controlled experimentation from a historical perspective, we must distinguish at least two notions of control. The first is a broad sense of control as “managing,” “restraining,” or “keeping everything stable except the target system to be intervened upon.” This notion primarily but not exclusively concerns the experiment’s material side—the objects, the setting and environment, and the tools, as well as the guided manipulation or intentional intervention in an otherwise stable situation to see what will happen. Footnote 7

In an uncontrolled situation, experimenters cannot determine the changes resulting from their interventions. To extract information from unwieldy experimental situations, they must standardize instruments and experimental targets and hold fixed the experimental background conditions. They ought also to be free of preconceived opinions and other sources of influence. Experimenters seek to make the experimental setting and background as stable and rigorous as possible because effects, both expected and novel, appear most distinctly against a stable background. Footnote 8 Generally, then, we can consider any aspect of experimental practice from the perspective of control; a key question is how experimenters identify what must be controlled in concrete contexts and how they achieve that control.

There is also a narrower notion of control, referring to comparative experimental designs. Footnote 9 It primarily but not exclusively concerns the experiment’s epistemic side, or the conditions required for the experiment to generate knowledge. Modern scientists typically associate with “control experiment” a particular experimental strategy or design, namely the comparison to a control case. An experimental intervention is compared with a baseline; the target system of the intervention is compared with a similar target system that, unlike the experimental object, was not intervened on (the “control mouse,” say, which did not receive treatment). This strategy encapsulates the requirements for an experiment to be informative about cause-effect relations. Footnote 10

In the narrow sense, comparison to a baseline is needed to find out whether it really was the manipulation of this particular variable that made a difference to the experimental outcome. Footnote 11 Of course, the more similar the experimental situations are, the more informative the comparisons will be. Making informative comparisons thus requires control practices in the broader sense explained above, to ensure that the two experimental settings are stable, save the intervention.

We should avoid confusing the emergence of terms such as “control experiment” and “experimental control” in the scientific literature with the emergence of explicit discussions about control practices and strategies. The terms “control experiment,” “controlled experiment,” and “experimental control” are recent terms. Google Ngram shows a steep increase for “control experiment” in the last decades of the nineteenth century in English, French, and German-language scientific literature. Of course, Ngram is not a rigorous tracker for word usage, but based on its data, we can safely assume that control practices were common long before the term spread in scientific writing. Footnote 12 As our volume demonstrates, discussions about stable experiments antedate the appearance of the term “control” in this literature. Concerns about the adequate management of experimental settings were voiced as soon as experimentation became widespread. Robert Boyle, for one, published two famous essays on “unsucceeding” experiments, where he discussed the obstacles posed by impure chemicals, the variability of body parts in different corpses, and other issues threatening experimental success (Boyle 1999a , b ).

The history of experimental control, then, encompasses four distinct yet related strands. The first is the historical development of control practices to stabilize and standardize experimental conditions. The second is the emergence and career of the comparative design in experimentation, understood as a way of generating and securing knowledge of cause-effect relations. The third involves the unfolding, both in philosophy of science and in the sciences themselves, of methodological discussions on control practices and designs in experimental practice. The fourth is the history of the term “(experimental) control.”

This volume concerns itself most with the first three strands. We do not systematically explore the history of the term “control;” Footnote 13 in fact, several contributions discuss research from before the late nineteenth century. However, precisely because control practices and strategies predate the term “control” in scientific literature, we keep terminological questions in mind as we analyze past experimental reports and methodological discussions. We pay careful attention to the terms past practitioners did use, whatever they were, to describe, explain, and defend control practices and strategies.

The contributions here examine how control practices and comparative designs developed, and include past accounts of critiques and defenses for these practices. Control is a multifaceted and elusive concept, and our volume reflects this. We have not attempted to reduce our discussion to a single definition of “control.” Although this introduction provides some points of orientation for analyzing control practices and strategies, each contributor further explains the concept for specific experimental contexts. The chapters range over different fields, from botany and vision studies, ecology and plant physiology, human physiology and psychology to animal behavior and experimental physics. They cover a period from the early seventeenth to the twentieth century. They examine experiments with complex and sometimes unwieldy objects and elusive phenomena. Chapters deal with studies on learning and judgment; color blindness in animals; auditory perceptions of tones, pitch, and vowel sounds; irregular movements; psychic forces; unobservable elements; and the best “photogenic climate” for promoting photosynthesis. Experiments on such objects and phenomena are hard to design, stabilize, and carry out, and they are often controversial. For this reason, they showcase questions and reflections on control in science particularly well.

The very practice of creating and maintaining a stable experimental situation is old, arguably as old as experimental intervention itself. Over time, experimenters learn what must be managed and tracked in experimental contexts; they seek to localize the phenomena of interest as well as the elements of the experimental setting in order to make interventions more exact. Gradually they develop new tools to do this. Precision instruments, elaborate recording devices, and other technologies available in the last century or two can assist with these tasks. The history of research laboratories can be written as the history of efforts to create highly controlled research environments. Nineteenth-century physicists worked at night or retreated to the lab basement to escape city noise, vibrations from trams, and exuberant students (Hoffmann 2001 ). Today’s scientists turn to specialized construction companies when they need “clean rooms” for research. Footnote 14 All-metal or all-plastic labs are built for research into the impacts of micro-plastics on materials and tissues or on radiation, respectively. Particle physicists dive to recover radiation-free lead from ancient shipwrecks to prevent contaminating their measurements.

Such materials and technologies often make it easier to keep an experimental situation stable and to track interesting changes. Footnote 15 At the same time, however, closer analysis of actual episodes shows that advancements in instrumentation, impressive as they may appear in hindsight, do not guarantee improved control. In fact, obtaining control often becomes more difficult, not least because researchers must learn the instruments’ proper functioning. “The more finely a method of investigation operates, the more complicated the devices used must be,” as Carl Stumpf noted (1926, 8). Footnote 16

Moreover, the history of control is a history of efforts—and efforts can fail. Implementing control strategies often fails, as even the experimenters themselves sometimes admit. Our volume illustrates how difficult it can be to manage an experimental setting, how resourceful some experimenters were in their management, and how they sometimes failed to achieve it despite intense effort. Claudia Cristalli’s researchers of psychic phenomena walk the line between controlling the psychic powers of the “percipients” in their experiments, and preventing them from sensing any phantasms at all. Christoph Hoffmann’s study of color blindness in fish shows how experimenters dealt with the tricky problem of controlling animals’ behavior. Experimenters found different solutions, both difficult to implement and neither completely satisfying. One option was to train the fish—much more challenging to do than training, say, a dog or rat. The other was to design the experimental setting in such a way that the “normal” behavior of the fish was taken into account when the behavior of interest was elicited. But what is the “normal” behavior of fish? And how can it be accommodated in the unnatural environment of a laboratory fish tank?

Other contributions illustrate how experimenters approached the creation and monitoring of an experimental setting. They discuss the multifaceted nature of the associated problems and the obstacles the experimenters had to overcome when attempting to stabilize unwieldy things, such as the irregular movements of microscopic parts, the germination, sprouting, and growth of plants, and auditory perceptions. The contributions describe the solutions they found to these problems. Experimenters tried their best to identify the smallest details of the experimental settings deemed relevant, and sometimes invented remarkably elaborate contraptions to keep them stable.

Caterina Schürch depicts the curious machines with which eighteenth-century plant physiologists tried to electrify plants and seeds with precise doses of electricity. Kärin Nickelsen shows how the nineteenth-century plant physiologist Julius Wiesner designed an artificial environment for his plants: double-walled glass jars, with the space between the walls filled with a solution of iodine in carbon disulphide. Because this liquid layer absorbed all visible light but heat rays, Wiesner could examine the impact of those rays on plant growth. Julia Kursell describes the giant arrangement of tubes Carl Stumpf erected to compare how his experimental subjects perceived natural and machine-generated vowels. She notes that, according to Stumpf, the increased finesse of experimental tasks required ever more complex experimental devices. Cristalli shows how Faraday, attempting to stop participants in table-turning experiments from making involuntary movements, designed a device consisting of a stack of cardboard sheets, arranged like a voltaic pile, with pellets of wax in between. The device would be placed between the hands of the séance participants and the tabletop. The sheets were arranged and marked in such a way that their displacement would indicate hand movements prior to the table’s movement.

These devices often astonish with their ingenuity, but the point is that they are the material realizations of what experimenters recognized as the relevant conditions and potential confounders for their experiments. They are therefore purpose-dependent, as Kursell notes; at the same time, they both constitute and constrain the generation of experimental knowledge. Cristalli’s, Schürch’s, Nickelsen’s and Evan Arnet’s chapters demonstrate this constraint: over time, views about what factors to manipulate, keep fixed, or monitor in controlled experiments might change considerably, even within a single research tradition. While Faraday built tools to control his subjects’ involuntary movements, his American colleague and erstwhile admirer Robert Hare turned to designing machines that would prevent voluntary movements in psychic experiments—in other words, to prevent fraud.

Schürch’s account illustrates a most dramatic change of focus. After decades of carefully controlled experimentation, which supported the view that electrification promotes plant growth, Jean Ingen-Houz showed, using the same control strategies, that it was not electricity but differences in light intensities that affected the plants. He thus re-oriented the entire research program of plant growth, rendering previously “well-controlled” experiments uncontrolled.

Similarly, in maze research on animal learning, later investigators critiqued their predecessors for stabilizing—“controlling for”—the very phenomenon they should have studied, as Arnet’s work illustrates. Nickelsen shows how control practices in photosynthesis research changed fundamentally as the experiments moved from the laboratory to the field. As she observes, the changes were not just practical—measuring natural light is harder than measuring laboratory light—but also conceptual. What mattered was no longer just “daylight,” but a complex set of factors consisting of the specific light individual plant parts received, intensity fluctuations during the day and the season, and so forth. Klodian Coko charts another kind of reorientation in his study of research on Brownian movement. Using the strategy of comparative experimentation, nineteenth-century researchers tried to establish what could and could not be the cause of Brownian movement. Later in the century, Brownian movement itself became evidence for a new kinematic-molecular theory of matter, which changed the understanding of rigor and experimentation.

Several chapters also direct attention to the fact that many experimenters were explicitly concerned with developing coping strategies for “limited beings” (Wimsatt 2007 ) in sub-optimal situations. Researchers faced challenges not only because background factors were difficult or too numerous to monitor, but also because those factors were not immediately observable. Remarkably, the physicist Lord Rayleigh devoted several of his public-facing remarks to the theme of “deficient rigor.” As Vasiliki Christopoulou and Theodore Arabatzis point out, for Rayleigh, the pursuit of absolute (“mathematical”) rigor could even be detrimental to progress in physics. It was in this situation that experimenters insisted on using two or more different experimental techniques to check if both converged on the same outcomes, as detailed in the contributions by Christopoulou and Arabatzis and by Coko.

Notably, experimenters developed strategies to guard against entirely unknown influences on their experiments. The notion that natural phenomena in an experiment might occur and not occur in unforeseeable ways is centuries old. The metaphysical interpretation of this notion has changed dramatically over time (Hacking 1984 , 1990 ), but there was wide and long-standing agreement about how to address it: namely, through multiple repetitions of experimental trials. Both the early seventeenth-century experimenter Scheiner and the late nineteenth-century experimenter Rayleigh gave the idea of multiple repetitions an important role in rigorous experimentation, if for different reasons.

In an early essay on medical experience, the ancient physician and anatomist Galen discussed the possibility that what is seen only once in a patient may not be a regular occurrence, and thus may not be worthy of acceptance and belief. Galen suggested this point in the middle of his attempt to demonstrate that medical practice is not just logos , but also experience. Footnote 17 As part of the argument, Galen alluded to the instability of memory and also noted that medicines work sometimes but not always (Galen 1944 ). In clinical medicine, at least, one single drug test might not produce reliable results, because “some things are frequent and some are rare” (Galen 1944 , 113). It must therefore be repeated several times, and even then, it may not tell us what is usually the case. Footnote 18 Ibn Sīnā (Avicenna) expressed a similar idea in a proposal for rules of drug testing, albeit with a positive spin. He wrote that “the effect of the drug should be the same in all cases or, at least, in most. If that is not the case, the effect is then accidental, because things that occur naturally are always or mostly consistent” (Nasser et al. 2009 , 80).

In the early modern period, we encounter this idea frequently, now also in discussions about experimentation beyond drug testing in clinical medicine. Repeating experimental trials several times, indeed “very many times,” became an imperative for rigorous experimentation—in this way, unknown or contingent and accidental influences on experiments could be avoided. Footnote 19 In later centuries it was to become a hallmark of rigorous experimentation that a trial be done more than once or on large samples. Footnote 20 However, as Schürch’s chapter shows, the appropriate number of repetitions remained contested.

Scholars looking for the “first” control experiment in the history of scientific inquiry typically assume, but in most cases tacitly, the narrower notion of “control” as comparative trial. They have found quite early examples for comparative designs in experimental practice. These examples often come from medicine, where it is both vitally and commercially important to discover the efficacy of certain drugs and treatments. The reputation of a practitioner depended on the treatments’ success.

For example, historian of statistics Stephen Stigler finds an instance of comparative experimentation in the Old Testament, in the Book of Daniel (around 164 BCE). Servants on a vegetarian diet are compared with children who eat “the king’s meat”: “And at the end of ten days their countenances appeared fairer and fatter in flesh than all the children which did eat the portion of the king’s meat” (Daniel 1:5–16). Footnote 21

A passage by Athenaeus (200 CE) describes how some convicted criminals had been thrown among asps and survived. It turned out that they had been given lemons prior to their punishment. The next day a piece of lemon was given to one convict but not to another. The one who ate the lemon survived the bites, the other died instantly. Footnote 22 The pseudo-Galenic treatise on theriac describes a trial with a similar design, whereby two birds would be poisoned and only one given an antidote (Leigh 2013 ). The trial tests the efficacy of medicines: if both animals survived, the tested antidote was recognized to be ineffectual. That experiment was again reported in the Middle Ages, notably by Bernard Gordon (McVaugh 2009 ).

Another famous ancient example is the legend of Pythagoras. As the story goes, he observed that most combinations of blacksmiths’ hammers generated a harmonious sound when striking anvils at the same time, while some did not. Pythagoras discovered that harmonious sounds were produced by those hammers whose masses were simple ratios of each other, while other hammers made dissonant noises when struck simultaneously. Notably, Ptolemy later criticized the Pythagorean experiment because, to him, it lacked control (Zhmud 2012 , 307).

The Pythagorean case is interesting. It clearly has a comparative component, inspecting the sound of hammers whose masses were simple ratios of each other and that of other hammers. But in the historiography of science it does not serve as an example of an early “control experiment.” In fact, the ancient texts have too little information to determine whether it was consciously performed as an experiment compared with a control, whether Pythagoras simply varied the setup, or whether he arrived at his conclusions by observing different blacksmiths at work.

Conscious and explicit implementation of comparative designs appears to become more common in seventeenth- and eighteenth-century experimental practice. In his studies on the generation of insects, Francesco Redi famously compared samples of organic materials—“a snake, some fish, some eels of the Arno, and a slice of milk-fed veal in four large, wide-mouthed flasks” (Redi 1909 , 33)—kept in open and closed containers. The samples were periodically inspected for traces of life. No life developed in closed containers, which Redi took as evidence against the spontaneous generation of maggots from putrefying flesh. Here, the comparative design demonstrates a cause-effect relation through the comparison with a “control.” Redi showed that maggots in open containers were generated by flies’ eggs. Footnote 23

The case of spontaneous generation research illustrates particularly well why it is useful to distinguish between comparative design strategies and a broader notion of control as management of the experimental setting. Redi’s experimental research was not decisive, and after him many other experimenters investigated spontaneous generation. They all contested each other’s experiments and many argued that their opponents had not properly maintained the experimental settings; they also argued that they themselves really had taken the necessary precautions to do so. John T. Needham, for instance, claimed that he could demonstrate the spontaneous generation of animalcules in infusions. He told his readers that he had “neglected no Precaution, even as far as to heat violently in hot Allies the Body of the Phial; that if any thing existed, even in that little Portion of Air which filled up the Neck, it might be destroy’d, and lose its productive Faculty” (Needham 1748 , 638). Notably, he did not report a comparison with a vial that had not been heated in fire. It may have been superfluous to him, because it was obvious that animalcules would appear in it, as so often had been observed. The debates continued throughout the nineteenth century. Experimental designs and interpretations for possible contaminants varied, but the comparative strategy generally remained the strategy of choice. Footnote 24 As Schürch’s contribution shows, in the decades around 1800, experimenters across Western Europe advocated comparative experimental designs.

Reports of comparative trials can be found in many fields, from agriculture to clinical medicine. Footnote 25 A notable but little-studied example is steeping experiments (Pastorino 2022 ). A comparative experiment by Francis Bacon served as a template for many subsequent experiments on the effects of plant growth when steeping seeds in various fluids.

Our volume illustrates comparative trial designs in plant physiology, physics, animal behavior studies, and psychology. The episodes exemplify both the conscientious application of these strategies and the obstacles experimenters faced as they attempted to realize well-controlled comparative trials.

The earliest pre-modern reports of experimental trials and comparative designs contain little express discussion on control practices and strategies. There are exceptions, of course, especially in medical contexts. I already noted Galen’s writings, and we know that medieval scholars such as Ibn Sīnā developed rules for drug testing (Crombie 1952). Mostly, however, comparative designs were simply described and rarely justified; there was little explicit concern with managing the details of experimental settings. When ancient and medieval authors noted the drug test on two birds, they surely meant to show a test to support the drug’s efficacy, but the argument for the comparative approach often remained implicit. In modern scientific writing, by contrast, we sometimes find detailed discussions and justifications of experimental designs—in controversies about experimental results, in debates about the status of heterodox scientific fields such as research on psychic phenomena, and in situations of uncertainty.

In this volume, Tawrin Baker’s chapter on Scheiner and Christopoulou and Arabatzis’s chapter on Rayleigh epitomize both the scarcity and the abundance of practitioners’ discourse on their control practices and strategies. Scheiner demonstrated to his readers how experimentation could serve as a legitimate check on a theory of vision. He did not expound or defend methodological ideas in detail, although he did focus attention on the process of experimentation. Words and pictures conveyed the experimental setups. Scheiner instructed his readers to make certain experiences and experiments; he discussed the implications for the theory of vision. However, as Baker notes, several issues remained open, such as how often an experiment should be repeated or how one ought to deal with discrepancies. Christopoulou and Arabatzis’s chapter on Rayleigh shows that late-nineteenth-century scientists wrote not only about the details of their experiments but also about experimental control. Experimenters drew attention to how they had re-designed instruments to make their measurements more precise and how they had employed additional instruments to check the quality of their measurements. They often insisted on using two measurement methods to guard against error.

We still know little about the unfolding of methodological discussions in the centuries after Scheiner’s appeal to a variety of experiences and experiments and Boyle’s musings on unwieldy, “uncontrolled” experimental settings and about the practices appropriate for managing and extracting knowledge from these settings. Little is known about the emergence of explicit methodologies for comparative trials. According to some scholars, notably Edwin Boring, it was not until the mid-nineteenth century that we find such explicit methodologies. Boring associated the first methodology of comparative experimental designs with a philosophical text, John Stuart Mill’s System of Logic (Boring 1954 ) . While the contributions to our volume do not tell a comprehensive history of methodological accounts on experimental control, they do suggest that it would be misleading to identify Mill as the sole originator and principal representative of these accounts. Footnote 26 As Schürch’s, Coko’s and Nickelsen’s chapters demonstrate, Mill was one of several early-nineteenth-century commentators on science who urged investigators to keep background conditions constant across trials, to “analyze” the background into different experimental conditions, and to compare the effects of interventions in one setting to another setting left untouched. But a broader history of these developments would still be desirable.

Our volume also shows that reflections about and justifications of control strategies predate modern philosophies of science. From Schürch’s study of late-eighteenth-century plant physiology we learn that, prior to Mill, practitioners not only called for rigorous and properly managed interventions, but also did much more: they reflected on control practices as validation procedures and debated their relative merits, practicality, and limitations. They observed that, to be instructive, comparisons must be made on sufficiently similar experimental subjects in similar situations. At times they disagreed about whether they or their colleagues had done enough to control their experiments. They criticized each other for not making comparative trials, for not controlling the right thing, or for not repeating a trial often enough.

The content of these debates and reflections tells us something about the experimenters’ own understanding of methodological issues concerning control, rigor, reliability, certainty, and failure in experimentation. Christopoulou and Arabatzis’s and Coko’s chapters illustrate this. As many contributors show, satisfactory control of an experiment is, in the end, an intersubjective, iterative achievement. Schürch and Christopoulou and Arabatzis note that experimenters such as Ingen-Housz and Rayleigh call upon others to check the results they themselves had obtained and to contribute additional experiments. Footnote 27 Cristalli charts the decades-long negotiations and re-negotiations among physicists, chemists, and psychologists on experimental practices deemed adequate to study psychic phenomena. The experimenters understood that their projects’ success depended on “controlling” their interlocutors as well. Footnote 28

This volume does not aim to replace earlier systematic discussions in history and philosophy of science on these issues, such as those on epistemological strategies of experimentation (Allan Franklin), tests for error (Deborah Mayo), representing and intervening (Ian Hacking), and how experiments end (Peter Galison). Our volume complements them. In fact, our discussions overlap with these approaches as we trace the history of controls while keeping epistemological strategies of experimentation in mind. We do contend that re-directing attention to control practices, control strategies, and practitioners’ accounts thereof illuminates new aspects of the history of experimental practices.

Control strategies and practices can be viewed as long-term and short-term methodological commitments, along the lines suggested by Peter Galison ( 1987 ). Arnet’s contribution to this volume uses this approach. Material and conceptual organizations of experiments vary, as do the identification of target systems, conditions, and confounders. The tools for stabilizing them change as well and are often (but by no means always!) local, context-specific, and relatively short-lived. Modern technologies allow for creative and sometimes intricate solutions to the problems of stabilization, standardization, and tracking. Yet the strategies have long been in place.

Control strategies are persistent. Even in the most complicated settings and with the most elusive phenomena, experimenters try to implement established control strategies as best they can, as shown in Schürch’s study of plant electrification, Coko’s discussion of experiments on Brownian movement, Cristalli’s study of psychic experiments, Kursell’s work on elusive auditory judgments, and Nickelsen’s discussion of plant physiology. Experimenters look for experimental conditions and confounding factors; they vary them to weigh their influence on experimental processes; they probe for error (Mayo 1996 ); they make their interventions less “fat-handed” (Woodward 2008 ); they compare situations meant to be similar and assess robustness, presupposing the no-miracle argument (Hacking 1985 ). At the same time, they develop specific, contextual implementations for these strategies, and they do not always agree on whether a particular implementation is effective.

In doing all this, experimenters face both technical and conceptual challenges. It may take a long time to harness experimental conditions, identify potential confounders, and find suitable techniques for doing so. Solutions to control problems will typically remain less than ideal. Hoffmann’s contribution demonstrates this fragility in control procedures. In debates about spontaneous generation, it took centuries to refine the tools to prevent contaminations from reaching the materials under investigation, and every new tool generated new issues for further exploration. Along the way, the understanding changed regarding the causes, conditions, and potential modifying factors and confounders. New technical challenges arose as a result.

Several chapters show that the implementation of control strategies may generate entirely new technical and conceptual problems for the experimenter, or even produce “surplus findings,” as Kursell writes. Footnote 29 Nickelsen, for instance, tracks changes in both the conceptualization and the logistics of managing background conditions for experiments on the influence of light on plant growth. Christopoulou and Arabatzis suggest that disturbances in physics experiments could become research topics in their own right. Arnet’s work also brings into relief the problematic implications of an over-emphasis on rigor and control. Early mazes were designed as simple systems of tracks in order to minimize environmental cues. But for a more complete understanding of animal learning, later researchers re-introduced precisely those same environmental features. The early mazes embodied a regime of control that stripped animals of certain sensory and environmental cues. Those mazes, however, excluded exactly those features that later researchers thought essential to advanced rodent learning. Footnote 30

Finally, several chapters suggest that it is fruitful to think of experiments as “controls of inferences,” because this perspective also brings out relevant methodological issues and their historical development. As Baker demonstrates, for early modern experimenters coming to grips with their Aristotelian heritage, the role of experiments in scientific inquiry was a crucial issue. In hindsight, studying how they managed this issue can also tell us something about Aristotle’s own ideas on the role of experimentation in empirical inquiry. For eighteenth- and nineteenth-century inquirers, then, the question is not so much whether but how, exactly, experimentation and experimentally generated knowledge can help us to understand nature. Steinle, Coko, Nickelsen, Kursell, and Hoffmann show how intricate the question can be as experiments target unobservable phenomena. As these experiments involve increasingly complicated instruments, hypotheses, assumptions, chains of inferences, and interpretations, the challenges for experimenters increase accordingly.

We place practitioners’ methodologies, experimental designs, strategies of inquiry, and practices of implementation in the center of our analyses. We thereby draw new trajectories and connections in the history of experimental inquiry. We identify lines of experimentation that sometimes turned into models of rigorous experimental design while other times being criticized. Bacon’s steeping experiments with plant seeds, as analyzed by Pastorino, exemplify a specific kind of comparative experimentation. It would be applied again and again throughout the eighteenth century, not just in plant science but also in other scientific fields. Pythagoras’ hammer experiments too were repeated, at least repeatedly reported, by several scholars prior to Galileo and Mersenne. In this case, the design was not a model but a point of critique for later scholars.

Our studies on control practices and on their discussion and justification have revealed other lineages and cross-fertilizations—among physics and psychology, physiology, botany and ethology, chemistry, medicine, agriculture, and philosophy. Control practices and strategies are contextual, in that the context determines what is controlled and how to achieve control. But control strategies and at times even control practices are not discipline-specific. The same strategies travel across disciplines, from physics to medicine and physiology to chemistry and back again. Several chapters suggest that the same methodological ideas and control strategies are advocated across national boundaries (see especially Schürch and Coko). Control strategies such as comparative designs and multiple repetitions are relatively stable across historical periods. But they may be justified in different ways at different times and may cease to be justified at all.

With our work, we hope to stimulate broader discussions about the longer-term history of rigorous experimentation: what are the strategies involved in it? And how do debates concerning well-designed experiments unfold in different fields and periods? By our effort we seek to clarify the roles of experimental strategies and methodologies as driving forces for scientific change, and as tools for determining what it means to do—or not to do—good science.

This volume (and its companion, a collection of essays on analysis and synthesis) originated in a Sawyer Seminar at Indiana University Bloomington titled “Rigor: Control, Analysis and Synthesis in Historical and Systematic Perspectives,” which was funded by the Andrew W. Mellon Foundation. Mellon Sawyer Seminars are temporary research centers, gathering together faculty, postdoctoral fellows, and graduate students for in-depth study of a scholarly subject in reading groups, seminars, and workshops. As part of our activities, we organized two international conferences. They brought together scholars in history, philosophy, and social studies of science who examine historical and contemporary dimensions of rigor in experimental practice. The contributors to this volume participated in the second of the Sawyer conferences (March 2022) and reconvened a few months later for an authors’ workshop, at which the draft chapters for this volume were intensely discussed.

Several institutions and individuals helped to make our work possible. We gratefully acknowledge the Mellon Foundation’s generous financial support, and especially the Foundation’s flexibility as we dealt with the challenges of pursuing collaborative scholarship during a pandemic. We are grateful to Director of Foundation Relations Cory Rutz at Indiana University’s Office of the Vice President for Research, for his prompt and efficient assistance in administering the grant. The authors’ workshop took place at the IU Europe Gateway (Berlin) and was funded by a combined grant from the IU College of Arts and Sciences and the College Arts and Humanities Institute. We very much appreciate this support. We are indebted to Jed Buchwald for including our work in the Archimedes series, and to Chris Wilby for his efforts in moving the publication along. A big thank you to our department manager Dana Berg (Department of History and Philosophy of Science and Medicine at IU), office assistant Maggie Herms (IU HPSC), and Andrea Adam Moore (IU Europe Gateway), all of whom helped to organize our conferences and workshops. Finally, we warmly thank the many participants at the two conferences and at the various other Sawyer events for their valuable input, comments, questions, and critique.

This is slowly changing, see Guettinger ( 2019 ); Sullivan ( 2022 ); Guettinger ( 2019 ); Desjardins et al. ( 2023 ).

Only the randomized controlled trial has been studied historically and systematically. See Marks ( 1997 , Chap. 5); Worrall ( 2007 ); Cartwright ( 2007 ); Keating and Cambrosio ( 2012 ). For the control group and (double) blind tests, see Kaptchuk ( 1998 ); Strong and Frederick ( 1999 , including further references); Dehue ( 2005 ); Holman ( 2020 ).

For a variety of views, see, for instance, McCartney ( 1942 ); Beniger ( 1986 ); Levin ( 2000a , 13–14); Amici ( 2001 ).

Two classic studies of how experimenters sought to “control” their audiences are Shapin and Schaffer ( 1985 ) and Geison ( 1995 , especially Chap. 5).

These ideas are also articulated in the philosophy of science, of course. In this volume, however, we are concerned mostly with practicing experimenters’ working philosophies.

Some historians have strong reservations about long-term histories “lining up unconnected look-alikes through the ages” (Dehue 2005 , 2), or “ahistorical narratives” comparing, for instance, early modern and Victorian experiments “merely because of superficial similarity ‘in the use of controls’” (Strick 2000 , 5, commenting on spontaneous generation experiments). Our volume shows that it is possible to write long-term histories without comparing apples to oranges.

These distinctions are inspired by one of the few systematic studies of controlled experimentation, Edwin Boring’s “The Nature and History of Experimental Control” (Boring 1954 ).

This insight underlies Ludwik Fleck’s and Thomas Kuhn’s accounts of scientific change.

Comparison, Boring noted, “appears in all experimentation because a discoverable fact is a difference or a relation, and a discovered datum has significance only as it is related to a frame of reference, to a relatum” (Boring 1954 , 589).

For the epistemic ideal underlying this design, the “perfectly controlled experiment,” see Guala ( 2005 , 65–69).

I keep this characterization vague because I do not want to commit to a specific philosophical understanding of causality here.

Technical terms such as “positive” and “negative” control are even more recent (and outside the timeframe of our volume). They are also poorly understood.

For a brief overview of historical definitions of control, see Levin ( 2000a , 21–31).

See Holbrook ( 2009 ).

See, e.g., Kuch et al. ( 2020 ).

The quotation is drawn from Kursell’s chapter in this volume.

Much of the text rebuts the sorites argument, according to which it is impossible to clarify the notion of seeing something “very many times” (see Galen 1944 , 124–25). For a reconstruction of the argument, see (Kupreeva 2022 ).

For the Aristotelian notion of the memory of many instances, see Bayer ( 1997 ). For its application in the scholastic-mathematical tradition, see Dear ( 1991 ).

On repetition and “many, many” trials, see Schickore ( 2017 , chapters 1–3).

A popular passage by Karl Popper expresses this idea: “Every experimental physicist knows those surprising and inexplicable apparent ‘effects’ which in his laboratory can perhaps even be reproduced for some time, but which finally disappear without trace. Of course, no physicist would say in such a case that he had made a scientific discovery (though he might try to rearrange his experiments so as to make the effect reproducible). Indeed the scientifically significant physical effect may be defined as that which can be regularly reproduced by anyone who carries out the appropriate experiment in the way prescribed. No serious physicist would offer for publication, as a scientific discovery, any such ‘occult effect,’ as I propose to call it—one for whose reproduction he could give no instructions. The ‘discovery’ would be only too soon rejected as chimerical, simply because attempts to test it would lead to negative results. (It follows that any controversy over the question whether events which are in principle unrepeatable and unique ever do occur cannot be decided by science: it would be a metaphysical controversy)” (Popper 2002 , 23–24).

This example is also quoted on the website of the Institute for Creation Research as a model for sound experimental design (Treece 1990 ).

Deipnosophists or Banquet of the Learned, 3.84 d-f:2. The reference is from McCartney ( 1942 , 5–6).

For details on Redi’s experiments, see Parke ( 2014 ). Historians of biology as well as science educators regularly cite Redi’s experiments on spontaneous generation as “the first control experiments.”

In his well-known book on Pasteur, Gerald Geison drew on Pasteur’s experiments with infusions to show that the negotiations of what does and does not count as a properly controlled experiment in the spontaneous generation debates turned into battles motivated by political and religious concerns. Geison argues that Pasteur effectively “controlled” his audiences (Geison 1995 ).

Bertoloni Meli ( 2009 ) describes many other comparative experiments from the early modern period. See also Schickore ( 2021 ).

Our volume focuses on practitioners’ methodological accounts. However, even in philosophy of science, Mill had predecessors in this regard: Dugald Stewart and John Herschel, for instance, cover territory very similar to Mill’s four methods of experimental inquiry.

For another example of appeals to the community in the struggle to identify the causes of blue milk, see Schickore ( 2023 , 37).

See Schürch’s discussion of Ingen-Housz in this volume, for example.

For another example of how control practices themselves become the object of study, see Landecker ( 2016 ).

Researchers today have identified other areas of concern for over-emphasizing rigor and control. One example is over-standardized mice (Engber 2013 ), and these studies highlight the importance of balancing control with other demands on research design. In public health studies, researchers must overcome barriers for recruitment, attrition, and sample size, which may necessitate lowering the bar for rigor to gather any valuable information at all (Crosby et al. 2010 ). Thus, the implication of an over-emphasis on rigor may be epistemic, socio-political, or both.

Amici, Raffaele Roncalli. 2001. The History of Italian Parasitology. Veterinary Parasitology 98: 3–30.

Article   Google Scholar  

Bayer, Greg. 1997. Coming to Know Principles in Posterior Analytics II 19. Apeiron 30: 109–142.

Beniger, James. 1986. The Control Revolution. Technology and Economic Origins of the Information Society . Cambridge, MA: Harvard University Press.

Google Scholar  

Bennett, S. 1993. A History of Control Engineering . Wiltshire: Redwood Books.

Bertoloni Meli, Domenico. 2009. A Lofty Mountain, Putrefying Flesh, Styptic Water, and Germinating Seeds. In The Accademia del Cimento and its European Context , ed. Marco Beretta, Antonio Clericuzio, and Larry Principe, 121–134. Sagamore Beach: Science History Publications.

Boring, Edwin Garrigues. 1954. The Nature and History of Experimental Control. American Journal of Psychology 67: 573–589.

Article   CAS   PubMed   Google Scholar  

Boyle, Robert. 1999a. The First Essay, of the Unsuccessfulness of Experiments. In The Works of Robert Boyle , ed. Michael Hunter and Edward B. Davies, 37–56. London: Pickering & Chatto. Original edition, 1661.

———. 1999b. The Second Essay, of Unsucceeding Experiments. In The Works of Robert Boyle , ed. Michael Hunter and Edward B. Davies, 57–82. London: Pickering & Chatto. Original edition, 1661.

Cartwright, Nancy. 2007. Are RCTs the Gold Standard? BioSocieties 2: 11–20.

Crosby, Richard, Laura F. Salazar, Ralph DiClemente, and Delia Lang. 2010. Balancing Rigor Against the Inherent Limitations of Investigating Hard-to-Reach Populations. Health Education Research 25: 1–5.

Article   PubMed   Google Scholar  

Dear, Peter. 1991. Narratives, Anecdotes, and Experiments: Turning Experience into Science in the Seventeenth Century. In The Literary Structure of Scientific Argument: Historical Studies , ed. Peter Dear, 135–163. Philadelphia: University of Pennsylvania Press.

Chapter   Google Scholar  

Dehue, Trudy. 2005. History of the Control Group. In Encyclopedia of Statistics in Behavioral Science , ed. Brian S. Everitt and David C. Howell, 829–836. Chichester: Wiley.

Derksen, Maarten. 2017. Histories of Human Engineering: Tact and Technology . Cambridge: Cambridge University Press.

Book   Google Scholar  

Desjardins, Eric, Derek Oswick, and Craig W. Fox. 2023. On the Ambivalence of Control in Experimental Investigation of Historically Contingent Processes. Journal of the Philosophy of History 17 (1): 130–153.

Dierig, Sven. 2006. Wissenschaft in der Maschinenstadt. Emil Du Bois-Reymond und seine Laboratorien in Berlin . Berlin: Wallstein Verlag.

Engber, Daniel. 2013. The Mouse Trap. How one Rodent Rules the Lab. Slate. http://www.slate.com/articles/health_and_science/the_mouse_trap/2011/11/the_mouse_trap.html .

Foucault, Michel. 1975. The Birth of the Clinic: An Archaeology of Medical Perception . New York: Vintage Books.

———. 1979. Discipline and Punish: The Birth of the Prison . New York: Vintage.

Frängsmyr, Tore, J.L. Heilbron, and Robin E. Rider. 1990. The Quantifying Spirit in the Eighteenth Century . Berkeley, CA: University of California Press.

Franklin, Seb. 2015. Control: Digitality as Cultural Logic . Cambridge, MA: MIT Press.

Galen. 1944. Galen on Medical Experience . First edition of the Arabic ed. New York [etc.]: Pub. for the trustees of the late Sir Henry Wellcome by the Oxford University Press.

Galison, Peter. 1987. How Experiments End . Chicago, IL: University of Chicago Press.

———. 2010. Secrecy in Three Acts. Social Research: An International Quarterly 77: 941–974.

Geison, Gerald. 1995. The Private Science of Louis Pasteur . Princeton, NJ: Princeton University Press.

Gigerenzer, Gerd, Zeno G. Swijtink, Theodore M. Porter, Lorraine Daston, John Beatty, and Lorenz Krüger. 1989. The Empire of Chance: How Probability Changed Science and Everyday Life . Cambridge: Cambridge University Press.

Guala, Francesco. 2005. The Methodology of Experimental Economics . Cambridge/New York: Cambridge University Press.

Guettinger, Stephan. 2019. A New Account of Replication in the Experimental Life Sciences. Philosophy of Science 86: 453–471.

Hacking, Ian. 1984. The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference . Cambridge: Cambridge University Press.

———. 1985. Do We See Through a Microscope? In Images of Science , ed. P.M. Churchland and C.A. Hooker, 132–152. Chicago and London.

———. 1990. The Taming of Chance . Cambridge: Cambridge University Press.

Hoffmann, Christoph. 2001. The Design of Disturbance: Physics Institutes and Physics Research in Germany, 1870–1910. Perspectives on Science 9: 173–195.

Holbrook, Daniel. 2009. Controlling Contamination: the Origins of Clean Room Technology. History and Technology 25 (3): 173–191. https://doi.org/10.1080/07341510903083203 .

Holman, Bennett. 2020. Humbug, the Council of Pharmacy and Chemistry, and the Origin of “The Blind Test” of Therapeutic Efficacy. In Uncertainty in Pharmacology: Epistemology, Methods and Decisions , ed. Barbara Osimani and A. Lacaze, 397–416. Dordrecht: Springer.

Hughes, Thomas Parke. 1983. Networks of Power: Electrification in Western Society, 1880–1930 . Baltimore, MD: Johns Hopkins University Press.

Kaptchuk, Ted. 1998. Intentional Ignorance: A History of Blind Assessment and Placebo Controls in Medicine. Bulletin of the History of Medicine 72: 389–433.

Keating, Peter, and Alberto Cambrosio. 2012. Cancer on Trial . Chicago, IL: University of Chicago Press.

Kuch, Declan, M. Kearnes, and K. Gulson. 2020. The Promise of Precision: Datafication in Medicine, Agriculture and Education. Policy Studies 41 (5): 527–546. https://doi.org/10.1080/01442872.2020.1724384 .

Kupreeva, Inna. 2022. Galen’s Empiricist Background: A Study of the Argument in On Medical Experience. In Galen’s Epistemology: Experience, Reason, and Method in Ancient Medicine , ed. Matyáš Havrda and R.J. Hankinson, 32–78. Cambridge: Cambridge University Press.

Landecker, Hannah. 2016. It Is What It Eats: Chemically Defined Media and the History of Surrounds. Studies in History and Philosophy of Biological and Biomedical Sciences 57: 148–160.

Leigh, Robert Adam. 2013. On Theriac to Piso, Attributed to Galen. A critical edition with translation and commentary . Exeter: Department of Classics, University of Exeter.

Levin, Miriam. 2000a. Contexts of Control. In Cultures of Control , ed. Miriam Levin, 13–39. Amsterdam: Harwood Academic Publishers.

———, ed. 2000b. Cultures of Control . Amsterdam: Harwood Academic Publishers.

Marks, Harry M. 1997. The Progress of Experiment. Science and Therapeutic Reform in the United States, 1900–1990 . Cambridge: Cambridge University Press.

Mayo, Deborah G. 1996. Error and the Growth of Experimental Knowledge . Chicago: University of Chicago Press.

McCartney, Eugene S. 1942. A Control Experiment in Antiquity. The Classical Weekly 36: 5–6.

McVaugh, Michael. 2009. The ‘Experience-Based’ Medicine of the Thirteenth Century. In Evidence and Interpretation in Studies on Early Science and Medicine , ed. Edith Sylla and William R. Newman, 105–130. Leiden: Brill.

Nasser, M., A. Tibi, and E. Savage-Smith. 2009. Ibn Sīnā’s Canon of Medicine: 11th Century Rules for Assessing the Effects of Drugs. Journal of the Royal Society of Medicine 102: 78–80.

Needham, John Turbervill. 1748. A Summary of Some Late Observations upon the Generation, Composition, and Decomposition of Animal and Vegetable Substances; Communicated in a Letter to Martin Folkes, Esq.; President of the Royal Society, by Mr. Tubervill Needham, Fellow of the Same Society. Philosophical Transactions of the Royal Society of London 45: 615–666.

Parke, Emily C. 2014. Flies from Meat and Wasps from Trees: Reevaluating Francesco Redi’s Spontaneous Generation Experiments. Studies in History and Philosophy of Biological and Biomedical Sciences 45: 34–42.

Article   ADS   PubMed   Google Scholar  

Pastorino, Cesare. 2022. Francis Bacon’s Controlled Experiments on Seed Steeping and Germination: Their Context, Circulation and Methodological Significance. Paper presented at the conference Control Practices in Historical and Systematic Perspectives , Indiana University Bloomington, March 2022.

Paul, Diane B. 1995. Controlling Human Heredity . Atlantic Highlands, NJ: Humanities Press International.

Pauly, Philip J. 1987. Controlling Life. Jacques Loeb & the Engineering Ideal in Biology . Oxford: Oxford University Press.

Popper, Karl. 2002. The Logic of Scientific Discovery . London and New York: Routledge.

Porter, Theodore M. 1995. Trust in Numbers. The Pursuit of Objectivity in Science and Public Life . Princeton, NJ: Princeton University Press.

Rabinbach, Anson. 1990. The Human Motor: Energy, Fatigue, and the Origins of Modernity . Berkeley, CA: University of California Press.

Rader, Karen. 2004. Making Mice: Standardizing Animals for American Biomedical Research . Princeton, NJ: Princeton University Press.

Redi, Francesco. 1909. Experiments on the Generation of Insects , 1688. Chicago, IL: Open Court.

Schickore, Jutta. 2017. About Method: Experimenters, Snake Venom, and the History of Writing Scientifically . Chicago, IL/London: University of Chicago Press.

———. 2021. Methodological Ideas in Past Experimental Inquiry: Rigor Checks Around 1800. Intellectual History Review . https://doi.org/10.1080/17496977.2021.1974225 .

———. 2023. Peculiar Blue Spots: Evidence and Causes around 1800. In Evidence: The Use and Misuse of Data , ed. The American Philosophical Society, 31–55. Philadelphia, PA: American Philosophical Society.

Schlager, Neil. 1995. Breakdown: Deadly Technological Disasters . Dretroit, MI: Visible Ink Press.

Seppel, Marten, and Keith Tribe. 2017. Cameralism in Practice: State Administration and Economy in Early Modern Europe . Rochester, NY: Boydell & Brewer.

Shapin, Steven, and Simon Schaffer. 1985. Leviathan and the Air-Pump. Hobbes, Boyle, and the Experimental Life . Princeton, NJ: Princeton University Press.

Strick, James. 2000. Sparks of Life. Darwinism and the Victorian Debates over Spontaneous Generation . Cambridge, MA: Harvard University Press.

Strong, I.I.I., and C. Frederick. 1999. The History of the Double Blind Test and the Placebo. Journal of Pharmacy and Pharmacology 51: 237–238.

Sullivan, Jacqueline A. 2022. Novel Tool Development and the Dynamics of Control: The Rodent Touchscreen Operant Chamber as a Case Study. Philosophy of Science 89 (5): 1–19.

Article   MathSciNet   Google Scholar  

Treece, James W. 1990. Daniel and the Classic Experimental Design. Accessed January 29, 2023. https://www.icr.org/article/daniel-classic-experimental-design .

Tuchman, Arleen M. 1993. Science, Medicine, and the State in Germany. The Case of Baden, 1815-1871 . New York: Oxford University Press.

Wiener, Norbert. 1948. Cybernetics, or, Control and communication in the animal and the machine . New York/Paris: J. Wiley, Hermann et Cie.

Wimsatt, William C. 2007. Re-Engineering Philosophy for Limited Beings . Cambridge, MA: Harvard University Press.

Wise, M. Norton, ed. 1995. The Values of Precision . Princeton, NJ: Princeton University Press.

Woodward, James. 2008. Cause and Explanation in Psychiatry: An Interventionist Perspective. In Philosophical Issues in Psychiatry: Explanation, Phenomenology and Nosology , ed. K. Kendler and J. Parnas, 132–184. Baltimore, MD: Johns Hopkins University Press.

Worrall, John. 2007. Why There’s No Cause to Randomize. British Journal for the Philosophy of Science 58: 451–488.

Zhmud, Leonid. 2012. Pythagoras and the Early Pythagoreans . Oxford: Oxford University Press.

Download references

Author information

Authors and affiliations.

Department of History and Philosophy of Science and Medicine, Indiana University, Bloomington, IN, USA

Jutta Schickore

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Jutta Schickore .

Editor information

Editors and affiliations.

William R. Newman

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2024 The Author(s)

About this chapter

Schickore, J. (2024). Introduction: Practices, Strategies, and Methodologies of Experimental Control in Historical Perspective. In: Schickore, J., Newman, W.R. (eds) Elusive Phenomena, Unwieldy Things. Archimedes, vol 71. Springer, Cham. https://doi.org/10.1007/978-3-031-52954-2_1

Download citation

DOI : https://doi.org/10.1007/978-3-031-52954-2_1

Published : 27 February 2024

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-52953-5

Online ISBN : 978-3-031-52954-2

eBook Packages : History History (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. Chapter 3 part1-Design of Experiments

    controlled experiment vs comparative experiment

  2. PPT

    controlled experiment vs comparative experiment

  3. PPT

    controlled experiment vs comparative experiment

  4. Experiments vs. comparative observational studies

    controlled experiment vs comparative experiment

  5. Control Group Vs Experimental Group In Science

    controlled experiment vs comparative experiment

  6. Observational Study Vs Controlled Experiment Ppt Powerpoint

    controlled experiment vs comparative experiment

VIDEO

  1. Comparative Experiment on Soil Fertilizer Loss

  2. Correlation: Comparing theory with experiment (U1-9-04)

  3. 8 std science practical book experiment 9 comparative study of polluted and non polluted water body

  4. The JavaScript Package Selection Task: A Comparative Experiment Using an LLM-based Approach

  5. AriaMx: Analyzing a Comparative Quantitation Experiment

  6. What is Randomized Controlled Trials (RCT)