PsyBlog

Social Psychology Experiments: 10 Of The Most Famous Studies

Ten of the most influential social psychology experiments explain why we sometimes do dumb or irrational things. 

social psychology experiments

Ten of the most influential social psychology experiments explain why we sometimes do dumb or irrational things.

“I have been primarily interested in how and why ordinary people do unusual things, things that seem alien to their natures. Why do good people sometimes act evil? Why do smart people sometimes do dumb or irrational things?” –Philip Zimbardo

Like famous social psychologist Professor Philip Zimbardo (author of The Lucifer Effect: Understanding How Good People Turn Evil ), I’m also obsessed with why we do dumb or irrational things.

The answer quite often is because of other people — something social psychologists have comprehensively shown.

Each of the 10 brilliant social psychology experiments below tells a unique, insightful story relevant to all our lives, every day.

Click the link in each social psychology experiment to get the full description and explanation of each phenomenon.

1. Social Psychology Experiments: The Halo Effect

The halo effect is a finding from a famous social psychology experiment.

It is the idea that global evaluations about a person (e.g. she is likeable) bleed over into judgements about their specific traits (e.g. she is intelligent).

It is sometimes called the “what is beautiful is good” principle, or the “physical attractiveness stereotype”.

It is called the halo effect because a halo was often used in religious art to show that a person is good.

2. Cognitive Dissonance

Cognitive dissonance is the mental discomfort people feel when trying to hold two conflicting beliefs in their mind.

People resolve this discomfort by changing their thoughts to align with one of conflicting beliefs and rejecting the other.

The study provides a central insight into the stories we tell ourselves about why we think and behave the way we do.

3. Robbers Cave Experiment: How Group Conflicts Develop

The Robbers Cave experiment was a famous social psychology experiment on how prejudice and conflict emerged between two group of boys.

It shows how groups naturally develop their own cultures, status structures and boundaries — and then come into conflict with each other.

For example, each country has its own culture, its government, legal system and it draws boundaries to differentiate itself from neighbouring countries.

One of the reasons the became so famous is that it appeared to show how groups could be reconciled, how peace could flourish.

The key was the focus on superordinate goals, those stretching beyond the boundaries of the group itself.

4. Social Psychology Experiments: The Stanford Prison Experiment

The Stanford prison experiment was run to find out how people would react to being made a prisoner or prison guard.

The psychologist Philip Zimbardo, who led the Stanford prison experiment, thought ordinary, healthy people would come to behave cruelly, like prison guards, if they were put in that situation, even if it was against their personality.

It has since become a classic social psychology experiment, studied by generations of students and recently coming under a lot of criticism.

5. The Milgram Social Psychology Experiment

The Milgram experiment , led by the well-known psychologist Stanley Milgram in the 1960s, aimed to test people’s obedience to authority.

The results of Milgram’s social psychology experiment, sometimes known as the Milgram obedience study, continue to be both thought-provoking and controversial.

The Milgram experiment discovered people are much more obedient than you might imagine.

Fully 63 percent of the participants continued administering what appeared like electric shocks to another person while they screamed in agony, begged to stop and eventually fell silent — just because they were told to.

6. The False Consensus Effect

The false consensus effect is a famous social psychological finding that people tend to assume that others agree with them.

It could apply to opinions, values, beliefs or behaviours, but people assume others think and act in the same way as they do.

It is hard for many people to believe the false consensus effect exists because they quite naturally believe they are good ‘intuitive psychologists’, thinking it is relatively easy to predict other people’s attitudes and behaviours.

In reality, people show a number of predictable biases, such as the false consensus effect, when estimating other people’s behaviour and its causes.

7. Social Psychology Experiments: Social Identity Theory

Social identity theory helps to explain why people’s behaviour in groups is fascinating and sometimes disturbing.

People gain part of their self from the groups they belong to and that is at the heart of social identity theory.

The famous theory explains why as soon as humans are bunched together in groups we start to do odd things: copy other members of our group, favour members of own group over others, look for a leader to worship and fight other groups.

8. Negotiation: 2 Psychological Strategies That Matter Most

Negotiation is one of those activities we often engage in without quite realising it.

Negotiation doesn’t just happen in the boardroom, or when we ask our boss for a raise or down at the market, it happens every time we want to reach an agreement with someone.

In a classic, award-winning series of social psychology experiments, Morgan Deutsch and Robert Krauss investigated two central factors in negotiation: how we communicate with each other and how we use threats.

9. Bystander Effect And The Diffusion Of Responsibility

The bystander effect in social psychology is the surprising finding that the mere presence of other people inhibits our own helping behaviours in an emergency.

The bystander effect social psychology experiments are mentioned in every psychology textbook and often dubbed ‘seminal’.

This famous social psychology experiment on the bystander effect was inspired by the highly publicised murder of Kitty Genovese in 1964.

It found that in some circumstances, the presence of others inhibits people’s helping behaviours — partly because of a phenomenon called diffusion of responsibility.

10. Asch Conformity Experiment: The Power Of Social Pressure

The Asch conformity experiments — some of the most famous every done — were a series of social psychology experiments carried out by noted psychologist Solomon Asch.

The Asch conformity experiment reveals how strongly a person’s opinions are affected by people around them.

In fact, the Asch conformity experiment shows that many of us will deny our own senses just to conform with others.

' data-src=

Author: Dr Jeremy Dean

Psychologist, Jeremy Dean, PhD is the founder and author of PsyBlog. He holds a doctorate in psychology from University College London and two other advanced degrees in psychology. He has been writing about scientific research on PsyBlog since 2004. View all posts by Dr Jeremy Dean

example of social psychology experiments

Join the free PsyBlog mailing list. No spam, ever.

  • The 25 Most Influential Psychological Experiments in History

Most Influential Psychological Experiments in History

While each year thousands and thousands of studies are completed in the many specialty areas of psychology, there are a handful that, over the years, have had a lasting impact in the psychological community as a whole. Some of these were dutifully conducted, keeping within the confines of ethical and practical guidelines. Others pushed the boundaries of human behavior during their psychological experiments and created controversies that still linger to this day. And still others were not designed to be true psychological experiments, but ended up as beacons to the psychological community in proving or disproving theories.

This is a list of the 25 most influential psychological experiments still being taught to psychology students of today.

1. A Class Divided

Study conducted by: jane elliott.

Study Conducted in 1968 in an Iowa classroom

A Class Divided Study Conducted By: Jane Elliott

Experiment Details: Jane Elliott’s famous experiment was inspired by the assassination of Dr. Martin Luther King Jr. and the inspirational life that he led. The third grade teacher developed an exercise, or better yet, a psychological experiment, to help her Caucasian students understand the effects of racism and prejudice.

Elliott divided her class into two separate groups: blue-eyed students and brown-eyed students. On the first day, she labeled the blue-eyed group as the superior group and from that point forward they had extra privileges, leaving the brown-eyed children to represent the minority group. She discouraged the groups from interacting and singled out individual students to stress the negative characteristics of the children in the minority group. What this exercise showed was that the children’s behavior changed almost instantaneously. The group of blue-eyed students performed better academically and even began bullying their brown-eyed classmates. The brown-eyed group experienced lower self-confidence and worse academic performance. The next day, she reversed the roles of the two groups and the blue-eyed students became the minority group.

At the end of the experiment, the children were so relieved that they were reported to have embraced one another and agreed that people should not be judged based on outward appearances. This exercise has since been repeated many times with similar outcomes.

For more information click here

2. Asch Conformity Study

Study conducted by: dr. solomon asch.

Study Conducted in 1951 at Swarthmore College

Asch Conformity Study

Experiment Details: Dr. Solomon Asch conducted a groundbreaking study that was designed to evaluate a person’s likelihood to conform to a standard when there is pressure to do so.

A group of participants were shown pictures with lines of various lengths and were then asked a simple question: Which line is longest? The tricky part of this study was that in each group only one person was a true participant. The others were actors with a script. Most of the actors were instructed to give the wrong answer. Strangely, the one true participant almost always agreed with the majority, even though they knew they were giving the wrong answer.

The results of this study are important when we study social interactions among individuals in groups. This study is a famous example of the temptation many of us experience to conform to a standard during group situations and it showed that people often care more about being the same as others than they do about being right. It is still recognized as one of the most influential psychological experiments for understanding human behavior.

3. Bobo Doll Experiment

Study conducted by: dr. alburt bandura.

Study Conducted between 1961-1963 at Stanford University

Bobo Doll Experiment

In his groundbreaking study he separated participants into three groups:

  • one was exposed to a video of an adult showing aggressive behavior towards a Bobo doll
  • another was exposed to video of a passive adult playing with the Bobo doll
  • the third formed a control group

Children watched their assigned video and then were sent to a room with the same doll they had seen in the video (with the exception of those in the control group). What the researcher found was that children exposed to the aggressive model were more likely to exhibit aggressive behavior towards the doll themselves. The other groups showed little imitative aggressive behavior. For those children exposed to the aggressive model, the number of derivative physical aggressions shown by the boys was 38.2 and 12.7 for the girls.

The study also showed that boys exhibited more aggression when exposed to aggressive male models than boys exposed to aggressive female models. When exposed to aggressive male models, the number of aggressive instances exhibited by boys averaged 104. This is compared to 48.4 aggressive instances exhibited by boys who were exposed to aggressive female models.

While the results for the girls show similar findings, the results were less drastic. When exposed to aggressive female models, the number of aggressive instances exhibited by girls averaged 57.7. This is compared to 36.3 aggressive instances exhibited by girls who were exposed to aggressive male models. The results concerning gender differences strongly supported Bandura’s secondary prediction that children will be more strongly influenced by same-sex models. The Bobo Doll Experiment showed a groundbreaking way to study human behavior and it’s influences.

4. Car Crash Experiment

Study conducted by: elizabeth loftus and john palmer.

Study Conducted in 1974 at The University of California in Irvine

Car Crash Experiment

The participants watched slides of a car accident and were asked to describe what had happened as if they were eyewitnesses to the scene. The participants were put into two groups and each group was questioned using different wording such as “how fast was the car driving at the time of impact?” versus “how fast was the car going when it smashed into the other car?” The experimenters found that the use of different verbs affected the participants’ memories of the accident, showing that memory can be easily distorted.

This research suggests that memory can be easily manipulated by questioning technique. This means that information gathered after the event can merge with original memory causing incorrect recall or reconstructive memory. The addition of false details to a memory of an event is now referred to as confabulation. This concept has very important implications for the questions used in police interviews of eyewitnesses.

5. Cognitive Dissonance Experiment

Study conducted by: leon festinger and james carlsmith.

Study Conducted in 1957 at Stanford University

Experiment Details: The concept of cognitive dissonance refers to a situation involving conflicting:

This conflict produces an inherent feeling of discomfort leading to a change in one of the attitudes, beliefs or behaviors to minimize or eliminate the discomfort and restore balance.

Cognitive dissonance was first investigated by Leon Festinger, after an observational study of a cult that believed that the earth was going to be destroyed by a flood. Out of this study was born an intriguing experiment conducted by Festinger and Carlsmith where participants were asked to perform a series of dull tasks (such as turning pegs in a peg board for an hour). Participant’s initial attitudes toward this task were highly negative.

They were then paid either $1 or $20 to tell a participant waiting in the lobby that the tasks were really interesting. Almost all of the participants agreed to walk into the waiting room and persuade the next participant that the boring experiment would be fun. When the participants were later asked to evaluate the experiment, the participants who were paid only $1 rated the tedious task as more fun and enjoyable than the participants who were paid $20 to lie.

Being paid only $1 is not sufficient incentive for lying and so those who were paid $1 experienced dissonance. They could only overcome that cognitive dissonance by coming to believe that the tasks really were interesting and enjoyable. Being paid $20 provides a reason for turning pegs and there is therefore no dissonance.

6. Fantz’s Looking Chamber

Study conducted by: robert l. fantz.

Study Conducted in 1961 at the University of Illinois

Experiment Details: The study conducted by Robert L. Fantz is among the simplest, yet most important in the field of infant development and vision. In 1961, when this experiment was conducted, there very few ways to study what was going on in the mind of an infant. Fantz realized that the best way was to simply watch the actions and reactions of infants. He understood the fundamental factor that if there is something of interest near humans, they generally look at it.

To test this concept, Fantz set up a display board with two pictures attached. On one was a bulls-eye. On the other was the sketch of a human face. This board was hung in a chamber where a baby could lie safely underneath and see both images. Then, from behind the board, invisible to the baby, he peeked through a hole to watch what the baby looked at. This study showed that a two-month old baby looked twice as much at the human face as it did at the bulls-eye. This suggests that human babies have some powers of pattern and form selection. Before this experiment it was thought that babies looked out onto a chaotic world of which they could make little sense.

7. Hawthorne Effect

Study conducted by: henry a. landsberger.

Study Conducted in 1955 at Hawthorne Works in Chicago, Illinois

Hawthorne Effect

Landsberger performed the study by analyzing data from experiments conducted between 1924 and 1932, by Elton Mayo, at the Hawthorne Works near Chicago. The company had commissioned studies to evaluate whether the level of light in a building changed the productivity of the workers. What Mayo found was that the level of light made no difference in productivity. The workers increased their output whenever the amount of light was switched from a low level to a high level, or vice versa.

The researchers noticed a tendency that the workers’ level of efficiency increased when any variable was manipulated. The study showed that the output changed simply because the workers were aware that they were under observation. The conclusion was that the workers felt important because they were pleased to be singled out. They increased productivity as a result. Being singled out was the factor dictating increased productivity, not the changing lighting levels, or any of the other factors that they experimented upon.

The Hawthorne Effect has become one of the hardest inbuilt biases to eliminate or factor into the design of any experiment in psychology and beyond.

8. Kitty Genovese Case

Study conducted by: new york police force.

Study Conducted in 1964 in New York City

Experiment Details: The murder case of Kitty Genovese was never intended to be a psychological experiment, however it ended up having serious implications for the field.

According to a New York Times article, almost 40 neighbors witnessed Kitty Genovese being savagely attacked and murdered in Queens, New York in 1964. Not one neighbor called the police for help. Some reports state that the attacker briefly left the scene and later returned to “finish off” his victim. It was later uncovered that many of these facts were exaggerated. (There were more likely only a dozen witnesses and records show that some calls to police were made).

What this case later become famous for is the “Bystander Effect,” which states that the more bystanders that are present in a social situation, the less likely it is that anyone will step in and help. This effect has led to changes in medicine, psychology and many other areas. One famous example is the way CPR is taught to new learners. All students in CPR courses learn that they must assign one bystander the job of alerting authorities which minimizes the chances of no one calling for assistance.

9. Learned Helplessness Experiment

Study conducted by: martin seligman.

Study Conducted in 1967 at the University of Pennsylvania

Learned Helplessness Experiment

Seligman’s experiment involved the ringing of a bell and then the administration of a light shock to a dog. After a number of pairings, the dog reacted to the shock even before it happened. As soon as the dog heard the bell, he reacted as though he’d already been shocked.

During the course of this study something unexpected happened. Each dog was placed in a large crate that was divided down the middle with a low fence. The dog could see and jump over the fence easily. The floor on one side of the fence was electrified, but not on the other side of the fence. Seligman placed each dog on the electrified side and administered a light shock. He expected the dog to jump to the non-shocking side of the fence. In an unexpected turn, the dogs simply laid down.

The hypothesis was that as the dogs learned from the first part of the experiment that there was nothing they could do to avoid the shocks, they gave up in the second part of the experiment. To prove this hypothesis the experimenters brought in a new set of animals and found that dogs with no history in the experiment would jump over the fence.

This condition was described as learned helplessness. A human or animal does not attempt to get out of a negative situation because the past has taught them that they are helpless.

10. Little Albert Experiment

Study conducted by: john b. watson and rosalie rayner.

Study Conducted in 1920 at Johns Hopkins University

Little Albert Experiment

The experiment began by placing a white rat in front of the infant, who initially had no fear of the animal. Watson then produced a loud sound by striking a steel bar with a hammer every time little Albert was presented with the rat. After several pairings (the noise and the presentation of the white rat), the boy began to cry and exhibit signs of fear every time the rat appeared in the room. Watson also created similar conditioned reflexes with other common animals and objects (rabbits, Santa beard, etc.) until Albert feared them all.

This study proved that classical conditioning works on humans. One of its most important implications is that adult fears are often connected to early childhood experiences.

11. Magical Number Seven

Study conducted by: george a. miller.

Study Conducted in 1956 at Princeton University

Experiment Details:   Frequently referred to as “ Miller’s Law,” the Magical Number Seven experiment purports that the number of objects an average human can hold in working memory is 7 ± 2. This means that the human memory capacity typically includes strings of words or concepts ranging from 5-9. This information on the limits to the capacity for processing information became one of the most highly cited papers in psychology.

The Magical Number Seven Experiment was published in 1956 by cognitive psychologist George A. Miller of Princeton University’s Department of Psychology in Psychological Review .  In the article, Miller discussed a concurrence between the limits of one-dimensional absolute judgment and the limits of short-term memory.

In a one-dimensional absolute-judgment task, a person is presented with a number of stimuli that vary on one dimension (such as 10 different tones varying only in pitch). The person responds to each stimulus with a corresponding response (learned before).

Performance is almost perfect up to five or six different stimuli but declines as the number of different stimuli is increased. This means that a human’s maximum performance on one-dimensional absolute judgment can be described as an information store with the maximum capacity of approximately 2 to 3 bits of information There is the ability to distinguish between four and eight alternatives.

12. Pavlov’s Dog Experiment

Study conducted by: ivan pavlov.

Study Conducted in the 1890s at the Military Medical Academy in St. Petersburg, Russia

Pavlov’s Dog Experiment

Pavlov began with the simple idea that there are some things that a dog does not need to learn. He observed that dogs do not learn to salivate when they see food. This reflex is “hard wired” into the dog. This is an unconditioned response (a stimulus-response connection that required no learning).

Pavlov outlined that there are unconditioned responses in the animal by presenting a dog with a bowl of food and then measuring its salivary secretions. In the experiment, Pavlov used a bell as his neutral stimulus. Whenever he gave food to his dogs, he also rang a bell. After a number of repeats of this procedure, he tried the bell on its own. What he found was that the bell on its own now caused an increase in salivation. The dog had learned to associate the bell and the food. This learning created a new behavior. The dog salivated when he heard the bell. Because this response was learned (or conditioned), it is called a conditioned response. The neutral stimulus has become a conditioned stimulus.

This theory came to be known as classical conditioning.

13. Robbers Cave Experiment

Study conducted by: muzafer and carolyn sherif.

Study Conducted in 1954 at the University of Oklahoma

Experiment Details: This experiment, which studied group conflict, is considered by most to be outside the lines of what is considered ethically sound.

In 1954 researchers at the University of Oklahoma assigned 22 eleven- and twelve-year-old boys from similar backgrounds into two groups. The two groups were taken to separate areas of a summer camp facility where they were able to bond as social units. The groups were housed in separate cabins and neither group knew of the other’s existence for an entire week. The boys bonded with their cabin mates during that time. Once the two groups were allowed to have contact, they showed definite signs of prejudice and hostility toward each other even though they had only been given a very short time to develop their social group. To increase the conflict between the groups, the experimenters had them compete against each other in a series of activities. This created even more hostility and eventually the groups refused to eat in the same room. The final phase of the experiment involved turning the rival groups into friends. The fun activities the experimenters had planned like shooting firecrackers and watching movies did not initially work, so they created teamwork exercises where the two groups were forced to collaborate. At the end of the experiment, the boys decided to ride the same bus home, demonstrating that conflict can be resolved and prejudice overcome through cooperation.

Many critics have compared this study to Golding’s Lord of the Flies novel as a classic example of prejudice and conflict resolution.

14. Ross’ False Consensus Effect Study

Study conducted by: lee ross.

Study Conducted in 1977 at Stanford University

Experiment Details: In 1977, a social psychology professor at Stanford University named Lee Ross conducted an experiment that, in lay terms, focuses on how people can incorrectly conclude that others think the same way they do, or form a “false consensus” about the beliefs and preferences of others. Ross conducted the study in order to outline how the “false consensus effect” functions in humans.

Featured Programs

In the first part of the study, participants were asked to read about situations in which a conflict occurred and then were told two alternative ways of responding to the situation. They were asked to do three things:

  • Guess which option other people would choose
  • Say which option they themselves would choose
  • Describe the attributes of the person who would likely choose each of the two options

What the study showed was that most of the subjects believed that other people would do the same as them, regardless of which of the two responses they actually chose themselves. This phenomenon is referred to as the false consensus effect, where an individual thinks that other people think the same way they do when they may not. The second observation coming from this important study is that when participants were asked to describe the attributes of the people who will likely make the choice opposite of their own, they made bold and sometimes negative predictions about the personalities of those who did not share their choice.

15. The Schachter and Singer Experiment on Emotion

Study conducted by: stanley schachter and jerome e. singer.

Study Conducted in 1962 at Columbia University

Experiment Details: In 1962 Schachter and Singer conducted a ground breaking experiment to prove their theory of emotion.

In the study, a group of 184 male participants were injected with epinephrine, a hormone that induces arousal including increased heartbeat, trembling, and rapid breathing. The research participants were told that they were being injected with a new medication to test their eyesight. The first group of participants was informed the possible side effects that the injection might cause while the second group of participants were not. The participants were then placed in a room with someone they thought was another participant, but was actually a confederate in the experiment. The confederate acted in one of two ways: euphoric or angry. Participants who had not been informed about the effects of the injection were more likely to feel either happier or angrier than those who had been informed.

What Schachter and Singer were trying to understand was the ways in which cognition or thoughts influence human emotion. Their study illustrates the importance of how people interpret their physiological states, which form an important component of your emotions. Though their cognitive theory of emotional arousal dominated the field for two decades, it has been criticized for two main reasons: the size of the effect seen in the experiment was not that significant and other researchers had difficulties repeating the experiment.

16. Selective Attention / Invisible Gorilla Experiment

Study conducted by: daniel simons and christopher chabris.

Study Conducted in 1999 at Harvard University

Experiment Details: In 1999 Simons and Chabris conducted their famous awareness test at Harvard University.

Participants in the study were asked to watch a video and count how many passes occurred between basketball players on the white team. The video moves at a moderate pace and keeping track of the passes is a relatively easy task. What most people fail to notice amidst their counting is that in the middle of the test, a man in a gorilla suit walked onto the court and stood in the center before walking off-screen.

The study found that the majority of the subjects did not notice the gorilla at all, proving that humans often overestimate their ability to effectively multi-task. What the study set out to prove is that when people are asked to attend to one task, they focus so strongly on that element that they may miss other important details.

17. Stanford Prison Study

Study conducted by philip zimbardo.

Study Conducted in 1971 at Stanford University

Stanford Prison Study

The Stanford Prison Experiment was designed to study behavior of “normal” individuals when assigned a role of prisoner or guard. College students were recruited to participate. They were assigned roles of “guard” or “inmate.”  Zimbardo played the role of the warden. The basement of the psychology building was the set of the prison. Great care was taken to make it look and feel as realistic as possible.

The prison guards were told to run a prison for two weeks. They were told not to physically harm any of the inmates during the study. After a few days, the prison guards became very abusive verbally towards the inmates. Many of the prisoners became submissive to those in authority roles. The Stanford Prison Experiment inevitably had to be cancelled because some of the participants displayed troubling signs of breaking down mentally.

Although the experiment was conducted very unethically, many psychologists believe that the findings showed how much human behavior is situational. People will conform to certain roles if the conditions are right. The Stanford Prison Experiment remains one of the most famous psychology experiments of all time.

18. Stanley Milgram Experiment

Study conducted by stanley milgram.

Study Conducted in 1961 at Stanford University

Experiment Details: This 1961 study was conducted by Yale University psychologist Stanley Milgram. It was designed to measure people’s willingness to obey authority figures when instructed to perform acts that conflicted with their morals. The study was based on the premise that humans will inherently take direction from authority figures from very early in life.

Participants were told they were participating in a study on memory. They were asked to watch another person (an actor) do a memory test. They were instructed to press a button that gave an electric shock each time the person got a wrong answer. (The actor did not actually receive the shocks, but pretended they did).

Participants were told to play the role of “teacher” and administer electric shocks to “the learner,” every time they answered a question incorrectly. The experimenters asked the participants to keep increasing the shocks. Most of them obeyed even though the individual completing the memory test appeared to be in great pain. Despite these protests, many participants continued the experiment when the authority figure urged them to. They increased the voltage after each wrong answer until some eventually administered what would be lethal electric shocks.

This experiment showed that humans are conditioned to obey authority and will usually do so even if it goes against their natural morals or common sense.

19. Surrogate Mother Experiment

Study conducted by: harry harlow.

Study Conducted from 1957-1963 at the University of Wisconsin

Experiment Details: In a series of controversial experiments during the late 1950s and early 1960s, Harry Harlow studied the importance of a mother’s love for healthy childhood development.

In order to do this he separated infant rhesus monkeys from their mothers a few hours after birth and left them to be raised by two “surrogate mothers.” One of the surrogates was made of wire with an attached bottle for food. The other was made of soft terrycloth but lacked food. The researcher found that the baby monkeys spent much more time with the cloth mother than the wire mother, thereby proving that affection plays a greater role than sustenance when it comes to childhood development. They also found that the monkeys that spent more time cuddling the soft mother grew up to healthier.

This experiment showed that love, as demonstrated by physical body contact, is a more important aspect of the parent-child bond than the provision of basic needs. These findings also had implications in the attachment between fathers and their infants when the mother is the source of nourishment.

20. The Good Samaritan Experiment

Study conducted by: john darley and daniel batson.

Study Conducted in 1973 at The Princeton Theological Seminary (Researchers were from Princeton University)

Experiment Details: In 1973, an experiment was created by John Darley and Daniel Batson, to investigate the potential causes that underlie altruistic behavior. The researchers set out three hypotheses they wanted to test:

  • People thinking about religion and higher principles would be no more inclined to show helping behavior than laymen.
  • People in a rush would be much less likely to show helping behavior.
  • People who are religious for personal gain would be less likely to help than people who are religious because they want to gain some spiritual and personal insights into the meaning of life.

Student participants were given some religious teaching and instruction. They were then were told to travel from one building to the next. Between the two buildings was a man lying injured and appearing to be in dire need of assistance. The first variable being tested was the degree of urgency impressed upon the subjects, with some being told not to rush and others being informed that speed was of the essence.

The results of the experiment were intriguing, with the haste of the subject proving to be the overriding factor. When the subject was in no hurry, nearly two-thirds of people stopped to lend assistance. When the subject was in a rush, this dropped to one in ten.

People who were on the way to deliver a speech about helping others were nearly twice as likely to help as those delivering other sermons,. This showed that the thoughts of the individual were a factor in determining helping behavior. Religious beliefs did not appear to make much difference on the results. Being religious for personal gain, or as part of a spiritual quest, did not appear to make much of an impact on the amount of helping behavior shown.

21. The Halo Effect Experiment

Study conducted by: richard e. nisbett and timothy decamp wilson.

Study Conducted in 1977 at the University of Michigan

Experiment Details: The Halo Effect states that people generally assume that people who are physically attractive are more likely to:

  • be intelligent
  • be friendly
  • display good judgment

To prove their theory, Nisbett and DeCamp Wilson created a study to prove that people have little awareness of the nature of the Halo Effect. They’re not aware that it influences:

  • their personal judgments
  • the production of a more complex social behavior

In the experiment, college students were the research participants. They were asked to evaluate a psychology instructor as they view him in a videotaped interview. The students were randomly assigned to one of two groups. Each group was shown one of two different interviews with the same instructor. The instructor is a native French-speaking Belgian who spoke English with a noticeable accent. In the first video, the instructor presented himself as someone:

  • respectful of his students’ intelligence and motives
  • flexible in his approach to teaching
  • enthusiastic about his subject matter

In the second interview, he presented himself as much more unlikable. He was cold and distrustful toward the students and was quite rigid in his teaching style.

After watching the videos, the subjects were asked to rate the lecturer on:

  • physical appearance

His mannerisms and accent were kept the same in both versions of videos. The subjects were asked to rate the professor on an 8-point scale ranging from “like extremely” to “dislike extremely.” Subjects were also told that the researchers were interested in knowing “how much their liking for the teacher influenced the ratings they just made.” Other subjects were asked to identify how much the characteristics they just rated influenced their liking of the teacher.

After responding to the questionnaire, the respondents were puzzled about their reactions to the videotapes and to the questionnaire items. The students had no idea why they gave one lecturer higher ratings. Most said that how much they liked the lecturer had not affected their evaluation of his individual characteristics at all.

The interesting thing about this study is that people can understand the phenomenon, but they are unaware when it is occurring. Without realizing it, humans make judgments. Even when it is pointed out, they may still deny that it is a product of the halo effect phenomenon.

22. The Marshmallow Test

Study conducted by: walter mischel.

Study Conducted in 1972 at Stanford University

The Marshmallow Test

In his 1972 Marshmallow Experiment, children ages four to six were taken into a room where a marshmallow was placed in front of them on a table. Before leaving each of the children alone in the room, the experimenter informed them that they would receive a second marshmallow if the first one was still on the table after they returned in 15 minutes. The examiner recorded how long each child resisted eating the marshmallow and noted whether it correlated with the child’s success in adulthood. A small number of the 600 children ate the marshmallow immediately and one-third delayed gratification long enough to receive the second marshmallow.

In follow-up studies, Mischel found that those who deferred gratification were significantly more competent and received higher SAT scores than their peers. This characteristic likely remains with a person for life. While this study seems simplistic, the findings outline some of the foundational differences in individual traits that can predict success.

23. The Monster Study

Study conducted by: wendell johnson.

Study Conducted in 1939 at the University of Iowa

Experiment Details: The Monster Study received this negative title due to the unethical methods that were used to determine the effects of positive and negative speech therapy on children.

Wendell Johnson of the University of Iowa selected 22 orphaned children, some with stutters and some without. The children were in two groups. The group of children with stutters was placed in positive speech therapy, where they were praised for their fluency. The non-stutterers were placed in negative speech therapy, where they were disparaged for every mistake in grammar that they made.

As a result of the experiment, some of the children who received negative speech therapy suffered psychological effects and retained speech problems for the rest of their lives. They were examples of the significance of positive reinforcement in education.

The initial goal of the study was to investigate positive and negative speech therapy. However, the implication spanned much further into methods of teaching for young children.

24. Violinist at the Metro Experiment

Study conducted by: staff at the washington post.

Study Conducted in 2007 at a Washington D.C. Metro Train Station

Grammy-winning musician, Joshua Bell

During the study, pedestrians rushed by without realizing that the musician playing at the entrance to the metro stop was Grammy-winning musician, Joshua Bell. Two days before playing in the subway, he sold out at a theater in Boston where the seats average $100. He played one of the most intricate pieces ever written with a violin worth 3.5 million dollars. In the 45 minutes the musician played his violin, only 6 people stopped and stayed for a while. Around 20 gave him money, but continued to walk their normal pace. He collected $32.

The study and the subsequent article organized by the Washington Post was part of a social experiment looking at:

  • the priorities of people

Gene Weingarten wrote about the social experiment: “In a banal setting at an inconvenient time, would beauty transcend?” Later he won a Pulitzer Prize for his story. Some of the questions the article addresses are:

  • Do we perceive beauty?
  • Do we stop to appreciate it?
  • Do we recognize the talent in an unexpected context?

As it turns out, many of us are not nearly as perceptive to our environment as we might like to think.

25. Visual Cliff Experiment

Study conducted by: eleanor gibson and richard walk.

Study Conducted in 1959 at Cornell University

Experiment Details: In 1959, psychologists Eleanor Gibson and Richard Walk set out to study depth perception in infants. They wanted to know if depth perception is a learned behavior or if it is something that we are born with. To study this, Gibson and Walk conducted the visual cliff experiment.

They studied 36 infants between the ages of six and 14 months, all of whom could crawl. The infants were placed one at a time on a visual cliff. A visual cliff was created using a large glass table that was raised about a foot off the floor. Half of the glass table had a checker pattern underneath in order to create the appearance of a ‘shallow side.’

In order to create a ‘deep side,’ a checker pattern was created on the floor; this side is the visual cliff. The placement of the checker pattern on the floor creates the illusion of a sudden drop-off. Researchers placed a foot-wide centerboard between the shallow side and the deep side. Gibson and Walk found the following:

  • Nine of the infants did not move off the centerboard.
  • All of the 27 infants who did move crossed into the shallow side when their mothers called them from the shallow side.
  • Three of the infants crawled off the visual cliff toward their mother when called from the deep side.
  • When called from the deep side, the remaining 24 children either crawled to the shallow side or cried because they could not cross the visual cliff and make it to their mother.

What this study helped demonstrate is that depth perception is likely an inborn train in humans.

Among these experiments and psychological tests, we see boundaries pushed and theories taking on a life of their own. It is through the endless stream of psychological experimentation that we can see simple hypotheses become guiding theories for those in this field. The greater field of psychology became a formal field of experimental study in 1879, when Wilhelm Wundt established the first laboratory dedicated solely to psychological research in Leipzig, Germany. Wundt was the first person to refer to himself as a psychologist. Since 1879, psychology has grown into a massive collection of:

  • methods of practice

It’s also a specialty area in the field of healthcare. None of this would have been possible without these and many other important psychological experiments that have stood the test of time.

  • 20 Most Unethical Experiments in Psychology
  • What Careers are in Experimental Psychology?
  • 10 Things to Know About the Psychology of Psychotherapy

About Education: Psychology

Explorable.com

Mental Floss.com

About the Author

After earning a Bachelor of Arts in Psychology from Rutgers University and then a Master of Science in Clinical and Forensic Psychology from Drexel University, Kristen began a career as a therapist at two prisons in Philadelphia. At the same time she volunteered as a rape crisis counselor, also in Philadelphia. After a few years in the field she accepted a teaching position at a local college where she currently teaches online psychology courses. Kristen began writing in college and still enjoys her work as a writer, editor, professor and mother.

  • 5 Best Online Ph.D. Marriage and Family Counseling Programs
  • Top 5 Online Doctorate in Educational Psychology
  • 5 Best Online Ph.D. in Industrial and Organizational Psychology Programs
  • Top 10 Online Master’s in Forensic Psychology
  • 10 Most Affordable Counseling Psychology Online Programs
  • 10 Most Affordable Online Industrial Organizational Psychology Programs
  • 10 Most Affordable Online Developmental Psychology Online Programs
  • 15 Most Affordable Online Sport Psychology Programs
  • 10 Most Affordable School Psychology Online Degree Programs
  • Top 50 Online Psychology Master’s Degree Programs
  • Top 25 Online Master’s in Educational Psychology
  • Top 25 Online Master’s in Industrial/Organizational Psychology
  • Top 10 Most Affordable Online Master’s in Clinical Psychology Degree Programs
  • Top 6 Most Affordable Online PhD/PsyD Programs in Clinical Psychology
  • 50 Great Small Colleges for a Bachelor’s in Psychology
  • 50 Most Innovative University Psychology Departments
  • The 30 Most Influential Cognitive Psychologists Alive Today
  • Top 30 Affordable Online Psychology Degree Programs
  • 30 Most Influential Neuroscientists
  • Top 40 Websites for Psychology Students and Professionals
  • Top 30 Psychology Blogs
  • 25 Celebrities With Animal Phobias
  • Your Phobias Illustrated (Infographic)
  • 15 Inspiring TED Talks on Overcoming Challenges
  • 10 Fascinating Facts About the Psychology of Color
  • 15 Scariest Mental Disorders of All Time
  • 15 Things to Know About Mental Disorders in Animals
  • 13 Most Deranged Serial Killers of All Time

Online Psychology Degree Guide

Site Information

  • About Online Psychology Degree Guide

people standing

9 of the Most Influential Social Psychology Experiments in History

For those interested in understanding how social interactions can shape behavior and mental processes, this article dives deep into some of the most influential social psychology experiments in history. Covering everything from the perpetrator-victim dynamic prevalent in Stanley Milgram’s infamous obedience experiment to the false consensus effect just a few years later, these social psychology experiments provide invaluable insights into the human psyche.

Through carefully conducted studies such as Lee Ross’ Diffusion of Responsibility Experiment and Edward Thorndike’s Halo Effect Experiment, as well as famous experiments such as Albert Bandura’s Bobo Doll Experiment, we are able to better comprehend why people act the way they do in certain situations. Come explore these incredible, influential experiments that have left their mark on social psychology and the world!

To learn more about what is social psychology check out our article.

The Stanford Prison Experiment, 1971

1.1. overview.

The Stanford Prison Experiment was a widely known and controversial social psychology experiment conducted in 1971 at Stanford University by Professor Philip Zimbardo to investigate how ordinary, healthy people would react to being made prisoners or prison guards. It has since become a classic social psychology experiment and is still studied today. However, the experiment has come under considerable criticism in recent years due to ethical issues.

1.2. Results

Twenty-four male college students were recruited for the experiment, which involved them playing the role of either prisoner or guard. Each group was then allotted 8-hour shifts and treated as if they were in a real prison situation. The prisoners were kept in the makeshift prison set up in the basement of the Psychology Department, where the guards were responsible for ensuring the inmates followed prison regulations. The participants were screened to guarantee they had no mental or physical problems that may have influenced their behavior.

The experiment concluded that it is possible to change the behavior of individuals when placed in groups, even when they are not aware they are being observed. The study showed how quickly people will conform to expected social roles and how easily ‘ordinary’ people can be transformed from ‘good’ to ‘evil.’ Both the prisoners and guards revealed stereotypical characteristics associated with correctional officers; the prisoners became emotionally unstable and submissive, whilst the guards became hostile and authoritative.

1.3. Criticisms

Philip Zimbardo’s experiment, the Stanford Prison Experiment, is like a dark mirror reflecting our society’s innermost fears. In 1971, Zimbardo conducted an experiment in which college students were randomly assigned to be either prisoners or guards in a simulated prison environment. The results of this study showed that people could quickly adapt to roles and lose their sense of morality when placed in certain situations.

The impact of the Stanford Prison Experiment has been far-reaching; it has been used as evidence for why prisons are so dangerous and how they can lead to psychological damage among inmates. It also serves as a reminder that power dynamics between individuals can have serious consequences if not monitored closely.

Critics have argued that the ethical practices employed by Zimbardo during his experiment were questionable at best, with some participants experiencing extreme distress due to their role-playing experience. Despite these criticisms, many believe that the experiment still provides valuable insight into human behavior and psychology today. Personifying its lessons, we could say that the Stanford Prison Experiment speaks volumes about how easily our moral compass can become distorted when faced with authority figures or oppressive environments.

The Asch Conformity Experiment, 1951

2.1. overview.

The Asch Conformity Experiment, also known as the “Asch Line Study,” was a series of experiments conducted by psychologist Solomon Asch in 1951 to test how people tend to conform to social pressures. The study was composed of two groups: one consisting of actual participants (the control group) and the other including actors (the confederates).

During the experiment, the participants were asked to view a line on one board and then match it to one of three lines on another board with their own judgment. Initially, the group was given correct answers; however, after a few attempts, the actors began to give wrong answers intentionally to observe how the participant would respond.

2.2. Results

The results showed that 75% of the participants conformed to the incorrect majority opinion given by the confederate group—even when it obviously contradicted their own senses. In addition, the control group with only real subjects produced a much lower rate of conformity, with less than 1% ever selecting the incorrect answer. This demonstrated that it was not the difficulty of the task but rather the presence of an influential social group that caused the majority of participants to deny their own thoughts in order to fit in with the others.

2.3. Criticisms

Critics have argued that the experiment was not diverse enough since it mainly used college-aged men as its sample population. Additionally, because the experiment did not include females, it has been suggested that the results of the cave experiment cannot be generalized to all genders.

Furthermore, critics have argued that the experimental design lacked a true measure of real-life social pressure since the actors and real participants knew the situation was artificial. Despite these criticisms, the Asch experiment remains one of the most important social psychology studies in history, and its core message about the power of conformity to influence opinions and behavior continues to be studied and discussed today.

We tackled the topic of conformity in our article about social influence .

The Bobo Doll Experiment, 1961

3.1. overview.

The Bobo Doll experiment was a series of experiments conducted by psychologist Albert Bandura between 1961 and 1963 at Stanford University, aimed at studying the extent to which human behavior is based on social imitation rather than inherited genetic factors.

Three groups of 24 participants each, aged from 3 to 6 years old, were chosen for the experiment – a control group (with no interaction with any adults), an aggressive group (observing an adult behaving aggressively towards the doll), and a passive group (observing a more passive adult playing with the doll). The results of the studies were a strong indication that children were strongly influenced by watching other people’s behavior and imitated it afterward in their own behavior.

3.2. Results

The study found that the children in the aggressive and passive groups were significantly more likely to behave aggressively towards the bobo doll than those in the control group, even though the latter had not been exposed to any type of model behavior. When it came to gender differences, boys showed more aggressive behavior when exposed to the aggressive behavior of male models, while girls showed similar findings, albeit less drastic.

Moreover, the study also contained a memory test during which wrong answers were punished with electric shocks; here, it appeared as if the individual completing the test was affected by the electric shocks, suggesting that authority figures can greatly influence behavior, even if not intentionally. Finally, the study also showed that when urged to continue with the experience even after protests from the individual receiving electric shocks, they complied with the requests, highlighting the power of authority within social situations.

3.3. Criticisms

Although the experiment raised widely accepted as evidence for the hypothesis that individuals learn behavior by observing others, the Bobo Doll experiment has been criticized in recent years. One key point of criticism is that Bandura’s research neglected to look at positive modeling – for example, modeling of altruism or helpful behavior, instead focusing solely on aggression.

Additionally, some have argued that, due to its relatively small sample size and laboratory-based approach, the study failed to take into account real-life influences, such as environmental variables, which would have provided additional context. Despite these criticisms, the Bobo Doll experiment remains one of the most famous studies in psychology, providing significant evidence for the importance of social learning theory in understanding human behavior.

The Milgram Experiment, 1963

4.1. overview.

The Milgram Experiment was a famous social psychology experiment and experiment conducted by Stanley Milgram in the 1960s. Its aim was to test people’s obedience to authority. The study examined how far people would go when an authority figure instructed them to perform acts that conflicted with their morals.

Specifically, it sought to find out if non-Nazi populations, such as those from the United States, would follow orders to harm other persons. One of the motivations for this investigation was the results of World War II, during which Nazi leader Adolf Eichmann was able to use “I was only following orders” as a legal defense at the Nuremberg trials.

4.2. Results

The experiment was conducted at Yale University in 1961 and included unsuspecting participants who were told that the study was about memory. Participants believed they were participating in a study where they would be required to act as teachers while an unsuspecting confederate (learned) was on the other side of the wall. Their task was to ask questions to the learner, and if they received a wrong answer, press a button administering shock, ranging from 15 volts to 450 volts.

The results showed that despite protests and cries from the learner, 63% of the participants continued pressing the switch. Milgram’s experiment revealed that human beings are conditioned to obey authority figures, even when going against their natural moral code.

4.3. Criticisms

Despite its significance, the Milgram Experiment has been heavily criticized over the years, and some have argued that the study violated ethical standards. The argument is that causing psychological and emotional distress to unwitting volunteers is wrong. Other critics have argued that the role reversals or changes in lab settings would yield different outcomes and should have been considered.

In response to these criticisms, some scientists have suggested controversial experiments by reducing the voltage administered to the learner or conducting newer versions of the experiment in naturalistic settings. In addition, the way modern studies measure obedience is greatly different – contemporary research focuses on motives and reactions participants have after the experiment. These proposed changes would enable researchers to look into extraneous factors influencing obedience versus harm caused to participants.

prison

The Halo Effect Experiment, 1977

5.1. overview.

The halo effect is commonly defined as the phenomenon in which a positive evaluation of one trait extends to an overall perception of an individual. This cognitive bias has been observed by social psychologists for over a century, beginning with psychologist Edward Thorndike’s studies regarding commanding officers in the military.

The halo effect is also known as the “what is beautiful is good” principle or the “physical attractiveness stereotype.” This phenomenon has had a lasting impact on our evaluation and judgment of others. Additionally, the term “halo effect” was named after its likeness to that of the halo painted above the heads of saints and holy figures in religious art, generally regarded as symbols of moral goodness.

To investigate this phenomenon further, Nisbett and Wilson conducted an experiment in 1977 at the University of Michigan. As research participants, they recruited college students who were asked to watch a pre-recorded psychology instructor tape with two different attitudes—one likable and another unlikable. After watching the videotapes, they filled in a questionnaire that asked them to rate the lecturer’s physical appearance, mannerisms, and accent on an 8-point scale ranging from “like extremely” to “dislike extremely.”

Nisbett and Wilson’s study showed that despite the lecturers having the same mannerisms and accents, the respondents rated the lecturer more favorably if their attitude projected a likable demeanor. Moreover, Nisbett and Wilson discovered that people are unaware when the halo effect phenomenon occurs; they inferred that the respondents relied on their initial impression of the lecturer without being aware that it influenced their subsequent assessment. In total, 278 college students participated in the study.

5.2. Results

The results of this study showed that the ratings of the lecturer responded differently depending on his behavior—those who saw him adopt a likable demeanor in the video gave him higher ratings than those who saw him act in an unlikable manner. Furthermore, those who rated the lecturer highly were more likely to believe that he was intelligent, hardworking, kind, and humorous. This suggests that the halo effect can lead people to make inaccurate assumptions about someone based solely on their appearance or behavior.

Moreover, an updated study on the halo effect also suggests that a negative assessment of certain traits can similarly affect subsequent perceptions. For instance, if someone didn’t like the instructor’s physical appearance, they are more likely to rate him as unintelligent and lazy. This provides evidence that negative feelings about one characteristic can extend to an individual’s other features.

5.3. Criticisms

Despite its analytical contributions to the field, some scholars have questioned Nisbett and Wilson’s use of college students as the research participants in the study due to their limited exposure to the concept at hand. Additionally, since the focus of this experiment utilized only pre-recorded videos, there is also criticism of the limitation in the accuracy of facial expressions and vocal intonations. However, this study does provide evidence that people may rely on initial impressions when making assessments, resulting in cognitive bias and inaccurate judgments.

The False Consensus Effect Experiment, 1974

6.1. overview.

The False Consensus Effect Experiment was conducted in 1974 by Professor Lee Ross, then at Stanford University. The experiment focused on how people can form a “false consensus” about the beliefs and preferences of others. Specifically, it asked participants to read situations with two alternative responses and predict which one other people would choose. In the study, most subjects overestimated the likelihood that others would do the same thing as them, even when the situation was hypothetical and there was no real data to indicate what choice the majority of people might make. Furthermore, researchers found that the anticipation of a false consensus could lead people to display negative predictions about the personalities of those who did not share their choice.

6.2. Results

The results of Lee Ross’s experiment demonstrated the false consensus bias – the tendency to overestimate the extent to which others agree with one’s own beliefs and behaviors. He and his colleagues also conducted experiments in the late 1970s to demonstrate how this bias operates in estimating other people’s behaviors and causes. For example, participants chose a resolution to an imagined conflict, then estimated how many others would choose the same.

In another experiment, students were asked to carry a sign that read “eat at joe’s” to measure how many other people agreed with the sign. Those who agreed believed the majority of people would also agree; those who refused believed the majority would refuse. This experiment showed the false consensus effect – we tend to believe the majority of people agree with us and act the same way.

team playing

6.3. Criticisms

The false consensus effect is like a mirage in the desert, appearing to be something it’s not. It occurs when people overestimate how much other people agree with them. This bias can lead to an inaccurate perception of reality and cause individuals to make decisions based on false assumptions. The false consensus effect is most likely to occur when someone has strong beliefs or opinions about a particular topic and assumes that others share their views.

The Chameleon Effect Experiment, 1990

7.1. overview.

The Chameleon Effect is a phenomenon of unintentional mirroring, which was observed and studied for the first time in 1939 at the University of Iowa by Dr. Wendell Johnson. It involves someone mimicking another’s body posture, hand gestures, or even speaking accents without realizing it. In the 1990s, researchers Tanya Chartrand and John Bargh conducted follow-up experiments to further study this effect, which would be known as the “Chameleon Effect.”

7.2. Results

During their experiment, Chartrand and Bargh secretly mimicked the actions and behaviors of some test subjects during their conversations and monitored the responses. Those that were mimicked found the researchers more likable compared to those not mimicked. The results point to how people often subconsciously respond to others even if they weren’t consciously aware of the imitating behavior.

7.3. Criticisms

Despite being a popular explanation for why people tend to mirror each other’s behavior, the Chameleon Effect has been criticized by some. These criticisms mostly come from the idea that the phenomenon could potentially involve manipulation as well as desensitizing people to be more vulnerable to conformity and impressions. Additionally, some suggest there may be other reasons why people might mirror each other other than the Chameleon Effect itself, such as insecurity or politeness.

The Diffusion of Responsibility Experiment, 1968

8.1. overview.

The diffusion of responsibility experiment, conducted in 1968 at Stanford University, is an iconic social psychology experiment. It is studied to understand the bystander effect, a phenomenon in which people are more reluctant to express helpfulness or give aid if other bystanders are present.

Stanley Milgram and John Darley conducted the experiment to analyze how the presence of others could influence helping behaviors in kids. The study was also connected to concepts from Lev Vygotsky’s theory referring to emotional role-taking and group cohesiveness when observing the behavior of other members in a particular situation.

In the study, 600 children between the ages of four to six were tested. Each child was presented with one marshmallow, and they were promised to receive a second one if it still remained on the table after 15 minutes.

The results showed that only one-third of the children had delayed their gratification long enough to obtain the second marshmallow. Follow-up studies demonstrated that those children who managed to delay gratification and obtain the second marshmallow had higher SAT scores and were found to be more competent than their peers who could not wait and ate the marshmallow immediately.

8.2. Results

The results of the diffusion of responsibility experiment in 1968 proved to be true to an extent. Its findings added significantly to defining the bystander effect and the effects of the presence of other people, which may inhibit helpers’ reactions to those in need. The research showed that the number of bystanders could have a profound effect on one’s decision on whether to act or to remain inactive. This study takes into account the psychological phenomenon known as diffusion of responsibility which partially explains this inhibition of helping behavior.

8.3. Criticisms

Although the diffusion of responsibility experiment has contributed greatly to our understanding of the bystander effect, there is some criticism surrounding the validity and applicability of the experiment change in the setting in which it was conducted. Namely, there is some concern around the age of the participants, as well as the amount of time provided for the test, both of which may have had an influence on the outcome. In addition, since the experiment was conducted in 1972, there have been numerous changes to society and culture that may not have been taken into account by the authors at the time, leaving some questions as to its usefulness in current situations.

confused person

The Cognitive Dissonance Experiment, 1957

Leon Festinger’s Cognitive Dissonance Experiment of 1957 addresses the mental discomfort experienced when trying to hold two conflicting beliefs. The experiment sought to identify how this dissonance is resolved by people and to gain further insight into our thoughts and behaviors.

Stanley Schachter and Jerome E. Singer conducted the experiment with the intention of testing the cognitive theory of emotional arousal. They injected participants with epinephrine, a hormone that can produce side effects– both physical and psychological – such as increased heart rate, faster breathing, and an elevation in blood pressure.

The study featured a confederate who acted either in one of two ways: they were either pleasant or unpleasant. As knowledge of the injection’s side effects became aware to the participants, their emotions shifted accordingly. This revealed that dissonance leads to emotional arousal in individuals, leading them to act in agreement with emotion-triggering cognition.

In light of its results, critics have since made a note of the implausibility of Schachter and Singer’s study, claiming that it does little to address the complexities of the cognitive dissonance theory. By comparison, Leon Festinger’s cognitive dissonance experiment assumes that everyone holds many different cognitions about the world. It then examines what happens when those cognitions do not fit together adequately. Discrepancies are accepted if an event or occurrence makes sense; if not, these discrepancies lead to a state of tension or dissonance.

Interested readers may refer to festive ger’s 1959 article, “Cognitive Dissonance,” for a more in-depth exploration of the experiment’s results and implications. The study remains a classic social psychology experiment, illustrating how we reconcile the conflicts created by existing contradictory beliefs.

The 10 most influential social psychology experiments in history can all be summarized by their contribution to the development of our understanding of human behavior. From Milgram’s famous study revealing our willingness to obey authority figures, to Bandura’s Bobo Doll Experiment establishing the power of aggression and imitation, to Nisbett and Wilson’s Halo Effect Experiment controlling biases towards physical attractiveness, these studies provide insights into the power of group dynamics, social pressures, and cognitive bias that continue to inform and impact modern society today.

Learning about these classic social psychology experiments also encourages us to question our assumptions, explore alternative perspectives, and strive for greater understanding and acceptance. Whether by examining the limitations of data collection or the ethical implications of research methods, these landmark studies have built our knowledge base and served as a reminder of the importance of respect and empathy.

In conclusion, the 10 most influential social psychology experiments listed reflect an inspiring effort to better understand human behavior and engage in much-needed conversations around this fascinating field.

Frequently Asked Questions

What were the initial experiments in social psychology.

Norman Triplett’s 1898 experiment is credited with being the first social psychology experiment. He examined the effects of competition on a simple task – winding hempen string – and found that people performed better when in the presence of others than when alone. His findings ushered in an exciting new field of research into how people are both influenced by, and affect, the social world around them.

Why are experiments are used in social psychology?

Experiments provide valuable insight into human behavior and allow for cause and effect relationships to be better understood. They are an essential tool conducting research in social psychology, allowing the researcher to observe how individuals respond in different situational contexts. Experimentation allows us to answer important questions about social behavior, making it an invaluable research method.

What are social psychology topics?

Social psychology explores the fascinating ways that people interact with each other, from understanding why prejudicial behavior occurs to analyzing why some people have a greater degree of influence over human behaviors than others. Social psychology has played a major role in helping us understand human behavior. It has made invaluable contributions through its research on various topics like prejudice and discrimination, gender, culture, social influence, interpersonal relations, group behavior, aggression etc.

What is an example of a social experiment?

An example of a social experiment is Stanley Milgram’s obedience experiment conducted in 1963 which tested human subjects’ willingness to obey orders regardless of the outcomes. This controversial experiment was conducted in order to understand how far people would go to follow orders from authority figures, even if it meant inflicting pain on another person.

Ahhh! What an article that was. Still hungry for some knowledge? See our recommended posts:

How To Use Psychology in Marketing – A Guide Loss Aversion Marketing Strategies to Increase Sales

What was the social experiment of the 60s?

The Social Experiment of the 1960s was the Milgram experiment, conducted by Yale University psychologist Stanley Milgram. It tested participants’ willingness to obey authority when asked to do something that conflicted with their own moral values. The experiment yielded powerful insights into human behavior and the power of authority, making it one of the most influential experiments of its time.

Leave a Comment Cancel

Your email address will not be published. Required fields are marked *

Email Address:

Save my name, email, and website in this browser for the next time I comment.

helpful professor logo

15 Famous Experiments and Case Studies in Psychology

15 Famous Experiments and Case Studies in Psychology

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

Learn about our Editorial Process

psychology theories, explained below

Psychology has seen thousands upon thousands of research studies over the years. Most of these studies have helped shape our current understanding of human thoughts, behavior, and feelings.

The psychology case studies in this list are considered classic examples of psychological case studies and experiments, which are still being taught in introductory psychology courses up to this day.

Some studies, however, were downright shocking and controversial that you’d probably wonder why such studies were conducted back in the day. Imagine participating in an experiment for a small reward or extra class credit, only to be left scarred for life. These kinds of studies, however, paved the way for a more ethical approach to studying psychology and implementation of research standards such as the use of debriefing in psychology research .

Case Study vs. Experiment

Before we dive into the list of the most famous studies in psychology, let us first review the difference between case studies and experiments.

  • It is an in-depth study and analysis of an individual, group, community, or phenomenon. The results of a case study cannot be applied to the whole population, but they can provide insights for further studies.
  • It often uses qualitative research methods such as observations, surveys, and interviews.
  • It is often conducted in real-life settings rather than in controlled environments.
  • An experiment is a type of study done on a sample or group of random participants, the results of which can be generalized to the whole population.
  • It often uses quantitative research methods that rely on numbers and statistics.
  • It is conducted in controlled environments, wherein some things or situations are manipulated.

See Also: Experimental vs Observational Studies

Famous Experiments in Psychology

1. the marshmallow experiment.

Psychologist Walter Mischel conducted the marshmallow experiment at Stanford University in the 1960s to early 1970s. It was a simple test that aimed to define the connection between delayed gratification and success in life.

The instructions were fairly straightforward: children ages 4-6 were presented a piece of marshmallow on a table and they were told that they would receive a second piece if they could wait for 15 minutes without eating the first marshmallow.

About one-third of the 600 participants succeeded in delaying gratification to receive the second marshmallow. Mischel and his team followed up on these participants in the 1990s, learning that those who had the willpower to wait for a larger reward experienced more success in life in terms of SAT scores and other metrics.

This case study also supported self-control theory , a theory in criminology that holds that people with greater self-control are less likely to end up in trouble with the law!

The classic marshmallow experiment, however, was debunked in a 2018 replication study done by Tyler Watts and colleagues.

This more recent experiment had a larger group of participants (900) and a better representation of the general population when it comes to race and ethnicity. In this study, the researchers found out that the ability to wait for a second marshmallow does not depend on willpower alone but more so on the economic background and social status of the participants.

2. The Bystander Effect

In 1694, Kitty Genovese was murdered in the neighborhood of Kew Gardens, New York. It was told that there were up to 38 witnesses and onlookers in the vicinity of the crime scene, but nobody did anything to stop the murder or call for help.

Such tragedy was the catalyst that inspired social psychologists Bibb Latane and John Darley to formulate the phenomenon called bystander effect or bystander apathy .

Subsequent investigations showed that this story was exaggerated and inaccurate, as there were actually only about a dozen witnesses, at least two of whom called the police. But the case of Kitty Genovese led to various studies that aim to shed light on the bystander phenomenon.

Latane and Darley tested bystander intervention in an experimental study . Participants were asked to answer a questionnaire inside a room, and they would either be alone or with two other participants (who were actually actors or confederates in the study). Smoke would then come out from under the door. The reaction time of participants was tested — how long would it take them to report the smoke to the authorities or the experimenters?

The results showed that participants who were alone in the room reported the smoke faster than participants who were with two passive others. The study suggests that the more onlookers are present in an emergency situation, the less likely someone would step up to help, a social phenomenon now popularly called the bystander effect.

3. Asch Conformity Study

Have you ever made a decision against your better judgment just to fit in with your friends or family? The Asch Conformity Studies will help you understand this kind of situation better.

In this experiment, a group of participants were shown three numbered lines of different lengths and asked to identify the longest of them all. However, only one true participant was present in every group and the rest were actors, most of whom told the wrong answer.

Results showed that the participants went for the wrong answer, even though they knew which line was the longest one in the first place. When the participants were asked why they identified the wrong one, they said that they didn’t want to be branded as strange or peculiar.

This study goes to show that there are situations in life when people prefer fitting in than being right. It also tells that there is power in numbers — a group’s decision can overwhelm a person and make them doubt their judgment.

4. The Bobo Doll Experiment

The Bobo Doll Experiment was conducted by Dr. Albert Bandura, the proponent of social learning theory .

Back in the 1960s, the Nature vs. Nurture debate was a popular topic among psychologists. Bandura contributed to this discussion by proposing that human behavior is mostly influenced by environmental rather than genetic factors.

In the Bobo Doll Experiment, children were divided into three groups: one group was shown a video in which an adult acted aggressively toward the Bobo Doll, the second group was shown a video in which an adult play with the Bobo Doll, and the third group served as the control group where no video was shown.

The children were then led to a room with different kinds of toys, including the Bobo Doll they’ve seen in the video. Results showed that children tend to imitate the adults in the video. Those who were presented the aggressive model acted aggressively toward the Bobo Doll while those who were presented the passive model showed less aggression.

While the Bobo Doll Experiment can no longer be replicated because of ethical concerns, it has laid out the foundations of social learning theory and helped us understand the degree of influence adult behavior has on children.

5. Blue Eye / Brown Eye Experiment

Following the assassination of Martin Luther King Jr. in 1968, third-grade teacher Jane Elliott conducted an experiment in her class. Although not a formal experiment in controlled settings, A Class Divided is a good example of a social experiment to help children understand the concept of racism and discrimination.

The class was divided into two groups: blue-eyed children and brown-eyed children. For one day, Elliott gave preferential treatment to her blue-eyed students, giving them more attention and pampering them with rewards. The next day, it was the brown-eyed students’ turn to receive extra favors and privileges.

As a result, whichever group of students was given preferential treatment performed exceptionally well in class, had higher quiz scores, and recited more frequently; students who were discriminated against felt humiliated, answered poorly in tests, and became uncertain with their answers in class.

This study is now widely taught in sociocultural psychology classes.

6. Stanford Prison Experiment

One of the most controversial and widely-cited studies in psychology is the Stanford Prison Experiment , conducted by Philip Zimbardo at the basement of the Stanford psychology building in 1971. The hypothesis was that abusive behavior in prisons is influenced by the personality traits of the prisoners and prison guards.

The participants in the experiment were college students who were randomly assigned as either a prisoner or a prison guard. The prison guards were then told to run the simulated prison for two weeks. However, the experiment had to be stopped in just 6 days.

The prison guards abused their authority and harassed the prisoners through verbal and physical means. The prisoners, on the other hand, showed submissive behavior. Zimbardo decided to stop the experiment because the prisoners were showing signs of emotional and physical breakdown.

Although the experiment wasn’t completed, the results strongly showed that people can easily get into a social role when others expect them to, especially when it’s highly stereotyped .

7. The Halo Effect

Have you ever wondered why toothpastes and other dental products are endorsed in advertisements by celebrities more often than dentists? The Halo Effect is one of the reasons!

The Halo Effect shows how one favorable attribute of a person can gain them positive perceptions in other attributes. In the case of product advertisements, attractive celebrities are also perceived as intelligent and knowledgeable of a certain subject matter even though they’re not technically experts.

The Halo Effect originated in a classic study done by Edward Thorndike in the early 1900s. He asked military commanding officers to rate their subordinates based on different qualities, such as physical appearance, leadership, dependability, and intelligence.

The results showed that high ratings of a particular quality influences the ratings of other qualities, producing a halo effect of overall high ratings. The opposite also applied, which means that a negative rating in one quality also correlated to negative ratings in other qualities.

Experiments on the Halo Effect came in various formats as well, supporting Thorndike’s original theory. This phenomenon suggests that our perception of other people’s overall personality is hugely influenced by a quality that we focus on.

8. Cognitive Dissonance

There are experiences in our lives when our beliefs and behaviors do not align with each other and we try to justify them in our minds. This is cognitive dissonance , which was studied in an experiment by Leon Festinger and James Carlsmith back in 1959.

In this experiment, participants had to go through a series of boring and repetitive tasks, such as spending an hour turning pegs in a wooden knob. After completing the tasks, they were then paid either $1 or $20 to tell the next participants that the tasks were extremely fun and enjoyable. Afterwards, participants were asked to rate the experiment. Those who were given $1 rated the experiment as more interesting and fun than those who received $20.

The results showed that those who received a smaller incentive to lie experienced cognitive dissonance — $1 wasn’t enough incentive for that one hour of painstakingly boring activity, so the participants had to justify that they had fun anyway.

Famous Case Studies in Psychology

9. little albert.

In 1920, behaviourist theorists John Watson and Rosalie Rayner experimented on a 9-month-old baby to test the effects of classical conditioning in instilling fear in humans.

This was such a controversial study that it gained popularity in psychology textbooks and syllabi because it is a classic example of unethical research studies done in the name of science.

In one of the experiments, Little Albert was presented with a harmless stimulus or object, a white rat, which he wasn’t scared of at first. But every time Little Albert would see the white rat, the researchers would play a scary sound of hammer and steel. After about 6 pairings, Little Albert learned to fear the rat even without the scary sound.

Little Albert developed signs of fear to different objects presented to him through classical conditioning . He even generalized his fear to other stimuli not present in the course of the experiment.

10. Phineas Gage

Phineas Gage is such a celebrity in Psych 101 classes, even though the way he rose to popularity began with a tragic accident. He was a resident of Central Vermont and worked in the construction of a new railway line in the mid-1800s. One day, an explosive went off prematurely, sending a tamping iron straight into his face and through his brain.

Gage survived the accident, fortunately, something that is considered a feat even up to this day. He managed to find a job as a stagecoach after the accident. However, his family and friends reported that his personality changed so much that “he was no longer Gage” (Harlow, 1868).

New evidence on the case of Phineas Gage has since come to light, thanks to modern scientific studies and medical tests. However, there are still plenty of mysteries revolving around his brain damage and subsequent recovery.

11. Anna O.

Anna O., a social worker and feminist of German Jewish descent, was one of the first patients to receive psychoanalytic treatment.

Her real name was Bertha Pappenheim and she inspired much of Sigmund Freud’s works and books on psychoanalytic theory, although they hadn’t met in person. Their connection was through Joseph Breuer, Freud’s mentor when he was still starting his clinical practice.

Anna O. suffered from paralysis, personality changes, hallucinations, and rambling speech, but her doctors could not find the cause. Joseph Breuer was then called to her house for intervention and he performed psychoanalysis, also called the “talking cure”, on her.

Breuer would tell Anna O. to say anything that came to her mind, such as her thoughts, feelings, and childhood experiences. It was noted that her symptoms subsided by talking things out.

However, Breuer later referred Anna O. to the Bellevue Sanatorium, where she recovered and set out to be a renowned writer and advocate of women and children.

12. Patient HM

H.M., or Henry Gustav Molaison, was a severe amnesiac who had been the subject of countless psychological and neurological studies.

Henry was 27 when he underwent brain surgery to cure the epilepsy that he had been experiencing since childhood. In an unfortunate turn of events, he lost his memory because of the surgery and his brain also became unable to store long-term memories.

He was then regarded as someone living solely in the present, forgetting an experience as soon as it happened and only remembering bits and pieces of his past. Over the years, his amnesia and the structure of his brain had helped neuropsychologists learn more about cognitive functions .

Suzanne Corkin, a researcher, writer, and good friend of H.M., recently published a book about his life. Entitled Permanent Present Tense , this book is both a memoir and a case study following the struggles and joys of Henry Gustav Molaison.

13. Chris Sizemore

Chris Sizemore gained celebrity status in the psychology community when she was diagnosed with multiple personality disorder, now known as dissociative identity disorder.

Sizemore has several alter egos, which included Eve Black, Eve White, and Jane. Various papers about her stated that these alter egos were formed as a coping mechanism against the traumatic experiences she underwent in her childhood.

Sizemore said that although she has succeeded in unifying her alter egos into one dominant personality, there were periods in the past experienced by only one of her alter egos. For example, her husband married her Eve White alter ego and not her.

Her story inspired her psychiatrists to write a book about her, entitled The Three Faces of Eve , which was then turned into a 1957 movie of the same title.

14. David Reimer

When David was just 8 months old, he lost his penis because of a botched circumcision operation.

Psychologist John Money then advised Reimer’s parents to raise him as a girl instead, naming him Brenda. His gender reassignment was supported by subsequent surgery and hormonal therapy.

Money described Reimer’s gender reassignment as a success, but problems started to arise as Reimer was growing up. His boyishness was not completely subdued by the hormonal therapy. When he was 14 years old, he learned about the secrets of his past and he underwent gender reassignment to become male again.

Reimer became an advocate for children undergoing the same difficult situation he had been. His life story ended when he was 38 as he took his own life.

15. Kim Peek

Kim Peek was the inspiration behind Rain Man , an Oscar-winning movie about an autistic savant character played by Dustin Hoffman.

The movie was released in 1988, a time when autism wasn’t widely known and acknowledged yet. So it was an eye-opener for many people who watched the film.

In reality, Kim Peek was a non-autistic savant. He was exceptionally intelligent despite the brain abnormalities he was born with. He was like a walking encyclopedia, knowledgeable about travel routes, US zip codes, historical facts, and classical music. He also read and memorized approximately 12,000 books in his lifetime.

This list of experiments and case studies in psychology is just the tip of the iceberg! There are still countless interesting psychology studies that you can explore if you want to learn more about human behavior and dynamics.

You can also conduct your own mini-experiment or participate in a study conducted in your school or neighborhood. Just remember that there are ethical standards to follow so as not to repeat the lasting physical and emotional harm done to Little Albert or the Stanford Prison Experiment participants.

Asch, S. E. (1956). Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs: General and Applied, 70 (9), 1–70. https://doi.org/10.1037/h0093718

Bandura, A., Ross, D., & Ross, S. A. (1961). Transmission of aggression through imitation of aggressive models. The Journal of Abnormal and Social Psychology, 63 (3), 575–582. https://doi.org/10.1037/h0045925

Elliott, J., Yale University., WGBH (Television station : Boston, Mass.), & PBS DVD (Firm). (2003). A class divided. New Haven, Conn.: Yale University Films.

Festinger, L., & Carlsmith, J. M. (1959). Cognitive consequences of forced compliance. The Journal of Abnormal and Social Psychology, 58 (2), 203–210. https://doi.org/10.1037/h0041593

Haney, C., Banks, W. C., & Zimbardo, P. G. (1973). A study of prisoners and guards in a simulated prison. Naval Research Review , 30 , 4-17.

Latane, B., & Darley, J. M. (1968). Group inhibition of bystander intervention in emergencies. Journal of Personality and Social Psychology, 10 (3), 215–221. https://doi.org/10.1037/h0026570

Mischel, W. (2014). The Marshmallow Test: Mastering self-control. Little, Brown and Co.

Thorndike, E. (1920) A Constant Error in Psychological Ratings. Journal of Applied Psychology , 4 , 25-29. http://dx.doi.org/10.1037/h0071663

Watson, J. B., & Rayner, R. (1920). Conditioned emotional reactions. Journal of experimental psychology , 3 (1), 1.

Chris

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 10 Reasons you’re Perpetually Single
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 20 Montessori Toddler Bedrooms (Design Inspiration)
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 21 Montessori Homeschool Setups
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd-2/ 101 Hidden Talents Examples

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

  • Skip to main content
  • Skip to primary sidebar

IResearchNet

Social Psychology Experiments

Social psychology experiments have played a pivotal role in unraveling the intricate tapestry of human behavior, cognition, and emotions within the social context. These experiments represent more than just scientific inquiries; they serve as windows into the fundamental aspects of human nature and the ways in which we interact with others. This article delves into a selection of famous experiments in social psychology, each a milestone in understanding the complexities of human social behavior.

Thesis Statement: The significance of these famous experiments extends far beyond the realm of academia, shaping our understanding of conformity, obedience, group dynamics, morality, and the subconscious biases that influence our decisions and actions. Through these groundbreaking studies, we gain valuable insights into the human condition, prompting us to question, explore, and reflect upon the intricate web of social interactions that define our lives.

Famous Experiments in Social Psychology

Social Psychology Experiments

The Bennington College study was conducted by sociologist Theodore Newcomb from 1935 until 1939. The study examined the attitudes of students attending the then all-female Bennington College early in the college’s history; indeed, the study began during the first year that the college had a senior class.

Solomon Asch’s Conformity experiments in the 1950s starkly demonstrated the power of conformity on people’s estimation of the length of lines. On over a third of the trials, participants conformed to the majority, even though the majority judgment was clearly wrong. Seventy-five percent of the participants conformed at least once during the experiment.

In Muzafer Sherif ’s Robbers Cave experiment (1954) boys were divided into two competing groups to explore how much hostility and aggression would emerge. It is also known as realistic group conflict theory, because the intergroup conflict was induced through competition over resources.

Leon Festinger’s Cognitive Dissonance experiment subjects were asked to perform a boring task. They were divided into two groups and given two different pay scales. At the end of the study, participants who were paid $1 to say that they enjoyed the task and another group of participants were paid $20 to say the same lie. The first group ($1) would later believe that they like the task better than the second group ($20). People justified the lie by changing their previously unfavorable attitudes about the task (Festinger and Carlsmith 1959).

Stanley Milgram’s Obedience to Authority experiment has shown how far people would go to obey an authority figure. Following the events of the Holocaust in World War II Stanley Milgram’s experiments of the 1960s/1970s showed that normal American citizens were capable of following orders to the point of causing extreme suffering in an innocent human being.

Albert Bandura’s Bobo Doll experiment has demonstrated how aggression is learned by imitation (Bandura et al. 1961). Bandura’s experimental work was one of the first studies in a long line of research showing how exposure to media violence leads to aggressive behavior in the observers.

In Philip Zimbardo’s Stanford Prison experiment a simulated exercise between student prisoners and guards showed how far people would follow an adopted role. This was an important demonstration of the power of the immediate social situation, and its capacity to overwhelm normal personality traits (Haney et al. 1973).

The Milgram Experiment

Background and Context

The Milgram Experiment, conducted by psychologist Stanley Milgram in the early 1960s, arose in a climate of post-World War II questions about obedience, authority, and moral responsibility. Inspired by the Nuremberg Trials and the revelation of the atrocities committed by Nazi personnel who claimed to be “just following orders,” Milgram sought to explore the extent to which individuals would obey authority figures, even when it conflicted with their own moral beliefs.

Experiment Setup and Procedure

The experiment involved three key roles: the experimenter (authority figure), the teacher (participant), and the learner (an actor). Participants believed they were assisting in a study examining the effects of punishment on learning. The teacher was instructed to administer increasingly severe electric shocks to the learner for incorrect responses in a word-pair memory test. Unbeknownst to the teacher, the learner did not actually receive shocks, but their responses were scripted to simulate distress and pain.

Ethical Concerns and Criticisms

The Milgram Experiment has been widely criticized for its ethical implications. Participants were exposed to significant psychological stress and believed they were causing harm to another person, potentially leading to long-lasting emotional trauma. Critics argue that the experiment lacked proper informed consent, and the debriefing process may not have been sufficient to alleviate the distress experienced by participants.

Major Findings and Their Impact

The Milgram Experiment revealed astonishing results. Contrary to expectations, a significant proportion of participants, under the pressure of the authority figure’s commands, continued to administer shocks up to potentially lethal levels, even when they were aware of the learner’s distress. This demonstrated the profound influence of authority figures on individual behavior.

The study shed light on the psychology of obedience and the potential for ordinary people to engage in harmful actions under the guise of following orders. Milgram’s findings raised ethical and moral questions about blind obedience and individual responsibility in the face of authority.

The Stanford Prison Experiment

The Stanford Prison Experiment, conducted by psychologist Philip Zimbardo in 1971, stands as one of the most notorious and influential studies in social psychology. Emerging during a tumultuous period in American history marked by social unrest and the questioning of authority, the experiment sought to investigate the psychological dynamics of power, authority, and the consequences of perceived roles within a simulated prison environment.

Description of the Experiment

The experiment involved the transformation of the basement of Stanford University’s psychology department into a mock prison. Volunteers were randomly assigned to play the roles of either guards or prisoners in a simulated prison environment. The participants quickly adapted to their roles, with guards displaying authoritarian behaviors, and prisoners experiencing psychological distress and rebellion. The study was originally intended to last two weeks but was terminated after only six days due to the alarming and unethical behaviors exhibited by both guards and prisoners.

Ethical Controversies

The Stanford Prison Experiment has been mired in ethical controversies. Critics argue that the psychological harm inflicted upon participants was severe, and the lack of proper oversight allowed the study to veer into dangerous territory. Questions have also been raised regarding the informed consent process, as participants were not fully aware of the potential psychological consequences of their involvement.

Key Findings and Implications

Despite its ethical shortcomings, the Stanford Prison Experiment yielded valuable insights into the malleability of human behavior in response to situational factors. It demonstrated how ordinary individuals could quickly adopt abusive and authoritarian roles when placed in positions of power. The study underscored the importance of ethical considerations in psychological research and prompted discussions about the responsibility of researchers to ensure the well-being of participants.

The implications of the study extend beyond academia, offering a cautionary tale about the potential for abuses of power and authority. It has influenced discussions on ethics in research, the psychology of group dynamics, and the understanding of how situational factors can shape behavior.

The Asch Conformity Experiment

Introduction and Historical Context

The Asch Conformity Experiment, conducted by Solomon Asch in the 1950s, remains a seminal study in the field of social psychology. Emerging in the post-World War II era, this experiment aimed to investigate the extent to which individuals conform to group norms and the impact of social pressure on individual decision-making.

Experiment Design and Methodology

In the Asch Conformity Experiment, participants were placed in a group of individuals, with the participant being the only true subject. The group was presented with a simple perceptual task: comparing the length of lines. Participants were asked to state which of several lines was of equal length to a reference line. Unknown to the participant, the other group members were confederates who had been instructed to give incorrect answers in some trials.

During the critical trials, the confederates deliberately provided incorrect answers that contradicted the obvious correct response. The participant, seated at the end of the row, faced the dilemma of whether to conform to the group’s incorrect consensus or assert their own judgment.

Conformity Results and Interpretations

The results of the Asch Conformity Experiment were striking. Despite the obvious correctness of their own judgments, participants frequently succumbed to group pressure and provided incorrect responses to match the consensus of the group. On average, about one-third of participants conformed to the group’s incorrect answers in the face of social pressure.

Asch’s findings underscored the potent influence of social conformity and the willingness of individuals to abandon their own perceptions and judgment in favor of group consensus. He also identified several factors that influenced the likelihood of conformity, such as the size of the majority and the unanimity of the group.

Influence on Social Psychology and Beyond

The Asch Conformity Experiment significantly impacted social psychology by highlighting the powerful role of social influence on human behavior. It prompted further research into group dynamics, conformity, and the psychology of social norms. Asch’s work laid the foundation for studies on topics such as groupthink, normative influence, and the conditions under which individuals are more likely to resist social pressure.

Beyond social psychology, the experiment has practical implications for understanding how conformity operates in everyday life, from peer pressure among adolescents to decision-making in organizations. The study has also been instrumental in discussions about individual autonomy and the tension between conforming to societal expectations and asserting one’s independent judgment.

The Asch Conformity Experiment remains a timeless exploration of the human propensity to conform and the psychological mechanisms at play when individuals navigate the tension between individuality and social cohesion.

The Robbers Cave Experiment

Background and Purpose of the Study

The Robbers Cave Experiment, conducted by psychologist Muzafer Sherif and his colleagues in 1954, was designed to investigate intergroup conflict and cooperation among children. The study emerged during a time when Cold War tensions and conflicts between nations were a prominent backdrop, prompting Sherif to explore the dynamics of group conflict on a smaller scale.

The central purpose of the study was to understand how group identities, competition, and cooperation could influence the attitudes and behaviors of individuals within groups and across groups. It sought to shed light on the origins of intergroup hostility and the potential for reconciliation.

Experimental Design and Procedures

The study took place at Robbers Cave State Park in Oklahoma and involved two phases.

  • Group Formation : In the first phase, a group of 22 boys was divided into two groups, the Rattlers and the Eagles, with no prior knowledge of each other. The boys formed strong group identities through team-building activities and bonding experiences.
  • Intergroup Competition : In the second phase, the two groups were introduced to each other and engaged in competitive activities, such as sports and contests, where rivalries quickly developed. The competition intensified intergroup conflicts, leading to name-calling, vandalism, and hostility.
  • Intervention and Cooperation : To address the escalating conflict, the researchers initiated activities that required the groups to collaborate, such as solving common problems and working together towards common goals. These cooperative experiences aimed to reduce intergroup tensions.

Notable Findings and Insights on Intergroup Conflict

The Robbers Cave Experiment yielded several important findings:

  • Intergroup conflict emerged swiftly when groups were formed and exposed to competition, even among previously unacquainted individuals.
  • The competition exacerbated stereotypes and prejudices between the groups.
  • Cooperation between groups, when introduced strategically, had the potential to reduce hostilities and foster intergroup harmony.
  • The study illustrated the role of superordinate goals (common objectives that transcended group boundaries) in promoting cooperation and reducing conflict.

Practical Applications and Contributions

The Robbers Cave Experiment has had lasting implications in the fields of social psychology and conflict resolution. It provided valuable insights into the dynamics of intergroup conflict and cooperation, shedding light on the processes by which hostility between groups can be both fueled and mitigated.

The concept of superordinate goals, derived from the study, has been widely applied in conflict resolution efforts. By identifying shared objectives that require collaboration across group lines, individuals and societies have been able to bridge divides and work together toward common aims. The study’s lessons have informed strategies for reducing prejudice, improving intergroup relations, and fostering peace in various contexts, including education, organizational management, and international diplomacy.

The Robbers Cave Experiment remains a classic illustration of how group identities and competition can lead to conflict, while also highlighting the potential for cooperation and reconciliation when shared goals and positive intergroup interactions are promoted.

The Zimbardo Stanford Prison Experiment

Overview of the Experiment

The Zimbardo Stanford Prison Experiment, conducted by psychologist Philip Zimbardo in 1971, is a widely recognized and controversial study in the realm of social psychology. The experiment was designed to investigate the psychological effects of perceived power and authority within a simulated prison environment.

In this study, participants were randomly assigned to play the roles of either guards or prisoners in a mock prison set up in the basement of Stanford University’s psychology department. The experiment aimed to explore how individuals, when placed in positions of power or vulnerability, would react and adapt to their roles.

Ethical Considerations and Criticisms

The Zimbardo Stanford Prison Experiment has been marred by significant ethical concerns and criticisms. The study generated intense psychological distress among participants, with the guards exhibiting abusive and authoritarian behaviors, and the prisoners experiencing emotional and psychological harm. The experiment’s duration, initially planned for two weeks, was terminated after only six days due to the extreme and unethical behaviors displayed by participants.

Critics argue that the study lacked proper informed consent, as participants were not fully aware of the potential psychological consequences of their involvement. The absence of proper oversight and safeguards to protect the well-being of participants has been a focal point of ethical critique.

Psychological Effects on Participants

The Zimbardo Stanford Prison Experiment had profound psychological effects on its participants. Guards, assigned to positions of power, quickly adopted authoritarian roles, displaying abusive behaviors toward the prisoners. Prisoners, on the other hand, experienced distress, humiliation, and a sense of powerlessness.

The psychological effects on participants were so severe that the study was terminated prematurely to prevent further harm. Post-experiment interviews revealed that some participants struggled to differentiate between their roles and their true identities, emphasizing the significant impact of situational factors on individual behavior.

Enduring Influence on Social Psychology

Despite its ethical controversies, the Zimbardo Stanford Prison Experiment had a lasting influence on the field of social psychology. It highlighted the malleability of human behavior in response to situational factors and the potential for ordinary individuals to engage in abusive actions when placed in positions of authority.

The study contributed to discussions on ethics in research and the responsibility of researchers to prioritize the well-being of participants. It also prompted further investigations into the psychology of power, authority, and obedience, leading to a deeper understanding of the complexities of human behavior within social contexts.

The Zimbardo Stanford Prison Experiment remains a cautionary tale in the annals of psychology, reminding researchers of the ethical imperative to protect participants and the enduring influence of situational factors on human behavior.

The Little Albert Experiment

Introduction to the Study

The Little Albert Experiment is a classic and ethically controversial study conducted by behaviorist John B. Watson and his graduate student Rosalie Rayner in 1920. The experiment aimed to investigate the process of classical conditioning, particularly the acquisition of phobias and emotional responses in humans.

The study is named after its subject, a 9-month-old boy known as “Little Albert.” It remains a notable case study in the field of psychology due to its ethical concerns and contributions to the understanding of learned behaviors.

Experiment Details and Ethical Concerns

In the Little Albert Experiment, Little Albert was exposed to a white rat, a rabbit, a dog, a monkey, and other stimuli. Initially, he displayed no fear or aversion to these objects. However, Watson and Rayner sought to condition an emotional response in Little Albert by pairing the presentation of these stimuli with a loud, frightening noise (produced by striking a suspended steel bar with a hammer). As a result of this pairing, Little Albert began to exhibit fear and distress in response to the previously neutral stimuli, particularly the white rat.

The ethical concerns surrounding this experiment are significant. Little Albert was not provided with informed consent, and his emotional well-being was disregarded. The study also lacked proper debriefing, and the long-term consequences of Little Albert’s conditioning were not addressed. The ethical standards of today would prohibit such a study from being conducted.

Conditioning Process and Long-Term Implications

The Little Albert Experiment demonstrated the principles of classical conditioning in humans. It illustrated how conditioned emotional responses, such as fear and anxiety, could be acquired through association with previously neutral stimuli. In this case, Little Albert learned to fear the white rat because it had been consistently paired with a loud, frightening noise.

The long-term implications of the study are less clear due to a lack of follow-up research on Little Albert. It remains unknown whether his conditioned fears persisted or how they may have impacted his later development. The study’s ethical shortcomings prevent a comprehensive assessment of its long-term effects.

Contemporary Perspectives on the Study

The Little Albert Experiment is viewed with skepticism and ethical concern from contemporary perspectives. It serves as a reminder of the importance of informed consent, debriefing, and the ethical treatment of research participants in psychological research. Ethical standards in research have evolved significantly since the time of the experiment, emphasizing the need to prioritize the well-being and rights of participants.

While the Little Albert Experiment contributed to the understanding of classical conditioning, it also serves as a cautionary tale about the ethical boundaries of research and the potential consequences of disregarding the psychological well-being of participants. Modern research ethics prioritize the protection and respect of individuals involved in psychological studies, ensuring that similar experiments would not be conducted today.

The Blue-Eyes/Brown-Eyes Exercise

Historical Context and Significance

The Blue-Eyes/Brown-Eyes Exercise is a landmark social experiment conducted by educator and activist Jane Elliott in the late 1960s. The experiment was born out of the civil rights movement in the United States and sought to address issues of racism, discrimination, and prejudice. Against the backdrop of racial tensions and the struggle for civil rights, Elliott designed the exercise to provide a firsthand experience of the effects of discrimination.

Experiment Design and Outcomes

In the Blue-Eyes/Brown-Eyes Exercise, Elliott divided her third-grade students into two groups based on eye color, designating one group as “superior” (those with blue eyes) and the other as “inferior” (those with brown eyes). Over the course of the exercise, Elliott systematically treated the two groups differently, providing privileges to the superior group while subjecting the inferior group to discrimination and negative stereotypes.

The results of the experiment were profound. Children in the inferior group quickly internalized their assigned role and began to exhibit lower self-esteem, diminished academic performance, and a range of negative emotional responses. On the other hand, those in the superior group displayed increased arrogance and a sense of entitlement.

Elliott conducted the exercise over multiple days, reversing the roles on the second day to provide a taste of both sides of discrimination. The exercise aimed to create empathy and understanding among participants by allowing them to personally experience the emotional and psychological impact of discrimination.

Broader Societal Impact and Implications

The Blue-Eyes/Brown-Eyes Exercise had a significant societal impact. It garnered attention in the media and brought issues of racism and discrimination to the forefront of public consciousness. Elliott’s work challenged prevailing beliefs about the nature of prejudice and discrimination, highlighting the role of societal conditioning in perpetuating such attitudes.

The exercise also emphasized the importance of empathy and perspective-taking in combatting racism and prejudice. By allowing participants to experience discrimination firsthand, Elliott aimed to foster greater empathy and understanding among individuals of different racial backgrounds.

Experimentation in Social Psychology

Experimentation definition.

Experimentation, in its simplest form, is a research method used to investigate the presence or absence of a causal relationship between two variables. This method involves systematically manipulating one variable, known as the independent variable, and then assessing the impact or effect of this manipulation on another variable, referred to as the dependent variable. Through experimentation, researchers aim to discern whether changes in the independent variable cause changes in the dependent variable, providing insights into causal relationships within a given phenomenon or context. This systematic and controlled approach allows for rigorous testing of hypotheses and the establishment of cause-and-effect relationships in scientific inquiry.

Importance and Consequences of Experiments

The importance and consequences of experiments in research are closely tied to their unique ability to establish causal relationships. Here are key features of experiments that facilitate the ability to draw causal conclusions and their implications:

  • Establishing Causality: Experiments are highly valuable because they allow researchers to make statements about causality. By systematically manipulating the independent variable and assessing its impact on the dependent variable, researchers can infer that changes in the independent variable cause changes in the dependent variable. This cause-and-effect relationship is central to scientific inquiry and helps uncover the mechanisms underlying various phenomena.
  • Directionality of Relationship: Experiments provide a clear temporal sequence where changes in the independent variable precede the assessment of the dependent variable. This temporal order is crucial for determining the directionality of the relationship between variables. In causal relationships, the cause must precede the effect. Experiments ensure that this criterion is met, enabling researchers to infer the causal direction.
  • Random Assignment: In experiments, participants are randomly assigned to different experimental groups. Random assignment ensures that each participant has an equal chance of being assigned to any experimental condition, creating equivalent groups at the outset. This eliminates the possibility that pre-existing differences between participants could account for observed differences in the dependent variable. Random assignment strengthens the validity of causal claims by minimizing confounding variables.
  • Isolation of Effects: Experiments enable researchers to isolate the effects of the independent variable by controlling all other aspects of the environment. This control ensures that all participants have a similar experience, except for the experimental manipulation. By eliminating extraneous variables, researchers can attribute any observed differences in the dependent variable solely to the independent variable. This isolation of effects enhances the internal validity of the study.

In summary, experiments are a powerful research method that allows for the establishment of causal relationships in scientific inquiry. Their ability to establish causality, ensure temporal precedence, employ random assignment, and isolate the effects of the independent variable makes experiments a cornerstone of empirical research. Researchers must adhere to these principles to draw valid and reliable conclusions about the causal relationships between variables, advancing our understanding of various phenomena in social psychology and other fields.

Some scholars have questioned the utility of experimentation, noting that the experiments which researchers design sometimes do not resemble the circumstances that people encounter in their everyday lives. However, experimentation is the only research method that allows one to definitively establish the existence of a causal relationship between two or more variables.

References:

  • Goodwin, C. J. (2003). Research methods in psychology: Methods and design. New York: Wiley.
  • Pelham, B. W. (1999). Conducting research in psychology: Measuring the weight of smoke. Pacific Grove,CA: Brooks/Cole.

  • Foundations
  • Write Paper

Search form

  • Experiments
  • Anthropology
  • Self-Esteem
  • Social Anxiety

example of social psychology experiments

  • Psychology >

Social Psychology Experiments

Social psychology experiments can explain how thoughts, feelings and behaviors are influenced by the presence of others.

This article is a part of the guide:

  • Milgram Experiment
  • Bobo Doll Experiment
  • Stanford Prison Experiment
  • Asch Experiment
  • Milgram Experiment Ethics

Browse Full Outline

  • 1 Social Psychology Experiments
  • 2.1 Asch Figure
  • 3 Bobo Doll Experiment
  • 4 Good Samaritan Experiment
  • 5 Stanford Prison Experiment
  • 6.1 Milgram Experiment Ethics
  • 7 Bystander Apathy
  • 8 Sherif’s Robbers Cave
  • 9 Social Judgment Experiment
  • 10 Halo Effect
  • 11 Thought-Rebound
  • 12 Ross’ False Consensus Effect
  • 13 Interpersonal Bargaining
  • 14 Understanding and Belief
  • 15 Hawthorne Effect
  • 16 Self-Deception
  • 17 Confirmation Bias
  • 18 Overjustification Effect
  • 19 Choice Blindness
  • 20.1 Cognitive Dissonance
  • 21.1 Social Group Prejudice
  • 21.2 Intergroup Discrimination
  • 21.3 Selective Group Perception

Typically social psychology studies investigate how someone's behavior influences a groups behavior or internal states, such as attitude or self-concept.

Obedience to Authority

"I was only following orders" Legal defence by a Nazi leader at the Nuremberg trial following World War II

The aftermath of World War 2 made scientists investigate what to made people "follow orders" even though the orders were horrible. The Stanley Milgram Experiment showed that also non-nazi populations would follow orders to harm other persons. It was not a German phenomenon as many thought.

Milgram's Lost Letter Experiment

Classic social psychology experiments are widely used to expose the key elements of aggressive behavior, prejudice and stereotyping. Social group prejudice is manifested in people's unfavorable attitudes towards a particular social group. Stanley Milgram's Lost Letter Experiment further explains this.

Obedience to a Role - Dehumanization

The Abu Ghraib prison-episode was yet another example on the power of predefined roles. The Stanford Prison Experiment by Philip Zimbardo, demonstrated the powerful effect our perception of expectations in roles have.

Solomon Asch wanted to test how much people are influenced by others opinions in the Asch Conformity Experiment .

Observational Role Learning

Behaviorists ruled psychology for a long time. They focused on how individuals learn by trying and failing. Albert Bandura thought that humans are much more than "learning machines". He thought that we learn from role models, initiating the (bandura) social cognitive theory. It all started with the Bobo Doll Experiment .

Helping Behavior - Good Samaritan

Knowing the story of the Good Samaritan makes you wonder what made the Samaritan help the stranger, and why did he not get help from the priest or the Levite? The Good Samaritan Experiment explores causes of not showing helping behavior or altruism.

Cognitive Dissonance Experiment

The Cognitive Dissonance Experiment by Leon Festinger assumes that people hold many different cognitions about their world and tests what happens when the cognitions do not fit. See also the more in depth article about the Cognitive Dissonance Experiment .

Bystander Effect

The Bystander Apathy Experiment was inspirated and motivation to conduct this experiment from the highly publicised murder of Kitty Genovese in the same year.

Groups and Influence On Opinion

Sherif's classic social psychology experiment named Robbers Cave Experiment dealt with in-group relations, out-group relations and intergroup relations.

The Social Judgment Experiment was designed to explore the internal processes of an individual's judgment and intergroup discrimination , how little it takes for people to form into groups, and the degree to which people within a group tend to favour the in-group and discriminate the out-group.

Halo Effect

The Halo Effect was demonstrated by Nisbett and Wilson's experiment. It fits the situation of Hollywood celebrities where people readily assume that since these people are physically attractive, it also follows that they are intelligent, friendly, and display good judgment as well. This also greatly applies to other well-known people such as politicians.

Wegner's Dream Rebound Experiment

According to studies, thoughts suppressed may resurface or manifest themselves in the future in the form of dreams. Psychologist Daniel M. Wegner proves this in his experiment on effects of thought suppression .

False Consensus

Everyone's got their own biases in each and every occasion, even when estimating other people behaviors and the respective causes. One of these is called the false consensus bias. Psychologist Professor Lee Ross conducted studies on setting out to show how false consensus effect operates.

Interpersonal Bargaining

Bargaining is one of the many activities we usually engage in without even realizing it. The Moran Deutsch and Robert Krauss Experiment investigated two central factors in bargaining, namely how we communicate with each other and the use of threats.

Understand and Belief

Daniel Gilbert together with his colleagues put to test both Rene Descartes' and Baruch Spinoza's beliefs on whether belief is automatic or is a separate process that follows understanding. This argument has long been standing for at least 400 years before it was finally settled.

Self-Deception

People lie all the time even to themselves and surprisingly, it does work! This is the finding of the Quattrone and Tversky Experiment that was published in the Journal of Personality and Psychology.

Overjustification Effect

The overjustification effect happens when an external incentive like a reward, decreases a person's intrinsic motivation to perform a particular task. Lepper, Greene and Nisbett confirmed this in their field experiment in a nursery school.

Chameleon Effect

Also called unintentional mirroring, the chameleon effect usually applies to people who are getting along so well, each tend to mimic each other's body posture, hand gestures, speaking accents, among others. This was confirmed by the Chartrand and Bargh experiments.

Confirmation Bias

Confirmation bias is also known as selective collection of evidence. It is considered as an effect of information processing where people behaves to as to make their expectations come true. People tend to favor information that confirms their preconceptions or hypotheses independently of the information's truthness or falsity.

Choice Blindness

Choice blindness refers to ways in which people are blind to their own choices and preferences. Lars Hall and Peter Johansson further explain this phenomenon in their study.

Stereotypes

The Clark Doll Test illustrates the ill effects of stereotyping and racial segregation in America. It illustrated the damage caused by systematic segregation and racism on children's self-perception at the young age of five.

Selective Group Perception

In selective group perception, people tend to actively filter information they think is irrelevant. This effect is demonstrated in Hastorf and Cantril's Case Study: They Saw a Game .

Changing Behaviour When Being Studied

The Hawthorne Effect is the process where human subjects of an experiment change their behavior, simply because they are being studied. This is one of the hardest inbuilt biases to eliminate or factor into the design.

  • Psychology 101
  • Flags and Countries
  • Capitals and Countries

Oskar Blakstad (Oct 10, 2008). Social Psychology Experiments. Retrieved Sep 17, 2024 from Explorable.com: https://explorable.com/social-psychology-experiments

You Are Allowed To Copy The Text

The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0) .

This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give appropriate credit and provide a link/reference to this page.

That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).

example of social psychology experiments

Want to stay up to date? Follow us!

Save this course for later.

Don't have time for it all now? No problem, save it as a course and come back to it later.

Footer bottom

  • Privacy Policy

example of social psychology experiments

  • Subscribe to our RSS Feed
  • Like us on Facebook
  • Follow us on Twitter

LESSWRONG LW

28 social psychology studies from *experiments with people* (frey & gregg, 2017).

I'm reading a very informative and fun book about human social psychology, Experiments With People (2nd ed, 2018).

... 28 social psychological experiments that have significantly advanced our understanding of human social thinking and behavior. Each chapter focuses on the details and implications of a single study, while citing related research and real-life examples along the way.

Here I summarize each chapter so that you can save time. Some results are old news to me, but some were quite surprising. I often skip over the experimental details, such as how the psychologists used ingenious tricks to make sure the participants don't guess the true purposes of the experiments. Refer to originals for details.

The experiments start in the 1950s and get up to 2010s, and occasionally literatures from before 1900s are quoted.

Chapters I find especially interesting are:

  • Chap 14. It lists the many failures of introspection, and raises question as to what consciousness can do.
  • Chap 16. It has significant similarity with superrationality and acausal trade.
  • Chap 20. It warns about how credulous humans are.
  • Chap 27. It is about the human fear of death and the psychological defenses against it.
  • Chap 28. It shows how belief in free will can be motivated by a desire to punish immoral behaviors. Understanding why people believe in free will is necessary for a theory of what is the use of the belief in free will.

Chap 1. Conforming to group norms

Asch conformity experiment , from Opinions and Social Pressure (Asch, 1955)

Video demonstration.

Groups of eight participated in a simple "perceptual" task. In reality, all but one of the participants were actors, and the true focus of the study was about how the remaining participant would react.
Each student viewed a card with a line on it, followed by another with three lines labeled A, B, and C (see accompanying figure). One of these lines was the same as that on the first card, and the other two lines were clearly longer or shorter. Each participant was then asked to say aloud which line matched the length of that on the first card... The actors would always unanimously nominate one comparator, but on certain trials they would give the correct response and on others, an incorrect response. The group was seated such that the real participant always responded last.

It was found that

  • When there are over 3 actors giving unanimously the wrong answer, the participant went along 1/3 of time.
  • Increasing the number of actors above 3 did not increase compliance.
  • Even when the difference between the lines was 7 inches, there were still some who complied.
  • If there is at least one actor disagreeing with the majority, the participant decreased compliance.
  • If the fellow dissenter joins the majority, the participant increased compliance to the same level of 1/3.
  • If the fellow dissenter leaves, the participant increased compliance only slightly.

There are two reasons for this compliance. One is heuristic about knowledge: the majority is usually more correct. Another is normative: social acceptance matters more than being correct.

The effect of a dissenting minority is notable.

Research finds that, whereas majorities inspire heuristic judgments and often compliance, minorities provoke a more systematic consideration of arguments, and possibly, an internal acceptance of their position. (Nemeth, 1987) Majorities tend to have a greater impact on public conformity, whereas minorities tend to have more effect on private conformity. (Chaiten & Stangor, 1987)

Chap 2. Forced compliance theory and cognitive dissonance

In When Prophecy Fails , the story of a UFO cult was detailed. When the doomsday prophecy failed, most people left, but some became even firmer believers.

(My own example, not appearing in the book.) In Borges's story A Problem , Borges asks, how would Don Quixote react if he kill a man?

Having killed the man, don Quixote cannot allow himself to think that the terrible act is the work of a delirium; the reality of the effect makes him assume a like reality of cause, and don Quixote never emerges from his madness.

This chapter reviews of Cognitive consequences of forced compliance (Festinger & Carlsmith, 1959)

  • Participants were asked to do an extremely boring task.
  • Then the experimenter asked the participant to deceive the next participant that the experiment was fun. Half were paid $1, another half paid $20.
  • A control group was not asked to lie.
  • Then they were nudged to take a survey about how they felt about the experiment.

The result is that, those paid $1 thought the experiment was fun, and those paid $20 thought it was boring, and those that didn't get asked to lie thought it was very boring.

Festinger explains this by the theory of cognitive dissonance:

  • An attitude (thinking the experiment was boring) and a behavior (saying it was fun) clashes, creating an uncomfortable feeling.
  • The participant then is motivated to remove the discomfort by changing the attitude by rationalization (thinking the experiment was actually fun).
  • If the participant was paid $20, then there was no dissonance, as there was a ready explanation of the dissonant behavior.
  • If the participant was paid $1, then there was dissonance, because the participant regarded the lying behavior as mostly voluntary .

An alternative explanation from Self-perception: An alternative interpretation of cognitive dissonance phenomena (Bem, 1967) :

  • We don't form beliefs about ourselves by direct introspection, instead, we infer it through
  • When we behave against previously self-beliefs, this creates an update on our self-beliefs.

See also Chap 14 for more on the lack of introspection.

Current consensus is that both theories are correct, in different situations. The self-perception effect happens when the behavior is mildly different from self-beliefs, and the cognitive dissonance effect happens when the behavior is grossly different.

There are also many complications, such as in Double forced compliance and cognitive dissonance theory (Girandola, 1997) , which reported that even if participants performed a boring task, then told others about how boring it was, afterwards they still felt the task was more interesting afterwards.

There is a lot of ongoing research.

Chap 3. Suffering can create liking

Such curious phenomena as hazing has been studied since The effect of severity of initiation on liking for a group (Aronson, 1959)

An experiment was conducted to test the hypothesis that persons who undergo an unpleasant initiation to become members of a group increase their liking for the group; that is, they find the group more attractive than do persons who become members without going through a severe initiation.

The group was a made-up thing by the experimenters. It purports to discuss interesting sexual things, but the participants, after finally "joining", would only hear a very boring group discussion about animal sex.

This hypothesis was derived from Festinger's theory of cognitive dissonance." 3 conditions were employed: reading of "embarrassing material" before a group, mildly embarrassing material to be read, no reading. The results clearly verified the hypothesis.

The "embarrassing material" are lists of obscene words. The "mildly embarrassing material" are lists of mildly sexual words.

Result: the very embarrassing ritual increased liking for the group.

Explanation was by the theory of cognitive dissonance: "I have already invested so much to join the group. I must be a fool if the group turned out to be bad! And I'm not a fool."

Cognitive dissonance has been used for brainwashing, persuasion, education, and many other kinds of things.

One of the authors learned from an investigative journalist about how a dodgy car company... had customers unnecessarily wait or hours while their finance deal was supposedly being negotiated upstairs.

[ Commitment and community: Communes and utopias in sociological perspective (Kanter, 1972)] noted that

19th-century utopian cults requiring their member to make significant sacrifices were more successful. For example, cults that had their members surrender all their personal belongings lasted much longer than those that did not.

Some bad investments are continued far after they had become clearly unprofitable, this is the sunk cost fallacy .

Chap 4. Just following orders

The banality of evil is the theory that everyday people can do great evils such as the Holocaust, by simply following orders.

Behavioral study of obedience (Milgram, 1965) reported the famous Milgram experiment . A video recreation is here .

This is a very famous experiment with many followups. There is sufficient material freely online, such as the Wikipedia page. So I won't recount it here.

I was most surprised to learn that personality had very little effect. That is, obedience exhibited by the participants in this experiment was mostly situational , instead of stemming from the personality of the participants.

Chap 5. Bystander apathy effect

The murder of Kitty Genovese stimulated research into the "bystander effect". On March 13, 1964 Genovese was murdered... 38 witnesses watched the stabbings but did not intervene or even call the police until after the attacker fled and Genovese had died...

In Bystander intervention in emergencies: Diffusion of responsibility (Latané & Darley, 1968) attributed the lack of help by witnesses to diffusion of responsibility : because each saw others witnessing the same event, they assumed that the others would take responsibility.

This phenomenon has a big literature, and is very popularly known, possibly due to the dramatic stories.

Concerning the original experiment by Latane and Darley, I was again surprised that personality factors had little effect, except one: growing up in a big community is correlated with a lower probability of helping.

Chap 6. The effect of an audience

When people perform a task in the presence of others, they perform better if the task is easy, and worse if the task is hard. One theory is that presence of others increases physiological arousal , which then enhances performance of simple tasks and decreases performance of hard tasks. Other theories

In Social enhancement and impairment of performance in the cockroach (Zajonc, 1969) , it is found that this is true even for cockroaches. In the experiment, Zajonc gave cockroaches two possible tasks: going through a straight maze, or a more complex maze. They either did the task alone, or while being watched by others outside (the maze was transparent).

While being watched, cockroaches solved faster on the straight maze but slower on the complex maze. This demonstrates that the physiological arousal theory is correct in cockroaches: the effect of an audience can happen without any complex cognitive ability.

However, complex cognitive ability sometimes does occur in humans. As reported in Social facilitation of dominant responses by the presence of an audience and the mere presence of others (Cottrell et al, 1968) , blindfolded audience does not exert an effect on the performer.

Chap 7. Group conflicts from trivial groups

This chapter begins with the Robbers Cave experiment , which was a study that investigates the realistic conflict theory , which sounds very common-sense:

  • group conflicts and feelings of resentment for other groups arise from conflicting goals and competition over limited resources
  • length and severity of the conflict is based upon the perceived value and shortage of the given resource
  • positive relations can only be restored with goals that require cooperation between groups

Then it recounts the blue eyes-brown eyes experiment . The problem, then, is, what is the least amount of group-difference in order to make a difference? Enter the minimal group paradigm of Experiments in intergroup discrimination (Tajfel, 1970) . Participants first took a test on estimating dot numbers, then divided into "overestimators" and "underestimators", while in truth they were random. Then, they were given points (convertible to cash) to divide among the groups. Participants favored their own groups significantly more.

In fact, the most favored strategy was to maximize (own group) - (other group), even though it did not maximize (own group). Thus, even the most minimal social groups induced ingroup-outgroup conflict.

The minimal group paradigm has been studied in many ways. It was also found that outgroup homogeneity effect , that is, "they are all the same; we are diverse", could also arise from minimal groups.

One theoretical explanation is Tajfel and Turner's social identity theory (Tajfel & Turner, 1979) , which states that: 0. A person's self-esteem depends on having a good identity.

  • A person's identity has two parts: personal and social.
  • Personal identity are about one's own traits and outcomes.
  • Social identity are derived from social groups and comparison between groups.
  • A person is motivated to improve self-esteem, and thus social identity.
  • Thus, one is motivated to improve the standings of one's ingroups and decrease the standings of one's outgroups.

One supporting evidence is that when a person has more self-esteem, they are less discriminating against outgroups (Crocker et al, 1987) .

Chap 8. The Good Samaritan Experiment

In the parable of the Good Samaritan ,

a traveller is stripped of clothing, beaten, and left half dead alongside the road. First a priest and then a Levite comes by, but both avoid the man. Finally, a Samaritan happens upon the traveller. Samaritans and Jews despised each other, but the Samaritan helps the injured man.

This inspired an experiment reported in "From Jerusalem to Jericho": A study of situational and dispositional variables in helping behavior. (Darley & Batson, 1973) , participants were theology students asked to give a short talk in another building.

People going between two buildings encountered a shabbily dressed person slumped by the side of the road. Subjects in a hurry to reach their destination were more likely to pass by without stopping.

The experiment was 2 x 3: the participant was asked to either give a short talk on the parable of the Good Samaritan, or on an irrelevant topic. They were either very hurried, hurried, or not hurried by the experimenter.

Hurrying made significant difference in the likelihood of their giving the victim help. The topic of the talk had some influence, according to a reanalysis by (Greenwald, 1975) , despite the original paper's claim of no influence. Self-reported personality and religiosity made no difference.

The lesson from this as well as many other social psychology experiments is that seemingly trivial situational variables have a greater impact than personality variables, even though people tend to explain behaviors using personality. See The Person and the Situation: Perspectives of Social Psychology (Lee Ross, Richard E. Nisbett, 2011)

Chap 9. External motivation harms internal motivation

Extrinsic motivations are motivations that "come from the outside", such as money, praise, food. Intrinsic motivations are from the inside, such as self-esteem, happiness. Both can motivate behaviors. However, it's interesting that sometimes extrinsic motivations can harm internal motivation.

In Undermining children's intrinsic interest with extrinsic reward: A test of the" overjustification" hypothesis (Lepper et al, 1973) , children are given markers to draw with. Some were told that they would be rewarded with a prize for playing, others got a prize unexpectedly, others were left alone as control group.

After some days, the amount of time children spent playing the markers were: got expected prize < control group < got unexpected prize

The book didn't talk much about why the unexpected prize created higher motivation, but I think it is similar to how gambling addiction comes from variable reward .

There are some explanations for why extrinsic reward lowered subsequent motivation. One is that extrinsic reward provides overjustification effect , where external rewards "crowd out" internal rewards,

Once rewards are no longer offered, interest in the activity is lost; prior intrinsic motivation does not return, and extrinsic rewards must be continuously offered as motivation to sustain the activity.

Another explanation is that humans heuristically view means to an end as undesirable. In (Sagotsky et al, 1982) , children were given two activities, playing with crayons and markers. They were equally fun at the beginning, but one group was told that in order to play with crayons, they had to play with markers first. After a while, they became less interested in playing with markers. The other group, the reverse.

I think this is the psychological basis of some ethical intuitions in the style of Kant :

we should never act in such a way that we treat humanity as a means only but always as an end in itself.

A third explanation is that people consider extrinsic rewards a threat to their freedom and autonomy, and thus tend to rebel against it. I saw a news today about Amazon's program to gamify work . Some complained that it was threatening the workers' autonomy, which is a strange complaint: if gamification actually increases intrinsic motivation for work, doesn't it increase autonomy? Autonomy is freedom to follow one's intrinsic motivation, and thus, if a worker acquires an intrinsic motivation to do a good job, they would have more autonomy.

I think this complaint can be explained as a different kind of autonomy: freedom from prediction. Humans are evolved to want to be unpredictable (at least by others), because to be predictable is to be threatened by manipulation, which often decreases fitness.

Chap 10. Actor-observer asymmetry

Other people did what they did because of who they are. We did what we did because of outside events.

In 1975, parts of the Watergate scandal was recreated in a very dramatic psychology experiment, reported in Ubiquitous Watergate: An attributional analysis (West, 1975) .

80 criminology students were asked to meet the experimenter privately for a mysterious reason. There, they were asked to join a burglary team for secret documents in an ad agency. There were four versions presented:

  • The burglary plan was sponsored by a government agency, for secret investigation purposes. Government would provide immunity if caught.
  • Same, but without immunity.
  • The plan was sponsored by a rival ad agency, with $2000 reward.
  • The student was asked to only join a test run of the plan, without stealing anything.

Afterwards, they were debriefed and asked to explain their decision to join/not join.

Separately, 238 psychology students were presented the above situation, and asked to guess what percentage would agree to the plan.

Then, half of the participants were asked, "Suppose John agreed to participate, explain why John agreed."

  • About 45% of participants agreed to join the burglary in the government-with-immunity situation. Otherwise, about 10%.
  • Most students in the second part thought they would not agree to the burglary plan.
  • Students in the first part who agreed to join the burglary explained their behavior as due to the circumstances.
  • Students in the second part explained the hypothetical John's behavior as due to John's personality.

The criminology students were "actors", and the psychology students were "observers". An asymmetry was that actors attributed their behavior to situations, while the observers attributed to personalities. This is the actor-observer asymmetry.

Complications in this asymmetry are noted in The actor-observer asymmetry in attribution: A (surprising) meta-analysis (Malle, 2006) . Malle found that there are two kinds of biases: when the behavior is negative, the actor blames the situation and the observer blames the person. When the behavior is positive, the reverse happens. As such, this can be explained as a self-serving bias.

The authors conclude with a funny note:

it's interesting to how athletes often publically thank the Lord for a personal victory, but do not publically blame the Lord for a defeat!

Chap 11. We are number 1

They never shout, "They are number 1."

People like to think good about themselves. Even in collectivistic societies, people regard themselves as above average in collectivistic traits, according to Pancultural self-enhancement (Sedikides et al, 2003)

Americans... self-enhanced on individualistic attributes, whereas Japanese... self-enhanced on collectivistic attributes

An experiment is reported in Basking in reflected glory: Three (football) field studies (Ciadini et al, 1976) , where students are asked to describe a recent university sports team's victory/defeat. Before that, half received criticisms that decreased to their self-esteem, and others received praises that increased their self-esteem.

The result was that among those who had higher self-esteem, they described the sports outcome using "we won" or "we lost" 1/4 of the times. For those who had lower self-esteem, they used "we won" 40% of the times when the team won, but used "we lost" only 14% of the times when the team lost.

The explanation is that people in need of boosts to self-esteem try to BIRG (Basking in reflected glory) and CORF (Cut off from reflected failures). Reflected glory also improves their social standing.

Methods of increasing one's social standing are called impression management , and include:

  • BIRG and CORF, as noted above;
  • ingratiation: we praise and agree with others, so as to be liked;
  • self-handicapping: a student gets drunk before a big test, so that if they fail, they could blame on the drunkenness instead of their study ability;
  • exemplification: behave virtuously and make sure others saw it.

A lot of these techniques are listed in (Jones and Pittman, 1982) .

Chap 12. Deindividuation

Effects of deindividuation variables on stealing among Halloween trick-or-treaters (Diener et al, 1976) reported an experiment in real life .

The experiment was run in a Halloween. An experimenter place a bowl of candy in her living room for trick-or-treaters. A hidden recorder observes. In one condition, the woman asked the children identification questions such as their names. In the other condition, children were completely anonymous. Some children came individually, others in a group.

In each condition, the woman invited the children in, claimed she had something in the kitchen she had to tend to, and told each child to take only one candy.

Result: being in a group and being anonymous both increased frequency of transgression (taking more than one candy). If the first child to take candies in a group transgressed, other children were also more likely to transgress.

The authors then defined deindividuation as when private self-awareness is reduced.

The truly deindividuated person is alleged to pay scant atetntion to personal values and moral codes... to be inordinately sensitive to cues in the immediate environment.

One study, The baiting crowd in episodes of threatened suicide (Mann, 1981) , examined 21 cases from newspapers, in which crowds were present when a person threatened to jump off a high place.

Baiting or jeering occurred in 10 of the cases. Analysis of newspaper accounts of the episodes suggested several deindividuation factors that might contribute to the baiting phenomenon: membership in a large crowd, the cover of nighttime, and physical distance between crowd and victim (all factors associated with anonymity).

Two theories of why deindividuation were given. One is that anonymity makes people feel safe to transgress. Another is that (Reicher & Postmes, 1995) people in a crowd would categorize themselves mainly by their social identity, and their behaviors would reflect the group norm than their personal norms.

I was disappointed that the authors did not give evolutionary psychological explanations for deindividuation. Humans are the only animals that wage wars. A deindividuation effect can be an evolutionary adaptation to prepare humans to fight more effectively in a crowd.

Chap 13. Mere exposure effect

People prefer familiar things. Really, that's quite a banal observation. What's delightful about this chapter is the ingenuity of the experiment design.

Think about your own face. You see them in a mirror image (unless you take a selfie), but others see it directly. This means that you are familiar with your face in the mirror image, but others in the direct image.

This is exploited in Reversed facial images and the mere-exposure hypothesis (Mita et al, 1977) . Couples were separately shown photos of the female one's face, some mirrored, others not. They were asked to pick the one they prefer. The female one preferred the mirrored photo, and the male one preferred the direct photo.

Mere exposure effect is robust in real life and across species. (Grush et al, 1978) found that

previous or media exposure alone successfully predicted 83% of the [US congress election] primary winners

And (Cross et al, 1967) found rats who heard Mozart music in infancy preferred Mozart over Schoenberg as adults, and vice versa.

One possible evolutionary psychological explanation were given: preference familiarity is safer, and thus more adaptive. The authors warned however that it's not so simple, as people also have a preference for mild novelty.

Chap 14. Shortcomings of introspection

This chapter reviews a study that shows a particular instance of introspection failure:

people's ideas about how their minds work stem not from private insights but from public knowledge. Unfortunately, however, this public knowledge is often not accurate. It is based on intuitive theories, widely shared throughout society, that are often mistaken.

The book referenced Verbal reports about causal influences on social judgments: Private access versus public theories (Nisbett, 1977) , although I find Telling More Than We Can Know: Verbal Reports on Mental Processes (Nisbett & Wilson, 1977) to be better.

Subjects are sometimes (a) unaware of the existence of a stimulus that importantly influenced a response, (b) unaware of the existence of the response, and (c) unaware that the stimulus has affected the response. It is proposed that when people attempt to report on their cognitive processes... they do not do so on the basis of any true introspection. Instead, their reports are based on a priori , implicit causal theories, or judgments about the extent to which a particular stimulus is a plausible cause of a given response.

In the experiment, a subject is given a fictitious application from Jill for the job of staff at crisis center. These applications are the same except on a few attributes of the applicant: attractiveness, intelligence, etc. Then, the subject is asked how much each attribute is correlated with the decision to accept.

The situation is then described to some observers (who didn't do the job application review), who are asked how much each attribute is correlated with the decision of the subject to accept.

Subjects who read that Jill had once been involved in a serious car accident claimed that the event had made them view her as a more sympathetic person. However, according to the ratings they later gave, this event had exerted no impact... the only exception pertained to ratings of Jill's intelligence. Here, an almost perfect correlation emerged between how subjects' judgments had actually shifted and how much they believed they had shifted. Why so? The researchers argued that there are explicit rules, widely known throughout a culture, for ascribing intelligence to people. Because subjects could readily recognize whether a given factor was relevant to intelligence, they could reliably guess whether they would have taken it into consideration.
The determinations of subjects and observers coincided almost exactly.

There are other introspection failures demonstrated by social psychology. People are unaware of the halo effect at work in their own judgments of others (Nisbett &Wilson, 1977) . People are unaware of the source of their own arousal . People are unaware of their bias even if they know of such bias (Pronin et al, 2002) .

In a further twist, introspection can degrade judgment. In (Wilson & Kraft, 1993) , participants reported how they felt about their romantic partners. Their expressed feelings correlated well with the duration of relationship. However, if they introspected on the reason of their feelings, before reporting their feelings, the correlation disappeared.

The authors conclude by suggesting that traveling, by putting oneself into novel situations, would be particularly helpful for one to know oneself.

Chap 15. Self-fulfilling prophecies

Again, a very well-known subject with a lot already written. This chapter reviews Social perception and interpersonal behavior: On the self-fulfilling nature of social stereotypes (Snyder et al, 1977)

Male "perceivers" interacted with female "targets" whom they believed to be physically attractive/unattractive. Tape recordings of each participant's conversational behavior were analyzed by naive observer judges for evidence of behavioral confirmation... targets who were perceived to be physically attractive came to behave in a friendly, likeable, and sociable manner in comparison with targets whose perceivers regarded them as unattractive. It is suggested that theories in cognitive social psychology attend to the ways in which perceivers create the information that they process in addition to the ways that they process that information.

Philosophically, a self-fulfilling prophecy is a prediction about a future that is true iff the act of prediction is done. Usually, predictions themselves are supposed to be independent of the future that they talk about. Of course, all useful predictions must affect the future -- the predictor would try to profit from the prediction. However, such effects on the future are on the predictor , not on the predicted .

Social psychologists have found that human behaviors are more influenced by the situation than the personality (as noted in The Person and the Situation book). Snyder et al suggested that, in fact, personality traits are one of those self-fulfilling prophecies.

our believing that others possess certain traits may cause us to behave in certain consistent ways toward them. This may cause them to behave in consistent ways in our presence.

In other words, a lot of the persistence of personality could arise from the fundamental attribution error .

Chap 16. How to live like a predeterminist

So then, God has mercy on whom he chooses to have mercy, and he hardens whom he chooses to harden. -- Romans 9:18, which Calvinists quote a lot.

Suppose an urge to smoke and a propensity to lung cancer are both genetically determined, and smoking does not cause lung cancer, why not smoke? If you feel the urge to smoke, it's already too late.

Believers of Calvinism think that God has chosen some people to be saved, and others are damned. Those who are favored by God would both be naturally free from the urge to sin in this world, and enjoy paradise after death. Those who are not, would feel the urge to sin in this world, and go to hell after death.

So if a Calvinist feels an urge to sin, it's already too late. Why not sin? Instead, Calvinists keep resisting the urge to sin, and moreover, deny that they are resisting such urges, and insisting that they are effortlessly virtuous, evidence of God's favor.

In Causal versus diagnostic contingencies: On self-deception and on the voter's illusion (Quattrone & Tversky, 1984) two experiments are reported.

In the first one, participants exercised, then were asked to put their hands in ice water until the pain makes them withdraw. Then they were told a version of the lung cancer puzzle: There are two kinds of hearts, type 1 and type 2, caused by unchangeable genetics. Type 1 heart is associated both with health and with a higher tolerance to the ice water after exercise. Type 2 heart is associated with early death and a lower tolerance. They then did the ice water test again, and they exhibited longer tolerance to the ice water, even though many of them denied that they were trying to do so.

In the second experiment, subjects encountered one of two theories about the sort of voters who determine the margin of victory in an election. Only one of the theories would enable voting subjects to imagine that they could "induce" other like-minded persons to vote. As predicted, more subjects indicated that they would vote given that theory than given a theory in which the subject's vote would not be diagnostic of the electoral outcome, although the causal impact of the subject's vote is the same under both theories

One explanation is that the unconsciousness deceived the consciousness, but the authors find this unreasonable, for it still does not explain what motivates the unconsciousness to deceive. They instead favored Greenwald's theory that people avoid analyzing in detail threatening information, just like how we throw away junk mail without looking in detail.

In conclusion, self-deception is not the result of one center of intelligence hoodwinking the other. Rather, it is the result of a low-level screening process that banishes suspicious cognitions before they have the opportunity to be fully entertained by the conscious mind.

Similarity to superrationality and acausal trade.

The behavior of Calvinists is similar to superrationality and acausal trade , in which agents behave in a way that is diagnostic of good outcomes, even if it does not cause good outcomes.

Assuming the superrational player has access to their opponents' source codes/simulations, the superrationality strategy can be justified, but then it would just be usual rationality.

I think normative decision theories are incompatible with sufficiently good prediction. Normative decisions are only defined for agents with apparent free will. An agent apparently has free will only to someone who cannot predict the agent's behavior well. Superrationality and acausal trade both attempt to make a decision theory for agents that are aware that they are too predictable (to themselves or to someone they play with). This is similar to the situation where someone sees the future and then "decides" to rebel against the future. Either they saw the true future and did not rebel, or they did not see the true future at all. It's illogical to say they both saw the future and rebelled against it.

Similar problems happen with Scott Aaronson's solution to Newcomb's paradox (I'm a "Wittengenstein"). A determinist who is self-aware of their determinism would, instead of offering a decision theory ("I should take one box because..."), offer a prediction theory ("I probably would take one box because...").

Chap 17. Partisan perceptions of media bias

People often complain of media biases. People report differently about the same event. Why?

In The hostile media phenomenon: biased perception and perceptions of media bias in coverage of the Beirut massacre (Vallone et al, 1985) , the researchers studied how people perceived news about the Bairut massacre ,

killing of civilians, mostly Palestinians and Lebanese Shiites... carried out by the militia under the eyes of their Israeli allies.

The researchers took some neutral reports on the event, and as expected, pro-Israel people thought they are biased to be anti-Israel, while anti-Israel people thought they are biased to be pro-Israel.

In a study on biases (Lord et al, 1984) , participants avoided bias by this command:

"Ask yourself at each step whether you would have made the same evaluations had exactly the same study produced results on the other side of the issue.

Chap 18. Empathy-altruism hypothesis

Several theorized psychological mechanisms of human altruistic actions are studied in More evidence that empathy is a source of altruistic motivation (Batson, 1982) reported an experiment on whether people would help a person in need.

It was found that: If (empathy OR guilt), then (helping). That is, people can be motivated to act altruistically by empathy without expectation of gain, or to gain relief from guilt. This argues against the theory of psychological hedonism .

Other potential sources of altruism are collectivism (act for the benefit of a group) and principlism (uphold a principle for its own sake). Effective altruism is one example of principlism based on utilitarianism.

Chap 19. Expanding the self to include the other

A psychological phenomenon of love (close personal relationships, such as lover, best friend) is to include that person in one's self. This involves perceiving, and allocating resources to, that person, in a similar way as to one's self.

Three experiments are described, from Close relationships as including other in the self (Aron, 1991) .

When allocating money, they allocate about the same to themself as to their friend.

They were asked to imagine nouns paired with their selves, mothers, or strangers. They recalled fewer nouns imagined with self or mother than nouns imagined with a stranger, suggesting that mother was processed more like self than a stranger. They explained the reason why it was recalled less by that we usually look at strangers directly, but only ourselves upon reflection (literal or not), and so it's harder to imagine ourselves than strangers.

When faced with a task to sort a list of adjectives into 4 piles: "true/false about me, and true/false about my spouse", they reacted slower on adjectives that were true about one but false about the other. This was explained by that differences between one's own and a close other's properties caused dissonance in the same way that holding opposite attitudes within oneself can cause dissonance.

Chap 20. Believing precedes disbelieving

Descartes divided the mind up into intellect and will. The intellect writes up potential beliefs about the world; the will then chooses which to endorse. Spinoza said that we believe everything that we happen to understand, and then disbelieve only if we find it necessary. You Can't Not Believe Everything You Read (Gilbert, 1993) presented three experiments that supports Spinoza's theory, and discussed its sociological effect.

... we asked subjects in Experiment 1 to play the role of a trial judge and to make sentencing decisions about an ostensibly real criminal defendant. Subjects were given some information about the defendant that was known to be false and were occasionally interrupted [by a distraction task]... We predicted that interruption would cause subjects to continue to believe the false information they accepted on comprehension and that these beliefs would exert a profound influence on their sentencing of the defendant...
Experiments 1 and 2 provide support for the Spinozan hypothesis: When people are prevented from unbelieving the assertions they comprehend... they did not merely recall that such assertions were said to be true, but they actually behaved as though they believed the assertions.

If you want to read more, I have written in detail about this .

Chap 21. Inferred memories

When we recall a memory, that memory is an inference about past based on a number of clues that we have in the present. It is not necessarily accurate.

Experiment from Women's theories of menstruation and biases in recall of menstrual symptoms (McFarland, 1989) found that when women report, day-to-day, their unpleasant emotions, there is no difference between premenstrual, menstrual, and inter-menstrual days (they feel equally unpleasant). But when asked to recall how unpleasant it was, they recall significantly more unpleasant pre-menstrual and menstrual days, and less unpleasant inter-menstrual days.

This is explained by that, when they recall, they used intuitive theories about PMS to infer "how it must have felt" instead of "how it actually felt". This also, as a side effect, casts doubt on whether PMS actually exists .

Memories can be completely made up, as in repressed memory therapies .

The fact that those inferences about the past are felt as genuine recalls, shows how little conscious introspection can give true knowledge about the self.

Chap 22. Ironic process theory

Try to not to think of a polar bear!

The theory of ironic process is that there is a cognitive process called intender who is looking for contents that matches some desired mental state. There is also a monitor who notifies consciousness about errant thoughts.

The intender is a costly process, and the monitor is a cheap process, so when one is under cognitive load, the intender doesn't work well, but the monitor still works well, and ironically, trying to not think of something results in thinking of it.

Ironic Processes of Mental Control (Wegner, 1994) reported an experiment. Participants were asked to consciously improve/deprove their moods with happy/sad thoughts. Half were also asked to do a memory task as cognitive load .

Those not under cognitive load were successful in their mood control, while those under cognitive load achieved the opposite.

This suggests that if you are under some cognitive load (such as busy studying), and you want to improve your mood, you should try consciously to feel worse. Also, if you are in a noisy and distracting environment, and want to sleep, you should try to stay awake.

Another experiment showed that people who try to avoid sexist language become ironically more prone to sexist language when under cognitive load. This is true no matter if they are sexist or not.

Chap 23. Implicit Association Test

In Single-target implicit association tests (ST-IAT) predict voting behavior of decided and undecided voters in swiss referendums (Raccuia, 2016) , compared to self-reported political orientation, implicit association was found to be a weaker, but somewhat independent, predictor of voting behavior.

Other similar methods to probe the unconsciousness are studied, and the results are new and mixed.

Chap 24. Prospect theory

People don't behave as expectation-maximizers. Instead they are better modelled by prospect theory:

  • Gains and losses are measured compared to a changeable default, instead of an absolute zero.
  • Losses are weighted more than gains, and both have decreasing marginal utilities.

An experiment The systematic influence of gain-and loss-framed messages on interest in and use of different types of health behavior (Rothman et al, 1999) . It was found that people used more bacteria-killing mouth wash, if they received positive advertising (about maintaining good health). They used more disclosing mouth wash (which merely detects dental diseases) if they received negative advertising (about the potential disease).

This theory, along with some others, is explained in great detail in Thinking, Fast and Slow (Kahnemann, 2011) , which I recommend.

Other mental heuristics include mental accounting (Thaler, 1980) , with its own set of irrational effects.

Chap 25. Social isolation increases aggression

If you can't join them, beat them: Effects of social exclusion on aggressive behavior (Twenge, 2001)

Social exclusion was manipulated by telling people that they would end up alone later in life or that other participants had rejected them. These manipulations caused participants to behave more aggressively. Excluded people issued a more negative job evaluation against someone who insulted them, blasted a target with higher levels of aversive noise both when the target had insulted them and when no interaction had occurred. However, excluded people were not more aggressive toward someone who issued praise.

In particular,

These responses were specific to social exclusion and were not mediated by emotion.

This was shown by two experimental facts:

Participants who were told they would end up alone later in life or that other participants had rejected them, did not feel worse than average.

Participants who were told they would end up unlucky later in life, did not act more aggressively than average.

Some psychological theories are given. One is self-determination theory from Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being (Deci and Ryan, 2000) , which says that people have three needs:

  • relatedness (to some other people)
  • efficacy (can do important things)
  • autonomy (can control their own future)

Other relevant factors are self-esteem, and stability over time. Stability and level of self-esteem as predictors of anger arousal and hostility (Kernis et al, 1989) found that in feelings of anger and hostility,

unstable high self-esteem > low self-esteem > stable high self-esteem

There is no evolutionary explanation, though. Social exclusion causes fewer offsprings, and aggression only worsens it. An evolutionary psychological explanation would be good. Either it has evolutionary benefit, or it is a side effect of something else.

Chap 26. Social effects of gossiping

Gossip is found to have a prosocial function. The virtues of gossip: Reputational information sharing as prosocial behavior (Feinberg, 2012)

... prosocial gossip , the sharing of negative evaluative information about a target in a way that protects others from antisocial or exploitative behavior.

In the study, they found experimental support for four hypotheses about the function of gossip:

  • prosocial : gossip is motivated by a desire to protect vulnerable people, without promise of material reward.
  • frustration : seeing antisocial behavior makes people feel bad, which . Prosocial people are more prone to this frustration.
  • relief : gossiping reduces the frustration.
  • deterrence : threat of gossip makes antisocial people behave more prosocially.

Chap 27. Fear of death

Good news: we will be worm food one day!

Good news for worms, I meant.

Terror management theory argues that the terror of death creates such a profound, subconscious, anxiety, that humans spend their lives denying it in various ways, creating culture, religion, and many other social phenomena in the process.

In this chapter are reviewed the first 4 of the 7 experiments from How sweet it is to be loved by you: the role of perceived regard in the terror management of close relationships (CR Cox, J Arndt, 2012) . This paper studies

... whether people turn to close relationships to manage the awareness of mortality because they serve as a source of perceived regard.

Perceived regard means "am I a good person as viewed by someone else?" The paper in particular showed that people who have death on their mind exaggerate how much they think they are loved by a partner. Perceived regard from their own selves, and from average strangers, did not change. Having intense physical pain on the mind also did nothing.

They also found that having death on the mind makes people claim to love their partners more. They theorized that this is mediated by increased perceived regard:

death on the mind -> more perceived regard from their partner -> more love for their partner

Study 4 revealed that activating thoughts of perceived regard from a partner in response to MS reduced death-thought accessibility. Studies 5 and 6 demonstrated that MS led high relationship contingent self-esteem individuals to exaggerate perceived regard from a partner, and this heightened regard led to greater commitment to one's partner. Study 7 examined attachment style differences and found that after MS, anxious individuals exaggerated how positively their parents see them, whereas secure individuals exaggerated how positively their romantic partners see them. Together, the present results suggest that perceptions of regard play an important role in why people pursue close relationships in the face of existential concerns.

Personal comment : It has been commented that Transhumanism can be analyzed as a religion. Is there value in analyzing transhumanism through terror management theory? There is at least one paper, Software immortals: Science or faith? (Proudfoot, 2012) , that did so. This is important, because if transhumanism is indeed a religion, then the chance is high that it is deluded/unfalsifiable, like most religions have been shown to be.

Also, this would explain why moral nihilism is usually suffered as a mental disease than accepted as a working hypothesis. Despite its theoretical simplicity and moderate empirical support, it just doesn't offer any protection against terror of death.

Chap 28. Motivated belief in free will

Free to punish: A motivated account of free will belief (Clark, 2014)

a key factor promoting belief in free will is a fundamental desire to hold others morally responsible for their wrongful behaviors

Five experiments from the paper are recounted in detail. The authors praised the paper highly for its comprehensiveness.

participants reported greater belief in free will after considering an immoral action than a morally neutral one... due to heightened punitive motivations... reading about others’ immoral behaviors reduced the perceived merit of anti-free-will research... the real-world prevalence of immoral behavior (as measured by crime and homicide rates) predicted free will belief on a country level.
Taken together, these results provide a potential explanation for the strength and prevalence of belief in free will: It is functional for holding others morally responsible and facilitates justifiably punishing harmful members of society.

Personal comment : Instead of philosophically studying whether free will exists, it's more productive to assume it doesn't exist , and see what behaviors can be explained. If everything can be explained without free will, then the problem of free will dissolves. Else, we will have concentrated what free will is for, and made subsequent studies more focused.

It is also useful to study the human intuitive belief in free will, as important phenomena about humans, independent of whether they are right or wrong. This is analogous to the study of folk psychology and naive physics . See From Uncaused Will to Conscious Choice: The Need to Study, Not Speculate About People’s Folk Concept of Free Will (Monroe, 2009)

the core of people’s concept of free will is a choice that fulfills one’s desires and is free from internal or external constraints. No evidence was found for metaphysical assumptions about dualism or indeterminism.

In the "Afterthoughts", the authors considered what a post-free-will society could be like. I think that such a society's theory of crime and punishment would be more like "because this follows the natural order of things", than "because criminals are morally bad".

Think of the joke about "my brain made me commit the crime"

The criminal: "My brain made me commit the crime." The judge: "My brain made me sentence you."

And now, instead of taking it as a joke, imagine both of them saying them very seriously. That's what I think could be true in the future.

The first edition of this book was published in 2003. In 2005, Ioannidis' paper "Why Most Published Research Findings Are False" started the reproducibility avalanche. How well have these experiments replicated? My university library only has the first edition. I can see from the Amazon preview of the second edition (2017) that the authors address this, but I can't see enough pages to see what their response is. I understand from other sources that priming and ego-depletion have not stood up well.

Collection of images representing 5 famous social psychology experiments

# 5 Famous Social Psychology Experiments

There are countless social psychology experiments that have been influential. Here, we highlight five powerful experiments in social psychology that have shaped the development of the field.

# 1. Solomon Asch’s Experiments on Conformity

Illustration of 4 participants with three confederates, a representation of how the Asch experiment is based on

Solomon Asch carried out a series of psychological tests known as the Asch Conformity Experiments in the 1950s to find out how much social pressure from the majority group could persuade a person to conform. Asch’s experimental hypothesis was centered around how people gave in to peer pressure and whether they would disregard their own opinions in order to fit in with the group. The experiment summary of the Asch conformity studies is that several lines with different heights are presented and the participant is challenged by the confederate’s answers to either agree or disagree.

The Asch experiment's basic design comprised a subject and a cohort of accomplices. The participants were informed that they would be performing a visual perception task in which they would need to match a given line's length to one of three comparison lines.

Example of how trial stimuli in the Asch Experiment look like where a target line is shown with three choice options

Out of all the participants in each group, only one was truly ‘naïve’; the others were ‘confederates’ who were told to provide false answers on purpose for specific trials. Thus, the ‘naive’ participant would be challenged by the ‘confederates’ who provided wrong answers. This would essentially place the ‘naive’ participant in a challenging position to be in.

An example of the experimental procedure from Solomon Asch’s experiment on conformity in 1955.

An example of the experimental procedure from Solomon Asch’s experiment on conformity in 1955. There are 6 confederates pictures and the 1 real participant, sitting in the second to last seat, who are looking at the trial stimuli at the front of the room. Image copyright: Cara Flanagan.

Throughout the trials, the confederates would intentionally select the incorrect response. The crucial query was whether the ‘naive’ participant would follow their own accurate assessment or adhere to the false majority opinion. The results and findings demonstrated that even in cases where the right response was evident, a sizable portion of the ‘naive’ participants would agree with the confederate group's inaccurate responses.

The degree of conformity was influenced by several factors:

  • Group Size: Up to a certain point, conformity grew in proportion to the size of the majority. The rate of conformity did not significantly increase after a certain number of confederates.
  • Unanimity: A participant was far less likely to comply if even one other person in the group provided the right response. The pressure to fit in was significantly lessened when there was a dissident voice.
  • Task Difficulty: Participants found it more difficult to trust their own judgment when the task was more ambiguous or difficult, ie. when the comparison lines were more similar in size, leading to an increase in conformity.
  • Response Type - Public vs. Private: When participants were required to provide their answers in public, they were more likely to comply, as opposed to when providing answers privately. Thus, one factor that clearly affected conformity was the fear of social rejection.

In summary, the Asch Conformity Experiment results emphasize the strong influence of social pressure on individual behavior and the propensity to conform even in the face of clear evidence to the contrary, have become classic studies in social psychology.

Try it out in Labvanced:

A preview of the data and Asch Conformity Experiment results recorded in Labvanced can be seen in the image below, such as the values for the presented line heights, choices, and reaction times:

View of the data collected from an online version of the Asch Conformity Experiment conducted with Labvanced.

View of the data collected from an online version of the Asch Conformity Experiment conducted with Labvanced.

Set up your psychology experiment today and try out our multi-user features in Labvanced.

open in new window

# 2. Bobo Doll Experiment by Albert Bandura: Social Learning Theory

Frames from a video and images shown to the children who participated in the Bobo doll experiment.

Frames from a video and images shown to the children who participated in the Bobo doll experiment. Copyright owner: Albert Bandura.

Social psychologist Albert Bandura carried out a groundbreaking study in 1961 called the Bobo Doll experiment, which made a substantial contribution to our understanding of children's social learning and aggression. Bandura was curious about how children learn to pick up new behaviors by imitation and observation.

In this experiment, children interacted with a life-sized inflatable doll called Bobo while being exposed to adult models who were aggressive and non-aggressive. The conditions of the study were as follows:

  • Aggressive Model Condition: Children witnessed a role model act violently against the Bobo doll. Along with hitting and kicking, the aggressive actions included verbal abuse.
  • Non-Aggressive Model Condition: Children witnessed a role model who did not act aggressively toward the Bobo doll.
  • Control Group: No adult role model was seen interacting with the Bandura Bobo doll.

Children were placed in a room with the Bobo doll and other toys after looking at the conditions / models. The purpose of the study was to determine whether the children would imitate the violent acts they witnessed.

The Bobo Doll study produced some fascinating results. Compared to the control group and the non-aggressive model, children who watched the aggressive model were more likely to act aggressively toward the Bobo doll. This finding aligned with Albert Bandura's social learning theory which postulates that people learn new abilities through observing and imitating the behaviors of others. The girls in the aggressive model condition also reacted more physically aggressive when the model was male, but they responded more verbally when the model was female. The observation of how frequently they punched Bobo broke the general pattern of gender-inverted effects. It was also found that boys were more likely than girls to imitate same-sex models.

Our knowledge of the roles that imitation and observational learning play in children's development of aggressive behaviors has greatly increased as a result of Bandura’s Bobo doll study.

# 3. Stanford Prison Experiment by Philip Zimbardo

Experiment participants who had the role of a ‘guard’, pictured walking in the prison yard.

Experiment participants who had the role of a ‘guard’, pictured walking in the prison yard.

Social psychologist Philip Zimbardo carried out a study at Stanford University in 1971 that is known as the Stanford Prison Experiment. The experiment's goal was to find out how people would act in a prison simulation if they were in positions of power or powerlessness.

Out of the 75 volunteers, Zimbardo and his colleagues chose 24 male college students to take part in the study. The participants were divided into two groups at random and placed in a mock prison located in the Stanford psychology building's basement: guards or inmates.

The participants were completely absorbed in their parts; guards were deindividualized by being outfitted in sunglasses and uniforms, and inmates were given numbers rather than names. The guards started acting abusively and authoritarian toward the inmates as a result of the authority that had been bestowed upon them. In response, the inmates displayed symptoms of severe stress and emotional collapse.

The experiment was supposed to last two weeks, but because of the participants' severe psychological distress, it was called off after just six days! The experiment's inherent ethical issues surfaced as a result of the situation getting worse. The study has sparked ethical questions due to issues like incomplete debriefing, intense simulation, and incompletely informed consent. Because the participants' psychological well-being was compromised, Philip Zimbardo’s Stanford Prison Experiment has come under fire on a number of occasions.

In summary, the results for Philip Zimbardo’s Stanford experiment shed a light on how even ordinary beings can quickly adopt harmful and dangerous behaviors just because of their environment or roles. The Stanford Prison Experiment is frequently brought up in conversations concerning how circumstances can affect behavior and how people can misuse their power when they are in positions of authority.

# 4. Obedience Experiment by Stanley Milgram

The study setup of the Obedience experiment where the experimenter and student are confederates and the teacher who is the participant is instructed to administer shocks.

The study setup of the Obedience experiment where the experimenter and student are confederates and the teacher who is the participant is instructed to administer shocks.

In the early 1960s, social psychologist Stanley Milgram carried out a number of contentious studies on submission to authority figures and the Milgram Experiment is the most well-known of these studies.

For the Obedience Experiment, three people were involved in the basic setup of the experiment: the learner (an associate of the experimenter), the teacher (a participant), and the experimenter (an authority figure). The ‘teacher’ participant was informed that the overall aim of the study was to examine the impact of punishment on learning and was directed to shock the student with progressively stronger electric shocks each time they erred on a memory task. The teacher participants were led to believe that the shocks were real (even though they weren't). Thus this setup was a mask for the real aim of the study: to assess to what extent an individual will be obedient to an authority figure, even in the case where their obedience is causing severe harm to others.

As the experiment went on, the experimenter (ie. the authority figure) would give the participant instructions to intensify the shocks while the learner, or confederate, made deliberate mistakes. Voltage levels ranging from mild to severe were labeled on the shocks, with the highest level indicating possible danger from 15 volts to 450 (danger – severe shock). Thus, the teacher could see how dangerous the high shock levels were and know they were ‘inflicting’ pain (even though the shocks were not real).

In summary, the key discovery of Milgram's Obedience to Authority experiment was that a sizable fraction of participants kept shocking the confederate even after they showed signs of distress, objected, and finally fell silent. The experiment result showed that a significant number of participants used the shock generator to its maximum capacity, demonstrating a high degree of submission to authority.

Because Stanley Milgram's Obedience study caused participants psychological distress, criticism and questions were raised pointing to its ethical issues. However, the study still managed to shed light on how common people might act dubiously or immorally when directed by an authority figure, offering insightful information about the influence of authority and social conformity.

# 5. The Hawthorne Effect by Henry A. Landsberger

Factory image of the Hawthorne Effect.

A phenomenon known as the Hawthorne Effect occurs when people adjust their behavior when they become aware that they are being watched or observed by others. A set of experiments carried out at the Western Electric Hawthorne Works in Chicago in the 1920s and 1930s led to the naming of this effect. The initial purpose of the studies was to look into how worker productivity and lighting conditions relate to one another. Elton Mayo also studied in this context how work structure changes (like rest periods) influenced worked outcomes at the factory.

The data from the Hawthorne studies were later reanalyzed and interpreted in the 1950s by social scientist Henry A. Landsberger. His work, especially the 1958 paper "Hawthorne Revisited," was instrumental in making the Hawthorne Effect concept widely known.

Landsberger came to the conclusion that it was the workers' awareness of being observed/studied that actually explained the observed changes in worker productivity, rather than the lighting conditions as first believed. The workers' motivation and performance improved as a result of the researchers' interest and attention.

From then, the results from Hawthorne Effect study has gained widespread acceptance in organizational behavior psychology and social science. It emphasizes how crucial social and psychological elements are in shaping behavior, especially in settings like research or in the workplace where people may behave differently because they are aware that they are being watched or studied. The Hawthorne Effect is frequently brought up when talking about the difficulties in using human subjects in experiments and research because it can be difficult to identify and comprehend the underlying causes of observed behavior when subjects are aware they are being observed.

# Social Psychology Experiments Today

While these classic experiments helped establish the field of social psychology by studying complex topics like obedience and conformity, today there are more ethical guidelines that researchers must follow.

Furthermore, due to the digitization of the 21st century, online experiments are becoming more and more popular which allow for participants to complete tasks together using their computers or smartphones.

# References

  • Asch, S. E. (1952). Group forces in the modification and distortion of judgments. In S. E. Asch, Social psychology (pp. 450–501). Prentice-Hall, Inc.
  • Asch, S. E. (1953). Effects of group pressure upon the modification and distortion of judgements. Group dynamics. Asch, S. E. (1956). Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological monographs: General and applied, 70(9), 1.
  • Bandura, A. (1965). Influence of models' reinforcement contingencies on the acquisition of imitative responses. Journal of personality and social psychology, 1(6), 589.
  • Bandura, A., Ross, D., & Ross, S. A. (1961). Transmission of aggression through imitation of aggressive models. The Journal of Abnormal and Social Psychology, 63(3), 575.
  • Bandura, A., Ross, D., & Ross, S. A. (1963). Imitation of film-mediated aggressive models. The Journal of Abnormal and Social Psychology, 66(1), 3.
  • Bandura, A., & Walters, R. H. (1977). Social learning theory(Vol. 1). Prentice Hall: Englewood cliffs.
  • Landsberger, H. A. (1958). Hawthorne Revisited: Management and the Worker, Its Critics, and Developments in Human Relations in Industry.
  • Milgram, S. (1963). Behavioral study of obedience. The Journal of abnormal and social psychology, 67(4), 371.
  • Milgram, S. (1965). Some conditions of obedience and disobedience to authority. Human relations, 18(1), 57-76.
  • Zimbardo, P. G. (1973). On the ethics of intervention in human psychological research: With special reference to the Stanford prison experiment. Cognition, 2(2), 243–256.
  • Zimbardo, P. G. (1995). The psychology of evil: A situationist perspective on recruiting good people to engage in anti-social acts. Japanese Journal of Social Psychology, 11(2), 125-133.
  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Sweepstakes
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How Social Psychologists Conduct Their Research

Surveys, observations, and case studies provide necessary data

Why Study Social Behavior?

Descriptive research, correlational research, experimental research.

Social psychology research methods allow psychologists a window into the causes for human behavior. They rely on a few well-established methods to research  social psychology topics. These methods allow researchers to test hypotheses and theories as they look for relationships among different variables.

Why do people do the things they do? And why do they sometimes behave differently in groups? These questions are of interest not only to social psychologists, but to teachers, public policy-makers, healthcare administrators, or anyone who has ever watched a news story about a world event and wondered, “Why do people act that way?”

Which type of research is best? This depends largely on the subject the researcher is exploring, the resources available, and the theory or hypothesis being investigated.

Why study social behavior? Since so many "common sense" explanations exist for so many human actions, people sometimes fail to see the value in scientifically studying social behavior. However, it is important to remember that folk wisdom can often be surprisingly inaccurate and that the scientific explanations behind a behavior can be quite shocking.

Stanley Milgram's infamous obedience experiments are examples of how the results of an experiment can defy conventional wisdom.

If you asked most people if they would obey an authority figure even if it meant going against their moral code or harming another individual, they would probably emphatically deny that they would ever do such a thing. Yet Milgram's results revealed that all participants hurt another person simply because they were told to do so by an authority figure, with 65% delivering the highest voltage possible.

The scientific method is essential in studying psychological phenomena in an objective, empirical, analytical way. By employing the scientific method, researchers can see cause-and-effect relationships, uncover associations among factors, and generalize the results of their experiments to larger populations.

While common sense might tell us that opposites attract, that birds of a feather flock together, or that absence makes the heart grow fonder, psychologists can put such ideas to the test using various research methods to determine if there is any real truth to such folk wisdom.

The goal of descriptive research is to portray what already exists in a group or population.

One example of this type of research would be an opinion poll to find which political candidate people plan to vote for in an upcoming election. Unlike causal and relational studies, descriptive studies cannot determine if there is a relationship between two variables. They can only describe what exists within a given population.

An example of descriptive research is a survey of people's attitudes toward a particular social issue such as divorce, capital punishment, or gambling laws.

Types of Descriptive Research

Some of the most commonly used forms of descriptive research utilized by social psychologists include the following.

Surveys are probably one of the most frequently used types of descriptive research. Surveys usually rely on self-report inventories in which people fill out questionnaires about their own behaviors or opinions.

The advantage of the survey method is that it allows social psychology researchers to gather a large amount of data relatively quickly, easily, and cheaply.

The Observational Method

The observational method involves watching people and describing their behavior. Sometimes referred to as field observation, this method can involve creating a scenario in a lab and then watching how people respond or performing naturalistic observation in the subject's own environment.

Each type of observation has its own strengths and weaknesses. Researchers might prefer using observational methods in a lab in order to gain greater control over possible extraneous variables, while others might prefer using naturalistic observation in order to obtain greater ecological validity . However, lab observations tend to be more costly and difficult to implement than naturalistic observations.

Case Studies

A case study involves the in-depth observation of a single individual or group. Case studies can allow researchers to gain insight into things that are very rare or even impossible to reproduce in experimental settings.

The case study of Genie , a young girl who was horrifically abused and deprived of learning language during a critical developmental period, is one example of how a case study can allow social scientists to study phenomena that they otherwise could not reproduce in a lab.

Social psychologists use correlational research to look for relationships between variables. For example, social psychologists might carry out a correlational study looking at the relationship between media violence and aggression . They might collect data on how many hours of aggressive or violent television programs children watch each week and then gather data how on aggressively the children act in lab situations or in naturalistic settings.

Conducting surveys, directly observing behaviors, or compiling research from earlier studies are some of the methods used to gather data for correlational research. While this type of study can help determine if two variables have a relationship, it does not allow researchers to determine if one variable causes changes in another variable.

While the researcher in the previous example on media aggression and violence can use the results of their study to determine if there might be a relationship between the two variables, they cannot say definitively that watching television violence causes aggressive behavior.

Experimental research is the key to uncovering causal relationships between variables . In experimental research, the experimenter randomly assigns participants to one of two groups:

  • The control group : The control group receives no treatment and serves as a baseline.
  • The experimental group : Researchers manipulate the levels of some independent variable in the experimental group and then measure the effects.

Because researchers are able to control the independent variables, experimental research can be used to find causal relationships between variables.

So if psychologists wanted to establish a causal relationship between media violence and aggressive behavior, they would want to design an experiment to test this hypothesis. If the hypothesis was that playing violent video games causes players to respond more aggressively in social situations, they would want to randomly assign participants to two groups.

The control group would play a non-violent video game for a predetermined period of time while the experimental group would play a violent game for the same period of time.

Afterward, the participants would be placed in a situation where they would play a game against another opponent. In this game, they could either respond aggressively or non-aggressively. The researchers would then collect data on how often people utilized aggressive responses in this situation and then compare this information with whether these individuals were in the control or experimental group.

By using the scientific method, designing an experiment, collecting data, and analyzing the results, researchers can then determine if there is a causal relationship between media violence and violent behavior.

Why Social Research Methods Are Important

The study of human behavior is as complex as the behaviors themselves, which is why it is so important for social scientists to utilize empirical methods of selecting participants, collecting data, analyzing their findings, and reporting their results.

Haslam N, Loughnan S, Perry G. Meta-milgram: An empirical synthesis of the obedience experiments . Voracek M, ed.  PLoS ONE . 2014;9(4):e93927. doi:10.1371/journal.pone.0093927

Milgram S. Behavioral study of obedience .  The Journal of Abnormal and Social Psychology . 1963;67(4):371-378. doi:10.1037/h0040525

Curtiss S, Fromkin V, Krashen S, Rigler D, Rigler M. The linguistic development of genie .  Language . 1974;50(3):528.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Program Type

5 ground-breaking social psychology experiments.

Home » Blog » Psychology » 5 Ground-Breaking Social Psychology Experiments

Psychologists often use experiments to answer humanity’s most difficult questions. After the atrocities of Nazi Germany in World War II, many wondered how people could follow the orders to perform such horrific actions.

Yale researcher Stanley Milgram devised an experiment around the following question: “Could it be that [Adolf] Eichmann and his million accomplices in the Holocaust were just following orders? Could we call them all accomplices?”

[Adolf Eichmann was one of the major organizers of the Holocaust.]

He found that other populations would follow orders to harm people despite the orders conflicting with their personal morals. By studying this social phenomenon, scientists were able to surmise that atrocities committed during the war were not endemic to German soldiers, as initially believed.

Throughout history, other psychology experiments have tried to address specific issues to foster better understanding of human behavior.

Here is a look at five notable experiments from the second half of the 20th century to present day:

Bobo Doll Experiment

Conducted in 1961 by scientist Albert Bandura, this experiment sought to prove human behavior was learned through social imitation, rather than inherited genetically. Bandura hypothesized children would mimic an adult’s behavior if they trusted the adult. He chose to use a Bobo doll, a roughly 5-foot-tall inflatable toy weighted at the bottom that bounces back to standing upright after being struck.

One group of children did not witness any adult interaction with the toy. Another group watched an adult behave aggressively. Bandura’s experiment found that children exposed to aggression were more likely to imitate the behavior and boys were three times more likely to mimic violence than girls.

Bystander Effect

In the 1960s, John Darley and Bibb Latané sought to measure how much time elapsed before bystanders reacted and either intervened or ignored the need for help when an emergency situation involving a group or an individual was staged. The researchers were inspired by the murder of Kitty Genovese in 1964, which became infamous after The New York Times reported that there were 38 witnesses to her murder and none of them tried to help.

Although the Kitty Genovese phenomenon was debunked later by the Times itself , it caused Darley and Latané’s discovery of The Bystander Effect. They demonstrated that a larger number of bystanders diminished the chances any of them would offer help. The Bystander Effect has continued to be replicated for years.

Halo Effect

The “halo effect” is a famous social psychology finding which suggests global or group evaluations about an individual can influence judgments about a specific trait. For example, a likable person is often perceived to be intelligent. The term, originally coined by psychology Edward Thorndike, is a type of confirmation bias.

In the 1970s, researchers Richard Nisbett and Timothy DeCamp Wilson performed an experiment to demonstrate this by showing two groups of students the same lecture, but changing the demeanor of the lecturer from one group to the next. When the lecturer appeared friendly, students responded more favorably than the group who saw the lecturer who appeared cold and distant.

The Chameleon Effect

Also referred to as “unintentional mirroring,” the Chameleon Effect is believed to be a natural tendency for one person to imitate another person based on how well they get along without any realization that it’s happening. Tanya Chartrand and John Bargh from New York University studied this phenomenon in the 1990s. They interviewed participants individually while affecting different mannerisms throughout the talk to gauge the bond that developed.

During two follow-up talks, the scientists mimicked the posture and other mannerisms of some test subjects. The participants mimicked the scientists more in the first experiment and found the scientists more likable when they mimicked their own mannerisms in the follow-ups. Those participants who weren’t mimicked had a more neutral opinion of the scientists.

The Volkswagen Fun Theory

In 2009, advertising agency DDB Stockholm created an initiative on behalf of car manufacturer Volkswagen. The company came up with the “ Fun Theory ,” conducting three experiments to see whether people might choose to change behavior and do something based on how much fun it was to do, such as recycling, throwing away trash or taking stairs versus an escalator.

In one instance, a set of stairs next to an escalator was decorated to look like piano keys with accompanying notes for each step a person took while traversing the stairs. The experiment found 66% more people chose the stairs than usual. In another, a trash bin with sound effects when people deposited litter collected more trash than nearby bins.

Though these were part of an advertising campaign rather than a scientific experiment, the results indicate people may be more inclined to perform a task such as taking stairs instead of an escalator if it appears to be fun.

RELATED ARTICLES

What is hybrid project management, supply chain lessons from covid-19, 833-591-1092.

Risepoint maintains this website on behalf of Florida Institute of Technology. Florida Tech maintains responsibility for curriculum, teaching, admissions, tuition, financial aid, accreditation, and all other academic- and instruction-related functions and decisions.

Learn more about Risepoint.

© 2024 privacy | terms | student disclosures

Get Our Program Guide

If you are ready to learn more about our programs, get started by downloading our program guide now.

Learning Mind

6 Shocking Social Psychology Experiments That Show How Far People Go to Fit in

  • Post author: Janey Davies, B.A. (Hons)
  • Post published: June 20, 2017
  • Reading time: 7 mins read
  • Post category: Psychology & Mental Health / Uncommon Science

Social psychology experiments can give us great insight into how we think, behave and act.

They help us to explain how our thoughts are influenced by others, how group dynamics work, and how we perceive others.

Here are six of the most important social psychology experiments:

1. The Milgram Experiment

After the atrocities of WW2, scientists wanted to know why a race of people did not speak out, and moreover, why they carried out tasks that were deemed to go against the very fabric of society.

Stanley Milgram (1963) set up an experiment in which participants were told to apply electrical shocks to another participant in another room. What the participants did not know was the person in the other room was in on the experiment and told to scream when the higher levels of power were applied.

Milgram wanted to know how far people would go in obeying an instruction if that instruction meant hurting another person.

Results showed that 65% of participants continued to the highest level of 450 volts . Milgram surmised that people will obey orders if they perceive these orders to be from someone in authority and they can relinquish their responsibility.

2. The Conformity Experiment

The Conformity experiment (1951), one of the most important social psychology experiments, took male students and put them in a room with eight other participants.

These eight were in on the experiment, unbeknown to the male students. The tests were simple enough; three lines of differing lengths were compared to a reference line and the whole group had to pick the line that was the same length.

The right answer was obvious but as the researcher went down the group, all those in on the experiment chose the wrong line . So would the student go along with the group or would they be assertive and chose the correct line?

The results showed that 50% followed the group and gave the wrong answer . Only 25% went against the group and over all the trials the average conformity rate was 33%. This appears to show that our willingness to fit in will override our wish to stand out .

3. The Halo Effect

The ‘ halo effect ’ is a kind of bias where our evaluation of the person leads us to make assumptions about the rest of their character .

One good example of the halo effect is how we perceive celebrities. They are often portrayed as beautiful, handsome, and wealthy. Because of these characteristics, we are more likely to think they are also funny, intelligent, and kind.

One favourable judgement about a person’s personality tends to bleed over into other judgments that are also favourable.

4. Sherif Robbers Cave Experiment

Muzafer Sherif’s most famous experiment is the ‘Robber’s Cave, 1954’, in which he wanted to understand group dynamics, in particular – conflict , negative prejudices, and stereotypes that people experience when groups are competing for resources.

Twenty-two boys were split randomly into two groups and transported to a summer camp, where they were separated with no knowledge of the other group. The boys chose names for their groups; the Rattlers and the Eagles. They spent a week bonding with the members of their group, then a competition stage was introduced.

The two groups met for the first time and competed for resources, prizes, and trophies. Despite the groups having only spent a week together, they were solid in their bonds and immediately started to show prejudice against the other group . At first, it was verbal assaults, then the abuse grew physical.

In the end, the two groups were so aggressive towards each other that researchers had to step in. Even after a two-day cooling off period, the boys were still describing their group in favourable terms and the others in less so terms.

Results suggest that we have an innate need to be in a group and will behave favourably to our group against others.

5. Stanford Prison Experiment

One of the most well-known social psychology experiments, the Stanford Prison Experiment was devised by Philip Zimbardo in 1971. It was focused on  the effects of perceived power , in particular, the struggle between guards and prisoners.

In the experiment, young men were given roles as either guards or prisoners and moved to a prison-like environment in the basement of Stanford University.

It soon became clear that the men given the roles as guards took their roles very seriously and began to abuse the prisoners , both verbally and psychologically. The prisoners appeared to accept their role and accepted the abuse without question. After only six days the situation was so intense that it had to be called off.

The researchers decided that it was the situation we are in that determines our behaviour, and not our individual personalities.

6. How stereotypes affect our judgement

Do we make instant judgements based on stereotypes? In one test, John Bargh (1996) divided 34 participants into 3 groups and subconsciously ‘programmed’ these groups into a different state; rude, polite, and neutral .

In order to do this, the participants were given word puzzles to work out. To install the different states into the three groups, each word puzzle’s answers related to words that defined that particular state, for instance for polite, words used were ‘courteous’, ‘patiently’ and ‘behaved’.

When they had finished, they were asked to talk to the lead experimenter, who was spotted in deep in conversation with someone. They had a choice whether to interrupt his conversation, wait for him to finish, or walk away.

Of the group that had been programmed with rude words , 64% interrupted the experimenter , compared to just 18% of participants programmed with polite words. The neutral condition recorded 36% interrupting.

The results showed that unconscious cues can lead to a change in our behaviour.

As you see from the above social psychology experiments, human nature is so susceptible to social conditioning that it sometimes makes people do truly crazy things.

References:

  • https://web.stanford.edu/dept/spec_coll/uarch/exhibits/Narration.pdf
  • https://nature.berkeley.edu/ucce50/ag-labor/7article/article35.htm
  • https://psycnet.apa.org/record/1952-00803-001
  • https://muse.jhu.edu/book/1107
  • https://psycnet.apa.org/record/1996-06400-003

power of misfits book banner desktop

Like what you are reading? Subscribe to our newsletter to make sure you don’t miss new thought-provoking articles!

Share this story share this content.

  • Opens in a new window Facebook
  • Opens in a new window X
  • Opens in a new window LinkedIn
  • Opens in a new window Reddit
  • Opens in a new window Tumblr

Leave a Reply Cancel reply

Save my name, email, and website in this browser for the next time I comment.

the power of misfits black friday

cropped Screenshot 2023 08 20 at 23.18.57

Behavioral Psychology Examples: Real-Life Applications of Key Theories

From Pavlov’s salivating dogs to the subtle nudges influencing our daily decisions, behavioral psychology unveils the fascinating ways our minds respond to stimuli, shaping our actions and interactions in the world around us. This captivating field of study has revolutionized our understanding of human behavior, offering insights that span from the classroom to the boardroom, and even the therapist’s couch.

Imagine a world where every action, every decision, and every habit could be understood and potentially influenced. That’s the promise of behavioral psychology, a discipline that has been shaping our understanding of human nature for over a century. But what exactly is behavioral psychology, and why should we care?

At its core, behavioral psychology is the study of how our environment and experiences shape our behavior. It’s a field that seeks to understand why we do what we do, not by peering into the depths of our unconscious mind, but by observing our actions and the consequences that follow.

The roots of behavioral psychology can be traced back to the early 20th century, with pioneers like John B. Watson and B.F. Skinner leading the charge. These trailblazers challenged the prevailing introspective methods of their time, arguing that psychology should focus on observable behaviors rather than internal mental states. Their work laid the foundation for a revolution in psychological thinking, one that continues to influence fields as diverse as education, marketing, and mental health treatment.

But why should we, as everyday individuals, care about behavioral psychology? The answer lies in its profound impact on our daily lives. By understanding the principles of behavioral psychology, we can gain valuable insights into our own actions and motivations. We can learn to recognize the subtle influences that shape our decisions, from the ads we see on social media to the layout of our favorite grocery store. Armed with this knowledge, we can make more informed choices, break bad habits, and even improve our relationships with others.

Classical Conditioning: Pavlov’s Dogs and Beyond

Let’s start our journey into behavioral psychology with one of its most famous experiments: Pavlov’s dogs. Ivan Pavlov, a Russian physiologist, stumbled upon a fascinating phenomenon while studying digestion in dogs. He noticed that his canine subjects would start salivating not just when they saw food, but also when they heard the footsteps of the lab assistant who usually fed them.

This observation led Pavlov to develop the theory of classical conditioning, a process by which a neutral stimulus (like footsteps) becomes associated with a natural response (salivation) through repeated pairings with an unconditioned stimulus (food). It’s a simple yet powerful concept that explains how we learn to associate certain stimuli with specific responses.

But classical conditioning isn’t just about drooling dogs. Its principles are at work all around us, often in ways we don’t even realize. Take, for example, the world of advertising. That catchy jingle you can’t get out of your head? That’s classical conditioning at work, associating a positive feeling with a particular brand or product.

Or consider how classical conditioning plays a role in treating phobias. Through a process called systematic desensitization, therapists can help patients overcome their fears by gradually exposing them to the feared stimulus in a safe, controlled environment. Over time, the patient learns to associate the once-feared object or situation with a sense of calm rather than panic.

The applications of classical conditioning extend far beyond the realms of marketing and therapy. In the world of consumer behavior, companies use these principles to create brand loyalty and influence purchasing decisions. The pleasant aroma wafting from a bakery, the satisfying ‘pop’ of opening a soda can, or the sleek design of a smartphone – all of these sensory experiences are carefully crafted to elicit positive associations and encourage repeat business.

Operant Conditioning: Skinner’s Box and Its Real-World Impact

While Pavlov was busy with his dogs, another behavioral psychologist was about to make waves with a different kind of animal experiment. Enter B.F. Skinner and his famous “Skinner Box.” This simple apparatus, essentially a cage with a lever that dispensed food when pressed, became the cornerstone of Skinner’s theory of operant conditioning.

Operant conditioning, unlike its classical counterpart, focuses on how the consequences of a behavior influence the likelihood of that behavior being repeated. In Skinner’s experiments, rats quickly learned to press the lever more frequently when it resulted in a food reward. This simple principle – that behaviors followed by positive outcomes are more likely to be repeated – forms the basis of much of our understanding of learning and behavior modification.

But operant conditioning isn’t just about rats in boxes. Its principles are at work in our everyday lives, particularly in education and the workplace. Consider the use of positive reinforcement in the classroom. When a teacher praises a student for good work, they’re applying operant conditioning principles to encourage that behavior in the future. Similarly, the concept of “token economies” in schools, where students earn points or stickers for good behavior that can be exchanged for rewards, is a direct application of Skinner’s theories.

In the workplace, operant conditioning principles underpin many performance management systems. Employee bonuses, promotions, and even simple praise from a manager all serve as positive reinforcements that encourage desired behaviors. On the flip side, negative reinforcement – the removal of an unpleasant stimulus following a desired behavior – can also be effective. For instance, a company might offer to waive a monthly fee if customers maintain a certain account balance, encouraging them to keep more money in their accounts.

The power of operant conditioning lies in its versatility. Whether you’re trying to train a dog, motivate a team, or change your own habits, understanding the relationship between behavior and consequences can be a game-changer. It’s a testament to the enduring relevance of behavioral models in psychology , showing how simple principles can have profound effects on complex human behaviors.

Social Learning Theory: Bandura’s Bobo Doll Experiment

As influential as classical and operant conditioning were, they didn’t tell the whole story of human learning. Enter Albert Bandura and his groundbreaking social learning theory, which proposed that we learn not just from direct experiences, but also by observing others.

Bandura’s most famous experiment, known as the Bobo doll experiment, demonstrated this principle in action. In this study, children watched adults interact with a large inflatable doll. Some adults treated the doll aggressively, while others played with it gently. When the children were later allowed to play with the doll themselves, those who had observed aggressive behavior were more likely to mimic it, even without any direct reinforcement.

This experiment highlighted the power of observational learning, a concept that has profound implications for understanding behavioral patterns in psychology . It suggests that we don’t need to experience something directly to learn from it – we can acquire new behaviors simply by watching others.

The real-world applications of social learning theory are vast and varied. In child development, it explains how children learn complex behaviors and social norms by observing and imitating their parents, peers, and other role models. This understanding has important implications for parenting and education, highlighting the importance of modeling desired behaviors rather than just instructing.

But social learning isn’t just for kids. Adults, too, continue to learn through observation throughout their lives. Think about how you might pick up a new skill at work by watching a more experienced colleague, or how you might adopt new exercise habits after seeing a friend’s fitness transformation on social media.

The influence of media on behavior is another area where social learning theory has significant relevance. The characters we see on TV, in movies, and on social media all serve as potential models for behavior. This can have both positive and negative effects – while seeing diverse representation in media can broaden our perspectives and inspire positive change, exposure to violent or risky behaviors can potentially lead to imitation, especially in younger viewers.

Understanding social learning theory can help us become more conscious of the influences around us and make more intentional choices about the behaviors we model and the media we consume. It’s a powerful reminder of our capacity to learn and change throughout our lives, not just through direct experience, but through the vast social world around us.

Cognitive Behavioral Therapy: Practical Applications in Mental Health

As behavioral psychology evolved, it began to incorporate elements of cognitive psychology, leading to the development of cognitive behavioral therapy (CBT). This powerful therapeutic approach, which combines behavioral techniques with strategies to change thought patterns, has revolutionized the treatment of many mental health conditions.

At its core, CBT is based on the idea that our thoughts, feelings, and behaviors are all interconnected. By changing one aspect – typically our thoughts or behaviors – we can influence the others, creating a positive cycle of improvement. This approach has proven particularly effective in treating conditions like anxiety and depression, where negative thought patterns often play a significant role.

Let’s consider an example of how CBT might work in practice. Imagine someone with social anxiety who avoids social gatherings due to a fear of being judged. A CBT approach might involve challenging the negative thoughts (“Everyone will think I’m boring”), practicing relaxation techniques to manage physical symptoms of anxiety, and gradually exposing the person to social situations in a controlled way.

The effectiveness of CBT is supported by numerous case studies and research findings. For instance, a meta-analysis published in the journal Cognitive Therapy and Research found that CBT was significantly more effective than control conditions in treating anxiety disorders, with effects maintained at follow-up.

But the applications of CBT extend beyond clinical settings. Many of its principles can be applied in self-help and personal development contexts. Techniques like cognitive restructuring (challenging and changing negative thought patterns) and behavioral activation (increasing engagement in positive activities) can be valuable tools for anyone looking to improve their mental well-being and achieve personal goals.

The success of CBT underscores the power of integrating behavioral and cognitive approaches in understanding human behavior . It shows how changing our actions can change our thoughts, and vice versa, offering a holistic approach to personal growth and mental health improvement.

Behavioral Economics: Nudge Theory and Consumer Behavior

As we venture into the 21st century, behavioral psychology has found new applications in the world of economics, giving rise to the field of behavioral economics. This innovative discipline challenges traditional economic models by recognizing that humans don’t always make rational decisions based on perfect information. Instead, our choices are often influenced by cognitive biases, emotions, and environmental factors.

One of the most influential concepts to emerge from behavioral economics is “nudge theory,” popularized by Richard Thaler and Cass Sunstein. A nudge, in this context, is any aspect of the choice architecture that alters people’s behavior in a predictable way without forbidding any options or significantly changing their economic incentives.

Nudges can take many forms in the real world. For example, placing healthier food options at eye level in a cafeteria can encourage people to make better dietary choices without restricting their freedom to choose less healthy options. Similarly, changing the default option for organ donation from “opt-in” to “opt-out” has been shown to significantly increase donation rates in some countries.

Governments around the world have taken notice of the potential of nudge theory to influence public behavior positively. The UK government, for instance, established a “Nudge Unit” (officially called the Behavioural Insights Team) to apply behavioral science principles to public policy. One of their successful interventions involved changing the wording on tax reminder letters, which increased timely tax payments by several percentage points, translating to millions of pounds in additional revenue.

In the realm of personal finance, behavioral economics offers valuable insights into why we often make irrational financial decisions and how we can combat these tendencies. For instance, the concept of “mental accounting” explains why we might treat a $100 lottery win differently from a $100 salary increase, even though the monetary value is the same. Understanding these biases can help us make more rational financial choices.

The principles of behavioral economics also have profound implications for marketing and consumer behavior. Concepts like loss aversion (we feel losses more keenly than equivalent gains) and the endowment effect (we value things more once we own them) help explain many consumer behaviors and inform marketing strategies.

As we navigate an increasingly complex world of choices, understanding the principles of behavioral economics can empower us to make better decisions and recognize when our choices might be influenced by factors we’re not consciously aware of. It’s a field that bridges the gap between psychology as the behaviorist views it and the real-world complexities of human decision-making.

The Enduring Relevance of Behavioral Psychology

As we’ve journeyed through the landscape of behavioral psychology, from Pavlov’s labs to modern economic policies, one thing becomes clear: the principles uncovered by behavioral psychologists continue to shape our understanding of human behavior in profound ways.

The theories we’ve explored – classical conditioning, operant conditioning, social learning theory, cognitive behavioral therapy, and behavioral economics – each offer unique insights into why we behave the way we do. They provide a toolkit for understanding and potentially influencing behavior, whether in clinical settings, educational institutions, workplaces, or our personal lives.

But perhaps the most exciting aspect of behavioral psychology is its ongoing evolution. As our understanding of the brain and behavior deepens, new applications and refinements of these theories continue to emerge. The integration of behavioral insights with neuroscience, for instance, is opening up new frontiers in our understanding of the biological basis of behavior.

Moreover, the digital age presents both challenges and opportunities for behavioral psychology. On one hand, technologies like smartphones and social media create new avenues for studying and influencing behavior on an unprecedented scale. On the other hand, they raise important ethical questions about privacy and the potential for manipulation.

Looking to the future, behavioral psychology is likely to play an increasingly important role in addressing some of society’s most pressing challenges. From developing more effective treatments for mental health conditions to designing policies that encourage sustainable behaviors, the insights of behavioral psychology will be crucial.

As individuals, understanding the principles of behavioral psychology can empower us to take greater control of our own behaviors and make more informed decisions. By recognizing the various influences on our behavior – from our environment to our thought patterns – we can become more conscious architects of our own lives.

In conclusion, behavioral psychology offers a fascinating lens through which to view human behavior. Its theories and applications touch every aspect of our lives, from the most personal decisions to the broadest societal trends. As we continue to unravel the complexities of human behavior, the insights of behavioral psychology will undoubtedly continue to play a crucial role in shaping our understanding of ourselves and the world around us.

Whether you’re a student of psychology, a professional in a related field, or simply someone curious about human behavior, delving deeper into behavioral psychology concepts can offer valuable insights and practical tools for navigating the complexities of human behavior. After all, in a world where understanding and influencing behavior is increasingly crucial, the principles of behavioral psychology are more relevant than ever.

References:

1. Pavlov, I. P. (1927). Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex. Oxford University Press.

2. Skinner, B. F. (1938). The Behavior of Organisms: An Experimental Analysis. Appleton-Century-Crofts.

3. Bandura, A. (1977). Social Learning Theory. Prentice Hall.

4. Beck, A. T. (1979). Cognitive Therapy and the Emotional Disorders. International Universities Press.

5. Thaler, R. H., & Sunstein, C. R. (2008). Nudge: Improving Decisions about Health, Wealth, and Happiness. Yale University Press.

6. Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

7. Hofmann, S. G., Asnaani, A., Vonk, I. J., Sawyer, A. T., & Fang, A. (2012). The Efficacy of Cognitive Behavioral Therapy: A Review of Meta-analyses. Cognitive Therapy and Research, 36(5), 427-440.

8. Behavioural Insights Team. (2012). Applying behavioural insights to reduce fraud, error and debt. Cabinet Office, UK Government.

9. Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases. Science, 185(4157), 1124-1131.

10. Dolan, P., Hallsworth, M., Halpern, D., King, D., Metcalfe, R., & Vlaev, I. (2012). Influencing behaviour: The mindspace way. Journal of Economic Psychology, 33(1), 264-277.

Similar Posts

Cognitive and Behavioral Psychology: Exploring the Mind-Behavior Connection

Cognitive and Behavioral Psychology: Exploring the Mind-Behavior Connection

A fascinating dance between the mind and behavior unfolds as we delve into the captivating world of cognitive and behavioral psychology, unraveling the intricate mechanisms that shape our thoughts, actions, and experiences. This intricate interplay between our mental processes and observable behaviors has long captivated researchers, therapists, and curious minds alike. It’s a realm where…

Bell Psychology: Exploring Pavlov’s Groundbreaking Work in Classical Conditioning

Bell Psychology: Exploring Pavlov’s Groundbreaking Work in Classical Conditioning

A simple bell, an inquisitive mind, and a serendipitous discovery—these elements converged to forever change our understanding of learning and behavior through the groundbreaking work of Ivan Pavlov. The tinkling of that unassuming bell would come to represent a revolution in psychological research, opening doors to new realms of understanding human and animal behavior. Imagine,…

Psychology Behind Habits: Unraveling the Science of Behavior Formation

Psychology Behind Habits: Unraveling the Science of Behavior Formation

From morning routines to midnight snacks, our lives are shaped by the invisible forces of habit—but what lies behind these powerful patterns of behavior? The psychology of habits is a fascinating field that delves deep into the human psyche, unraveling the mysteries of why we do what we do, often without even thinking about it….

Associative Learning in Psychology: Definition, Examples, and Applications

Associative Learning in Psychology: Definition, Examples, and Applications

From Pavlov’s salivating dogs to Skinner’s operant chambers, the captivating world of associative learning has unlocked countless secrets behind human and animal behavior. This fascinating field of psychology has revolutionized our understanding of how we learn, adapt, and interact with our environment. It’s a journey that takes us from the simplest of reflexes to the…

Extinction Burst Psychology: Understanding the Last-Ditch Effort in Behavior Change

Extinction Burst Psychology: Understanding the Last-Ditch Effort in Behavior Change

When old habits die hard, they often go out with a bang – a psychological phenomenon known as an extinction burst that can make or break our attempts to change behavior. This fascinating aspect of human psychology plays a crucial role in our journey towards personal growth and transformation. But what exactly is an extinction…

Nose Rubbing Psychology: Decoding Hidden Messages in Body Language

Nose Rubbing Psychology: Decoding Hidden Messages in Body Language

A simple nose rub, an unconscious gesture we’ve all experienced, holds a treasure trove of hidden psychological meanings waiting to be uncovered. It’s one of those quirky little things we do without thinking, like scratching an itch or adjusting our glasses. But what if I told you that this seemingly insignificant action could reveal more…

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

A comparison of conventional and resampled personal reliability in detecting careless responding

  • Original Manuscript
  • Open access
  • Published: 16 September 2024

Cite this article

You have full access to this open access article

example of social psychology experiments

  • Philippe Goldammer   ORCID: orcid.org/0000-0002-9914-9897 1 ,
  • Peter Lucas Stöckli   ORCID: orcid.org/0000-0003-3216-4803 1 ,
  • Hubert Annen   ORCID: orcid.org/0000-0003-1508-6276 1 &
  • Annika Schmitz-Wilhelmy   ORCID: orcid.org/0000-0001-8566-3521 2  

Detecting careless responding in survey data is important to ensure the credibility of study findings. Of several available detection methods, personal reliability (PR) is one of the best-performing indices. Curran, Journal of Experimental Social Psychology, 66 , 4-19, ( 2016 ) proposed a resampled version of personal reliability (RPR). Compared to the conventional PR or even–odd consistency, in which just one set of scale halves is used, RPR is based on repeated calculation of PR across several randomly rearranged sets of scale halves. RPR should therefore be less affected than PR by random errors that may occur when a specific set of scale half pairings is used for the PR calculation. In theory, RPR should outperform PR, but it remains unclear whether it in fact does, and under what conditions the potential gain in detection accuracy is the most pronounced. We conducted two studies: a simulation study examined the performance of the conventional PR and RPR in detecting simulated careless responding, and a real data example study analyzed their performance when detecting human-generated careless responding. In both studies, RPR turned out to be a significantly better careless response indicator than PR. The results also revealed that using 25 resamples for the RPR computation is sufficient to obtain the expected gain in detection accuracy over the conventional PR. We therefore recommend using RPR instead of the conventional PR when screening questionnaire data for careless responding.

Similar content being viewed by others

example of social psychology experiments

Can response variance effectively identify careless respondents to multi-item, unidimensional scales?

example of social psychology experiments

Combatting carelessness: Can placement of quality check items help reduce careless responses?

example of social psychology experiments

Evidence About the Accuracy of Surveys in the Face of Declining Response Rates

Avoid common mistakes on your manuscript.

Likert scale questionnaires are a convenient and routinely used method to assess different psychological phenomena. However, credible results can be obtained by this method only if respondents have answered the questions attentively. Unfortunately, it is rather common that some respondents in the sample rush through the questionnaire without paying attention to the item content and instructions (Goldammer et al., 2020 ; Meade & Craig, 2012 )―a response behavior that has been commonly labeled insufficient effort responding (Huang et al., 2012 ) or careless responding (Meade & Craig, 2012 ).

Undetected careless responding can have severe consequences. Simulation studies suggest that even small proportions of careless respondents in the data, 10% (Hong et al., 2020 ; Woods, 2006 ) or even only 5% (Credé, 2010 ), can bias results. Moreover, the bias increases with the rate of careless respondents (Hong et al., 2020 ). In addition, careless responding has a biasing effect on a variety of estimates, such as item covariances (Credé, 2010 ; Goldammer et al., 2020 ), item means (Goldammer et al., 2020 ), reliability estimates (Hong et al., 2020 ; Huang et al., 2012 ), factor loadings (Kam & Meyer, 2015 ; Meade & Craig, 2012 ), and the testing of construct dimensionality (Arias et al., 2020 ; Goldammer et al., 2020 ; Woods, 2006 ). It is not surprising that research on detecting careless responding has gained considerable attention and that several detection methods have been proposed (for an overview, see Curran, 2016 ).

  • Personal reliability

Among the proposed methods, personal reliability (PR; Jackson, 1976 ), which has also been called individual reliability or even–odd consistency (Curran, 2016 ; DeSimone et al., 2015 ), has turned out to be one of the best-performing indices in detecting careless responding (Goldammer et al., 2020 ,  2024 ; Huang et al., 2012 ; Meade & Craig, 2012 ; Niessen et al., 2016 ). PR makes use of a very simple logic: Careful respondents should not contradict themselves over the course of a questionnaire and are expected to choose similar response options when they rate items from the same unidimensional scale. Accordingly, a score that is based on half the scale items (e.g., even-numbered items) should correlate positively with a score based on the rest of the scale items (e.g., odd-numbered items). If a questionnaire includes at least three such pairs of scale halves, PR may then be computed as within-person correlation across the two vectors of scale half pairings (Curran, 2016 ; DeSimone et al., 2015 ).

The computational principle of PR may be further illustrated with a simplified example. Let us assume that researchers have gathered data for two persons who completed a personality questionnaire measuring the five broad personality dimensions openness, conscientiousness, extraversion, agreeableness, and emotional stability, and that each of the five broad dimensions was measured with four items on a unidirectionally keyed Likert scale with six response options. Table 1 shows the simulated response protocols for these two persons. The protocol of person 1 was simulated such that it mimics a careful response pattern, Footnote 1 and the protocol of person 2 was simulated such that it mimics a careless (i.e., random uniform) response pattern.

To obtain PR, the researchers proceed as follows. First, they build halves for each of the five dimensions and calculate an average score for these halves. In our example, the researchers calculated scores for the even-numbered and the odd-numbered items and rearranged the data from wide to long format, such that each person has five row entries and the two scale halves are represented as two separate columns or vectors (see Table 2 ). Based on this data format, PR can be calculated for each person by correlating the two vectors of scale halves. If desired, these PR values can also be corrected with the Spearman–Brown prophesy formula ( \({\text{PR}}_{\text{SB}}= \frac{(\text{k}*\text{PR})}{(1+\left(\text{k}-1\right)*\text{PR})}\) ; Brown, 1910 ; Spearman, 1910 ), in which k is the factor of scale reduction (in our case, k equals 2). In our example, the researchers would have obtained a PR SB of .525 for person 1 and a PR SB of .178 for person 2. Based on these PR SB values, the researchers then conclude (unaware of our data-generating process) that person 2 has responded on the personality questionnaire and its scales in a far more inconsistent manner than person 1. After consulting the literature for commonly applied PR cut scores (e.g., below .3; DeSimone et al., 2015 ; Zickar & Keith, 2023 ), the researchers would have even begun to doubt the responding effort of person 2.

  • Resampled personal reliability

Calculating the PR by using even–odd scale half pairs has been the standard for many years (Curran, 2016 , pp. 9–10). Nevertheless, it is an arbitrary choice for building the scale halves that may be as good or bad as any other set of scale halves. To overcome this arbitrariness, Curran ( 2016 , pp. 9–10) therefore proposed using a resampled version of personal reliability, which he called resampled individual reliability (RIR) (here, resampled personal reliability, RPR). This entails calculating PR with different sets of scale halves and summarizing the results to an overall PR measure, for instance by building the arithmetic mean of the PR values (i.e., computing a mean-based RPR). Footnote 2

The principle of the RPR may be again illustrated using our simulated response protocols. In Table 3 we show the scores of three different random sets of scale half pairings (out of 3 5 potential pairs) and the resulting PR and RPR values for carefully responding person 1 and carelessly responding person 2. Based on the PR and RPR values in Table 3 , the following observations and conclusions can be made. First, the PRs vary depending on the sets of scale half pairings that were used for computation, from .811 to .866 for person 1 and from –.472 to .263 for person 2 (Table 3 ). Footnote 3 Second, if researchers use only a single set of scale half pairings, they may be unlucky with their choice and end up with a PR measure that only poorly reflects the true state of the respondent’s responding effort. For instance, if PR had been calculated based only on the second random split of items, the researchers would have obtained a PR SB of .263 for person 2 and in turn may have not doubted the person’s responding effort, as they should have or would have when using another set of scale half pairings. Third, by taking the average of PR values that were calculated across different sets of scale half pairings (and thus calculating RPR), the researchers may obtain not only a measure that is less arbitrary in its computation but also one that is less affected by “sampling error” (i.e., fluctuation in PR values that occurs when using different sets of scale half pairings). Eventually, this reduced amount of sampling error in RPR is also why RPR has been suspected to be a more precise careless response indicator than the conventional PR measure (Curran, 2016 , pp. 9–10; Ward & Meade, 2023 , p. 587), which is based on only a single set of scale half pairings—typically a set of even–odd scale half pairs.

Purpose of current studies and research questions

Unfortunately, up to now, no study that we know of has systematically examined whether RPR really outperforms PR and under what conditions the potential gain in detection accuracy is the most pronounced (e.g., Zickar & Keith, 2023 ). For this reason, we conducted a simulation study, and we reanalyzed the data of Goldammer et al.’s ( 2020 ) experimental study on careless responding. The simulation study allowed us to examine the performance of PR and RPR across a multitude of conditions that could not be easily set up in a real experiment. However, a simulation study also entails using simulated careless responses protocols, which may only partially reflect human-generated careless response protocols. Niessen et al. ( 2016 ), for instance, found the detection rates of several careless response indices to be remarkably lower when human-generated careless response data were used than when computer-generated random data were used. Reanalyzing Goldammer et al.’s ( 2020 ) experimental data therefore allowed us to examine the utility of PR and RPR under conditions that are closer to real human careless responding in surveys and thus under conditions in which careless responding may be harder to detect. In sum, our two studies therefore address the following three research questions:

Research question 1: Does RPR outperform the conventional PR (i.e., even–odd consistency)?

Research question 2: Under what conditions is the gain in detection accuracy that is expected when using RPR instead of PR the most pronounced?

Research question 3: Does the performance of RPR and PR differ when human instead of computer-generated careless response patterns need to be detected?

The simulation study

General design.

To generate the data in our simulation study, we used latent factor models (Jöreskog, 1969 ), in which an observed item response can be described as a function of the underlying latent factor, the latent factor’s loading on the observed item, and an item-specific error variance. We selected this model type because it allowed us to simulate response protocols of multi-facet surveys and to manipulate parameters that we thought were most likely to have an effect on the detection accuracy of PR and/or RPR. In sum, we examined the performance of PR and RPR across (3*3*2*2) 36 conditions with 100 replications for each condition. For all these data-generating processes, we used Stata 18 (StataCorp, 2023 ). Table 4 provides an overview of the manipulated and fixed parameters in our simulation study.

Manipulated parameters

We manipulated four parameters in our simulation study. First, we examined the performance of PR and RPR across three facet (latent factor) sample sizes: 5, 15, and 30. Because the number of items was fixed at 4 per facet, the item sample sizes for the three facet conditions were as follows: 20, 60, 120. These facet (and item) sample sizes were chosen because they reflected the facet (and item) sample sizes of typical short and comprehensive personality questionnaires (Donnellan et al., 2006 ; Johnson, 2014 ; Soto & John, 2017 ). The condition with five facets may be taken as an example of a typical short personality questionnaire in which only five facets are assessed, such as the Mini-IPIP (Donnellan et al., 2006 ), which has 20 items measuring five broad personality traits. Furthermore, the condition with five facets represents a situation in which the number of available facets was only barely above the required minimum (i.e., 3) that is necessary to calculate the PR measures. In contrast, the conditions with 15 and 30 facets were clearly above the required minimum of facets. These conditions may be taken as examples of comprehensive personality questionnaires in which several facets are assessed, such as the Big Five Inventory-2 (Soto & John, 2017 ), which has 60 items measuring 15 facets, or the IPIP-NEO-120 (Johnson, 2014 ), with 120 items measuring 30 facets. Because the number of facets (or scale half pairings) acts as sample size when computing personal reliability measures, more precise careless responding estimates can be expected when more facets and thus larger sample sizes are used (see DeSimone et al., 2015 , p. 175). We expected PR and the RPR to be more accurate indicators of careless responding when 15 or even 30 facets were used for computation than when only five facets were used.

Second, we manipulated the extent of item-specific error per facet (latent factor) across three levels: low, mediocre, and high. In the condition with low item-specific error per facet, the error variances were low in all items measuring each facet (i.e., normally distributed errors with N [0, 0.5] for all items of each factor). In the condition with mediocre item-specific error per facet, the error variances were low for one half of the items of each facet (i.e., normally distributed errors with N [0, 0.5]) and high for the other half of the items of each facet (i.e., normally distributed errors with N [0, 1.5] ). In the condition with high item-specific error per facet, the error variances were high in all items measuring each facet (i.e., normally distributed errors with N [0, 1.5] for all items of each factor). An unstandardized factor loading of 1 corresponded to a standardized factor loading of .77 if the item-specific error variance was set to 0.5 and corresponded to a standardized factor loading of .40 if the item-specific error variance was set to 1.5. These levels of item-specific error were chosen because they reflect typical levels of measurement quality that may be achieved in confirmatory factor models: In the ideal case (i.e., low item-specific error condition), the latent factor explains the majority of variance in each of the construct items, which is given when the standardized loading is above .7 (Kline, 2016 , p. 301). In applied settings, however, the more common case is (i.e., mediocre item-specific error condition) that only some of the items have standardized loadings above .7. In the worst case (i.e., high item-specific error condition), all item loadings are only around .4, which has also been suggested as the minimum value for a factor loading to be considered meaningful (Brown, 2015 , p. 115). With an increasing extent of item-specific error per facet, the error in the scores of the scale half pairings and the resulting PR should increase as well. Thus, we expected PR and RPR to be more accurate indicators of careless responding when the extent of item-specific error per facet was low than when it was mediocre or even high.

Third, we examined the performance of PR and RPR across two patterns or types of careless responding: random uniform and invariant. These two careless response patterns were chosen because they represented two commonly applied strategies that survey participants may use when completing a questionnaire carelessly (Ward & Meade, 2023 , pp. 581–582), and they have also been used in recent simulation studies on careless responding (Hong et al., 2020 ; Wind & Wang, 2023 ). In the random uniform careless responding pattern, all six response options (1 to 6) in the simulated response protocols had an equal probability of occurrence. In the case of the invariant careless responding pattern, however, only the two response options “4” and “5” were randomly simulated for the careless response protocols with an equal probability of occurrence. To obtain these invariant random response options, we drew from a normal distribution with N (4.5, 0.2) and rounded the drawn values to the next integer. Random uniform careless response patterns are therefore characterized by a large intra-facet response variance and invariant careless response patterns by a relatively small one. In turn, random uniform careless response patterns should therefore go along with far more response inconsistencies than the invariant careless response patterns, especially as all items were unidirectionally (i.e., positively) keyed in our simulation study. Thus, we expected PR and RPR to be more accurate in detecting random uniform than in detecting invariant careless response patterns.

Fourth, we manipulated the severity of careless responding in the careless response protocols across two levels: full and partial careless responding. These two levels of severity were chosen because they represented two types of careless respondents that studies on careless responding (Bowling et al., 2021 ; Meade & Craig, 2012 ; Ward & Meade, 2023 ; Yu & Cheng, 2019 ) have typically reported—participants that are unmotivated from the beginning, who complete the whole questionnaire carelessly and participants that lose the motivation during the completion of the survey, who answer the items carelessly after the “change-point” (Bowling et al., 2021 ; Yu & Cheng, 2019 ). In the full careless responding condition, all item responses of every careless response protocol were replaced with simulated careless responses. In the partial careless responding condition, 50% of the item responses of the response protocol were randomly selected and replaced with simulated careless responses. Footnote 4 Generally, response protocols in which all item responses were simulated to be given carelessly tend to be more easily spotted than protocols in which only a partial number of item responses were simulated to be given carelessly (Hong et al., 2020 ; Meade & Craig, 2012 ). We expected PR and RPR to be more accurate in detecting full than in detecting partial careless responding.

Summary of hypotheses

In sum, we expected PR and RPR to be more accurate when the conditions were favorable for them—conditions in which their computation is less affected by error (i.e., sampling or measurement error) and conditions in which the careless response patterns are easier to spot: when many facets/scale half pairings (e.g., 30) can be used for computation; when the extent of item-specific error per facet is low; when a random uniform careless response pattern should be detected; and when a careless response protocol should be detected in which all item responses are simulated to be given carelessly.

Moreover, because RPR is less affected by fluctuations than the conventional PR measure, which is based on only a single set of scale half parings (e.g., Curran, 2016 ), we expected RPR to outperform PR to a larger degree in conditions in which the fluctuations of the individual sets of scale half parings and their PR values are large: when fewer scale half pairings (e.g., only five) are used for computation; when the extent of item-specific error per facet is large; when careless response patterns should be detected that are harder to spot (e.g., invariant and partial careless response patterns).

Fixed parameters

In all of our simulation conditions, the following parameters were held constant (see Table 4 ). First, we always drew samples with 300 observations and randomly defined 30% of them as careless respondents (whose careful response protocols were later on replaced with the careless response patterns). We held the sample size and the percentage of careless respondents in the sample constant across conditions, because we did not expect these two parameters to have an effect on the performance of the PR measures, which have the (in)consistency of the individual response protocols and not deviations from a normative response pattern as their primary focus (Goldammer et al., 2020 ; 2024 ). A sample size of 300 and a percentage of 30% careless respondents in the sample was chosen because comparable values had been used in previous simulation studies (Hong et al., 2020 ; Wind & Wang, 2023 ).

Second, we always drew the samples from a multivariate normal facet (latent factor) correlation matrix in which the facet correlations were set to .3 and the facet means and standard deviations were set to 3.5 and 0.7. As for sample size and percentage of careless respondents in the sample, the facet correlation was held constant across conditions, because we did not expect this parameter to have a large impact on the performance of the PR measures. Compared to the between-facet correlation, we considered the within-facet correlation (i.e., correlations among items in each facet) as more important for the performance of the PR measures, which is why we manipulated the extent of item-specific error per facet instead. A facet correlation of .3 was chosen because it approximated facet and domain correlations reported in validation studies (e.g., Soto & John, 2017 ) reasonably well, and by drawing samples from a multivariate normal facet correlation matrix, we followed previous simulation studies that used similar parameter settings for the data-generating process (Hong et al., 2020 ; Wind & Wang, 2023 ).

Third, for each of these drawn of facets (latent factors), we always generated four positively keyed items, and for each of these four items the factor loading was set to 1. The resulting continuous item scores were then rounded to the next integer, which allowed us to obtain a categorical response format in which the item values ranged from 1 to 6. If rounded item values fell below 1, they were recoded to 1, and if rounded item values were greater than 6, they were recoded to 6. We chose to use four items per facet and a categorical six-point response format because we considered these settings to be a reasonable approximation of the typical questionnaire format used in psychological assessments. By generating positively keyed items, we followed previous simulation studies that used similar parameter settings for the data-generating process (Hong et al., 2020 ; Wind & Wang, 2023 ). Lastly, the factor loadings for the items were held constant across conditions, because we already manipulated the measurement precision in the facets by using increasing numbers of less reliable items per facet.

Measures of personal reliability

In addition to the conventional PR measure (i.e., even–odd consistency), we calculated three RPR versions that were based on a different number of independently drawn resamples (i.e., 25, 50, 100). Calculating these three RPR versions allowed us to examine their relative performance and thus to gain insights on the number of RPR resamples that is necessary to achieve the expected gain in detection accuracy over the conventional PR measure. The values of the conventional PR as well as those of the three RPR versions were all corrected with the Spearman–Brown prophesy formula.

Outcome measures and analytical procedure

We used two classification accuracy measures as outcomes in our simulation study: the area under the receiver operating characteristic curve (AUC) and the sensitivity at a false-positive rate of 5%. The AUC statistic varies between 0 and 1 and can be interpreted as the probability that a randomly chosen carelessly responding individual has a higher score on a careless response index than a randomly chosen carefully responding individual (Pepe, 2000 ). Thus, an index can be considered as effective in detecting carelessly responding participants if its AUC is significantly larger than 0.5 (Lasko et al., 2005 ; Streiner & Cairney, 2007 ). In contrast, a slightly different picture is obtained by sensitivity at a false-positive rate of 5%. This measure indicates the percentage of careless response protocols that can be detected with a given index if we accept that 5% of the normal response protocols are falsely classified as being careless (e.g., Huang et al., 2012 , p. 106).

We obtained these two outcome measures by running nonparametric receiver operating characteristic (ROC) regression models (using the Stata command rocreg with tie correction and bootstrapping) in which the PR measures were entered as independent variables and the protocol classification (careful vs. careless) as a dependent variable. The AUCs and sensitivities were then used in two different ways to examine the performance of PR and the three RPR versions. For one, we calculated averaged AUCs and sensitivities across the 100 replications of each condition. For another, we determined the effect size of our manipulated parameters on the AUCs and sensitivities. We therefore ran analysis of variance (ANOVA) models for each of the four PR measures and the two outcome measures, in which the manipulated parameters were entered as categorical independent variables and the AUC or the sensitivity as a dependent variable.

The averaged condition-specific AUC values are displayed in Table 5 , the averaged condition-specific sensitivity values in Table 6 , and the averaged condition-specific cutoff values at a false-positive rate of 5% in Table 7 . In addition, Table 8 shows the effect sizes of the manipulated factors and their interactions on the AUCs and sensitivities of the conventional PR measure and the three RPR versions. When analyzing the simulation data, we considered effects as potentially meaningful only if they were significant at p  < .001 and had an effect size of η 2  > .01.

Impact of manipulated factors on the detection accuracy of the PR measures

As expected, the conventional PR measure and the three RPR versions were more accurate in detecting careless responding in conditions in which their computation was less affected by error (i.e., conditions in which more facets and facets with lower item-specific error were used) and in conditions in which the careless response patterns were easier to spot (i.e., conditions in which full careless responding and uniform random careless response patterns should be detected).

With η 2 values of .89 and above, the error per facet and the number of facets had the strongest main effects on the AUC and sensitivity of the four PR measures (see Table 8 ). Thus, higher AUC and sensitivity values could be obtained for the conventional PR measure and the RPR versions if more facets and facets with a low item-specific error were for used for computation. With η 2 values of .42 and above, the severity of the careless responding in the response protocols (i.e., full vs. partial careless responding) also had a strong main effect on the AUC and sensitivity of the four PR measures (see Table 8 ). Thus, higher AUC and sensitivity values could be obtained for the conventional PR measure and the three RPR versions if full rather than partial careless responding should be detected. Compared to the other manipulated factors, the smallest main effect, with η 2 values of .04 (see Table 8 ), was observed for type of careless responding (i.e., invariant vs. uniform random), indicating that invariant careless responding was harder to detect than uniform random responding.

In addition to these main effects, there were also some interesting two-way and three-way interaction effects between the manipulated factors (see Table 8 ). For instance, the error per facet interacted with the number of facets (with η 2 values that ranged from .10 to .63) and the severity of careless responding in the response protocols (with η 2 values that ranged from .05 to .15), and these three factors were all part of a three-way interaction (with η 2 values that ranged from .07 to .17). The three-way interaction indicated that the decrease in AUC and sensitivity that occurred when shifting from low to high error per facet was more pronounced when five instead of 30 facets were used, especially when partial instead of full careless responding should be detected. In Fig. 1 , this three-way interaction is displayed for the case when the AUC of the RPR with 25 resamples (RPR25) is used as outcome. However, this interaction pattern was comparable to the pattern that we observed when the sensitivity of RPR25 was used and when AUC or sensitivity values of other PR measures were used as outcome.

figure 1

Three-way interaction between error per facet, number of facets, and severity of careless responding when predicting AUC of RPR25. Note . Results were based on 3600 replications, F (4, 3564) = 90.98, p  < .001, η 2  = .09. AUC of RPR25 = area under the receiver operating characteristic curve for the mean-based resampled personal reliability with 25 sets of scale half pairings (higher values indicate a higher probability of detecting careless responding); Low = low item error per facet with normally distributed errors with N (0, 0.5) for all items of each factor; Mediocre = mediocre item error per facet with normally distributed errors with N (0, 0.5) for one half of the items of each factor and normally distributed errors with N (0, 1.5) for the other half of the items of each factor; High = high item error per facet with normally distributed errors with N (0, 1.5) for all items of each factor; Full careless responding = all item responses of the response protocol are simulated to be given carelessly; Partial careless responding = 50% of the item responses of the response protocol were randomly selected and replaced with simulated careless responses

For another, the severity of careless responding in the response protocols interacted with the type of careless responding (with η 2 values that ranged from .06 to .11). This two-way interaction indicated the following: When all item responses of the response protocols were simulated to be given carelessly, invariant and uniform random careless responding could be detected equally well. However, when only 50% of the items of the response protocols were randomly selected and replaced with simulated careless responses, invariant careless responding was harder to detect than uniform random careless responding. In Fig. 2 , this two-way interaction is displayed for the case when the AUC of the RPR with 25 resamples (RPR25) is used as outcome. However, this interaction pattern was comparable to the pattern that we observed when the sensitivity of RPR25 was used and when AUC or sensitivity values of other PR measures were used as outcome.

figure 2

Two-way interaction between severity of careless responding and type of careless responding when predicting AUC of RPR25. Note . Results were based on 3600 replications, F (4, 3564) = 455.77, p  < .001, η 2  = .11. AUC of RPR25 = area under the receiver operating characteristic curve for the mean-based resampled personal reliability with 25 sets of scale half pairings (higher values indicate a higher probability of detecting careless responding); Full careless responding = all item responses of the response protocol were simulated to be given carelessly; Partial careless responding = 50% of the item responses of the response protocol were randomly selected and replaced with simulated careless responses; Invariant careless responding = only the two response options “4” and “5” were randomly simulated for the careless response protocols with an equal probability of occurrence; Uniform careless responding = all six response options (1 to 6) in the simulated response protocols had an equal probability of occurrence

Comparison of PR and RPR across simulation conditions

To examine whether the conventional PR measure and the three RPR versions performed differently across the simulation conditions, we inspected the indices shown in Tables 5 and 6 and additionally ran ANOVA models with a data set in long format in which the four PR measures were represented as additional predictor (with four levels) and their replication-specific AUC and sensitivity values as four separate row entries listed under a single AUC and sensitivity column.

With an η 2 value of .39 in the case of AUC and an η 2 value of .14 in the case of sensitivity, the PR predictor had a substantial impact on the two outcomes measures. In line with our expectation, this main effect indicated that the three RPR measures generally performed better in detecting careless responding than the conventional PR measure.

Across all conditions, AUC improved from .789 to .833–.834, and sensitivity from .494 to .558–.561 when using an RPR version instead of PR. When using an RPR version instead of the conventional PR measure, the largest gain in the AUC (+.078–.079) occurred if five facets with mediocre item-specific error were used and invariant full careless responding should be detected, and the smallest gain in AUC (+.001) occurred if 30 facets with low item-specific error were used and uniform random full careless responding should be detected. Similarly, when using a RPR version instead of the conventional PR measure, the largest gain in sensitivity (+.178–.183) occurred if five facets with mediocre item-specific error were used and invariant full careless responding should be detected, and the smallest gain in sensitivity (+.004) occurred if 30 facets with low item-specific error were used and invariant full careless responding should be detected.

Bonferroni-corrected pairwise comparisons between the PR measures further revealed no significant differences in detection performance (i.e., AUC and sensitivity) between the three RPR versions. Thus, using 25 resamples for the RPR computation was sufficient to obtain the expected gain in detection accuracy over the conventional PR measure, and using more resamples (i.e., 50 or 100) was not associated with an additional improvement in AUC and sensitivity values. Footnote 5

In addition to these main effects, there was an interesting pattern of two-way and three-way interaction effects between the PR predictor and the other manipulated factors. The PR predictor interacted with number of facets (η 2  = .05) and the error per facet (η 2  = .03) when predicting AUC, and these three factors were all part of a three-way interaction when predicting AUC (η 2  = .03) and sensitivity (η 2  = .03). This three-way interaction indicated that the gain in detection accuracy that occurred when changing from PR to RPR25 (for instance) was more pronounced when five instead 30 facets were used, especially when the error per facet was low instead of high. In Fig. 3 , this three-way interaction is displayed for the case when AUC was used as outcome. However, this interaction pattern was comparable to the one that we observed when sensitivity was used as outcome. Thus, our expectation that the RPR would outperform the PR to a larger degree in conditions in which the fluctuations in the individual sets of scale half parings and their PR values is large was at least supported for a specific set of simulation conditions.

figure 3

Three-way interaction between type of personal reliability measure, number of facets, and error per facet when predicting AUC. Note . The four personal reliability (PR) measures were calculated in each of the 3600 replications. To examine the interaction, a data set in long format was used in which the AUCs of the PR measures were listed as four separate rows under a single AUC column. Unadjusted test statistic is equal to F (12, 14,256) = 33.60, p  < .001. Adjusted test statistic that is based on cluster-robust SE with replication id as cluster variable is equal to F (12, 3599) = 94.86, p  < .001. The effect size for this interaction is equal to η 2  = .03 (using ordinary least-squares estimation). PR = conventional personal reliability (i.e., even–odd consistency); AUC = area under the receiver operating characteristic curve (higher values indicate a higher probability of detecting careless responding); RPR25 = resampled personal reliability that is based on 25 sets of scale half pairings; RPR50 = resampled personal reliability that is based on 50 sets of scale half pairings; RPR100 = resampled personal reliability that is based on 100 sets of scale half pairings; Low error per facet = normally distributed errors with N (0, 0.5) for all items of each factor; Mediocre error per facet = normally distributed errors with N (0, 0.5) for one half of the items of each factor and normally distributed errors with N (0, 1.5) for the other half of the items of each factor; High error per facet = normally distributed errors with N (0, 1.5) for all items of each factor

Reanalysis of real data

Study setting and participants.

For our real data example, we reanalyzed the data from Goldammer et al.’s ( 2020 ) experimental study on careless responding. That study sample consisted of 359 German-speaking, predominantly male ( n  = 357; 99.4%) military conscripts who were on average 20 years ( SD  = 1.14) old. The conscripts were nested in 12 platoons, of which each had on average 29.92 members ( SD  = 8.49). Each platoon was led by one platoon leader, and at the time the study took place, the conscripts had been led by their platoon leader for about 9 to 10 weeks (Goldammer et al., 2020 ). Every conscript received 10 Swiss francs for participation.

Substantive study measures

The substantive questionnaire in Goldammer et al. ( 2020 ) contained six broad scales: three leader behavior measures (i.e., transformational, passive-avoidant, and authentic leadership), two relational correlates of leadership (leader–member exchange, organizational commitment), and one follower effectiveness criterion (organizational citizenship behavior).

Transformational leadership (TFL) was assessed with the German adaptation (Felfe, 2006 ) of the Multifactor Leadership Questionnaire (MLQ; Bass & Avolio, 1995 ). The German TFL scale included six facets, and each of these facets contained four items that were rated on a five-point Likert scale ranging from 1 = never to 5 = frequently, almost always . Passive-avoidant leadership (PAL) was assessed with the German adaptation of the two MLQ facets management-by-exception passive and laissez-faire (Bass & Avolio, 1995 ). Each of these two facets was measured with four items on a five-point Likert scale that ranged from 1 = never to 5 = frequently, almost always . Authentic leadership (AL) was assessed with the publisher’s (i.e., Mindgarden) German translation of the Authentic Leadership Questionnaire (Avolio et al., 2007 ). Of the four AL facets, the transparency facet was assessed by five items, the moral and self-awareness facets were each assessed by four items, and the balanced processing facet by three items. All AL items were rated on a five-point Likert scale that ranged from 1 = never to 5 = frequently, almost always . Leader–member exchange (LMX) was assessed with the German version (Paul & Schyns, 2002 ) of Liden and Maslyn’s ( 1998 ) multidimensional measure. LMX included four facets, and each facet was measured by three items that were rated on a five-point Likert scale that ranged from 1 = does not apply at all to 5 = applies completely .

Further, organizational commitment (OC) was assessed with the German adaptation (COBB; Felfe & Franke, 2012 ) of Meyer and Allen’s ( 1990 ) commitment measure. Of the three facets of OC, the affective and normative facet were each assessed by five items and the continuance facet by four items. All OC items were rated on a five-point Likert scale that ranged from 1 = does not apply at all to 5 = applies completely . The conscript’s organizational citizenship behavior (OCB) was assessed with the German adaptation (Staufenbiel & Hartz, 2000 ) of the OCB scale proposed by Podsakoff et al. ( 1990 ). OCB included four facets, and each facet was measured by five items that ranged from 1 = does not apply at all to 5 = applies completely . In sum, the questionnaire in Goldammer et al. ( 2020 ) included 23 facets that were measured by 94 items.

Experimental conditions and survey arrangement

In Goldammer et al. ( 2020 ), participants were randomly assigned to one of three response conditions: careful responding ( n  = 121), random careless responding ( n  = 119), and opposite careless responding ( n  = 119). The participants in the careful responding condition were instructed to complete all items accurately and attentively. The participants in the random responding condition were instructed to select any response option they wanted on 50% of the items in each scale (i.e., TFL, PAL, AL, LMX, OC, OCB). The participants in the opposite responding condition were instructed to select the opposite of the response option that would have actually applied to them on 50% of the items in each scale.

The questionnaire in Goldammer et al. ( 2020 ) was arranged in six randomly ordered scale-specific blocks; each scale-specific block was further divided into two survey pages. On the first survey page of each block, a random selection of 50% of the items of each scale was displayed. For this item selection, all participants were instructed to respond carefully. The remaining 50% of the scale-specific items were displayed on the following second survey page of each block, for which the participants received the condition-specific response instructions.

As we did for the simulation study, we calculated the conventional PR measure (i.e., even–odd consistency) and three RPR versions that were based on a different number of independently drawn resamples (i.e., 25, 50, 100).

We used the AUC and the sensitivity at a false-positive rate of 5% as outcome measures for the real data analyses. We obtained these two outcome measures by running nonparametric ROC regression models (using the Stata command rocreg with tie correction and bootstrapping) in which the PR measures were entered as independent variables and the condition assignment (careful vs. careless) as a dependent variable. We calculated these two outcome measures for each of our four PR measures across three facet levels: 6, 16, and 23. For the first facet level (i.e., 6), only the six TFL facets and their 24 items were used for the computation of the four PR measures and their AUCs and sensitivities. For the second facet level (i.e., 16), the 16 facets of the TFL, PAL, AL, and LMX scales and their 60 items were used for the computation of the four PR measures and their AUCs and sensitivities. For the third facet level (i.e., 23), all 23 facets and their 94 items were used for the computation of the four PR measures and their AUCs and sensitivities. This procedure allowed us to examine the performance of the PR measures across facet levels that were comparable to those evaluated in our simulation study.

Table 9 shows the AUCs and sensitivities of the four PR measures for each of the three facet “conditions.” The omnibus test of equality indicated a significant inequality between the AUCs of the four PR measures in the 16-facet condition, χ 2 (3) = 12.44, p  = .006, and the 23-facet condition, χ 2 (3) = 12.30, p  = .006. In both of these facet conditions, Bonferroni-corrected pairwise post hoc comparisons further revealed significant differences between the conventional PR measure and each of the three RPR versions. In addition, the pairwise comparisons showed no significant differences in AUC between the three RPR versions in either of these facet conditions. Even though the Bonferroni-corrected critical χ 2 value was not reached when testing the AUCs for equality in the six-facet condition, we observed the same pattern of AUC results as in the 16- and 23-facet conditions (i.e., comparable AUCs for the three RPR versions, which all tended to be higher than that of the conventional PR measure). In contrast to the omnibus test for AUC equality, none of the three omnibus tests for sensitivity equality reached the Bonferroni-corrected critical χ 2 value of 11.74. Nevertheless, we observed a comparable pattern as in the AUC results (i.e., comparable sensitivities for the three RPR versions, which all tended to be higher than that of the conventional PR measure).

The real data analyses therefore allowed us to obtain two insights. First, the increase in detection accuracy when shifting from the conventional PR measure to the RPR25 (for instance) did not seem to substantially decrease when using more facets for the PR calculation (e.g., when using 16 instead of six facets). Instead, it seemed that using RPR went along with a constant gain of AUC across the three examined facet levels. This result was contrary to our expectation but partially in line with the simulation results in which the supposed interaction effect (i.e., using RPR is associated with a larger gain in detection accuracy when conditions are rough) also only occurred for specific sets of conditions. The second and more important result, however, was that the RPR measures were generally more accurate than the conventional PR measure. This finding illustrated that RPR may be of greater utility not only when detecting computer-generated careless response patterns but also when human-generated careless response patterns need to be detected.

The aim of our present research was to examine whether RPR really outperforms the conventional PR or even–odd consistency in detecting careless responding, and under what conditions the potential gain in detection accuracy is the most pronounced. We therefore conducted a simulation study in which we evaluated the relative performance of the conventional PR and three RPR versions in detecting simulated careless response protocols across 36 conditions. In a second study, we examined the performance of PR and the three RPR versions when detecting human-generated careless response protocols by reanalyzing the data from Goldammer et al.’s ( 2020 ) experimental study on careless responding.

Our analyses show that RPR is under many conditions a significantly better careless response indicator than the conventional PR, no matter whether computer- or human-generated careless response patterns need to be detected. This result was in line with our expectation and confirms the as-yet untested proposition (e.g., Curran, 2016 , pp. 9–10; Ward & Meade, 2023 , p. 587) that calculating PR with different sets of scale halves and averaging the individual PRs to an overall measure (i.e., the RPR) results in a careless response indicator that is more accurate than the conventional PR measure, which is only based on a single set of scale half pairs—typically a set of even–odd scale half pairs.

When using RPR instead of PR, we expected the gain in detection accuracy to be larger under presumably “rough” conditions (i.e., only few facets with high item-specific error can be used when detecting invariant partial careless responding) than under presumably favorable conditions (i.e., many few facets with low item-specific error can be used when detecting uniform random full careless responding). The only support for this interaction hypothesis is indicated by the three-way interaction in the simulation study between the PR predictor, the number of facets, and the error per facet. This result suggests that using RPR instead of PR is associated with a relatively constant gain in detection accuracy that only decreases under conditions in which PR and RPR reach the maximal level of detection accuracy (i.e., an AUC value of 1).

Recommendations

Across all conditions in the simulation study, the AUC increased from .789 to .833–.834 and the sensitivity from .494 to .558–.561 when using an RPR version instead of the conventional PR measure. Moreover, in all the simulation conditions that we examined, the averaged condition-specific detection accuracy of RPR versions never fell below the accuracy of the conventional PR measure. Thus, using an RPR version instead of PR only brings advantages. In the worst case, researchers obtain a PR measure that is less arbitrary in its computation and only as accurate as the conventional PR (i.e., even–odd consistency). In the best and more likely case, researchers obtain a PR measure that is less arbitrary in its computation and more accurate than the conventional PR measure. We therefore generally recommend using RPR instead of the conventional PR measure when screening questionnaire data for careless responding.

Our results also show that the three RPR versions examined, which were based on a different number of independently drawn resamples (i.e., 25, 50, 100), resulted in almost identical AUC and sensitivity values. Thus, using 25 resamples for the RPR computation was sufficient to obtain the expected gain in detection accuracy over the conventional PR measure, and using more resamples (i.e., 50 or 100) was not associated with any substantial further improvement in AUC and sensitivity values. We therefore recommend using 25 resamples (i.e., 25 different sets of scale half pairings) when calculating RPR. Furthermore, we recommend using the arithmetic mean for summarizing the individual PR values (i.e., calculating a mean-based RPR). As our supplementary analyses showed, this type of RPR calculation is generally more accurate than other types of RPR calculation that use different ways of summarizing the individual PR values (i.e., using the median or the standard deviation).

Finally, our results show how researchers can improve the detection accuracy of RPR. First, researchers should use validated questionnaires (assuming that such questionnaires include items that are more reliable than those of unvalidated questionnaires) and/or questionnaires with 15 or more facets. Second, our supplementary analyses showed that higher detection accuracy of RPR can be obtained when items are presented in a random rather than a fixed construct-wise order, especially when partial invariant careless responding needs to be detected. Footnote 6 However, using longer and validated questionnaires and presenting their items in a random order may not be possible in every research context. Thus, whenever shorter or less validated questionnaires are used or a fixed construct-wise item presentation mode is chosen, we recommend the following procedure when using RPR for careless responding detection: First, instead on blindly relying on heuristics (e.g., screening RPR below .3; DeSimone et al., 2015 ; Zickar & Keith, 2023 ), researchers should apply (more conservative) cutoff values that match the measurement conditions of their study when screening the response protocols for careless responding according to the RPR values. For instance, to ensure a false-positive rate of 5% in our simulation, a cutoff value of .56 had to be used in the context of 15 facets with low item-specific error, but a cutoff value of −0.09 to −0.10 in the context of 15 facets with mediocre item-specific error (see Table 7 ). Second, researchers should use additional careless response indices that are less affected by the factors “number of facets” and “item-specific error per facet,” such as response time per item or Mahalanobis distance.

Limitations and future research directions

Our study has several limitations. First, the conclusions drawn based on our simulation results are valid only for the conditions that we examined. We manipulated number of facets, extent of item-specific error per facet, type of careless responding, and severity of careless responding in the careless response protocols―factors that we thought would most likely affect the detection accuracy of PR and RPR. However, other factors that we did not examine may have a potential effect on the detection performance of PR and RPR, such as number of response options (e.g., 4, 5, 6), number of items per facet (e.g., 4, 6, 10), type of item keying (i.e., unidirectional vs. bidirectional), and type of item distribution (e.g., normal vs. skewed). Furthermore, we did not examine how challenges that are typically encountered in applied research settings, such as missing data, lack of unidimensionality of the scales and facets used, or different item functioning affect the performance of the PR and RPR. Footnote 7 7 Thus, future simulation studies should extend our work and examine whether factors that we held constant across our conditions have an effect on the detection accuracy of PR and RPR. In addition, it would be worthwhile to examine whether adaptations of RPR to the context of a single unidimensional scale also turn out to be useful in detecting careless responding. Footnote 8

Second, we used latent factor models for the data generation because this model type allowed us to readily manipulate the factors that we examined in our study. However, our simulation results may not hold if other model types are used for data generation, such as models that are based on item response theory (IRT). Future simulation studies should examine whether our results can be replicated when IRT-based models, such as Rasch rating scale models (Andrich, 1978 ) or graded response models (Samejima, 1969 ), are used for data generation.

Third, we focused on comparing the detection performance of PR with that of the three RPR versions. With this focus, however, we left the question unanswered as to how RPR performs compared to other established careless response indices. Future studies could therefore look at how the RPR performs compared to average response time per item and Mahalanobis distance (e.g., Goldammer et al., 2020 ) and explore which of these indices best complements RPR.

Screening questionnaire data for careless responding is important to ensure the credibility of study findings. However, screening will only be effective if researchers apply the most accurate indices and know under what conditions the applied indices perform favorably and less favorably. In our two studies, we examined the detection accuracy of RPR, which turns out to perform even better than the well-performing conventional PR. Further, our studies indicate that RPR performs the best when 15 or more facets with low item-specific error per facet are used, and the worst (but nevertheless acceptably) when only five facets with high item-specific error per facet can be used for calculation. These valuable insights may help applied researchers in more effective design of their careless response screening.

Data availability

The code to run the simulation and the data of the two studies are available under the following link: https://polybox.ethz.ch/index.php/s/OUeUYS9ZHkyu7YB

The data-generating process (DGP) for this careful response protocol is identical to DGP used in some conditions of the simulation study (i.e., latent domain intercorrelation: .3; unstandardized item loadings: 1; unstandardized random normal error variance: .50).

A reviewer wondered why we had chosen the arithmetic mean for summarizing the different PRs and not another measure of central tendency. Because the literature offers little guidance in this regard, using the arithmetic mean was our best guess. However, we viewed the reviewer’s point as legitimate and therefore ran supplementary analyses in which we examined whether using a median-based RPR would alter the results. In these analyses, however, the mean-based RPR turned out to be a slightly better index for careless responding than the median-based RPR (see supplementary material, Tables S1 – S3 ). Across all conditions in the simulation study, the AUC increased from .824–.827 to .833–.834 and the sensitivity from .552–.557 to .558–.561 when using a mean-based RPR version instead of a median-based RPR version. Moreover, in all the simulation conditions that we examined, the averaged condition-specific detection accuracy of mean-based RPR versions never fell below the accuracy of their median-based RPR counterparts. Thus, we recommend using a mean-based rather than a median-based RPR version for detecting careless responding.

Based on the numerical example in Table 3 , a reviewer wondered whether the PR values of careful respondents would generally be in a tighter range than those of careless respondents and thus could be an informative careless response index as well—potentially, an even more informative index than the (arithmetic) mean-based RPR. To examine this question, we ran supplementary analyses in which we examined whether using the standard deviation of PR values would alter the results. In these supplementary analyses, the standard deviation of PR values turned out to perform satisfactorily in most conditions (i.e., better than chance), which confirmed the reviewer’s hypothesis that the PR values of careful respondents are in a tighter range than those of careless respondents. However, the detection accuracies of the standard deviation of the PR values almost never reached those of the mean-based RPRs (see supplementary material, Tables S4 – S6 ). Across all conditions in the simulation study, the AUC increased from .802–.811 to .833–.834 and the sensitivity from .518–.534 to .558–.561 when using a mean-based RPR version instead of a version of that is based on the standard deviation of the PR values. Moreover, in all but two simulation conditions that we examined, the averaged condition-specific detection accuracy of mean-based RPR versions never fell below the accuracy of their “standard deviation counterparts.” Thus, the mean-based RPR may be considered as more informative than the standard deviation of PR values and should therefore be preferably used for careless responding detection.

A reviewer wondered how the random selection and replacement of items with careless responses is in line with the typical phenomenon of back careless responding, where the careless responses are concentrated at the end of the questionnaire (Yu & Cheng, 2019 ). We were aware of the research that shows that partial careless responding most likely occurs at the end of the survey, and we also aimed to represent this case with our partial careless responding condition. However, we had a specific questionnaire setup in mind when programming the partial careless responding condition in our simulation—the case in which the order of item display is random. This item presentation mode seems to be a legitimate choice for non-cognitive items (Bandalos, 2018 , p. 222) and may also come with the advantage of a better fitting confirmatory measurement model, due to reduced residual correlations among the items (Bandalos, 2021 ), Thus, in our partial careless responding conditions, those items that were randomly selected and replaced by careless response patterns can be thought of as being placed at the end of the survey. Furthermore, the case of random item order further means that each item of each construct has the equal probability of being placed in the last half of the questionnaire and thus an equal probability of being affected by back careless responding.

Nevertheless, there may be cases where researchers may wish to display the items in a fixed construct-wise order. In such a case, only constructs and their items that are placed in the last half of the questionnaire will be affected by careless responding. Eventually, we realized that such a fixed construct-wise item presentation mode may also have an effect on the performance of the PR and RPR. We therefore ran a supplementary simulation study with the identical settings as outlined in the paper but changed the item selection procedure for the partial careless responding conditions (i.e., the last 50% of items were replaced in each careless response protocol). For most conditions, the detection accuracies obtained from the supplementary simulation study were comparable to those that we obtained from the simulation study reported in the paper. However, in conditions where partial invariant careless responding had to be detected, the performance of the PR and RPRs differed substantially across the two simulations. Whereas the PR and RPRs had satisfactory to good detection accuracies for detecting partial invariant careless responding when items that were randomly selected were replaced by careless responses (i.e., representing back careless responding in a questionnaire in which items are randomly displayed), they performed completely unsatisfactorily (i.e., they were more indicative of careful responding) when the last 50% of items were replaced with careless responses (i.e., representing back careless responding in a questionnaire with a fixed construct-wise item order) (see supplementary material, Tables S7 – S15 ). Thus, whenever possible, researchers should choose a random item presentation mode over a fixed construct-wise item presentation mode, because higher detection accuracies can be achieved when using PR and RPRs.

As a robustness check, we reran the analyses for the comparisons between PR and RPR with cluster-robust standard error estimation (using replication id as cluster variable in linear regression models). In these additional analyses, the test statistics of nearly all pairwise comparisons among the RPR versions reached statistical significance, even though their AUC and sensitivity values only differed at the third decimal place. However, the effect sizes that were associated with each of the pairwise comparisons confirmed our conclusion (Cohen’s d for AUC RPRs vs PR  = 0.33–0.34, Cohen’s d for AUC RPRs vs RPRs  = 0.007–0.010, Cohen’s d for Sensitivity RPRs vs PR  = 0.21–0.23, Cohen’s d for Sensitivity RPRs vs RPRs  = 0.009–0.011) that the differences between the RPR versions and the conventional PR (e.g., RPR25 vs. PR) may be considered as practically meaningful and those between the different RPR versions (e.g. RPR100 vs. RPR25) as neglectable.

According to Goldammer et al., 2024 , another approach for improving the detection accuracy of RPR would be to use scales that contain bidirectionally keyed items (i.e., positively and negatively worded items) instead of scales that are only composed of unidirectionally keyed items (i.e., only positively or only negatively worded items).

We expect PR and RPR to be more accurate in detecting careless responding if scales are used for which unidimensionality and lack of different item functioning (across subgroups) can be assumed. However, if researchers doubt the quality of their scales, they should explicitly test for these two preconditions.

One of the three following strategies might be used to adapt PR and RPR to the context of a single unidimensional scale. First, researchers could split the items of the scale into three or more pairs of item blocks and calculate the within-person correlation across these pairs of items blocks. This version may be repeated across a series randomly rearranged set of pairs of item blocks and is most in line with the original spirit of the resampled personal reliability. However, a sufficient number of items needs to be given (e.g., 12 items or more) to obtain block scores that are really different from the item scores. Second, researchers could calculate the within-person correlation across each item pair combination, which would result in a statistic that is very similar to the semantic synonym index (Curran, 2016 , p. 9). Third, researchers could calculate the within-person correlation across each item–rest combination. As this approach examines whether item responses are in line with the score of the remaining items, it may have certain similarities with the Guttman error index (see Niessen et al., 2016 , pp. 4, 10).

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43 (4), 561–573. https://doi.org/10.1007/bf02293814

Article   Google Scholar  

Arias, V. B., Garrido, L. E., Jenaro, C., Martínez-Molina, A., & Arias, B. (2020). A little garbage in, lots of garbage out: Assessing the impact of careless responding in personality survey data. Behavior Research Methods, 52 , 2489–2505. https://doi.org/10.3758/s13428-020-01401-8

Article   PubMed   Google Scholar  

Avolio, B. J., Gardner, W. L., & Walumbwa, F. O. (2007). Authentic Leadership Questionnaire . Mind Garden.

Google Scholar  

Bandalos, D. L. (2018). Measurement theory and applications for the social sciences . Guilford Press.

Bandalos, D. L. (2021). Item meaning and order as causes of correlated residuals in confirmatory factor analysis. Structural Equation Modeling, 28 (6), 903–913. https://doi.org/10.1080/10705511.2021.1916395

Bass, B. M., & Avolio, B. J. (1995). MLQ Multifactor Leadership Questionnaire: Technical report . Mindgarden.

Bowling, N. A., Gibson, A. M., Houpt, J. W., & Brower, C. K. (2021). Will the questions ever end? Person-level increases in careless responding during questionnaire completion. Organizational Research Methods, 24 (4), 718–738. https://doi.org/10.1177/1094428120947794

Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3 (3), 296–322. https://doi.org/10.1111/j.2044-8295.1910.tb00207.x

Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). Guilford Press.

Credé, M. (2010). Random responding as a threat to the validity of effect size estimates in correlational research. Educational and Psychological Measurement, 70 (4), 596–612. https://doi.org/10.1177/0013164410366686

Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66 , 4–19. https://doi.org/10.1016/j.jesp.2015.07.006

DeSimone, J. A., Harms, P. D., & DeSimone, A. J. (2015). Best practice recommendations for data screening. Journal of Organizational Behavior, 36 (2), 171–181. https://doi.org/10.1002/job.1962

Donnellan, M. B., Oswald, F. L., Baird, B. M., & Lucas, R. E. (2006). The mini-IPIP scales: Tiny-yet-effective measures of the Big Five Factors of Personality. Psychological Assessment, 18 (2), 192–203. https://doi.org/10.1037/1040-3590.18.2.192

Felfe, J. (2006). Validierung einer deutschen Version des “Multifactor Leadership Questionnaire” (MLQ Form 5 x Short) von Bass und Avolio (1995) [Validation of a German version of the “Multifactor Leadership Questionnaire” (MLQ Form 5 x Short) by Bass and Avolio (1995)]. Zeitschrift für Arbeits- und Organisationspsychologie, 50 (2), 61–78. https://doi.org/10.1026/0932-4089.50.2.61

Felfe, J., & Franke, F. (2012). Commit. Verfahren zur Erfassung von Commitment gegenüber der Organisation, dem Beruf und der Beschäftigungsform [Commit: Procedure for measuring commitment to the organization, profession and form of employment] . Verlag Hans Huber.

Goldammer, P., Annen, H., Stöckli, P. L., & Jonas, K. (2020). Careless responding in questionnaire measures: Detection, impact, and remedies. The Leadership Quarterly, 31 (4), 101384. https://doi.org/10.1016/j.leaqua.2020.101384

Goldammer, P., Stöckli, P. L., Escher, Y. A., Annen, H., Jonas, K., & Antonakis, J. (2024). Careless responding detection revisited: Accuracy of direct and indirect measures. Behavior Research Methods, 1–28. https://doi.org/10.3758/s13428-024-02484-3

Hong, M., Steedle, J. T., & Cheng, Y. (2020). Methods of detecting insufficient effort responding: Comparisons and practical recommendations. Educational and Psychological Measurement, 80 (2), 312–345. https://doi.org/10.1177/0013164419865316

Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27 (1), 99–114. https://doi.org/10.1007/s10869-011-9231-8

Jackson, D. N. (1976). The appraisal of personal reliability [Paper presentation]. Meetings of the Society of Multivariate Experimental Psychology, University Park

Johnson, J. A. (2014). Measuring thirty facets of the Five Factor Model with a 120-item public domain inventory: Development of the IPIP-NEO-120. Journal of Research in Personality, 51 , 78–89. https://doi.org/10.1016/j.jrp.2014.05.003

Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34 (2), 183–202. https://doi.org/10.1007/bf02289343

Kam, C. C. S., & Meyer, J. P. (2015). How careless responding and acquiescence response bias can influence construct dimensionality: The case of job satisfaction. Organizational Research Methods, 18 (3), 512–541. https://doi.org/10.1177/1094428115571894

Kline, R. B. (2016). Principles and practice of structural equation modeling (4 th ed.). Guilford Press.

Lasko, T. A., Bhagwat, J. G., Zou, K. H., & Ohno-Machado, L. (2005). The use of receiver operating characteristic curves in biomedical informatics. Journal of Biomedical Informatics, 38 (5), 404–415. https://doi.org/10.1016/j.jbi.2005.02.008

Liden, R. C., & Maslyn, J. M. (1998). Multidimensionality of leader-member exchange: An empirical assessment through scale development. Journal of Management, 24 (1), 43–72. https://doi.org/10.1177/014920639802400105

Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17 (3), 437–455. https://doi.org/10.1037/a0028085

Meyer, J. P., & Allen, N. J. (1990). The measurement and antecedents of affective, continuance and normative commitment to the organization. Journal of Occupational Psychology, 63 (1), 1–18. https://doi.org/10.1111/j.2044-8325.1990.tb00506.x

Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2016). Detecting careless respondents in web-based questionnaires: Which method to use? Journal of Research in Personality, 63 , 1–11. https://doi.org/10.1016/j.jrp.2016.04.010

Paul, T., & Schyns, B. (2002). Deutsche Leader-Member Exchange Skala (LMX MDM). Zusammenstellung sozialwissenschaftlicher Items und Skalen (ZIS) . https://doi.org/10.6102/zis25

Pepe, M. S. (2000). Receiver operating characteristic methodology. Journal of the American Statistical Association, 95 (449), 308–311. https://doi.org/10.1080/01621459.2000.10473930

Podsakoff, P. M., MacKenzie, S. B., Moorman, R. H., & Fetter, R. (1990). Transformational leader behaviors and their effects on followers’ trust in leader, satisfaction, and organizational citizenship behaviors. Leadership Quarterly, 1 (2), 107–142. https://doi.org/10.1016/1048-9843(90)90009-7

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (Psychometric Monograph No. 17) . Psychometric Society.

Soto, C. J., & John, O. P. (2017). The next Big Five Inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. Journal of Personality and Social Psychology, 113 (1), 117–143. https://doi.org/10.1037/pspp0000096

Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3 (3), 271–295. https://doi.org/10.1111/j.2044-8295.1910.tb00206.x

StataCorp. (2023). Stata Statistical Software: Release 18 . StataCorp LLC.

Staufenbiel, T., & Hartz, C. (2000). Organizational citizenship behavior: Entwicklung und erste Validierung eines Meßinstruments [Organizational citizenship behavior: Development and first validation of a measurement tool]. Diagnostica, 46 (2), 73–83. https://doi.org/10.1026/0012-1924.46.2.73

Streiner, D. L., & Cairney, J. (2007). What’s under the ROC? An introduction to receiver operating characteristics curves. Canadian Journal of Psychiatry, 52 (2), 121–128. https://doi.org/10.1177/070674370705200210

Ward, M. K., & Meade, A. W. (2023). Dealing with careless responding in survey data: Prevention, identification, and recommended best practices. Annual Review of Psychology, 74 (1), 577–596. https://doi.org/10.1146/annurev-psych-040422-045007

Wind, S., & Wang, Y. (2023). Using Mokken scaling techniques to explore carelessness in survey research. Behavior Research Methods, 55 (7), 3370–3415.

Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 283 (3), 186–191. https://doi.org/10.1007/s10862-005-9004-7

Yu, X., & Cheng, Y. (2019). A change-point analysis procedure based on weighted residuals to detect back random responding. Psychological Methods, 24 (5), 658–674. https://doi.org/10.1037/met0000212

Zickar, M. J., & Keith, M. G. (2023). Innovations in sampling: improving the appropriateness and quality of samples in organizational research. Annual Review of Organizational Psychology and Organizational Behavior, 10 (1), 315–337. https://doi.org/10.1146/annurev-orgpsych-120920-052946

Download references

No funding was received for conducting these studies.

Author information

Authors and affiliations.

Military Academy at ETH Zurich, Birmensdorf, Switzerland

Philippe Goldammer, Peter Lucas Stöckli & Hubert Annen

Department of Psychology, University of Zurich, Zurich, Switzerland

Annika Schmitz-Wilhelmy

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Philippe Goldammer .

Ethics declarations

Conflicts of interest.

The authors have no relevant financial or non-financial interests to disclose.

Ethics approval

Because no new data were collected, no ethical approval was required according to the ethical regulations of the Faculty of Philosophy of the University of Zurich.

Consent to participate

Not applicable.

Consent for publication

Transparency.

The studies’ designs, hypotheses, and analyses were not preregistered.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 195 KB)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Goldammer, P., Stöckli, P.L., Annen, H. et al. A comparison of conventional and resampled personal reliability in detecting careless responding. Behav Res (2024). https://doi.org/10.3758/s13428-024-02506-0

Download citation

Accepted : 25 August 2024

Published : 16 September 2024

DOI : https://doi.org/10.3758/s13428-024-02506-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Careless responding detection
  • Even–odd consistency
  • Find a journal
  • Publish with us
  • Track your research

Experimental Design: Types, Examples & Methods

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

Probably the most common way to design an experiment in psychology is to divide the participants into two groups, the experimental group and the control group, and then introduce a change to the experimental group, not the control group.

The researcher must decide how he/she will allocate their sample to the different experimental groups.  For example, if there are 10 participants, will all 10 participants participate in both groups (e.g., repeated measures), or will the participants be split in half and take part in only one group each?

Three types of experimental designs are commonly used:

1. Independent Measures

Independent measures design, also known as between-groups , is an experimental design where different participants are used in each condition of the independent variable.  This means that each condition of the experiment includes a different group of participants.

This should be done by random allocation, ensuring that each participant has an equal chance of being assigned to one group.

Independent measures involve using two separate groups of participants, one in each condition. For example:

Independent Measures Design 2

  • Con : More people are needed than with the repeated measures design (i.e., more time-consuming).
  • Pro : Avoids order effects (such as practice or fatigue) as people participate in one condition only.  If a person is involved in several conditions, they may become bored, tired, and fed up by the time they come to the second condition or become wise to the requirements of the experiment!
  • Con : Differences between participants in the groups may affect results, for example, variations in age, gender, or social background.  These differences are known as participant variables (i.e., a type of extraneous variable ).
  • Control : After the participants have been recruited, they should be randomly assigned to their groups. This should ensure the groups are similar, on average (reducing participant variables).

2. Repeated Measures Design

Repeated Measures design is an experimental design where the same participants participate in each independent variable condition.  This means that each experiment condition includes the same group of participants.

Repeated Measures design is also known as within-groups or within-subjects design .

  • Pro : As the same participants are used in each condition, participant variables (i.e., individual differences) are reduced.
  • Con : There may be order effects. Order effects refer to the order of the conditions affecting the participants’ behavior.  Performance in the second condition may be better because the participants know what to do (i.e., practice effect).  Or their performance might be worse in the second condition because they are tired (i.e., fatigue effect). This limitation can be controlled using counterbalancing.
  • Pro : Fewer people are needed as they participate in all conditions (i.e., saves time).
  • Control : To combat order effects, the researcher counter-balances the order of the conditions for the participants.  Alternating the order in which participants perform in different conditions of an experiment.

Counterbalancing

Suppose we used a repeated measures design in which all of the participants first learned words in “loud noise” and then learned them in “no noise.”

We expect the participants to learn better in “no noise” because of order effects, such as practice. However, a researcher can control for order effects using counterbalancing.

The sample would be split into two groups: experimental (A) and control (B).  For example, group 1 does ‘A’ then ‘B,’ and group 2 does ‘B’ then ‘A.’ This is to eliminate order effects.

Although order effects occur for each participant, they balance each other out in the results because they occur equally in both groups.

counter balancing

3. Matched Pairs Design

A matched pairs design is an experimental design where pairs of participants are matched in terms of key variables, such as age or socioeconomic status. One member of each pair is then placed into the experimental group and the other member into the control group .

One member of each matched pair must be randomly assigned to the experimental group and the other to the control group.

matched pairs design

  • Con : If one participant drops out, you lose 2 PPs’ data.
  • Pro : Reduces participant variables because the researcher has tried to pair up the participants so that each condition has people with similar abilities and characteristics.
  • Con : Very time-consuming trying to find closely matched pairs.
  • Pro : It avoids order effects, so counterbalancing is not necessary.
  • Con : Impossible to match people exactly unless they are identical twins!
  • Control : Members of each pair should be randomly assigned to conditions. However, this does not solve all these problems.

Experimental design refers to how participants are allocated to an experiment’s different conditions (or IV levels). There are three types:

1. Independent measures / between-groups : Different participants are used in each condition of the independent variable.

2. Repeated measures /within groups : The same participants take part in each condition of the independent variable.

3. Matched pairs : Each condition uses different participants, but they are matched in terms of important characteristics, e.g., gender, age, intelligence, etc.

Learning Check

Read about each of the experiments below. For each experiment, identify (1) which experimental design was used; and (2) why the researcher might have used that design.

1 . To compare the effectiveness of two different types of therapy for depression, depressed patients were assigned to receive either cognitive therapy or behavior therapy for a 12-week period.

The researchers attempted to ensure that the patients in the two groups had similar severity of depressed symptoms by administering a standardized test of depression to each participant, then pairing them according to the severity of their symptoms.

2 . To assess the difference in reading comprehension between 7 and 9-year-olds, a researcher recruited each group from a local primary school. They were given the same passage of text to read and then asked a series of questions to assess their understanding.

3 . To assess the effectiveness of two different ways of teaching reading, a group of 5-year-olds was recruited from a primary school. Their level of reading ability was assessed, and then they were taught using scheme one for 20 weeks.

At the end of this period, their reading was reassessed, and a reading improvement score was calculated. They were then taught using scheme two for a further 20 weeks, and another reading improvement score for this period was calculated. The reading improvement scores for each child were then compared.

4 . To assess the effect of the organization on recall, a researcher randomly assigned student volunteers to two conditions.

Condition one attempted to recall a list of words that were organized into meaningful categories; condition two attempted to recall the same words, randomly grouped on the page.

Experiment Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of taking part in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Print Friendly, PDF & Email

IMAGES

  1. PPT

    example of social psychology experiments

  2. Famous Social Psychology Experiments by Michael Lay

    example of social psychology experiments

  3. 15 Social Psychology Examples (2024)

    example of social psychology experiments

  4. 10 Stunning Social Psychology Experiment Ideas For Students 2024

    example of social psychology experiments

  5. PPT

    example of social psychology experiments

  6. Sociological experiments

    example of social psychology experiments

VIDEO

  1. Social Psychology Experiments

  2. What are some psychology experiments with interesting results #redditstories #experiment

  3. social experiment

  4. Social experiment 🙏🙏🥹

  5. social experiment

  6. Cognitive Psychology Research Methods Experiments & Case Studies

COMMENTS

  1. Social Psychology Experiments: 10 Of The Most Famous Studies

    It has since become a classic social psychology experiment, studied by generations of students and recently coming under a lot of criticism. 5. The Milgram Social Psychology Experiment. The Milgram experiment, led by the well-known psychologist Stanley Milgram in the 1960s, aimed to test people's obedience to authority.

  2. Social Experiments and Studies in Psychology

    A social experiment is a type of research performed in psychology to investigate how people respond in certain social situations. In many of these experiments, the experimenters will include confederates who are people who act like regular participants but who are actually acting the part. Such experiments are often used to gain insight into ...

  3. The 25 Most Influential Psychological Experiments in History

    3. Bobo Doll Experiment Study Conducted by: Dr. Alburt Bandura. Study Conducted between 1961-1963 at Stanford University . Experiment Details: During the early 1960s a great debate began regarding the ways in which genetics, environmental factors, and social learning shaped a child's development. This debate still lingers and is commonly referred to as the Nature vs. Nurture Debate.

  4. Famous Social Psychology Experiments

    The Stanford Prison Experiment . During the early 1970s, Philip Zimbardo set up a fake prison in the basement of the Stanford Psychology Department, recruited participants to play prisoners and guards, and played the role of the prison warden. The experiment was designed to look at the effect that a prison environment would have on behavior, but it quickly became one of the most famous and ...

  5. 15 Social Psychology Examples

    Examples of Social Psychology. 1. The Stanford Prison Experiment. Conducted by Philip Zimbardo in 1971, the Stanford Prison Experiment was a shocking reveal of how humans can be cruel to other humans when placed in positions of power.

  6. 9 of the Most Influential Social Psychology Experiments in History

    Overview. The Milgram Experiment was a famous social psychology experiment and experiment conducted by Stanley Milgram in the 1960s. Its aim was to test people's obedience to authority. The study examined how far people would go when an authority figure instructed them to perform acts that conflicted with their morals.

  7. 15 Famous Experiments and Case Studies in Psychology

    Famous Experiments in Psychology 1. The Marshmallow Experiment. ... A Class Divided is a good example of a social experiment to help children understand the concept of racism and discrimination. The class was divided into two groups: blue-eyed children and brown-eyed children. For one day, Elliott gave preferential treatment to her blue-eyed ...

  8. Social Psychology: Definition, Theories, Scope, & Examples

    Social Psychology: Definition, Theories, Scope, & Examples. Social psychology is the scientific study of how people's thoughts, feelings, beliefs, intentions, and goals are constructed within a social context by the actual or imagined interactions with others. It, therefore, looks at human behavior as influenced by other people and the ...

  9. Social Psychology Experiments

    Famous social psychology experiments and studies have influenced the field itself as well as public understanding of human nature. The Bennington College study was conducted by sociologist Theodore Newcomb from 1935 until 1939. The study examined the attitudes of students attending the then all-female Bennington College early in the college's history; indeed, the study began during the first ...

  10. Social Psychology Experiments

    Social psychology experiments can explain how thoughts, feelings and behaviors are influenced by the presence of others. Typically social psychology studies investigate how someone's behavior influences a groups behavior or internal states, such as attitude or self-concept.

  11. 28 social psychology studies from *Experiments With People ...

    The lesson from this as well as many other social psychology experiments is that seemingly trivial situational variables have a greater impact than personality variables, even though people tend to explain behaviors using personality. See The Person and the Situation: Perspectives of Social Psychology (Lee Ross, Richard E. Nisbett, 2011) Chap 9.

  12. Social Psychology: Definition, Theory, & Examples

    Examples of Social Psychology Research. In the rest of this article, I will describe examples of several famous social psychology experiments. For now, it is important to know that social psychology research is usually experimental. Most social psychology research attempts to understand how multiple aspects of social situations are related to ...

  13. Milgram Shock Experiment

    Stanley Milgram Shock Experiment. Stanley Milgram, a psychologist at Yale University, carried out one of the most famous studies of obedience in psychology. He conducted an experiment focusing on the conflict between obedience to authority and personal conscience. Milgram (1963) examined justifications for acts of genocide offered by those ...

  14. 5 Famous & Classic Experiments

    Here, we highlight five powerful experiments in social psychology that have shaped the development of the field. 1. Solomon Asch's Experiments on Conformity. Solomon Asch carried out a series of psychological tests known as the Asch Conformity Experiments in the 1950s to find out how much social pressure from the majority group could persuade ...

  15. Bandura's Bobo Doll Experiment on Social Learning

    It is possible to argue that the bobo doll experiment was unethical. For example, there is the problem of whether or not the children suffered any long-term consequences as a result of the study. Although it is unlikely, we can never be certain. ... Journal of Abnormal and Social Psychology, 63, 575-82. Bandura, A., Ross, D., & Ross, S. A. (1963).

  16. Social Psychology Research Methods

    Descriptive Research. Correlational Research. Experimental Research. Social psychology research methods allow psychologists a window into the causes for human behavior. They rely on a few well-established methods to research social psychology topics. These methods allow researchers to test hypotheses and theories as they look for relationships ...

  17. 5 Ground-Breaking Social Psychology Experiments

    Bobo Doll Experiment. Conducted in 1961 by scientist Albert Bandura, this experiment sought to prove human behavior was learned through social imitation, rather than inherited genetically. Bandura hypothesized children would mimic an adult's behavior if they trusted the adult. He chose to use a Bobo doll, a roughly 5-foot-tall inflatable toy ...

  18. PDF The 25 Most Influential Psychological Experiments in History

    By Kristen Fescoe Published January 2016. The field of psychology is a very broad field comprised of many smaller specialty areas. Each of these specialty areas has been strengthened over the years by research studies designed to prove or disprove theories and hypotheses that pique the interests of psychologists throughout the world. While each ...

  19. 6 Shocking Social Psychology Experiments That Show How Far People Go to

    Here are six of the most important social psychology experiments: 1. The Milgram Experiment. After the atrocities of WW2, scientists wanted to know why a race of people did not speak out, and moreover, why they carried out tasks that were deemed to go against the very fabric of society. Stanley Milgram (1963) set up an experiment in which ...

  20. Solomon Asch Conformity Line Experiment Study

    Asch used a lab experiment to study conformity, whereby 50 male students from Swarthmore College in the USA participated in a 'vision test.'. Using a line judgment task, Asch put a naive participant in a room with seven confederates/stooges. The confederates had agreed in advance what their responses would be when presented with the line task.

  21. Behavioral Psychology: Real-Life Examples and Applications

    Social Learning Theory: Bandura's Bobo Doll Experiment. As influential as classical and operant conditioning were, they didn't tell the whole story of human learning. Enter Albert Bandura and his groundbreaking social learning theory, which proposed that we learn not just from direct experiences, but also by observing others.

  22. Stanford Prison Experiment: Zimbardo's Famous Study

    For example, prisoners and guards may have personalities that make conflict inevitable, with prisoners lacking respect for law and order and guards being domineering and aggressive. ... Prison Experiment in introductory social psychology textbooks. Teaching of Psychology, 41, 318 -324. Haney, C., Banks, W. C., & Zimbardo, P. G. (1973). A ...

  23. A comparison of conventional and resampled personal reliability in

    Detecting careless responding in survey data is important to ensure the credibility of study findings. Of several available detection methods, personal reliability (PR) is one of the best-performing indices. Curran, Journal of Experimental Social Psychology, 66, 4-19, (2016) proposed a resampled version of personal reliability (RPR). Compared to the conventional PR or even-odd consistency ...

  24. Experimental Design: Types, Examples & Methods

    Three types of experimental designs are commonly used: 1. Independent Measures. Independent measures design, also known as between-groups, is an experimental design where different participants are used in each condition of the independent variable. This means that each condition of the experiment includes a different group of participants.