Introduction to Statistics

Chapter 7 introduction to hypothesis testing, 7.1 chapter overview.

In this chapter, we will continue our discussion on statistical inference with a discussion on hypothesis testing. In hypothesis testing, we take a more active approach to our data by asking questions about population parameters and developing a framework to answer those questions. We will root this discussion in confidence intervals before learning about several other approaches to hypothesis testing.

Chapter Learning Outcomes/Objectives

  • confidence intervals.
  • the critical value approach.
  • the p-value approach.

R Objectives

  • Generate hypothesis tests for a mean.
  • Interpret R output for tests of a mean.

This chapter’s outcomes correspond to course outcomes (6) apply statistical inference techniques of parameter estimation such as point estimation and confidence interval estimation and (7) apply techniques of testing various statistical hypotheses concerning population parameters.

7.2 Logic of Hypothesis Testing

This section is framed in terms of questions about a population mean \(\mu\) , but the same logic applies to \(p\) (and other population parameters).

One of our goals with statistical inference is to make decisions or judgements about the value of a parameter. A confidence interval is a good starting point, but we might also want to ask questions like

  • Do cans of soda actually contain 12 oz?
  • Is Medicine A better than Medicine B?

A hypothesis is a statement that something is true. A hypothesis test involves two (competing) hypotheses:

  • The null hypothesis , denoted \(H_0\) , is the hypothesis to be tested. This is the “default” assumption.
  • The alternative hypothesis , denoted \(H_A\) is the alternative to the null.

Note that the subscript 0 is “nought” (pronounced “not”). A hypothesis test helps us decide whether the null hypothesis should be rejected in favor of the alternative.

Example : Cans of soda are labeled with “12 FL OZ”. Is this accurate? The default, or uninteresting, assumption is that cans of soda contain 12 oz. \(H_0\) : the mean volume of soda in a can is 12 oz. \(H_A\) : the mean volume of soda in a can is NOT 12 oz.

We can write these hypotheses in words (as above) or in statistical notation. The null specifies a single value of \(\mu\)

  • \(H_0\) : \(\mu=\mu_0\)

We call \(\mu_0\) the null value . When we run a hypothesis test, \(\mu_0\) will be replaced by some number. For the soda can example, the null value is 12. We would write \(H_0: \mu = 12\) .

The alternative specifies a range of possible values for \(\mu\) :

  • \(H_A\) : \(\mu\ne\mu_0\) . “The true mean is different from the null value.”

Take a random sample from the population. If the data area consistent with the null hypothesis, do not reject the null hypothesis. If the data are inconsistent with the null hypothesis and supportive of the alternative hypothesis, reject the null in favor of the alternative.

Example : One way to think about the logic of hypothesis testing is by comparing it to the U.S. court system. In a jury trial, jurors are told to assume the defendant is “innocent until proven guilty”. Innocence is the default assumption, so \(H_0\) : the defendant is innocent. \(H_A\) : the defendant is guilty. Like in hypothesis testing, it is not the jury’s job to decide if the defendant is innocent. That should be their default assumption. They are only there to decide if the defendant is guilty or if there is not enough evidence to override that default assumption. The burden of proof lies on the alternative hypothesis.

Notice the careful language in the logic of hypothesis testing: we either reject, or fail to reject, the null hypothesis. We never “accept” a null hypothesis.

7.2.1 Decision Errors

  • A Type I Error is rejecting the null when it is true. (Null is true, but we conclude null is false.)
  • A Type II Error is not rejecting the null when it is false. (Null is false, but we do not conclude it is false.)
\(H_0\) is
True False
Decision Do not reject \(H_0\) Correct decision Type II Error
Reject \(H_0\) Type I Error Correct decision
Example : In our jury trial, \(H_0\) : the defendant is innocent. \(H_A\) : the defendant is guilty. A Type I error is concluding guilt when the defendant is innocent. A Type II error is failing to convict when the person is guilty.

How likely are we to make errors? Well, \(P(\) Type I Error \()=\alpha\) , the significance level . (Yes, this is the same \(\alpha\) we saw in confidence intervals!) For Type II error, \(P(\) Type II Error \()=\beta\) . This is related to the sample size calculation from the previous chapter, but is otherwise something we don’t have time to cover.

We would like both \(\alpha\) and \(\beta\) to be small but, like many other things in statistics, there’s a trade off! For a fixed sample size,

  • If we decrease \(\alpha\) , then \(\beta\) will increase.
  • If we increase \(\alpha\) , then \(\beta\) will decrease.

In practice, we set \(\alpha\) (as we did in confidence intervals). We can improve \(\beta\) by increasing sample size. Since resources are finite (we can’t get enormous sample sizes all the time), we will need to consider the consequences of each type of error.

Example We could think about assessing consequences through the jury trial example. Consider two possible charges: Defendant is accused of stealing a loaf of bread. If found guilty, they may face some jail time and will have a criminal record. Defendant is accused of murder. If found guilty, they will have a felony and may spend decades in prison. Since these are moral questions, I will let you consider the consequences of each type of error. However, keep in mind that we do make scientific decisions that have lasting impacts on people’s lives.
  • At the \(\alpha\) level of significance, the data provide sufficient evidence to support the alternative hypothesis.
  • At the \(\alpha\) level of significance, the data do not provide sufficient evidence to support the alternative hypothesis.

Notice that these conclusions are framed in terms of the alternative hypothesis, which is either supported or not supported. We will never conclude the null hypothesis. Finally, when we write these types of conclusions, we will write them in the context of the problem.

7.3 Confidence Interval Approach to Hypothesis Testing

We can use a confidence interval to help us weigh the evidence against the null hypothesis. A confidence interval gives us a range of plausible values for \(\mu\) . If the null value is in the interval, then \(\mu_0\) is a plausible value for \(\mu\) . If the null value is not in the interval, then \(\mu_0\) is not a plausible value for \(\mu\) .

  • State null and alternative hypotheses.
  • Decide on significance level \(\alpha\) . Check assumptions (decide which confidence interval setting to use).
  • Find the critical value.
  • Compute confidence interval.
  • If the null value is not in the confidence interval, reject the null hypothesis. Otherwise, do not reject.
  • Interpret results in the context of the problem.
Example : Is the average mercury level in dolphin muslces different from \(2.5\mu g/g\) ? Test at the 0.05 level of significance. A random sample of \(19\) dolphins resulted in a mean of \(4.4 \mu g/g\) and a standard deviation of \(2.3 \mu g/g\) . \(H_0: \mu = 2.5\) and \(H_A: \mu \ne 2.5\) . Significance level is \(\alpha=0.05\) . The value of \(\sigma\) is unknown and \(n = 19 < 30\) , so we are in setting 3. For setting 3, the critical value is \(t_{df, \alpha/2}\) . Here, \(df=n-1=18\) and \(\alpha/2 = 0.025\) :
The confidence interval is \[\begin{align} \bar{x} &\pm t_{df, \alpha/2}\frac{s}{\sqrt{n}} \\ 4.4 &\pm 2.101 \frac{2.3}{\sqrt{19}} \\ 4.4 &\pm 1.109 \end{align}\] or \((3.29, 5.51)\) . Since the null value, \(2.5\) , is not in the interval, it is not a plausible value for \(\mu\) (at the 95% level of confidence). Therefore we reject the null hypothesis. At the 0.05 level of significance, the data provide sufficient evidence to conclude that the true mean mercury level in dolphin muscles is greater than \(2.5\mu g/g\) . Note: The alternative hypothesis is “not equal to”, but we conclude “greater than” because all of the plausible values in the confidence interval are greater than the null value.

7.4 Critical Value Approach to Hypothesis Testing

We learned about critical values when we discussed confidence intervals. Now, we want to use these values directly in a hypothesis test. We will compare these values to a value based on the data, called a test statistic .

Idea: the null is our “default assumption”. If the null is true, how likely are we to observe a sample that looks like the one we have? If our sample is very inconsistent with the null hypothesis, we want to reject the null hypothesis.

7.4.1 Test statistics

Test statistics are similar to z- and t-scores: \[\text{test statistic} = \frac{\text{point estimate}-\text{null value}}{\text{standard error}}.\] In fact, they serve a similar function in converting a variable \(\bar{X}\) into a distribution we can work with easily.

  • Large Sample Setting : \(\mu\) is target parameter, \(n \ge 30\)

\[z = \frac{\bar{x}-\mu_0}{s/\sqrt{n}}\]

  • Small Sample Setting : \(\mu\) is target parameter, \(n < 30\)

\[t = \frac{\bar{x}-\mu_0}{s/\sqrt{n}}\]

The set of values for the test statistic that cause us to reject \(H_0\) is the rejection region . The remaining values are the nonrejection region . The value that separates these is the critical value!

section 7 1 introduction to hypothesis testing answers

  • State the null and alternative hypotheses.
  • Determine the significance level \(\alpha\) . Check assumptions (decide which setting to use).
  • Compute the value of the test statistic.
  • Determine the critical values.
  • If the test statistic is in the rejection region, reject the null hypothesis. Otherwise, do not reject.
  • Interpret results.
Example : Is the average mercury level in dolphin muslces different from \(2.5\mu g/g\) ? Test at the 0.05 level of significance. A random sample of \(19\) dolphins resulted in a mean of \(4.4 \mu g/g\) and a standard deviation of \(2.3 \mu g/g\) . \(H_0: \mu = 2.5\) and \(H_A: \mu \ne 2.5\) . Significance level is \(\alpha=0.05\) . The value of \(\sigma\) is unknown and \(n = 19 < 30\) , so we are in setting 3. The test statistic is \[\begin{align} t &= \frac{\bar{x}-\mu_0}{s/\sqrt{n}} \\ &= \frac{4.4-2.5}{2.3/\sqrt{19}} \\ &= 3.601 \end{align}\] The critical value is \(t_{df, \alpha/2}\) . Here, \(df=n-1=18\) and \(\alpha/2 = 0.025\) :
The test statistic is in the rejection region, so we will reject the null hypothesis:

section 7 1 introduction to hypothesis testing answers

At the 0.05 level of significance, the data provide sufficient evidence to conclude that the true mean mercury level in dolphin muscles is greater than \(2.5\mu g/g\) .

Notice that this is the same conclusion we came to when we used the confidence interval approach. These approaches are exactly equivalent!

7.5 P-Value Approach to Hypothesis Testing

If the null hypothesis is true, what is the probability of getting a random sample that is as inconsistent with the null hypothesis as the random sample we got? This probability is called the p-value .

Example : Is the average mercury level in dolphin muscles different from \(2.5\mu g/g\) ? Test at the 0.05 level of significance. A random sample of \(19\) dolphins resulted in a mean of \(4.4 \mu g/g\) and a standard deviation of \(2.3 \mu g/g\) . Probability of a sample as inconsistent as our sample is \(P(t_{df} \text{ is as extreme as the test statistic})\) . Consider \[P(t_{18} > 3.6) = 0.001\] but we want to think about the probability of being “as extreme” in either direction (either tail), so \[\text{p-value} = 2P(t_{18}>3.6) = 0.002\]
  • If \(\text{p-value} < \alpha\) , reject the null hypothesis. Otherwise, do not reject.

7.5.1 P-Values

Large Sample Setting : \(\mu\) is target parameter, \(n \ge 30\) , \[2P(Z > |z|)\] where \(z\) is the test statistic.

Small Sample Setting : \(\mu\) is target parameter, \(n < 30\) , \[2P(t_{df} > |t|)\] where \(t\) is the test statistic.

Note: \(|a|\) is the “absolute value” of \(a\) . The absolute value takes a number and throws away the sign, so \(|2|=2\) and \(|-3|=3\) .

  • Determine the p-value.

We often use p-values instead of the critical value approach because they are meaningful on their own (they have a direct interpretation).

Example : For the dolphins, \(H_0: \mu = 2.5\) and \(H_A: \mu \ne 2.5\) . Significance level is \(\alpha=0.05\) . The value of \(\sigma\) is unknown and \(n = 19 < 30\) , so we are in setting 3. The test statistic is \[\begin{align} t &= \frac{\bar{x}-\mu_0}{s/\sqrt{n}} \\ &= \frac{4.4-2.5}{2.3/\sqrt{19}} \\ &= 3.601 \end{align}\] The p-value is \[2P(t_{df} > |t|) - 2P(t_{18} > 3.601) = 0.002\] Since \(\text{p-value}=0.002 < \alpha=0.05\) , reject the null hypothesis. At the 0.05 level of significance, the data provide sufficient evidence to conclude that the true mean mercury level in dolphin muscles is greater than \(2.5\mu g/g\) .

As before, this is the same conclusion we came to when we used the confidence interval and critical value approaches. All of these approaches are exactly equivalent.

R: Hypothesis Tests for a Mean

To conduct hypothesis tests for a mean in R, we will again use the t.test command. The arguments we will use for hypothesis testing are

  • x : the variable that contains the data we want to use to construct a confidence interval.
  • mu : the null value, \(\mu_0\) .
  • conf.level : the desired confidence level ( \(1-\alpha\) ).

We will again to use the Loblolly pine tree data.

Let’s test if the average height of Loblolly pines differs from \(40\) feet. We will test at a 0.01 level of significance ( \(\alpha = 0.01\) ). So \(H_0: \mu = 40\) and \(H_A: \mu \ne 40\) and the R command will look like

Last time we used this command, we noted that R printed more information than we knew how to handle. That information was about hypothesis tests! The output from this test shows the following (top to bottom):

  • the data used in the hypothesis test.
  • the value of the test statistic ( \(t = -3.3851\) ), the degrees of freedom ( \(83\) ), and the p-value ( \(0.001\) ).
  • the alternative hypothesis.
  • the confidence interval.
  • the sample mean.

Based on this output, we have everything we need to conduct a hypothesis test using (A) the confidence interval approach, (B) the critical value approach, or (C) the p-value approach! In practice, we might include results from multiple approaches: At the 0.01 level of significance, there is sufficient evidence to reject the null hypothesis and conclude that the true mean height of Loblolly pines is less than 40 feet ( \(t = -3.385\) , p-value \(=0.001\) ).

  • Statistics And Probability

Section 7.1 Introduction to Hypothesis Testing

Related documents.

Wks 8, Chapter 7

Add this document to collection(s)

You can add this document to your study collection(s)

Add this document to saved

You can add this document to your saved list

Suggest us how to improve StudyLib

(For complaints, use another form )

Input it if you want to receive answer

Logo for University of Missouri System

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7 Chapter 7: Introduction to Hypothesis Testing

alternative hypothesis

critical value

effect size

null hypothesis

probability value

rejection region

significance level

statistical power

statistical significance

test statistic

Type I error

Type II error

This chapter lays out the basic logic and process of hypothesis testing. We will perform z  tests, which use the z  score formula from Chapter 6 and data from a sample mean to make an inference about a population.

Logic and Purpose of Hypothesis Testing

A hypothesis is a prediction that is tested in a research study. The statistician R. A. Fisher explained the concept of hypothesis testing with a story of a lady tasting tea. Here we will present an example based on James Bond who insisted that martinis should be shaken rather than stirred. Let’s consider a hypothetical experiment to determine whether Mr. Bond can tell the difference between a shaken martini and a stirred martini. Suppose we gave Mr. Bond a series of 16 taste tests. In each test, we flipped a fair coin to determine whether to stir or shake the martini. Then we presented the martini to Mr. Bond and asked him to decide whether it was shaken or stirred. Let’s say Mr. Bond was correct on 13 of the 16 taste tests. Does this prove that Mr. Bond has at least some ability to tell whether the martini was shaken or stirred?

This result does not prove that he does; it could be he was just lucky and guessed right 13 out of 16 times. But how plausible is the explanation that he was just lucky? To assess its plausibility, we determine the probability that someone who was just guessing would be correct 13/16 times or more. This probability can be computed to be .0106. This is a pretty low probability, and therefore someone would have to be very lucky to be correct 13 or more times out of 16 if they were just guessing. So either Mr. Bond was very lucky, or he can tell whether the drink was shaken or stirred. The hypothesis that he was guessing is not proven false, but considerable doubt is cast on it. Therefore, there is strong evidence that Mr. Bond can tell whether a drink was shaken or stirred.

Let’s consider another example. The case study Physicians’ Reactions sought to determine whether physicians spend less time with obese patients. Physicians were sampled randomly and each was shown a chart of a patient complaining of a migraine headache. They were then asked to estimate how long they would spend with the patient. The charts were identical except that for half the charts, the patient was obese and for the other half, the patient was of average weight. The chart a particular physician viewed was determined randomly. Thirty-three physicians viewed charts of average-weight patients and 38 physicians viewed charts of obese patients.

The mean time physicians reported that they would spend with obese patients was 24.7 minutes as compared to a mean of 31.4 minutes for normal-weight patients. How might this difference between means have occurred? One possibility is that physicians were influenced by the weight of the patients. On the other hand, perhaps by chance, the physicians who viewed charts of the obese patients tend to see patients for less time than the other physicians. Random assignment of charts does not ensure that the groups will be equal in all respects other than the chart they viewed. In fact, it is certain the groups differed in many ways by chance. The two groups could not have exactly the same mean age (if measured precisely enough such as in days). Perhaps a physician’s age affects how long the physician sees patients. There are innumerable differences between the groups that could affect how long they view patients. With this in mind, is it plausible that these chance differences are responsible for the difference in times?

To assess the plausibility of the hypothesis that the difference in mean times is due to chance, we compute the probability of getting a difference as large or larger than the observed difference (31.4 − 24.7 = 6.7 minutes) if the difference were, in fact, due solely to chance. Using methods presented in later chapters, this probability can be computed to be .0057. Since this is such a low probability, we have confidence that the difference in times is due to the patient’s weight and is not due to chance.

The Probability Value

It is very important to understand precisely what the probability values mean. In the James Bond example, the computed probability of .0106 is the probability he would be correct on 13 or more taste tests (out of 16) if he were just guessing. It is easy to mistake this probability of .0106 as the probability he cannot tell the difference. This is not at all what it means.

The probability of .0106 is the probability of a certain outcome (13 or more out of 16) assuming a certain state of the world (James Bond was only guessing). It is not the probability that a state of the world is true. Although this might seem like a distinction without a difference, consider the following example. An animal trainer claims that a trained bird can determine whether or not numbers are evenly divisible by 7. In an experiment assessing this claim, the bird is given a series of 16 test trials. On each trial, a number is displayed on a screen and the bird pecks at one of two keys to indicate its choice. The numbers are chosen in such a way that the probability of any number being evenly divisible by 7 is .50. The bird is correct on 9/16 choices. We can compute that the probability of being correct nine or more times out of 16 if one is only guessing is .40. Since a bird who is only guessing would do this well 40% of the time, these data do not provide convincing evidence that the bird can tell the difference between the two types of numbers. As a scientist, you would be very skeptical that the bird had this ability. Would you conclude that there is a .40 probability that the bird can tell the difference? Certainly not! You would think the probability is much lower than .0001.

To reiterate, the probability value is the probability of an outcome (9/16 or better) and not the probability of a particular state of the world (the bird was only guessing). In statistics, it is conventional to refer to possible states of the world as hypotheses since they are hypothesized states of the world. Using this terminology, the probability value is the probability of an outcome given the hypothesis. It is not the probability of the hypothesis given the outcome.

This is not to say that we ignore the probability of the hypothesis. If the probability of the outcome given the hypothesis is sufficiently low, we have evidence that the hypothesis is false. However, we do not compute the probability that the hypothesis is false. In the James Bond example, the hypothesis is that he cannot tell the difference between shaken and stirred martinis. The probability value is low (.0106), thus providing evidence that he can tell the difference. However, we have not computed the probability that he can tell the difference.

The Null Hypothesis

The hypothesis that an apparent effect is due to chance is called the null hypothesis , written H 0 (“ H -naught”). In the Physicians’ Reactions example, the null hypothesis is that in the population of physicians, the mean time expected to be spent with obese patients is equal to the mean time expected to be spent with average-weight patients. This null hypothesis can be written as:

section 7 1 introduction to hypothesis testing answers

The null hypothesis in a correlational study of the relationship between high school grades and college grades would typically be that the population correlation is 0. This can be written as

section 7 1 introduction to hypothesis testing answers

Although the null hypothesis is usually that the value of a parameter is 0, there are occasions in which the null hypothesis is a value other than 0. For example, if we are working with mothers in the U.S. whose children are at risk of low birth weight, we can use 7.47 pounds, the average birth weight in the U.S., as our null value and test for differences against that.

For now, we will focus on testing a value of a single mean against what we expect from the population. Using birth weight as an example, our null hypothesis takes the form:

section 7 1 introduction to hypothesis testing answers

Keep in mind that the null hypothesis is typically the opposite of the researcher’s hypothesis. In the Physicians’ Reactions study, the researchers hypothesized that physicians would expect to spend less time with obese patients. The null hypothesis that the two types of patients are treated identically is put forward with the hope that it can be discredited and therefore rejected. If the null hypothesis were true, a difference as large as or larger than the sample difference of 6.7 minutes would be very unlikely to occur. Therefore, the researchers rejected the null hypothesis of no difference and concluded that in the population, physicians intend to spend less time with obese patients.

In general, the null hypothesis is the idea that nothing is going on: there is no effect of our treatment, no relationship between our variables, and no difference in our sample mean from what we expected about the population mean. This is always our baseline starting assumption, and it is what we seek to reject. If we are trying to treat depression, we want to find a difference in average symptoms between our treatment and control groups. If we are trying to predict job performance, we want to find a relationship between conscientiousness and evaluation scores. However, until we have evidence against it, we must use the null hypothesis as our starting point.

The Alternative Hypothesis

If the null hypothesis is rejected, then we will need some other explanation, which we call the alternative hypothesis, H A or H 1 . The alternative hypothesis is simply the reverse of the null hypothesis, and there are three options, depending on where we expect the difference to lie. Thus, our alternative hypothesis is the mathematical way of stating our research question. If we expect our obtained sample mean to be above or below the null hypothesis value, which we call a directional hypothesis, then our alternative hypothesis takes the form

section 7 1 introduction to hypothesis testing answers

based on the research question itself. We should only use a directional hypothesis if we have good reason, based on prior observations or research, to suspect a particular direction. When we do not know the direction, such as when we are entering a new area of research, we use a non-directional alternative:

section 7 1 introduction to hypothesis testing answers

We will set different criteria for rejecting the null hypothesis based on the directionality (greater than, less than, or not equal to) of the alternative. To understand why, we need to see where our criteria come from and how they relate to z  scores and distributions.

Critical Values, p Values, and Significance Level

alpha

The significance level is a threshold we set before collecting data in order to determine whether or not we should reject the null hypothesis. We set this value beforehand to avoid biasing ourselves by viewing our results and then determining what criteria we should use. If our data produce values that meet or exceed this threshold, then we have sufficient evidence to reject the null hypothesis; if not, we fail to reject the null (we never “accept” the null).

Figure 7.1. The rejection region for a one-tailed test. (“ Rejection Region for One-Tailed Test ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

The rejection region is bounded by a specific z  value, as is any area under the curve. In hypothesis testing, the value corresponding to a specific rejection region is called the critical value , z crit  (“ z  crit”), or z * (hence the other name “critical region”). Finding the critical value works exactly the same as finding the z  score corresponding to any area under the curve as we did in Unit 1 . If we go to the normal table, we will find that the z  score corresponding to 5% of the area under the curve is equal to 1.645 ( z = 1.64 corresponds to .0505 and z = 1.65 corresponds to .0495, so .05 is exactly in between them) if we go to the right and −1.645 if we go to the left. The direction must be determined by your alternative hypothesis, and drawing and shading the distribution is helpful for keeping directionality straight.

Suppose, however, that we want to do a non-directional test. We need to put the critical region in both tails, but we don’t want to increase the overall size of the rejection region (for reasons we will see later). To do this, we simply split it in half so that an equal proportion of the area under the curve falls in each tail’s rejection region. For a = .05, this means 2.5% of the area is in each tail, which, based on the z  table, corresponds to critical values of z * = ±1.96. This is shown in Figure 7.2 .

Figure 7.2. Two-tailed rejection region. (“ Rejection Region for Two-Tailed Test ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

Thus, any z  score falling outside ±1.96 (greater than 1.96 in absolute value) falls in the rejection region. When we use z  scores in this way, the obtained value of z (sometimes called z  obtained and abbreviated z obt ) is something known as a test statistic , which is simply an inferential statistic used to test a null hypothesis. The formula for our z  statistic has not changed:

section 7 1 introduction to hypothesis testing answers

Figure 7.3. Relationship between a , z obt , and p . (“ Relationship between alpha, z-obt, and p ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

When the null hypothesis is rejected, the effect is said to have statistical significance , or be statistically significant. For example, in the Physicians’ Reactions case study, the probability value is .0057. Therefore, the effect of obesity is statistically significant and the null hypothesis that obesity makes no difference is rejected. It is important to keep in mind that statistical significance means only that the null hypothesis of exactly no effect is rejected; it does not mean that the effect is important, which is what “significant” usually means. When an effect is significant, you can have confidence the effect is not exactly zero. Finding that an effect is significant does not tell you about how large or important the effect is.

Do not confuse statistical significance with practical significance. A small effect can be highly significant if the sample size is large enough.

Why does the word “significant” in the phrase “statistically significant” mean something so different from other uses of the word? Interestingly, this is because the meaning of “significant” in everyday language has changed. It turns out that when the procedures for hypothesis testing were developed, something was “significant” if it signified something. Thus, finding that an effect is statistically significant signifies that the effect is real and not due to chance. Over the years, the meaning of “significant” changed, leading to the potential misinterpretation.

The Hypothesis Testing Process

A four-step procedure.

The process of testing hypotheses follows a simple four-step procedure. This process will be what we use for the remainder of the textbook and course, and although the hypothesis and statistics we use will change, this process will not.

Step 1: State the Hypotheses

Your hypotheses are the first thing you need to lay out. Otherwise, there is nothing to test! You have to state the null hypothesis (which is what we test) and the alternative hypothesis (which is what we expect). These should be stated mathematically as they were presented above and in words, explaining in normal English what each one means in terms of the research question.

Step 2: Find the Critical Values

Step 3: calculate the test statistic and effect size.

Once we have our hypotheses and the standards we use to test them, we can collect data and calculate our test statistic—in this case z . This step is where the vast majority of differences in future chapters will arise: different tests used for different data are calculated in different ways, but the way we use and interpret them remains the same. As part of this step, we will also calculate effect size to better quantify the magnitude of the difference between our groups. Although effect size is not considered part of hypothesis testing, reporting it as part of the results is approved convention.

Step 4: Make the Decision

Finally, once we have our obtained test statistic, we can compare it to our critical value and decide whether we should reject or fail to reject the null hypothesis. When we do this, we must interpret the decision in relation to our research question, stating what we concluded, what we based our conclusion on, and the specific statistics we obtained.

Example A Movie Popcorn

Our manager is looking for a difference in the mean weight of popcorn bags compared to the population mean of 8. We will need both a null and an alternative hypothesis written both mathematically and in words. We’ll always start with the null hypothesis:

section 7 1 introduction to hypothesis testing answers

In this case, we don’t know if the bags will be too full or not full enough, so we do a two-tailed alternative hypothesis that there is a difference.

Our critical values are based on two things: the directionality of the test and the level of significance. We decided in Step 1 that a two-tailed test is the appropriate directionality. We were given no information about the level of significance, so we assume that a = .05 is what we will use. As stated earlier in the chapter, the critical values for a two-tailed z  test at a = .05 are z * = ±1.96. This will be the criteria we use to test our hypothesis. We can now draw out our distribution, as shown in Figure 7.4 , so we can visualize the rejection region and make sure it makes sense.

Figure 7.4. Rejection region for z * = ±1.96. (“ Rejection Region z+-1.96 ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

Now we come to our formal calculations. Let’s say that the manager collects data and finds that the average weight of this employee’s popcorn bags is M = 7.75 cups. We can now plug this value, along with the values presented in the original problem, into our equation for z :

section 7 1 introduction to hypothesis testing answers

So our test statistic is z = −2.50, which we can draw onto our rejection region distribution as shown in Figure 7.5 .

Figure 7.5. Test statistic location. (“ Test Statistic Location z-2.50 ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

Effect Size

When we reject the null hypothesis, we are stating that the difference we found was statistically significant, but we have mentioned several times that this tells us nothing about practical significance. To get an idea of the actual size of what we found, we can compute a new statistic called an effect size. Effect size gives us an idea of how large, important, or meaningful a statistically significant effect is. For mean differences like we calculated here, our effect size is Cohen’s d :

section 7 1 introduction to hypothesis testing answers

This is very similar to our formula for z , but we no longer take into account the sample size (since overly large samples can make it too easy to reject the null). Cohen’s d is interpreted in units of standard deviations, just like z . For our example:

section 7 1 introduction to hypothesis testing answers

Cohen’s d is interpreted as small, moderate, or large. Specifically, d = 0.20 is small, d = 0.50 is moderate, and d = 0.80 is large. Obviously, values can fall in between these guidelines, so we should use our best judgment and the context of the problem to make our final interpretation of size. Our effect size happens to be exactly equal to one of these, so we say that there is a moderate effect.

Effect sizes are incredibly useful and provide important information and clarification that overcomes some of the weakness of hypothesis testing. Any time you perform a hypothesis test, whether statistically significant or not, you should always calculate and report effect size.

Looking at Figure 7.5 , we can see that our obtained z  statistic falls in the rejection region. We can also directly compare it to our critical value: in terms of absolute value, −2.50 > −1.96, so we reject the null hypothesis. We can now write our conclusion:

Reject H 0 . Based on the sample of 25 bags, we can conclude that the average popcorn bag from this employee is smaller ( M = 7.75 cups) than the average weight of popcorn bags at this movie theater, and the effect size was moderate, z = −2.50, p < .05, d = 0.50.

Example B Office Temperature

Let’s do another example to solidify our understanding. Let’s say that the office building you work in is supposed to be kept at 74 degrees Fahrenheit during the summer months but is allowed to vary by 1 degree in either direction. You suspect that, as a cost saving measure, the temperature was secretly set higher. You set up a formal way to test your hypothesis.

You start by laying out the null hypothesis:

section 7 1 introduction to hypothesis testing answers

Next you state the alternative hypothesis. You have reason to suspect a specific direction of change, so you make a one-tailed test:

section 7 1 introduction to hypothesis testing answers

You know that the most common level of significance is a  = .05, so you keep that the same and know that the critical value for a one-tailed z  test is z * = 1.645. To keep track of the directionality of the test and rejection region, you draw out your distribution as shown in Figure 7.6 .

Figure 7.6. Rejection region. (“ Rejection Region z1.645 ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

Now that you have everything set up, you spend one week collecting temperature data:

Day

Temp

Monday

77

Tuesday

76

Wednesday

74

Thursday

78

Friday

78

section 7 1 introduction to hypothesis testing answers

This value falls so far into the tail that it cannot even be plotted on the distribution ( Figure 7.7 )! Because the result is significant, you also calculate an effect size:

section 7 1 introduction to hypothesis testing answers

The effect size you calculate is definitely large, meaning someone has some explaining to do!

Figure 7.7. Obtained z statistic. (“ Obtained z5.77 ” by Judy Schmitt is licensed under CC BY-NC-SA 4.0 .)

section 7 1 introduction to hypothesis testing answers

You compare your obtained z  statistic, z = 5.77, to the critical value, z * = 1.645, and find that z > z *. Therefore you reject the null hypothesis, concluding:

Reject H 0 . Based on 5 observations, the average temperature ( M = 76.6 degrees) is statistically significantly higher than it is supposed to be, and the effect size was large, z = 5.77, p < .05, d = 2.60.

Example C Different Significance Level

Finally, let’s take a look at an example phrased in generic terms, rather than in the context of a specific research question, to see the individual pieces one more time. This time, however, we will use a stricter significance level, a = .01, to test the hypothesis.

We will use 60 as an arbitrary null hypothesis value:

section 7 1 introduction to hypothesis testing answers

We will assume a two-tailed test:

section 7 1 introduction to hypothesis testing answers

We have seen the critical values for z  tests at a = .05 levels of significance several times. To find the values for a = .01, we will go to the Standard Normal Distribution Table and find the z  score cutting off .005 (.01 divided by 2 for a two-tailed test) of the area in the tail, which is z * = ±2.575. Notice that this cutoff is much higher than it was for a = .05. This is because we need much less of the area in the tail, so we need to go very far out to find the cutoff. As a result, this will require a much larger effect or much larger sample size in order to reject the null hypothesis.

We can now calculate our test statistic. We will use s = 10 as our known population standard deviation and the following data to calculate our sample mean:

section 7 1 introduction to hypothesis testing answers

The average of these scores is M = 60.40. From this we calculate our z  statistic as:

section 7 1 introduction to hypothesis testing answers

The Cohen’s d effect size calculation is:

section 7 1 introduction to hypothesis testing answers

Our obtained z  statistic, z = 0.13, is very small. It is much less than our critical value of 2.575. Thus, this time, we fail to reject the null hypothesis. Our conclusion would look something like:

Fail to reject H 0 . Based on the sample of 10 scores, we cannot conclude that there is an effect causing the mean ( M  = 60.40) to be statistically significantly different from 60.00, z = 0.13, p > .01, d = 0.04, and the effect size supports this interpretation.

Other Considerations in Hypothesis Testing

There are several other considerations we need to keep in mind when performing hypothesis testing.

Errors in Hypothesis Testing

In the Physicians’ Reactions case study, the probability value associated with the significance test is .0057. Therefore, the null hypothesis was rejected, and it was concluded that physicians intend to spend less time with obese patients. Despite the low probability value, it is possible that the null hypothesis of no true difference between obese and average-weight patients is true and that the large difference between sample means occurred by chance. If this is the case, then the conclusion that physicians intend to spend less time with obese patients is in error. This type of error is called a Type I error. More generally, a Type I error occurs when a significance test results in the rejection of a true null hypothesis.

The second type of error that can be made in significance testing is failing to reject a false null hypothesis. This kind of error is called a Type II error . Unlike a Type I error, a Type II error is not really an error. When a statistical test is not significant, it means that the data do not provide strong evidence that the null hypothesis is false. Lack of significance does not support the conclusion that the null hypothesis is true. Therefore, a researcher should not make the mistake of incorrectly concluding that the null hypothesis is true when a statistical test was not significant. Instead, the researcher should consider the test inconclusive. Contrast this with a Type I error in which the researcher erroneously concludes that the null hypothesis is false when, in fact, it is true.

A Type II error can only occur if the null hypothesis is false. If the null hypothesis is false, then the probability of a Type II error is called b (“beta”). The probability of correctly rejecting a false null hypothesis equals 1 − b and is called statistical power . Power is simply our ability to correctly detect an effect that exists. It is influenced by the size of the effect (larger effects are easier to detect), the significance level we set (making it easier to reject the null makes it easier to detect an effect, but increases the likelihood of a Type I error), and the sample size used (larger samples make it easier to reject the null).

Misconceptions in Hypothesis Testing

Misconceptions about significance testing are common. This section lists three important ones.

  • Misconception: The probability value ( p value) is the probability that the null hypothesis is false. Proper interpretation: The probability value ( p value) is the probability of a result as extreme or more extreme given that the null hypothesis is true. It is the probability of the data given the null hypothesis. It is not the probability that the null hypothesis is false.
  • Misconception: A low probability value indicates a large effect. Proper interpretation: A low probability value indicates that the sample outcome (or an outcome more extreme) would be very unlikely if the null hypothesis were true. A low probability value can occur with small effect sizes, particularly if the sample size is large.
  • Misconception: A non-significant outcome means that the null hypothesis is probably true. Proper interpretation: A non-significant outcome means that the data do not conclusively demonstrate that the null hypothesis is false.
  • In your own words, explain what the null hypothesis is.
  • What are Type I and Type II errors?
  • Why do we phrase null and alternative hypotheses with population parameters and not sample means?
  • Why do we state our hypotheses and decision criteria before we collect our data?
  • Why do you calculate an effect size?
  • z = 1.99, two-tailed test at a = .05
  • z = 0.34, z * = 1.645
  • p = .03, a = .05
  • p = .015, a = .01

Answers to Odd-Numbered Exercises

Your answer should include mention of the baseline assumption of no difference between the sample and the population.

Alpha is the significance level. It is the criterion we use when deciding to reject or fail to reject the null hypothesis, corresponding to a given proportion of the area under the normal distribution and a probability of finding extreme scores assuming the null hypothesis is true.

We always calculate an effect size to see if our research is practically meaningful or important. NHST (null hypothesis significance testing) is influenced by sample size but effect size is not; therefore, they provide complimentary information.

section 7 1 introduction to hypothesis testing answers

“ Null Hypothesis ” by Randall Munroe/xkcd.com is licensed under CC BY-NC 2.5 .)

section 7 1 introduction to hypothesis testing answers

Introduction to Statistics in the Psychological Sciences Copyright © 2021 by Linda R. Cote Ph.D.; Rupa G. Gordon Ph.D.; Chrislyn E. Randell Ph.D.; Judy Schmitt; and Helena Marvin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Study Site Homepage

  • Request new password
  • Create a new account

A Step-by-Step Introduction to Statistics for Business

Student resources, chapter 7: hypothesis testing.

Answers for all ‘Test Yourself’ questions from the book to check your performance and widen your overall understanding of the contents.

  • Bad; we can’t test for “the same”
  • Bad; this is not specific – better than when?
  • One-tailed, “more”
  • Two-tailed, “different”
  • One-tailed, “increase”
  • No, .02 > .01
  • No, .06 > .05
  • Yes, .06 < .10
  • Retain the null.  The difference is not statistically significant.  There is insufficient evidence to conclude that women have more fun than men at our store.
  • Reject the null and accept the alternative.  The difference is statistically significant.  Employee retention is different between branches.
  • Reject the null.  The difference is statistically significant.  Sales increased after sales training.
  • H 1 : μ ≠ 10
  • H 1 : μ < 0
  • H 1 : μ > 15
  • H 0 : μ ≤ -10
  • H 0 : μ = 60
  • H 0 : μ ≥ 100
  • You are more likely to reject the null is α = .10 because the region of rejection is larger (10% of the sampling distribution described by the null instead of 5%).
  • ξ = .02, p > .05

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

sustainability-logo

Article Menu

section 7 1 introduction to hypothesis testing answers

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Filling discrepancies between consumer perception and actual piped water quality to promote the potable use of the municipal water supply in indonesia.

section 7 1 introduction to hypothesis testing answers

1. Introduction

2. materials and methods, 2.1. study areas, 2.2. questionnaire survey, 2.3. water quality analyses, 2.4. cost analysis, 2.5. public drinking water tap, 2.6. data analysis, 3.1. perception of water quality, 3.2. water quality, 3.3. cost comparison of different water sources for drinking, 3.4. public drinking stations, 4. discussion, 5. conclusions, author contributions, institutional review board statement, informed consent statement, data availability statement, acknowledgments, conflicts of interest.

Click here to enlarge figure

  • Greene, J. Bottled Water in Mexico: The Rise of a New Access to Water Paradigm. Wiley Interdiscip. Rev. Water 2018 , 5 , e1286. [ Google Scholar ] [ CrossRef ]
  • Cohen, A.; Ray, I. The Global Risks of Increasing Reliance on Bottled Water. In Nature Sustainability ; Nature Publishing Group: Berlin, Germany, 2018; pp. 327–329. [ Google Scholar ] [ CrossRef ]
  • Bouhlel, Z.; Köpke, J.; Mina, M.; Smakhtin, V. Global Bottled Water Industry: A Review of Impacts and Trends ; University Institute for Water, Environment and Health: Hamilton, ON, Canada, 2023. [ Google Scholar ]
  • Foster, T.; Priadi, C.; Kotra, K.K.; Odagiri, M.; Rand, E.C.; Willetts, J. Self-Supplied Drinking Water in Low- and Middle-Income Countries in the Asia-Pacific. NPJ Clean Water 2021 , 4 , 37. [ Google Scholar ] [ CrossRef ]
  • Tosun, J.; Scherer, U.; Schaub, S.; Horn, H. Making Europe Go from Bottles to the Tap: Political and Societal Attempts to Induce Behavioral Change. Wiley Interdiscip. Rev. Water 2020 , 7 , e1435. [ Google Scholar ] [ CrossRef ]
  • Brei, V.A. How Is a Bottled Water Market Created? Wiley Interdiscip. Rev. Water 2018 , 5 , e1220. [ Google Scholar ] [ CrossRef ]
  • Geerts, R.; Vandermoere, F.; Van Winckel, T.; Halet, D.; Joos, P.; Van Den Steen, K.; Van Meenen, E.; Blust, R.; Borregán-Ochando, E.; Vlaeminck, S.E. Bottle or Tap? Toward an Integrated Approach to Water Type Consumption. Water Res. 2020 , 173 , 115578. [ Google Scholar ] [ CrossRef ]
  • Jaffee, D. Unequal Trust: Bottled Water Consumption, Distrust in Tap Water, and Economic and Racial Inequality in the United States. Wiley Interdiscip. Rev. Water 2024 , 11 , e1700. [ Google Scholar ] [ CrossRef ]
  • Alfonso, S.M.; Kazama, S.; Takizawa, S. Inequalities in Access to and Consumption of Safely Managed Water Due to Socio-Economic Factors: Evidence from Quezon City, Philippines. Curr. Res. Environ. Sustain. 2021 , 4 , 100117. [ Google Scholar ] [ CrossRef ]
  • Ochoo, B.; Valcour, J.; Sarkar, A. Association between Perceptions of Public Drinking Water Quality and Actual Drinking Water Quality: A Community-Based Exploratory Study in Newfoundland (Canada). Environ. Res. 2017 , 159 , 435–443. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Dianty, F.R.; Martono, D.N.; Priadi, C.R. Public Perception of Drinking Water Quality in Bekasi City, Indonesia. J. Penelit. Pendidik. IPA 2022 , 8 , 2551–2555. [ Google Scholar ] [ CrossRef ]
  • de França Doria, M. Factors Influencing Public Perception of Drinking Water Quality. Water Policy 2009 , 12 , 1–19. [ Google Scholar ] [ CrossRef ]
  • Hopland, A.O.; Kvamsdal, S.F. Tap Water Quality: In the Eye of the Beholder. J. Water Health 2022 , 20 , 1436–1444. [ Google Scholar ] [ CrossRef ]
  • Delpla, I.; Legay, C.; Proulx, F.; Rodriguez, M.J. Perception of Tap Water Quality: Assessment of the Factors Modifying the Links between Satisfaction and Water Consumption Behavior. Sci. Total Environ. 2020 , 722 , 137786. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Lestari, P.; Trihadiningrum, Y. The Impact of Improper Solid Waste Management to Plastic Pollution in Indonesian Coast and Marine Environment. Mar. Pollut. Bull. 2019 , 149 , 110505. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Jaffee, D.; Newman, S. A Bottle Half Empty: Bottled Water, Commodification, and Contestation. Organ. Environ. 2013 , 26 , 318–335. [ Google Scholar ] [ CrossRef ]
  • Puspita, T.; Dharmayanti, I.; Tjandrarini, D.H.; Zahra, Z.; Anwar, A.; Irianto, J.; Rachmat, B.; Yunianto, A. Packaged Drinking Water in Indonesia: The Determinants of Household in the Selection and Management Process. J. Water Sanit. Hyg. Dev. 2023 , 13 , 508–519. [ Google Scholar ] [ CrossRef ]
  • Horowitz, N.; Frago, J.; Mu, D. Life Cycle Assessment of Bottled Water: A Case Study of Green2O Products. Waste Manag. 2018 , 76 , 734–743. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Gleick, P.H.; Cooley, H.S. Energy Implications of Bottled Water. Environ. Res. Lett. 2009 , 4 , 014009. [ Google Scholar ] [ CrossRef ]
  • Garfí, M.; Cadena, E.; Sanchez-Ramos, D.; Ferrer, I. Life Cycle Assessment of Drinking Water: Comparing Conventional Water Treatment, Reverse Osmosis and Mineral Water in Glass and Plastic Bottles. J. Clean. Prod. 2016 , 137 , 997–1003. [ Google Scholar ] [ CrossRef ]
  • Garcia-Suarez, T.; Kulak, M.; King, H.; Chatterton, J.; Gupta, A.; Saksena, S. Life Cycle Assessment of Three Safe Drinking-Water Options in India: Boiledwater, Bottledwater, Andwater Purified with a Domestic Reverse-Osmosis Device. Sustainability 2019 , 11 , 6233. [ Google Scholar ] [ CrossRef ]
  • Circular, G.A. Accelerating the Circular Economy for Post-Consumer PET Bottles in Southeast Asia ; EcoKnights: Singapore, 2019. [ Google Scholar ]
  • Pinter, E.; Welle, F.; Mayrhofer, E.; Pechhacker, A.; Motloch, L.; Lahme, V.; Grant, A.; Tacker, M. Circularity Study on Pet Bottle-to-Bottle Recycling. Sustainability 2021 , 13 , 7370. [ Google Scholar ] [ CrossRef ]
  • Smith, R.L.; Takkellapati, S.; Riegerix, R.C. Recycling of Plastics in the United States: Plastic Material Flows and Polyethylene Terephthalate (PET) Recycling Processes. ACS Sustain. Chem. Eng. 2022 , 10 , 2084–2096. [ Google Scholar ] [ CrossRef ]
  • Andrady, A.L. The Plastic in Microplastics: A Review. Mar. Pollut. Bull. 2017 , 119 , 12–22. [ Google Scholar ] [ CrossRef ]
  • Umoafia, N.; Joseph, A.; Edet, U.; Nwaokorie, F.; Henshaw, O.; Edet, B.; Asanga, E.; Mbim, E.; Chikwado, C.; Obeten, H. Deterioration of the Quality of Packaged Potable Water (Bottled Water) Exposed to Sunlight for a Prolonged Period: An Implication for Public Health. Food Chem. Toxicol. 2023 , 175 , 113728. [ Google Scholar ] [ CrossRef ]
  • Akhbarizadeh, R.; Dobaradaran, S.; Schmidt, T.C.; Nabipour, I.; Spitz, J. Worldwide Bottled Water Occurrence of Emerging Contaminants: A Review of the Recent Scientific Literature. J. Hazard. Mater. 2020 , 392 , 122271. [ Google Scholar ] [ CrossRef ]
  • Schymanski, D.; Goldbeck, C.; Humpf, H.U.; Fürst, P. Analysis of Microplastics in Water by Micro-Raman Spectroscopy: Release of Plastic Particles from Different Packaging into Mineral Water. Water Res. 2018 , 129 , 154–162. [ Google Scholar ] [ CrossRef ]
  • Oßmann, B.E.; Sarau, G.; Holtmannspötter, H.; Pischetsrieder, M.; Christiansen, S.H.; Dicke, W. Small-Sized Microplastics and Pigmented Particles in Bottled Mineral Water. Water Res. 2018 , 141 , 307–316. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kooy, M.; Walter, C.T. Towards a Situated Urban Political Ecology Analysis of Packaged Drinking Water Supply. Water 2019 , 11 , 225. [ Google Scholar ] [ CrossRef ]
  • Ko, S.H.; Sakai, H. Perceptions of Water Quality, and Current and Future Water Consumption of Residents in the Central Business District of Yangon City Myanmar. Water Supply 2022 , 22 , 1094–1106. [ Google Scholar ] [ CrossRef ]
  • Ikhsan, A.N.; Thohira, M.C.; Daniel, D. Analysis of Packaged Drinking Water Use in Indonesia in the Last Decades: Trends, Socio-Economic Determinants, and Safety Aspect. Water Policy 2022 , 24 , 1287–1305. [ Google Scholar ] [ CrossRef ]
  • Prasetiawan, T.; Nastiti, A.; Muntalif, B.S. ‘Bad’ Piped Water and Other Perceptual Drivers of Bottled Water Consumption in Indonesia. Wiley Interdiscip. Rev. Water 2017 , 4 , e1219. [ Google Scholar ] [ CrossRef ]
  • Indonesia Statistics Agency. Percentage Distribution of Households by Province and Source of Drinking Water, 2021 ; Indonesia Statistics Agency: Jakarta, Republic of Indonesia, 2021. Available online: https://www.bps.go.id/id/statistics-table/3/YzBaMlduSlFVbTVrUnpWeU9YRTJka0pVTTFkU1FUMDkjMw==/distribusi-persentase-rumah-tangga-menurut-provinsi-dan-sumber-air-minum.html?year=2021 (accessed on 2 April 2024).
  • Ministry of Public Works and Housing (Indonesia). PDAM Capacity and Services. Ministry of Public Works and Housing, Indonesia. Available online: https://data.pu.go.id/dataset/kapasitas-dan-layanan-pdam (accessed on 29 April 2024).
  • Jambeck, J.R.; Geyer, R.; Wilcox, C.; Siegler, T.R.; Perryman, M.; Andrady, A.; Narayan, R.; Law, K.L. Plastic Waste Inputs from Land into the Ocean. Mar. Pollut. 2015 , 347 , 768–771. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Bogor City Statistics Agency. Bogor City Rainfall (Curah Hujan Kota Bogor—In Bahasa Indonesia). Available online: https://bogorkota.bps.go.id/indicator/151/153/1/jumlah-curah-hujan.html (accessed on 28 June 2024).
  • Malang City Statistics Agency. Malang City Rainfall (Curah Hujan Kota Malang—In Bahasa Indonesia). Available online: https://malangkota.bps.go.id/indicator/151/508/1/jumlah-curah-hujan-di-kota-malang.html (accessed on 28 June 2024).
  • Semarang City Statistics Agency. Semarang City Rainfall (Curah Hujan Kota Semarang—In Bahasa Indonesia). Available online: https://semarangkota.bps.go.id/indicator/151/79/1/curah-hujan-kota-semarang.html (accessed on 28 June 2024).
  • Bogor City Statistics Agency. Bogor City Population Based on Age and Gender Groups 2014–2021 (Penduduk Kota Bogor Berdasarkan Kelompok Umur dan Jenis Kelamin, 2014–2021—In Bahasa Indonesia). Available online: https://bogorkota.bps.go.id/indicator/12/31/1/penduduk-kota-bogor-berdasarkan-kelompok-umur-dan-jenis-kelamin.html (accessed on 13 May 2024).
  • Malang City Statistics Agency. Malang City Population Density 2019–2021 (Kepadatan Penduduk Kota Malang—In Bahasa Indonesia). Available online: https://malangkota.bps.go.id/indicator/12/304/1/kepadatan-penduduk-menurut-kecamatan.html (accessed on 13 May 2024).
  • Semarang City Government. Semarang City Population (Jumlah Penduduk Kota Semarang—In Bahasa Indonesia). Available online: https://data.semarangkota.go.id/elemendata/cari?cari=penduduk&tahunAwal=2021&tahunAkhir=2021 (accessed on 13 May 2024).
  • Bogor City Government. Bogor City Water Utility Company Targets 97% Service Coverage in 2019 (PDAM Tirta Pakuan Targetkan Cakupan Layanan 97 Persen di 2019—In Bahasa Indonesia). Available online: https://kotabogor.go.id/index.php/show_post/detail/11460 (accessed on 13 May 2024).
  • Malang City Water Utility Company (PDAM Tugu Tirta Kota Malang. Water Utility Statistics (Statistik PDAM Kota Malang—In Bahasa Indonesia). Available online: https://perumdatugutirta.co.id/info/statistik (accessed on 13 May 2024).
  • Semarang City Water Utility Company (PDAM Tirta Moedal Kota Semarang). Production Infrastructure (Infrastruktur Produksi—In Bahasa Indonesia). Available online: https://pdamkotasmg.co.id/page/instalasi_pengolahan_air (accessed on 13 May 2024).
  • Nastiti, A.; Sudradjat, A.; Geerling, G.W.; Smits, A.J.M.; Roosmini, D.; Muntalif, B.S. The Effect of Physical Accessibility and Service Level of Water Supply on Economic Accessibility: A Case Study of Bandung City, Indonesia. Water Int. 2017 , 42 , 831–851. [ Google Scholar ] [ CrossRef ]
  • Khanal, S.; Kazama, S.; Benyapa, S.; Takizawa, S. Performance Assessment of Household Water Treatment and Safe Storage in Kathmandu Valley, Nepal. Water 2023 , 15 , 2305. [ Google Scholar ] [ CrossRef ]
  • Devesa, R.; Dietrich, A.M. Guidance for Optimizing Drinking Water Taste by Adjusting Mineralization as Measured by Total Dissolved Solids (TDS). Desalination 2018 , 439 , 147–154. [ Google Scholar ] [ CrossRef ]
  • Shuai, Y.; Zhang, K.; Zhu, H.; Lou, J.; Zhang, T. Toward the Upgrading Quality of Drinking Water from Flavor Evaluation: Taste, Feeling, and Retronasal Odor Issues. ACS ES T Eng. 2023 , 3 , 308–321. [ Google Scholar ] [ CrossRef ]
  • Fankhauser, S.; Tepic, S. Can Poor Consumers Pay for Energy and Water? An Affordability Analysis for Transition Countries. Energy Policy 2007 , 35 , 1038–1049. [ Google Scholar ] [ CrossRef ]
  • Prouty, C.; Zhang, Q. How Do People’s Perceptions of Water Quality Influence the Life Cycle Environmental Impacts of Drinking Water in Uganda? Resour. Conserv. Recycl. 2016 , 109 , 24–33. [ Google Scholar ] [ CrossRef ]
  • Etale, A.; Jobin, M.; Siegrist, M. Tap versus Bottled Water Consumption: The Influence of Social Norms, Affect and Image on Consumer Choice. Appetite 2018 , 121 , 138–146. [ Google Scholar ] [ CrossRef ]
  • Nastiti, A.; Muntalif, B.S.; Roosmini, D.; Sudradjat, A.; Meijerink, S.V.; Smits, A.J.M. Coping with Poor Water Supply in Peri-Urban Bandung, Indonesia: Towards a Framework for Understanding Risks and Aversion Behaviours. Environ. Urban 2017 , 29 , 69–88. [ Google Scholar ] [ CrossRef ]
  • Espinosa-García, A.C.; Díaz-Ávalos, C.; González-Villarreal, F.J.; Val-Segura, R.; Malvaez-Orozco, V.; Mazari-Hiriart, M. Drinking Water Quality in a Mexico City University Community: Perception and Preferences. Ecohealth 2015 , 12 , 88–97. [ Google Scholar ] [ CrossRef ]
  • Piriou, P.; Mackey, E.D.; Suffet, I.H.; Bruchet, A. Chlorinous Flavor Perception in Drinking Water. Water Sci. Technol. 2004 , 49 , 321–328. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Mackey, E.D.; Baribeau, H.; Crozes, G.F.; Suffet, I.H.; Piriou, P. Public Thresholds for Chlorinous Flavors in U.S. Tap Water. Water Sci. Technol. 2004 , 49 , 335–340. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wening Purwandari, T.; Kazama, S.; Takizawa, S. Water Consumption Analysis of Small Islands Supplied with Desalinated Water in Indonesia. J. Jpn. Soc. Civ. Eng. Ser. G (Environ. Res.) 2021 , 77 , 129–140. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Prayoga, R.; Nastiti, A.; Schindler, S.; Kusumah, S.W.D.; Sutadian, A.D.; Sundana, E.J.; Simatupang, E.; Wibowo, A.; Budiwantoro, B.; Sedighi, M. Perceptions of Drinking Water Service of the ‘off-Grid’ Community in Cimahi, Indonesia. Water 2021 , 13 , 1398. [ Google Scholar ] [ CrossRef ]
  • Javidi, A.; Pierce, G.U.S. Households’ Perception of Drinking Water as Unsafe and Its Consequences: Examining Alternative Choices to the Tap. Water Resour. Res. 2018 , 54 , 6100–6113. [ Google Scholar ] [ CrossRef ]
Bogor CityMalang CitySemarang City
Total area (km )118.15101.1373.78
Annual rainfall (mm)3400 [ ]2400 [ ]2500 [ ]
Total population (persons)1,052,359 [ ]864,173 [ ] 1,694,743 [ ]
Population density (persons/km )8782 [ ]7677 [ ]4534 [ ]
Service coverage (%)91 [ ]100 [ ]60 [ ]
Water production (m /d)230,794 [ ]151,042 [ ]284,497 [ ]
Production per capita (m /d. person)0.220.170.17
Water supply source (m /d)Rivers 200,844
Springs 29,950 [ ]
Springs 118,914
Deep wells 32,128 [ ]
Rivers 213,415
Springs 26,542
Deep wells 44,540 [ ]
CostPDAM WaterPackaged WaterRefill Station Water
BoilingLPG stove IDR 29,200 (USD 1.9)Electric kettle IDR 6800 (USD 0.4)Not applicableNot applicable
Water16,000–30,000
(USD 1.0–2.0)
16,000–30,000
(USD 1.0–2.0)
IDR 170,000–500,000
(USD 10.9–32.3)
IDR 60,000–200,000
USD (3.9–12.9)
Total IDR 45,200–59,200
(USD 2.9–3.9)
IDR 22,800–36,800
(USD 1.4–2.4)
IDR 170,000–500,000
(USD 10.9–32.3.0)
IDR 60,000–200,000
USD (3.9–12.9)
Factors Count
Clear color/appearance22
Collective experience4
No smell21
Taste good12
Trust in institution4
Total63
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Zikrina, M.N.; Kazama, S.; Sawangjang, B.; Takizawa, S. Filling Discrepancies between Consumer Perception and Actual Piped Water Quality to Promote the Potable Use of the Municipal Water Supply in Indonesia. Sustainability 2024 , 16 , 7082. https://doi.org/10.3390/su16167082

Zikrina MN, Kazama S, Sawangjang B, Takizawa S. Filling Discrepancies between Consumer Perception and Actual Piped Water Quality to Promote the Potable Use of the Municipal Water Supply in Indonesia. Sustainability . 2024; 16(16):7082. https://doi.org/10.3390/su16167082

Zikrina, Masayu Nadiya, Shinobu Kazama, Benyapa Sawangjang, and Satoshi Takizawa. 2024. "Filling Discrepancies between Consumer Perception and Actual Piped Water Quality to Promote the Potable Use of the Municipal Water Supply in Indonesia" Sustainability 16, no. 16: 7082. https://doi.org/10.3390/su16167082

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. Section 7.1 Introduction to Hypothesis Testing

    section 7 1 introduction to hypothesis testing answers

  2. PPT

    section 7 1 introduction to hypothesis testing answers

  3. PPT

    section 7 1 introduction to hypothesis testing answers

  4. SOLUTION: Introduction of hypothesis testing

    section 7 1 introduction to hypothesis testing answers

  5. SOLUTION: Introduction of hypothesis testing

    section 7 1 introduction to hypothesis testing answers

  6. Introduction to Hypothesis Testing

    section 7 1 introduction to hypothesis testing answers

COMMENTS

  1. MAT 120 Section 7.1 Introduction to Hypothesis Testing

    ANSWER: (a) B. null and alternative. (b) B. They are complements. Explanation: (a) The null hypothesis H0 is a statistical hypothesis that contains a statement of equality, such as ≤ , = , or ≥. The alternative hypothesis Ha is the complement of the null hypothesis. It is a statement that must be true if H0 is false and it contains a statement of strict inequality, such as > , ≠ , or <.

  2. 7.1: Introduction to hypothesis testing Flashcards

    If the null hypothesis is true, then a P-value (or probability value) of a hypothesis test is the probability of obtaining a sample statistic with a value as extreme or more extreme than the one determined from the sample data. There are three types of hypothesis tests: 1. left -tailed test. 2.right tailed test.

  3. Chapter 7 Introduction to Hypothesis Testing

    A hypothesis is a statement that something is true. A hypothesis test involves two (competing) hypotheses: The null hypothesis, denoted H 0 H 0, is the hypothesis to be tested. This is the "default" assumption. The alternative hypothesis, denoted H A H A is the alternative to the null.

  4. Section 7.1

    Section 7.1 - Introduction to Hypothesis Testing. How do you want to study today? Flashcards. Review terms and definitions. Learn. Focus your studying with a path ... amelia-e-m. Terms in this set (20) hypothesis test. a process that uses sample statistics to test a claim about the value of a population parameter. sample statistic. a numerical ...

  5. 2.1: Chapter 7- Introduction to Hypothesis Testing

    As stated earlier in the chapter, the critical values for a two-tailed z test at a = .05 are z * = ±1.96. This will be the criteria we use to test our hypothesis. We can now draw out our distribution, as shown in Figure 7.4, so we can visualize the rejection region and make sure it makes sense. Figure 7.4.

  6. Solved Section 7.1 : Introduction to HypothesisTesting1.

    Question: Section 7.1 : Introduction to HypothesisTesting1. State the claim mathematically. Thenwrite the null and alternative hypothesis. Determine whether thehypothesis test is left-tailed, right-tailed, ortwo-tailed. a. A research hospital claims that more than3.7% of the population suffers from high blood pressure. Use p, therepresent the ...

  7. Section 7.1 Introduction to Hypothesis Testing

    Introduction to Hypothesis Testing. Section 7.1 Objectives. • State a null hypothesis and an alternative hypothesis. • Identify type I and type II errors and interpret the. level of significance. • Determine whether to use a one-tailed or two-tailed. statistical test and find a p-value. • Make and interpret a decision based on the ...

  8. 7.1: Introduction to Hypothesis Testing

    Figure 7.1.1 7.1. 1 You can use a hypothesis test to decide if a dog breeder's claim that every Dalmatian has 35 spots is statistically sound. (Credit: Robert Neff) The scientific method, briefly, states that only by following a careful and specific process can some assertion be included in the accepted body of knowledge.

  9. Chapter 7: Introduction to Hypothesis Testing

    This chapter lays out the basic logic and process of hypothesis testing. We will perform z tests, which use the z score formula from Chapter 6 and data from a sample mean to make an inference about a population.. Logic and Purpose of Hypothesis Testing. A hypothesis is a prediction that is tested in a research study. The statistician R. A. Fisher explained the concept of hypothesis testing ...

  10. 7: Introduction to Hypothesis Testing

    Jan 7, 2024. This chapter lays out the basic logic and process of hypothesis testing. We will perform z-tests, which use the z-score formula from chapter 6 and data from a sample mean to make an inference about a ….

  11. PDF One-Sample Hypothesis Tests

    7.1 Introduction to Hypothesis Testing 7.1.1 Null and Alternative Hypotheses A statistical hypothesis test is a procedure for deciding between two competing claims, or hypotheses, about the values of the parameters of one or more populations, such as their means. A hypothesis test

  12. PDF Introduction to Hypothesis Testing

    8.2 FOUR STEPS TO HYPOTHESIS TESTING The goal of hypothesis testing is to determine the likelihood that a population parameter, such as the mean, is likely to be true. In this section, we describe the four steps of hypothesis testing that were briefly introduced in Section 8.1: Step 1: State the hypotheses. Step 2: Set the criteria for a decision.

  13. Chapter 7: Introduction to Hypothesis Testing Flashcards

    left-tailed test (p. 354) Type of hypothesis test that occurs if the alternative hypothesis H∨a contains the less-than inequality symbol (<). (Since I don't have Quizlet+, I can't insert the image of the actual normal curve; ergo, I pasted (some of) the information.)H₀: μ ≥ kH∨a: μ < k(P is the area to the left of the standardized ...

  14. PDF Introduction to Hypothesis Testing

    rtion(p) or population mean (μ).A hypothesis test is a standard procedure for testing a. Inc. 17Hypotheses come in pairsThere are always at least two. In the admissions example, the two hypotheses are essentially: H 0: That the acceptance rate is 0.70 (null) e)Null and Alternative HypothesesThe null hypothesis, or H0, is the startin.

  15. Chapter 7: Hypothesis Testing

    The difference is statistically significant. Sales increased after sales training. You are more likely to reject the null is α = .10 because the region of rejection is larger (10% of the sampling distribution described by the null instead of 5%). ξ = .02, p > .05. Answers for all 'Test Yourself' questions from the book to check your ...

  16. 8.1.1: Introduction to Hypothesis Testing Part 1

    Hypothesis. This page titled 8.1.1: Introduction to Hypothesis Testing Part 1 is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by OpenStax via source content that was edited to the style and standards of the LibreTexts platform. The actual test begins by considering two hypotheses.

  17. 7.1: Introduction to Hypothesis Testing

    A statistician will make a decision about these claims. This process is called " hypothesis testing ". A hypothesis test involves collecting data from a sample and evaluating the data. Then, the statistician makes a decision as to whether or not there is sufficient evidence, based upon analyses of the data, to reject the null hypothesis.

  18. Sustainability

    A statistical analysis was performed using R (v. 4.3) to compare the water quality data and the respondents' answers. The Kruskal-Wallis test was used for an analysis of the differences among more than two sample groups. The Chi-squared test was used to examine associations between categorical data.