A PRIMER ON HYPOTHESIS TESTING FOR MEANS
Wanting to have as small a carbon footprint as possible, your best friend Sam Quint owns a Toyota
Prius. Quint claims that the average gas mileage of all the Priora in the world is µ = 37 miles per gallon
(MPG). (Note: according to The Boston Globe, the plural form of “Prius” is “Priora.”) You believe this figure is likely true since, along with being an expert on large sharks and fishing boats that could be bigger, Quint is reasonably knowledgeable about cars and car mileage. You currently own a Honda
Civic and are thinking of trading it in for a Prius. But your Civic gets pretty good mileage and you wouldn’t want to trade it in unless the Prius gets significantly more than 37 MPG on average.
Of course you will never know the true average gas mileage µ for Priora because you can’t examine all
Priora in the world. But you could take a random sample of Priora, calculate the average gas mileage
X for the sample, and then see what X might tell you about µ. So assume you can get a phone list of all of the Priora owners in the world. You take a random sample of n = 36 of them, call them up, and ask each what average gas mileage they get. You enter the data into Excel, and run Descriptive
Statistics. The results are as follows:
© 2013 by the Kenan-Flagler Business School, University of North Carolina, Chapel Hill, NC 27599-3490. Not to be reproduced
without permission. All rights reserved.
This technical note was prepared by Alan W. Neebe and Adam J. Mersereau as a basis for a class discussion rather than to illustrate the effective or ineffective handling of an administrative situation.
You see that the sample mean X = 37.6458, which is more than 37. So you have some evidence that µ is more than 37. On the other hand, you know that gas mileages of Priora in the real world vary, so the fact that you found some Priora whose gas mileage happens to be more than 37 is not compelling by itself. You might have just gotten lucky with your sample!
The question is, do you have convincing (or compelling) evidence? So the question is not whether
37.6458 is more than 37 (which it obviously is). The question is whether 37.6458 is significantly more than 37. This requires that we delve into the world of hypothesis testing and notions of significant differences. At a high level, hypothesis testing is pretty intuitive. If X is much larger than 37, we’ll declare it significantly more. If X is only a little bit larger than 37 (or if it’s smaller than 37), we will conclude that the evidence is insufficient. The mechanics of hypothesis testing are to quantify what we mean by “significance” and to decide how much is “much larger.”
Hypothesis testing is a thought experiment. We suppose that we know the true mean. We then take a look at the sample and ask ourselves, how likely is it to see such a sample given this true mean? If it is likely, then we have no reason to doubt the mean we’re assuming. If, on the other hand, the sample is highly unlikely or rare, there are two possibilities. The first is that we’re assuming the correct mean and the sample was a fluke. The second is that we’re in fact assuming the wrong mean. We’ll conclude the latter when we find that