As before, it all starts with a picture…so let’s look at some of the options when graphing quantitative variables.
Some people (mistakenly) use the terms bar chart and histogram synonymously—they are not the same thing. For a histogram, the horizontal axis is a quantitative variable, and the vertical axis is the frequency (count). The y-axis can also be labeled with Relative Frequency (percent),
Cumulative Frequency (total count for this, and all previous groups), or Cumulative Relative
Frequency (total percent for this, and all previous groups).
The variable’s values are divided into groups (bins, intervals, classes, buckets—there are many different names), and a bar is drawn (of the appropriate height) showing the amount
(count, percent, etc.) of the data that fall in that group. The bars will touch, unless there were no data in a group (leaving an empty spot where a bar might have been).
Figure 1 - A Frequency Histogram
We could get into a lot of detail about dividing the data into those groups, and where the boundaries of those groups are…but let’s not. Let’s use technology to create our histograms.
Let’s focus on getting the technology to produce a good histogram. For that, I have a fairly simple algorithm. This procedure works for the TI 83/84 series of calculators.
First (naturally), put the data into your calculator. Next, select a histogram from the Stat Plot menu. From the ZOOM menu, choose Zoom Stat.
Alas, this is almost certainly a bad histogram. Let’s take a minute to fix a couple of things to create a good histogram.
Press the WINDOW button. The values you see here were put there by the calculator, and two of them control how the graph looks (and need to be changed).
Scroll down to XSCL and choose a value that you could count in multiples of without much difficulty (5 and 10 are great choices here). Larger values here will create fewer bars in the graph; smaller values will create more bars in the graph.
HOLLOMAN’S AP STATISTICS
BVD CHAPTER 04, PAGE 1 OF 13
Next, scroll to XMIN and make this number a multiple of the value you have for XSCL. If you must change the value, be sure to make it smaller than the value that was there in the first place! The calculator automatically puts the smallest datum as this value, so if you put in something higher, you will lose part of the graph.
Finally, press GRAPH. You should now have a good histogram—or at least a better one. You might want to go back to WINDOW and change YMAX (higher) to be able to see the top of every bar, or change XMAX (higher) to see all of the rightmost bar.
Ideally, a good histogram has between 5 and 15 bars. Fewer than 5 bars makes it hard to say anything interesting about the variable; more than 15 bars creates a confusing jumble of bars that are hard to interpret.
All of this is creates a frequency histogram—a histogram where frequency (count) is the vertical axis. I said earlier that there were lots of possible options for the y-axis…and each of those options creates a slightly different kind of histogram. It’s probably more important that you can read these other types, rather than create them.
That’s a lot of detail about histograms—take that as a hint about how important they are in the grand scheme of statistics.
Also called a stem and leaf plot, this uses the actual data and place value to create a graph
(which is not terribly different from a histogram!). Here’s an example:
- Heat Emitted 7 | 349
8 | 48
9 | 36
10 | 3499
11 | 36
*where 7|3 means 73 calories per gram of cement.
Turn your head to the right and you get something that looks a lot like a histogram!
The rightmost digit of the data becomes the leaf (on the right of the vertical bar), and the remainder of the digits of the data become the stems.
The stems take the place of the groups from