How to Use Stem-and-Leaf Plots
Why are stem-and-leaf plots useful? How are they related to the kind of stems and leaves that fall from trees this time of year?
Jason Marshall, PhD
Listen
How to Use Stem-and-Leaf Plots
As we discussed last time, my backyard—and, thanks to my young daughter, even small portions of my house—are currently flooded with fall leaves. All of this lovely foliage I’m living in and amongst has gently raked me into a rather pensive mood … and has me contemplating some of the intersections between math and leaves.
That’s why we spent time last week talking about one of these math/leaf connections known as the stem-and-leaf plot. If you joined us for that, you’ll remember that we’ve thus far learned how to make stem-and-leaf plots, but we haven’t yet had a chance to dig in and figure out exactly why such plots might be useful.
So, what are stem-and-leaf plots good for? What can they tell us? And why might we want to make them? Let’s find out.
Why “Stems” and “Leaves”?
Let’s kick things off with a recap of the basics of making stem-and-leaf plots and a look at why they’re called “stem-and-leaf” plots in the first place. As we’ve learned, a stem-and-leaf plot is really just a two-column table: the first column contains the “stems” made up of the first digit (or perhaps the first several digits) of your numerical data, and the second column contains horizontal lists of “leaves” made from the last digits of your data points. So there are multiple “leaves” for each “stem” of data, just as there are multiple leaves on each branch of a tree … get it?
A stem-and-leaf plot is really just a two-column table…
To illustrate what this all means in practice, we imagined measuring the widths of a bunch of leaves, all of which were between 7 and 23 cm wide. To make a stem-and-leaf plot of this data, you begin by filling the first column with the “stems,” 0, 1, and 2. The “stem” labeled “0″ is for all the leaves with single digit widths between 0 and 9 cm, the “stem” labeled “1” is for all the leaves with widths between 10 and 19 cm wide, and the “stem” labeled “2″ is for all the leaves with widths between 20 and 29 cm wide.
After you’ve made your “stems,” you just have to add the “leaves” to your plot. If one of the leaves you’ve measured has a width of 14 cm, you would write 4 in the second column of the row with 1 in the first column. For a 7 cm wide leaf, you would write 7 in the second column of the row with 0 in the first column. And so on for each leaf in your data set.
Once you’ve re-sorted your plot so that all the “leaves” on each “stem” (which, remember, represent the widths of actual leaves in our example) are sorted from smallest to largest, your stem-and-leaf plot is finished.
What Can We Learn From Stem-and-Leaf Plots?
But the big question is: What can you learn from such a plot? The answer is: Quite a lot. For example, imagine that you want to find the most common leaf width. In other words, you want to find one of the statistical quantities known as the mode or the median of your data set. How can you do it?
One way would be to write out all the values in your data set in a big long list ordered from smallest to largest. Then, to find the median value, you just have to count out and find the value that’s right in the middle of the list.
While it’s also possible to find the mode this way (remember, the mode is the value that occurs most frequently), it’s a bit painful to do. To find the mode using your ordered list, you would have to look through the entire thing and constantly keep track of which number has occurred the most. It’s doable, but it’s a lot to keep track of and it’s not exactly the most convenient thing in the world to do in your head.
What other options do you have?
How to Use Stem-and-Leaf Plots
If you instead start by creating a stem-and-leaf plot, you’ll be able to find the median and especially the mode a lot more easily. The organization provided by the two columns of the stem-and-leaf plot makes it a little bit easier to figure out which number is in the middle of the data set (that’s the median), and a whole lot easier to find the mode.
Why the big improvement in finding the mode? If you think about it, you’ll see that the organization provided by the plot makes it easy to see which “leaf” occurs most frequently. And that most frequently occurring “leaf” (for a given “stem”) must be the mode of the data set. So stem-and-leaf plots are great for investigating the frequency with which different values occur.
Stem-and-leaf plots are great for investigating the frequency with which different values occur.
What if you were even more curious about leaf widths and wanted to know if most of the widths are clustered around the most common value? Or if they’re spread out over a broad range of values? In other words, what if you also wanted to know about the shape of the distribution of leaf widths? A stem-and-leaf plot can help here too since it will immediately show you if most of the values are clustered together, or if they’re evenly spread out over the different “stems,” or anything else. So stem-and-leaf plots are also great for investigating the “shape” of your data set.
Alternatives to the Stem-and-Leaf Plot
While stem-and-leaf plots are a useful tool, they’re not the only way you can investigate these types of questions about a set of data. The other big technique for studying this type of data—in particular for studying the distribution or shape of a data set—is to use something called a histogram.
What’s that? Well, unfortunately—as so often happens—we’re all out of time for today. So the answer to that question will have to wait until next time.
Wrap Up
Until next time, this is Jason Marshall with The Math Dude’s Quick and Dirty Tips to Make Math Easier. Thanks for reading, math fans!
Leaf image from Shutterstock.