Data and Graphs: Overview

Pick up a newspaper or a news magazine. Turn on a TV or radio newscast or sports program. You're surrounded by data, much of it shown in graphic form. With so much data available, people have developed charts, tables, and graphs to organize the information and make it easy for others to understand.

As students work with data, statistics, and probability, they should become familiar with many vocabulary words, most of which are also used in everyday life. Students also encounter many different kinds of graphs, so there is a lot to learn.

One efficient way to display data is with a stem-and-leaf plot. Imagine 10 people sitting around a table at a family reunion. Their ages are 32, 1, 45, 37, 8, 9, 55, 81, 34, and 51. You might make a stem-and-leaf plot using the tens digits from their ages as the “stem” and the ones digits as the “leaves.” The plot would look like this.

 Ages of Peopleat Family Reunion Stem Leaf 0 1, 8, 9 3 2, 4, 7 4 5 5 1, 5 8 1

Notice that in the stem-and-leaf plot both the tens and ones digits are in order from least to greatest. Where parts of the display have no data—no one was in their 20s or 60s for example—there is a gap. Where many data are close together—ages under ten and ages in the 30s—there are clusters. Stem-and-leaf plots help us order data and show a great deal of data in a small space.

Two sets of data may be shown on a double bar graph. For example, at the family reunion, it was found that four surnames (Brown, Kildare, Roberts, and O'Hara) were represented. Males and females of each family name could be shown in a double bar graph such as this one.

More data about people at the family reunion could be shown in a frequency table and histogram, both of which use 10-year intervals for ages. In the frequency table, tallies show the numbers of people in various age groups.

 Ages of People at Family Reunion Interval (years) Tally Marks Frequency 0 – 9 8 10 – 19 10 20 – 29 18 30 – 39 15 40 – 49 6 50 – 59 9 60 – 69 3 70 – 79 5 80 – 89 1

The same data is shown on a histogram, a display that is like a bar graph except that it shows interval data.

With numerical data, the terms mean, median, and mode tell us about a typical value. The range tells us how the data is “spread out,” or the difference between the greatest and the least numbers in a set of data. Suppose that the following numbers show the number of miles driven by five families who came to the family reunion:

5, 120, 36, 36, 97, 509, 247

Put the numbers in order. It's easy to see that the numbers of miles vary from 5 to 509. The range is the difference in the least and greatest number: 509 − 5 = 504.

The mode is the number that occurs most often. Sets of data may have more than one mode, or they may have no mode if all the numbers are different. The mode of this data set is 36.

The median is the middle value in a set of data in which the data are arranged in order. If there is an even number of data, the median is the mean, or average, of the middle two numbers. Following are the numbers in order:

5, 36, 36, 97, 120, 247, 509

Since there are 7 numbers, the fourth (97) is the middle number, or the median.

The mean is the average of the numbers. To find the mean, add all the numbers and divide by the number of addends. The mean of the numbers is 1,050 ÷ 7 = 150.

Another form of graph, the line plot, makes it easy to see the mean, median, mode, and range of a set of data. The line plot uses a number line and Xs to represent each number. This line plot shows these scores from the game of horseshoes at the family reunion: 18, 14, 22, 33, 7, 7, 30, 34, 30, 10, 15.

The data has two modes—7 and 30. The line plot lets you see these easily, because the numbers are represented by Xs. The range is34 − 7 or 27. The median, the middle value is the sixth X, or 18. To find the mean, add the scores and divide by the number of scores (220 ÷ 11 = 20).

Line graphs show change over time. Broken lines show increases or decreases in data, and the slope of the lines tells whether the change is gradual or rapid. Horizontal line segments show periods of no change. This graph shows temperatures over an 8-hour period at the family reunion.

How can the graph be interpreted? The temperature rose slowly from 8:00 to 9:00 a.m., and then rose more rapidly from 9:00 until 10:00 a.m. and from 10:00 until 11:00 a.m. The change from 11:00 a.m. until noon was a more gradual increase. The temperature remained constant from noon until 1:00 p.m.. and also from 3:00 to 4:00 p.m. It gradually fell from 1:00 p.m.. until 3:00 p.m.

Double line graphs let people compare two sets of data over time. Often they use two colors and a code to show the sets of data. Here is a double line graph showing temperatures in indoor and outdoor areas at the family reunion.

Even a quick look at the graph shows more variation or "ups and downs" on outdoor temperatures than indoor temperatures. You can see that the indoor temperature, about 72°F, was higher than the outdoor temperature at 8:00 a.m. As the day progressed, the indoor temperature rose to 75°F, then stayed constant from noon until 3:00 p.m., when it fell 2 degrees, then rose again to 74°F at 4:00 p.m. You can visually find the greatest difference in temperatures at noon—a difference of about 12 degrees.

With so many types of graphs available, students must learn how to choose appropriate types. Bar graphs are often a good choice to show comparisons among data. Histograms show data that is organized in equal intervals. Line graphs and double line graphs are well suited to showing change over time. Pictographs are appropriate when the data are multiples of a number—and where pictures help to convey information. Circle graphs are well suited for showing parts of a whole.