13.11 Stem-and-Leaf 草块和直方图
Section outline
-
Stem-and-Leaf Plots and Histograms
::Stem-and-Leaf 绘图和直方图Imagine asking a class of 20 algebra students how many brothers and sisters they had. You would probably get a range of answers from zero on up. Some students would have no siblings, but most would have at least one. The results might look like this:
::想象一下,问一个20个代数学生的班级,他们有多少兄弟姐妹。你可能会得到从零到零的答案。有些学生没有兄弟姐妹,但大多数学生至少有一个。结果可能是这样的:1, 4, 2, 1, 0, 2, 1, 0, 1, 2, 1, 0, 0, 2, 2, 3, 1, 1, 3, 6
We could organize this information in many ways. The first way might just be to create an ordered list, relisting all the numbers in order, starting with the smallest:
::我们可以以多种方式组织这些信息。第一种方法可能是创建一份有命令的清单,按照顺序重新列出所有数字,从最小的数字开始:0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 6
Another way to list the results is in a table:
::列表结果的另一种方式是表格:Number of siblings Number of matching students 0 4 1 7 2 5 3 2 4 1 5 0 6 1 We could also make a visual representation of the data by making categories for the number of siblings on the axis, and stacking representations of each student above the category marker. We could use crosses, stick-men or even photographs of the students to show how many students are in each category.
::我们也可以对数据进行直观描述,对x-轴上的兄弟姐妹人数进行分类,并将每个学生的表层排列在分类标记之上。 我们可以使用十字架、警棍甚至学生的照片来显示每一类别中有多少学生。Make and Interpret Stem-and-Leaf Plots
::Make and 口译员 尖尖和叶片Another useful way to display data is with a stem-and-leaf plot . Stem-and-leaf plots are especially useful because they give a visual representation of how the data is clustered, but preserve all of the numerical information. A stem-and-leaf plot consists of a vertical “stem” containing the first digit of each number, with the rest of each number written to the right of the stem like a “leaf.” In the stem and leaf plot below, the first number represented is 21. It is the only number with a stem of 2, so that makes it the only number in the 20’s. The next two numbers have a common stem of 3. They are 33 and 36. The next numbers are 40, 46 and 47.
::显示数据的另一个有用方式是用干叶片块来显示数据。 Stem-and-leaf 地块特别有用,因为它们直观地展示了数据是如何分组的,但保存了所有数字信息。 干叶块包括一个垂直的“stem ” , 包含每个数字的第一个数字,每个数字的其余部分写到干叶的右侧,就像一个“leaf ” 。 在下面的干叶片中,第一个数字是21。 它是唯一一个有2个数字的数字,因此它成为20年代唯一的数字。接下来的2个数字共有3个,分别为33个和36个,接下来的数字是40个、46个和47个。Stem-and-leaf plots have a number of advantages over simply listing the data in a single line.
::Stem-and-leaf地块与简单地在单行中列出数据相比,具有若干优势。-
They show how data is distributed, and whether it is symmetric around the center.
::它们显示数据是如何分布的, 以及它是否在中心周围对称 。 -
They can be used as the data is being collected.
::这些数据可在收集数据时使用。 -
They make it easy to determine the
and mode.
::它们使确定方式和模式变得容易。
Stem-and-leaf plots are not ideal for all situations; in particular they are not practical when the data is too tightly clustered. For example, with the data above about students’ siblings, all the data points would occupy the same stem (zero). In that case, no additional information could be gained from a stem-and-leaf plot.
::斯塔姆和叶片地块并非在所有情况下都是理想的;特别是当数据过于紧凑时,它们就不切实际。 比如,根据上面有关学生兄弟姐妹的数据,所有数据点都会占据同一个干点(零 ) 。 在这种情况下,从干叶地块上无法获得更多的信息。Creating a Stem-and-Leaf Plot
::创建静态和叶片草图While traveling on a long train journey, Rowena collected the ages of all the passengers traveling in her carriage. The ages for the passengers are shown below. Arrange the data into a stem-and-leaf plot, and use the plot to find the median and mode ages.
::Rowena在长长的火车旅途中,收集了乘坐她的车的所有乘客的年龄,下面列出了乘客的年龄。将数据排列成干叶图,并利用图图找到中位数和模式年龄。35, 42, 38, 57, 2, 24, 27, 36, 45, 60, 38, 40, 40, 44, 1, 44, 48, 84, 38, 20, 4, 2, 48, 58, 3, 20, 6, 40, 22, 26, 17, 18, 40, 51, 62, 31, 27, 48, 35, 27, 37, 58, 21
The first step is to determine a sensible stem . Since all the values fall between 1 and 84, the stem should represent the tens column, and run from 0 to 8 so that the numbers represented can range from 00 (which we would represent by placing a leaf of 0 next to the 0 on the stem) to 89 (a leaf of 9 next to the 8 on the stem). We then go through the data and fill out our plot:
::第一步是确定一个合理的干点。 由于所有值都在 1 到 84 之间, 干点应该代表 10 列, 从 0 到 8 , 这样代表的数字可以从 00 个( 我们通过将0 叶放在 0 根上) 到 89 个( 9 叶放在 8 根 上) 。 然后我们通过 数据 填写 :You can see immediately that the interval with the most number of passengers is the 40-49 group. In order to correctly determine the median and the mode, it is helpful to construct a second, ordered stem and leaf plot , placing the leaves on each branch in ascending order
::您可以立即看到,乘客人数最多的间隔是40-49组。为了正确确定中位数和模式,有必要构建第二个、定序干叶和叶片,将叶叶子按上行顺序排列。The mode is now apparent—there are 4 zeros in a row on the 4-branch, so the mode is 40. The median is the middle value; since there are 43 data points, the median is the value. (Using our formula from earlier, .) So the median is 37.
::现在模式是显而易见的 — 4- 分支的一行有4个零, 因此模式是40个。 中位值是中值; 中位值是中值; 由于有43个数据点, 中位值是22个值。 (用我们先前的公式,43+12=22。 ) 因此中位值是37个。Make and Interpret Histograms
::制作和解释直方图Look again at the example of the algebra students and their siblings. The data was collected in the following list.
::再看看代数学生及其兄弟姐妹的例子,数据收集于下表。1, 4, 2, 1, 0, 2, 1, 0, 1, 2, 1, 0, 0, 2, 2, 3, 1, 1, 3, 6
We were able to organize the data into a table. Here is the table again, but this time we will use the word frequency as a header to indicate the number of times each value occurs in the list.
::我们得以将数据组织成一个表格。 这是再次显示的表格, 但这次我们将使用单词频率作为页眉, 以显示列表中每个值的发生次数 。Number of siblings Frequency 0 4 1 7 2 5 3 2 4 1 5 0 6 1 Now we could use this table as an coordinate list to plot a line diagram like this one:
::现在我们可以用这个表格作为( x,y) 坐标列表来绘制像这样的线条图 :While this diagram does indeed show the data, it is somewhat misleading. For example, the continuous line joining the number of students with one and two siblings makes it look like we know something about how many students have 1.5 siblings (which of course, is impossible). In this case, where the data points are all integers, it’s wrong to suggest that the function is continuous between the points!
::虽然这个图表确实展示了数据,但它有些误导。 比如,将学生人数和兄弟姐妹人数连在一起的连续线使得我们似乎知道有多少学生有1.5个兄弟姐妹(这当然是不可能的 ) 。 在这种情况下,如果数据点都是整数,那么暗示该函数在两点之间是连续的是错误的!When the data we are representing falls into well defined categories (such as the integers 1, 2, 3, 4, 5 & 6) it is more appropriate to use a histogram to display that data. A histogram for this data is shown below.
::当我们所代表的数据属于明确界定的类别(如整数1、2、3、4、5和6)时,使用直方图显示该数据更为合适。以下为该数据的直方图。Each number on the axis has an associated column, whose height shows how many students have that number of siblings. For example, the column at is 5 units high, indicating that there are 5 students with 2 siblings.
::X - 轴的每个数字都有一个相关的栏目,其身高显示有多少学生有这样的兄弟姐妹。例如,x=2的栏目高5个单位,表明有5个学生有2个兄弟姐妹。The categories on the axis are called bins . Histograms differ from bar charts in that they don’t necessarily have fixed widths for the bins. They are also useful for displaying continuous data (data that varies continuously rather than in integer amounts). To illustrate this, here are some examples.
::x- 轴上的分类被称为文件夹。 直方图与条形图表不同, 因为它们不一定有文件夹的固定宽度。 它们也可用于显示连续数据( 持续变化的数据, 而不是整数数据 ) 。 为了说明这一点, 这里举一些例子 。Displaying Data in a Histogram
::直方图中显示数据Monthly rainfall (in millimeters) for Beaver Creek Oregon was collected over a five year period, and the data is shown below. Display the data in a histogram.
::海狸溪俄勒冈州每月降雨量(毫米)是为期5年的,数据见下文。用直方图显示数据。41.1, 254.7, 91.6, 60.9, 75.6, 36.0, 16.5, 10.6, 62.2, 89.4, 124.9, 176.7, 121.6, 135.6, 141.6, 77.0, 82.8, 28.9, 6.7, 22.1, 29.9, 110.0, 179.3, 97.6, 176.8, 143.5, 129.8, 94.9, 77.0, 60.8, 60.0, 32.5, 61.7, 117.2, 194.5, 208.6, 176.8, 143.5, 129.8, 94.9, 77.0, 60.8, 20.0, 32.5, 61.7, 117.2, 194.5, 208.6, 133.1, 105.2, 92.0, 60.7, 52.8, 37.8, 14.8, 23.1, 41.3, 75.7, 134.6, 148.8
Notice the similarity between histograms and stem-and-leaf plots. A stem-and-leaf plot resembles a histogram on its side. We could start by making a stem-and-leaf plot of our data.
::注意直方图和干叶图之间的相似性。 干叶图和干叶图类似其侧的直方图。 我们可以从我们数据中的干叶图开始。For our data above our stem would be the tens, and run from 1 to 25. Instead of rounding the decimals in the data, we truncate them, meaning we simply remove the decimal . For example, 165.7 would have a stem of 16 and a leaf of 5, and we would just leave out the seven tenths.
::对于我们数据上方的数据来说,我们的数据是10个,从1个到25个,我们没有将数据中的小数点四舍五入,而是将其截断,这意味着我们简单地删除小数点。 比如,165.7的节点为16个,叶子为5个,我们将忽略7个十分之一。By outlining the numbers on the stem and leaf plot, we can see what a histogram with a bin-width of 10 would look like. You can see that with so many bins, the histogram looks random , with no clear pattern visible. In a situation like this we need to reduce the number of bins. We will increase the bin width to 25 and collect the data in a table:
::通过描述干叶图和叶片上的数值, 我们可以看到一个以 bin- wide 10 的直方图会是什么样子。 您可以看到, 以如此多的垃圾箱, 直方图看起来是随机的, 没有清晰可见的图案 。 在这样的情况下, 我们需要减少垃圾箱的数量 。 我们将将垃圾箱的宽度增加到 25, 并在表格中收集数据 :Rainfall (mm) Frequency 7 8 9 12 6 9 0 6 2 0 1 The histogram associated with this bin width is below.
::与此文件夹宽度相关的直方图如下 。The pattern in the distribution is far more apparent with fewer bins. So let's look at what the histogram would look like with even fewer bins. We will combine bins by pairs to give 6 bins with a bin-width of 50. Our table and histogram now looks like this.
::分布图的图案以更小的垃圾箱来显示。 因此让我们看看直方图以更小的垃圾箱来显示。 我们将将双对的垃圾箱组合起来, 给6个垃圾箱, 并给50个垃圾箱, 我们的表格和直方图现在看起来像这个样子 。Rainfall (mm) Frequency 15 21 15 6 2 1 The pattern is much clearer now. The normal monthly rainfall is around 75 mm, but sometimes it will be a very wet month and be higher (even much higher).
::现在情况更加明朗了。 月降雨量通常约为75毫米,但有时会潮湿,甚至更高(甚至更高 ) 。You can see that although it may be counter-intuitive, sometimes you can see more information by reducing the number of intervals (or bins) in a histogram. It’s a bit like zooming out on a picture; you can’t see as many of the details, but the overall shape of what you are looking at may become clearer.
::你可以看到,尽管它可能是反直觉的,但有时你可以通过减少直方图中的间隔(或书包)数量来看到更多的信息。 这有点像放大图片;你看不到太多的细节,但你所看到的东西的总体形状可能变得更加清晰。Make Histograms Using a Graphing Calculator
::使用图形计算计算器制作直方图Look again at the data from the first example. We’ve seen how to manipulate raw data to give a stem-and-leaf plot and a histogram. Now let’s take some of the tedious sorting work out of the process by using a graphing calculator to automatically sort our data into bins.
::再看看第一个例子的数据。 我们已经看到如何操纵原始数据来给出干叶图和直方图。 现在让我们用图形计算计算器将我们的数据自动排序到垃圾箱中,把一些无聊的分类工作从这个过程中拿开。The following unordered data represents the ages of passengers on a train carriage.
::下列无顺序数据代表列车载运乘客的年龄。35, 42, 38, 57, 2, 24, 27, 36, 45, 60, 38, 40, 40, 44, 1, 44, 48, 84, 38, 20, 4, 2, 48, 58, 3, 20, 6, 40, 22, 26, 17, 18, 40, 51, 62, 31, 27, 48, 35, 27, 37, 58, 21.
Use a graphing calculator to display the data as a histogram with bin-widths of 10, 5 and 20.
::使用图形计算器以直方图形式显示数据,双维为10、5和20。Input the data in your calculator:
::在您的计算器中输入数据 :Press [START] and choose the [EDIT] option.
::按[裁 键,并选择[EDIT]选项。Input all 43 data points into the table in column .
::将所有43个数据点输入L1栏的表格。Select plot type:
::选择绘图类型 :Bring up the [STATPLOT] option by pressing [2nd] , [Y=] .
::按住[第二 [Y=],提出[STATLOIPT]选项。Highlight 1:Plot1 and press [ENTER] . This will bring up the plot options screen. Highlight the histogram and press [ENTER] . Make sure the Xlist is the list that contains your data.
::突出显示 1: plot1 并按下 [ENTER] 。 这将显示绘图选项屏幕。 突出显示直方图并按下 [ENTER] 。 请确定 X 列表是包含您数据的列表 。Select bin widths and plot:
::选择书包宽度和绘图 :Press [WINDOW] and ensure that Xmin and Xmax allow for all data points to be shown. The Xscl value determines the bin width.
::按 [WINDOW] 并确保 Xmin 和 Xmax 允许显示所有数据点。 Xscl 值决定 bin 宽度 。Press [GRAPH] to display the histogram.
::按 [GRAPH] 显示直方图 。You can change bin widths and see how the histogram changes, by varying Xscl . Below are histograms with bin widths of 10, 5 and 20. (In this example and will work whatever bin width we choose, but notice that to display the histogram correctly we need to use a different Ymax value for each.)
::您可以更改 bin 宽度, 并查看直方图是如何变化的, 不同的 Xscl 。 下面是 bin 宽度为 10、 5 和 20 的直方图 。 (在此示例中, Xmin=0 和 Xmax= 100 将不管我们选择的 bin 宽度都有效, 但请注意, 要正确显示直方图, 我们需要对每个人使用不同的 Ymax 值 。)Example
::示例示例示例示例Example 1
::例1Rowena made a survey of the ages of passengers in a train carriage, and collected the results in a frequency table . Display the results as a histogram.
::Rowena对列车载客年龄进行了调查,并在频率表中收集了结果。将结果显示为直方图。Age range Frequency 0 – 9 6 10 – 19 2 20 – 29 9 30 – 39 8 40 – 49 11 50 – 59 4 60 – 69 2 70 – 79 0 80 – 89 1 Solution
::解决方案Since the data is already collected into intervals we will use these as our bins for the histogram. Even though the top end of the first interval is 9, the bin on our histogram will extend to 10. This is because, as we move to continuous data, we have a range of numbers that goes right up to the lower end of the following bin, even if it doesn’t include that number. The range of values for the first bin would therefore be , and all the other bins would have similarly described ranges.
::由于数据已经收集到间隔中, 我们将用这些数据作为直方图的文件夹。 即使第一个间距的顶端是 9, 我们直方图的顶端会扩大到 10 。 这是因为, 当我们移动到连续数据时, 我们拥有一系列的数字可以直达下端的下端, 即使它不包括这个数字 。 因此, 第一个便箱的值范围将是 0x < 10 , 而所有其他的都同样描述了范围 。Review
::回顾-
Create a stem-and-leaf plot for the following data. Use the first digit (
hundreds
) as the stem, and the second (
tens
) as the leaf. Truncate any
units
and
decimals
. Order the plot to find the median and the mode.
data:
607.4, 886.0, 822.2, 755.7, 900.6, 770.9, 780.8, 760.1, 936.9, 962.9, 859.9, 848.3, 898.7, 670.9, 946.7, 817.8, 868.1, 887.1, 881.3, 744.6, 984.9, 941.5, 851.8, 905.4, 810.6, 765.3, 881.9, 851.6, 815.7, 989.7, 723.4, 869.3, 951.0, 794.7, 807.6, 841.3, 741.5, 822.2, 966.2, 950.1.
::为以下数据建立一个干叶图。使用第一个数(百)作为干叶,使用第二个数(十)作为叶叶。调用任何单位和小数。调用任何单位和小数。调用这个图以找到中位数和模式。数据:607.4、886.0、822.2、755.7、900.6、770.9、780.8、760.1、936.9、962.9、859.9、848.3、898.7、670.9、946.7、817.8、868.1、887.1、881.3、744.6、984.9、941.5、851.8、905.4、810.6、765.53、881.9、851.6、815.7、989.7、723.4、869.3、951.0、794.7、807.6、841.3、741.5、822.2、966.2、950.1。 -
Make a frequency table for the data in Question 1. Use a bin width of 50.
::问题1中的数据使用50的垃圾桶宽度,为数据绘制频率表。 -
Plot the data from Question 1 as a histogram with a bin width of
- 50
- 100
::将问题1的数据以直方图绘制,从问题1中标出数据,从中键宽度为50 100
For 4-6, use the following stem-and-leaf plot which shows data collected for the speed of 40 cars in a 35 mph limit zone in Culver City, California.
::4-6,使用以下干叶图,显示加利福尼亚州考尔弗市35英里限制区内40辆汽车速度的数据收集数据。-
Find the mean, median and mode speed.
::找到中位数和模式速度的中位数 -
Create a frequency table, starting at 25 mph with a bin width of 5 mph.
::创建频率表, 从 25 mph 开始, bin 宽度为 5 mph 。 -
Use the table to construct a histogram with the intervals from your frequency table.
::使用表格构建直方图,与频率表的间距相隔。
For 7-11 use the histogram shown below. The data is the result of a survey of each subject's number of siblings.
::对于7-11人使用下文所示的直方图,这些数据是对每个对象的兄弟姐妹人数进行调查的结果。-
The median of the data.
::数据中位数。 -
The mean of the data.
::数据平均值 。 -
The mode of the data.
::数据模式 。 -
The number of people who have an odd number of siblings.
::兄弟姐妹人数奇数的人数。 -
The percentage of the people surveyed who have 4 or more siblings.
::有4个或4个以上兄弟姐妹的被调查者的百分比。
Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
They show how data is distributed, and whether it is symmetric around the center.