平均值、中位数和模式
Section outline
-
Introduction
::导言An amusement park is designing a new section for children over 3 years old and under 8 years old. As part of the park's research, it used a survey of the heights and weights of a thousand children in that age group. Which measure of central tendency should it use to accommodate the greatest number of children on a roller coaster?
::一个游乐公园正在为3岁以上和8岁以下儿童设计一个新的区段,作为该区研究的一部分,它利用了对该年龄组1 000名儿童的身高和体重的调查。Mean, Median, and Mode
::平均值、中位数和模式With descriptive statistics , your goal is to describe the data you find in a sample or given in a problem. Because it would not make sense to present your findings as long lists of numbers, you summarize important aspects of the data. One important aspect of the data is the measure of central tendency , which is a measure of the "middle" value of a set of data. A measure of central tendency is helpful if we want to summarize a set of data or to compare different sets of data. For example, suppose a restaurant manager wants to know what dish on the menu is the most popular. Alternatively, suppose a coach wants to know how fast a sprinter can run a given distance. Moreover, a real estate agent may want to know the price of houses in a certain area. There are three ways to measure central tendency:
::在描述性统计中,您的目标是描述您在抽样中发现的数据或在一个问题中提供的数据。 因为将你的调查结果作为长长的数字列表来展示是不合理的, 您可以总结数据的重要方面。 数据的一个重要方面是衡量中央趋势, 即衡量一组数据的“ 中间” 值。 衡量中央趋势的尺度是有用的, 如果我们想要总结一组数据或者比较不同的数据集。 比如, 假设餐厅经理想知道菜单上最受欢迎的是什么菜。 或者, 教练想知道一个冲印机能跑多远。 此外, 房地产代理可能想知道某个区域房屋的价格。 有三种方法可以测量中心趋势 :-
Use the
mean
, which is the arithmetic
average
of the data.
::使用平均值,即数据算术平均数。 -
Use the
, which is the number exactly in the middle of the data. When the data have an odd number of counts, the median is the middle number after the data have been ordered. When the data have an even number of counts, the median is the arithmetic average of the two most central numbers.
::使用数据,也就是数据中间的数字。当数据有奇数的数时,中位数是数据排序后的中位数。当数据有偶数的数时,中位数是两个最核心数字的算术平均数。 -
Use the
mode
, which is the most often occurring number in the data. If two or more numbers occur equally frequently, then the data are said to be
bimodal
or
multimodal
.
::使用该模式,该模式是数据中最经常出现的数字,如果两个或两个以上的数字同样频繁出现,则数据据说是双式或多式数据。
Calculating the mean, median, and mode is straightforward. What is challenging is determining when to use each measure, and knowing how to interpret the data using the relationships between the three measures.
::计算平均值、中位数和模式是直截了当的。 挑战在于确定何时使用每一项计量,并了解如何使用这三项计量之间的关系来解释数据。The following video explains how to determine the mean, median, and mode:
::以下视频解释如何确定平均值、中位数和模式:Play, Learn, and Explore with Mean, Median, and Mode:
::玩耍、学习和探索与中位数、中位数和模式:Examples
::实例Example 1
::例1Given different situations and datasets, how do we decide which measure to use?
::鉴于不同的情况和数据集,我们如何决定使用何种措施?Solution:
::解决方案 :To decide which measure of central tendency to use, it is a good idea to calculate and interpret all three of the numbers.
::决定使用哪一种衡量中心趋势,计算和解释所有三个数字是一个好主意。For example, if someone asked you how many people can sit in the typical car, it would make more sense to use mode than to use mean. With mode, you can find out that a 5-person car is the most frequent car driven, and determine that the answer to the question is 5. If you calculate the mean for the number of seats in all cars, you will end up with a decimal like 5.4, which makes less sense in this context.
::例如,如果有人问你有多少人可以坐在典型的汽车里,那么使用模式比使用中意更有意义。如果使用模式,你可以发现5人汽车是最经常驾驶的汽车,并且确定问题的答案是5。如果你计算所有汽车座椅的平均值,你最后会以小数小数(5.4)来表示,在这种情况下,这不那么有意义。On the other hand, if you were finding the central heights of professional basketball players, using mean might make a lot more sense than mode.
::另一方面,如果你发现 职业篮球运动员的中央高度, 使用刻薄可能比模式更有意义。Example 2
::例2Compute the mean, median, and mode for the following numbers:
::计算下列数字的平均值、中位数和模式:3, 5, 1, 6, 8, 4, 5, 2, 7, 8, 4, 2, 1, 3, 4, 6, 7, 9, 4, 3, 2
Solution:
::解决方案 :Mean: The sum of all these numbers is 94, and there are 21 numbers total, so the mean is
::平均值:所有这些数字的总和为94,总数为21,平均值为94214.4762。Median: When you order the numbers from least to greatest, you get
::中中位数:当您从最小点到最大点订购数字时,您可以得到1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9.
The 11th number has 10 numbers to the right and 10 numbers to the left, so it is the median. The median is the number 4.
::第11个数字右边有10个数字,左边有10个数字,所以是中位数。中位数是第4个。Mode: The most frequently occurring number is the number 4.
::方式:最常见的数字是数字4。Example 3
::例3You write a computer code to produce a random number between 0 and 10 with equal probability. Unfortunately, you suspect your code doesn't work perfectly because in your first few attempts at running the code, it produces the following numbers:
::你写一个计算机代码来生成0到10之间的随机数字, 概率相同。 不幸的是, 您怀疑您的代码不完美, 因为在您最初几次尝试运行代码时, 它产生以下数字 :1, 9, 1, 1, 9, 2, 9, 1, 9, 9, 9, 2, 2.
How would you argue using mean, median, or mode that this code is probably not producing a random number between 0 and 10 with equal probability?
::使用平均值、中位数或模式来说明该代码可能不会产生0到10之间概率相等的随机数字。Solution:
::解决方案 :This question is very similar to questions you will see when you study statistical inference.
::这个问题与研究统计推论时你将看到的问题非常相似。First, you would note that the mean of the data is 4.9231. If the data were truly random, then the mean would probably be right around the number 5, which it is. This is not strong evidence to suggest that the random number generating code is broken.
::首先,你会注意到数据的平均值是4.9231。如果数据确实是随机的,那么这个平均值可能正好在5号左右。这并不能有力地证明生成代码的随机数字被打破了。Next, you would note that the median of the data is 2. This should make you suspect that something is wrong. You would expect that the median is of random numbers between 0 and 10 to be somewhere around 5.
::接下来,您会注意到数据中位数是 2 。 这应该会让你怀疑有问题。 您会预计中位数在0到 10 之间的随机数字是大约 5 点左右。Lastly, you would note that the mode of the data is 9. By itself, this is not strong data to suggest anything. Every sample will have to have at least one mode. What should make you suspicious, however, is the fact that only two other numbers were produced and were almost as frequent as the number 9. You would expect a greater variety of numbers to be produced.
::最后,你应该注意到数据模式是9. 数据模式本身不是显示任何东西的有力数据,每个样本必须至少有一种模式,但应该令人怀疑的是,只有另外两个数字产生出来,而且几乎与数字9一样频繁。 你会期望产生更多不同的数字。Example 4
::例4Recall the problem from the Introduction: Which measure of central tendency should the park use to accommodate the greatest number of children on a roller coaster?
::回顾导言中的问题:公园应使用哪种衡量中心趋势的尺度来容纳过山车上人数最多的儿童?Solution:
::解决方案 :To attract more customers, the amusement park should accommodate as many children as possible. For this reason, it should use mode to determine the most common height and weight. However, knowing the most common height and weight may still not accommodate the greatest number of children, so it should also consider the mean to determine the average height and weight .
::为了吸引更多的顾客,游乐园应容纳尽可能多的儿童,为此,它应使用模式确定最常见的身高和体重,然而,知道最常见的身高和体重可能仍不能容纳最多的儿童,因此,它还应考虑确定平均身高和体重的平均值。Example 5
::例5Ross and his friends want to play basketball. They decide to choose teams based on the number of cousins everyone has. One will be the team with fewer cousins, and the other will be the team with more cousins. Should they use the mean, median, or mode to compute the cutoff number that will separate the two teams?
::罗斯和他的朋友们想打篮球。他们决定根据每个人的表亲人数来选择球队。 其中一人是表亲较少的球队,另一人是表亲较多的球队。 他们是否应该用平均值、中位数或模式来计算两个球队分开的截断数?Solution:
::解决方案 :Ross and his friends should use the median number of cousins as the cutoff number, because this will allow each team to have the same number of players. If there is an odd number of people playing, then the extra person will just join either team or switch in later.
::罗斯和他的朋友应该使用表亲的中位数作为截断数,因为这可以让每个球队的球员人数相同。 如果有奇数的球员参加比赛,额外的球员会加入球队,或者稍后换球员。Example 6
::例6Compute the mean, median, and mode for the following numbers:
::计算下列数字的平均值、中位数和模式:1, 4, 5, 7, 6, 8, 0, 3, 2, 2, 3, 4, 6, 5, 7, 8, 9, 0, 6, 5, 3, 1, 2, 4, 5, 6, 7, 8, 8, 8, 4, 3, 2.
Solution:
::解决方案 :The mean is 4.6061. The median is 5. The mode is 8.
::平均值是4.6061,中位数是5, 模式是8。Example 7
::例7The cost of fresh blueberries at different times of the year are
::每年不同时间新鲜蓝莓的成本为:$2.50, $2.99, $3.20, $3.99, $4.99.
If you bought blueberries regularly, what would you typically pay?
::如果你经常买蓝莓 你会付多少钱?Solution:
::解决方案 :The word "typically" is used instead of "average" to allow you to choose whether mean, median, or mode would make the most sense. In this case, mean does make the most sense. The average cost is $3.53.
::通常使用“ 通常” 一词, 而不是“ 平均 ” , 使您可以选择中位值、 中位数或模式是否最合理。 在这种情况下, 意思是最合理的。 平均费用是3.53美元。Summary
::摘要-
The
mean
is the arithmetic average of the data.
::平均值是数据的算术平均数。 -
The
median
is the number in the middle of a dataset. When the data have an odd number of counts, the median is the middle number after the data have been ordered. When the data have an even number of counts, the median is the average of the two most central numbers.
::中位数是数据集中间的数字。当数据有奇数计数时,中位数是数据排序之后的中位数。当数据有偶数计数时,中位数是两个中位数的平均值。 -
The
mode
is the most often occurring number in the data. If two or more numbers occur equally frequently, then the data are said to be
bimodal
or
multimodal
.
::模式是数据中最经常出现的数字,如果两个或两个以上的数字同样频繁出现,则数据据说是双式或多式数据。
Review
::回顾You surveyed the students in your English class to find out how many siblings each student has. Here are your results:
::你对英语班的学生进行了调查,以了解每个学生有多少兄弟姐妹。0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 10, 12.
1. Find the mean, median, and mode of this data.
::1. 找出这些数据的平均值、中位数和模式。2. Why does it make sense that the mean number of siblings is greater than the median number of siblings?
::2. 为什么兄弟姐妹的平均人数高于兄弟姐妹的中位数是有道理的?3. Which measure of central tendency do you think is best for describing the typical number of siblings?
4. So far in math this semester, you have taken 10 quizzes. The mean of the scores is 88.5. What is the sum of the scores?
::3. 你认为哪种衡量中心趋势的尺度最适合描述典型的兄弟姐妹人数? 4. 到目前为止,在数学学期,你已经进行了10次测验,分数的平均值是88.5。分数的总和是多少?5. Find if 5, 9, 11, 12, 13, 14, 16, and have a mean of 12.
::5. 发现x5、9、11、12、13、14、16和x的平均值为12。6. Meera drove an average of 22 miles a day last week. How many miles did she drive last week?
::6. Meera上周平均每天开车22英里,上星期开多少英里?7. Find if 2, 6, 9, 8, 4, 5, 8, 1, 4, and have a median of 5.
::7. 发现如果2、6、9、8、4、5、8、1、4和x的中位数为5,则X值为5。Calculate the mean, median, and mode for each set of numbers:
::计算每组数字的平均值、中位数和模式:8. 11, 15, 19, 12, 21, 34, 15, 28, 24, 15, 27, 19, 20, 13, 15
9. 3, 5, 7, 5, 5, 17, 8, 9, 11, 5, 3, 7
10. -3, 0, 5, 8, 12, 4, 2, 1, 6
Calculate the mean and median for each set of numbers:
::计算每组数字的平均值和中位数:11. 12, 88, 89, 90
12. 16, 17, 19, 20, 20, 98
13. For which of the previous two questions was the median less than the mean? What in the set of numbers caused this?
::13. 前两个问题中,中位数比中位数低是哪个问题?14. For which of the previous questions (11 or 12) was the median greater than the mean? What in the set of numbers caused this?
::14. 前几个问题(11个或12个)的中位数高于中位数,其中哪一个问题(11个或12个)的中位数高于中位数?15. In each of the sets of numbers for problems 11 and 12, one number could be considered an outlier . Which numbers do you think are the outliers and why? What would happen to the mean and median if you removed the outliers?
::15. 在问题11和12的每组数字中,一个数字可被视为出局者,你认为哪些数字是出局者,为什么?如果删除出局者,平均值和中位数会怎样?Review (Answers )
::回顾(答复)Please see the Appendix.
::请参看附录。 -
Use the
mean
, which is the arithmetic
average
of the data.