15.1 平均值、中值和方式
章节大纲
-
The three measures of central tendency are mean, , and mode. When would it make sense to use one of these measures and not the others?
::中央趋势的三种衡量尺度是平均的、平均的和模式的。 使用其中一种而不是其他措施什么时候才有意义呢?Mean, Median, and Mode
::平均值、中位数和模式With descriptive statistics , your goal is to describe the data that you find in a sample or is given in a problem. Because it would not make sense to present your findings as long lists of numbers, you summarize important aspects of the data. One important aspect of the data is the measure of central tendency , which is a measure of the “middle” value of a set of data. There are three ways to measure central tendency:
::在描述性统计中,您的目标是描述您在抽样中发现或在一个问题中提供的数据。 因为将您的调查结果作为长长的数字列表来展示是不合理的, 您可以总结数据的重要方面。 数据的一个重要方面是衡量中心趋势, 即衡量一组数据的“ 中间” 值。 有三种方法可以衡量中心趋势 :-
Use the
mean
, which is the arithmetic
average
of the data.
::使用平均值,即数据算术平均数。 -
Use the
median
, which is the number exactly in the middle of the data. When the data has an odd number of counts, the median is the middle number after the data has been ordered. When the data has an even number of counts, the median is the arithmetic average of the two most central numbers.
::使用中位数, 即数据中间点的数字。 当数据有奇数计数时, 中位数是数据排序后的中位数。 当数据有偶数计数时, 中位数是两个中位数的算术平均值 。 -
Use the
mode
, which is the most often occurring number in the data. If there are two or more numbers that occur equally frequently, then the data is said to be
bimodal
or
multimodal
.
::使用该模式,该模式是数据中最经常出现的数字,如果两个或两个以上的数字同样频繁出现,则数据据说是双式或多式数据。
Calculating the mean, median and mode is straightforward.
::计算平均值、中位数和模式是直截了当的。Take the following numbers:
::取下列数字:3, 5, 1, 6, 8, 4, 5, 2, 7, 8, 4, 2, 1, 3, 4, 6, 7, 9, 4, 3, 2
To calculate the mean, first find the sum. The sum of all these numbers is 94 and there are 21 numbers total so the mean is .
::要计算平均值, 请先找到总和。 所有这些数字的总和是 94 和 21 个总数, 所以平均值是 9421 4. 4762 。Note that it is common practice to round to 4 decimals in AP Statistics .
::请注意,在亚太统计中,四舍五入到小数点后四舍五入是常见做法。To calculate the median, first order the numbers from least to greatest. When you order the numbers from least to greatest you get:
::要计算中位数, 请首先将数字从最小排序到最大排序。 当您从最小排序到最大排序时, 您可以得到 :1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9
The number has ten numbers to the right and ten numbers to the left so it is the median. The median is the number 4.
::第11个数字向右有10个数字,向左有10个数字,所以是中位数。中位数是第4个。To find the mode, find the most frequently occurring number. the most frequently occurring number is the number 4.
::要找到模式,请找到最经常发生的数字。 最经常发生的数字是第4号。What is challenging is determining when to use each measure and knowing how to interpret the data using the relationships between the three measures. Take the following situation:
::挑战在于确定何时使用每一项措施,并了解如何使用这三项措施之间的关系解释数据。Ross is with his friends and they want to play basketball. They decide to choose teams based on the number of cousins everyone has. One team will be the team with fewer cousins and the other team will be the team with more cousins.
::罗斯和他的朋友在一起,他们想打篮球。他们决定根据每个人的表亲人数来选择球队。一个球队是表亲较少的球队,另一个球队是表亲较多的球队。Should they use the mean, median or mode to compute the cutoff number that will separate the two teams?
::他们应该使用平均值、中位数或模式来计算两个小组分开的截断号码吗?Ross and his friends should use the median number of cousins as the cutoff number because this will allow each team to have the same number of players. If there are an odd number of people playing, then the extra person will just join either team or switch in later.
::罗斯和他的朋友应该使用表亲的中位数作为截断数,因为这可以让每个球队的球员人数相同。 如果有奇数的球员参加比赛,额外的球员会加入球队,或者稍后换球员。Examples
::实例Example 1
::例1Earlier, you were asked how to determine which measure of central tendency to use. In order to decide which measure of central tendency to use, it is a good idea to calculate and interpret all three of the numbers.
::早些时候,有人问您如何确定中央使用趋势的衡量标准。 为了决定中央使用趋势的衡量标准,计算和解释所有三个数字是一个好主意。For example, if someone asked you how many people can sit in the typical car, it would make more sense to use mode than to use mean. With mode, you could find out that a five person car is the most frequent car driven and determine that the answer to the question is 5. If you calculate the mean for the number of seats in all cars, you will end up with a decimal like 5.4, which makes less sense in this context.
::例如,如果有人问你有多少人可以坐在典型的汽车里, 使用模式比使用平均方式更有意义。 如果使用模式, 您可以发现五人的汽车是最常驾驶的汽车, 并且确定问题的答案是 5 。 如果你计算出所有汽车座椅的平均值, 你最后会出现小数点5. 4, 在这种背景下, 小数点为5. 4, 这不那么有意义 。On the other hand, if you were finding the central heights of NBA players, using mean might make a lot more sense than mode.
::另一方面,如果你找到NBA球员的中心高度, 使用刻薄可能比模式更有意义。Example 2
::例2Compute the mean, median, and mode for the following numbers.
::计算下列数字的平均值、中位数和模式。1, 4, 5, 7, 6, 8, 0, 3, 2, 2, 3, 4, 6, 5, 7, 8, 9, 0, 6, 5, 3, 1, 2, 4, 5, 6, 7, 8, 8, 8, 4, 3, 2
The mean is 4.6061. The median is 5. The mode is 8.
::平均值是4.6061,中位数是5, 模式是8。Example 3
::例3The cost of fresh blueberries at different times of the year are:
::每年不同时间新鲜蓝莓的费用如下:$2.50, $2.99, $3.20, $3.99, $4.99
If you bought blueberries regularly what would you typically pay?
::如果你经常买蓝莓 你会付多少钱?The word “typically” is used instead of average to allow you to make your own choice as to whether mean, median, or mode would make the most sense. In this case, mean does make the most sense. The average cost is $3.53.
::使用“通常”一词,而不是平均,是为了让您自己选择中位值、中位数或模式是否最有意义。在这种情况下,意思是最有道理的。平均成本是3.53美元。Example 4
::例4Five people were called on a phone survey to respond to some political opinion questions. Two people were from the zip code 94061, one person was from the zip code 94305 and two people were from 94062.
::两人来自拉链码94061,一人来自拉链码94305,两人来自94062。Which measure of central tendency makes the most sense to use if you want to state where the average person was from?
::如果你想说明一般人来自何方,那么用哪种衡量中心趋势最有意义?None of the measures of central tendency make sense to apply to this situation. Zip codes are categorical data rather than quantitative data even though they happen to be numbers. Other examples of categorical data are hair color or gender. You could argue that mode is applicable in a broad sense, but in general remember that mean, median, and mode can only be applied to quantitative data.
::中央趋势的衡量标准都无法适用于这种情况。 Zip 代码是绝对数据,而不是定量数据,尽管它们恰好是数字。 绝对数据的其他实例是毛发颜色或性别。 你可以争辩说,模式在广义上是适用的,但一般地记住,平均值、中位数和模式只能适用于定量数据。Example 5
::例5You write a computer code to produce a random number between 0 and 10 with equal probability. Unfortunately, you suspect your code doesn’t work perfectly because in your first few attempts at running the code, it produces the following numbers:
::你写一个计算机代码来生成0到10之间的随机数字,其概率相同。 不幸的是,你怀疑你的代码不完美,因为在你最初尝试运行代码的几处尝试中,它产生以下数字:1, 9, 1, 1, 9, 2, 9, 1, 9, 9, 9, 2, 2
How would you argue using mean, median, or mode that this code is probably not producing a random number between 0 and 10 with equal probability?
::使用平均值、中位数或模式来说明该代码可能不会产生0到10之间概率相等的随机数字。This question is very similar to questions you will see when you study statistical inference.
::这个问题与研究统计推论时你将看到的问题非常相似。First you would note that the mean of the data is 4.9231. If the data was truly random then the mean would probably be right around the number 5 which it is. This is not strong evidence to suggest that the random number generating code is broken.
::首先,你会注意到数据的平均值是4.9231。如果数据确实是随机的,那么其平均值可能正好在5号左右。这不是有力的证据表明生成代码的随机数字被打破了。Next you would note that the median of the data is 2. This should make you suspect that something is wrong. You would expect that the median is of random numbers between 0 and 10 to be somewhere around 5.
::接下来您会注意到数据中位数是 2 。 这应该会让你怀疑有问题。 您会预计中位数在0到 10 之间的随机数字是大约 5 点左右。Lastly, you would note that the mode of the data is 9. By itself this is not strong data to suggest anything. Every sample will have to have at least one mode. What should make you suspicious, however, is the fact that only two other numbers were produced and were almost as frequent as the number 9. You would expect a greater variety of numbers to be produced.
::最后,你要指出的是,数据模式是9. 数据模式本身不是显示任何情况的有力数据,每个样本必须至少有一种模式,但令人怀疑的是,只有另外两个数字是生产的,而且几乎与9号数字一样频繁。 你会期望产生更多不同的数字。Summary -
The three measures of central tendency are mean, median, and mode.
::三个衡量中心趋势的尺度是平均值、中位数和模式。 -
Mean
is the arithmetic average of the data, calculated by finding the sum of all numbers and dividing by the total count.
::平均值是数据的算术平均数,计算方法是查找所有数字的总和,除以总计算。 -
Median
is the middle number in ordered data; if there's an even number of counts, it's the average of the two most central numbers.
::中位数是定购数据中的中位数; 如果有偶数的计数, 则是两个中位数的平均值 。 -
Mode
is the most frequently occurring number in the data; if multiple numbers occur equally frequently, the data is bimodal or multimodal.
::模式是数据中最经常出现的数字;如果多数字同样频繁出现,数据是双式或多式数据。 -
Choosing which measure to use depends on the context and goal.
::选择使用何种措施取决于背景和目标。
Review
::回顾You surveyed the students in your English class to find out how many siblings each student had. Here are your results:
::您对英语班的学生进行了调查,以了解每个学生有多少兄弟姐妹。以下是您的成绩:0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 10, 12
1. Find the mean, median, and mode of this data.
::1. 找出这些数据的平均值、中位数和模式。2. Why does it make sense that the mean number of siblings is greater than the median number of siblings?
::2. 为什么兄弟姐妹的平均人数高于兄弟姐妹的中位数是有道理的?3. Which measure of central tendency do you think is best for describing the typical number of siblings?
::3. 你认为哪种衡量中心趋势的尺度最适合描述典型的兄弟姐妹人数?4. So far in math you have taken 10 quizzes this semester. The mean of the scores is 88.5. What is the sum of the scores?
::4. 到目前为止,在数学方面,你本学期已经进行了10次测验,评分的平均值是88.5。 得分的总和是多少?5. Find if 5, 9, 11, 12, 13, 14, 16, and have a mean of 12.
::5. 发现x5、9、11、12、13、14、16和x的平均值为12。6. Meera drove an average of 22 miles a day last week. How many miles did she drive last week?
::6. Meera上周平均每天开车22英里,上星期开多少英里?7. Find if 2, 6, 9, 8, 4, 5, 8, 1, 4, and have a median of 5.
::7. 发现如果2、6、9、8、4、5、8、1、4和x的中位数为5,则X值为5。Calculate the mean, median, and mode for each set of numbers:
::计算每组数字的平均值、中位数和模式:8. 11, 15, 19, 12, 21, 34, 15, 28, 24, 15, 27, 19, 20, 13, 15
9. 3, 5, 7, 5, 5, 17, 8, 9, 11, 5, 3, 7
10. -3, 0, 5, 8, 12, 4, 2, 1, 6
Calculate the mean and median for each set of numbers:
::计算每组数字的平均值和中位数:11. 12, 88, 89, 90
12. 16, 17, 19, 20, 20, 98
13. For which of the previous two questions was the median less than the mean? What in the set of numbers caused this?
::13. 前两个问题中,中位数比中位数低是哪个问题?14. For which of the previous two questions was the median greater than the mean? What in the set of numbers caused this?
::14. 前两个问题中,哪一个是中位数大于中位数?15. In each of the sets of numbers for problems 11 and 12, there is one number that could be considered an outlier. Which numbers do you think are the outliers and why? What would happen to the mean and median if you removed the outliers?
::15. 在问题11和12的每组数字中,有一个数字可以被认为是异常数字,你认为哪些数字是异常数字,为什么?如果删除异常数字,平均值和中位数会怎样?Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
Use the
mean
, which is the arithmetic
average
of the data.