中心限制定理
Section outline
-
What is the Central Limit Theorem? How does the Central Limit Theorem relate other distributions to the normal distribution ?
::中心限制定理是什么? 中心限制定理如何将其他分布与正常分布联系起来?This lesson describes the relationship between the normal distribution and the Central Limit Theorem.
::这一课程描述了正常分布与中央限制理论之间的关系。The Central Limit Theorem
::中央限制定理The Central Limit Theorem is a very powerful statement in statistics , saying that as you take more and more samples from a random variable , the distribution of the means of the samples (If you completed the lesson titled “The Mean of Means”, you will recognize this as “the sampling distribution of the sample means”) will approximate a normal distribution. This is true regardless of the original distribution of the random variable (if the number of data points in each sample is 30 or more)! In fact, as demonstrated in the video above, even a discrete random variable with a pretty odd distribution will output an approximately normal distribution from the means of enough samples.
::中央限制理论是一个非常有力的统计说明,指出随着从随机变量中采集越来越多的样本,样本手段的分布(如果你完成了题为“手段的平均值”的课程,你会承认这是“抽样手段的抽样分布”)将接近正常分布。不管随机变量的原始分布(如果每个样本中的数据点数为30个或30个以上),情况都是如此。 事实上,如上文视频所示,即使是一个离散随机变量,如果分布非常奇异,也会从足够样本中产生大致正常的分布。Formally, the CLT says:
::正式地说,CLT说:If samples of size are drawn at random from any population with a finite mean and standard deviation , then the sampling distribution of the sample means , , approximates a normal distribution as increases.
::如果从具有一定平均值和标准差的任何人群中随机抽取大小样本,则样本的抽样分布意味着x,相当于正常分布的增加量。In “normal English”:
::“正常英文”:If you collect many samples from an ordinary random variable, and calculate the mean of each sample, then the means will be distributed in an approximate bell-curve, and the “mean of means” will be the same as the mean of the population. The larger the size of the samples you collect, the more closely the distribution of their means will approximate a normal distribution.
::如果您从普通随机变量中收集了许多样本,并且计算了每个样本的平均值,那么这些手段将以大约的钟曲线分布,而“手段”将与人口平均值相同。 采集的样本越大,其分布越接近正常分布。Notes to remember:
::要记住的注解:-
As long as your
sample size
is
30 or greater
, you may assume the distribution of the sample means to be approximately normal, meaning that you can calculate the probability that the mean of a single sample of size 30 or greater will occur by using the
z
-score of the mean.
::只要你的样本大小为30或30以上,你可以假定样本的分布意味着大致正常,也就是说,您可以通过使用平均值的z分计算出30或以上的单一样本的平均值的概率。 -
The mean of the distribution created from many sample means approaches the mean of the population. Formally:
::从许多抽样中得出的分布平均值意味着接近人口的平均值。 形式上: μx {____________________________________________________________________________________________________ -
The standard deviation of the distribution of the means is estimated by dividing the standard deviation of the population by the square root of the sample size. Formally:
::手段分布的标准偏差是通过将人口标准偏差除以样本大小的平根来估计的。 -
Use the notation
(
-bar) rather than the random variable
to indicate that the random variable you are describing is a
sample mean
.
::使用标记 x (x-bar) 而不是随机变量 x 来表示您描述的随机变量是一个样本平均值 。
You may use the z-score percentage reference table below as needed:
::您可按需要使用以下z-score百分数参考表:Z 0.00 0.01 0.02 0.03
0.04 0.05 0.06 0.07 0.08 0.09 Z 0.0 .5
0.504
0.508
0.512
0.516
0.5199
0.5239
0.5279
0.5319
0.5359
0.0 0.1 0.5398
0.5438
0.5478
0.5517
0.5557
0.5596
0.5636
0.5675
0.5714
0.5753
0.1 0.2
0.5793
0.5832
0.5871
0.591
0.5948
0.5987
0.6026
0.6064
0.6103
0.6141
0.2 0.3 0.6179
0.6217
0.6255
0.6293
0.6331
0.6368
0.6406
0.6443
0.648
0.6517
0.3 0.4 .6554
0.6591
0.6628
0.6664
0.67
0.6736
0.6772
0.6808
0.6844
0.6879
0.4 0.5 0.6915
0.695
0.6985
0.7019
0.7054
0.7088
0.7123
0.7157
0.719
0.7224
0.5 0.6
0.7257
0.7291
0.7324
0.7357
0.7389
0.7422
0.7454
0.7486
0.7517
0.7549
0.6 0.7 0.758
0.7611
0.7642
0.7673
0.7704
0.7734
0.7764
0.7794
0.7823
0.7852
0.7 0.8 0.7881
0.791
0.7939
0.7967
0.7995
0.8023
0.8051
0.8078
0.8106
0.8133
0.8 0.9 0.8159
0.8186
0.8212
0.8238
0.8264
0.8289
0.8315
0.834
0.8365
0.8389
0.9 1.0 0.8413
0.8438
0.8461
0.8485
0.8508
0.8531
0.8554
0.8577
0.8599
0.8621
1.0 1.1 0.8643
0.8665
0.8686
0.8708
0.8729
0.8749
0.877
0.879
0.881
0.883
1.1 1.2 0.8849
0.8869
0.8888
0.8907
0.8925
0.8944
0.8962
0.898
0.8997
0.9015
1.2 1.3 0.9032
0.9049
0.9066
0.9082
0.9099
0.9115
0.9131
0.9147
0.9162
0.9177
1.3 1.4 0.9192
0.9207
0.9222
0.9236
0.9251
0.9265
0.9279
0.9292
0.9306
0.9319
1.4 1.5 0.9332
0.9345
0.9357
0.937
0.9382
0.9394
0.9406
0.9418
0.9429
0.9441
1.5 1.6 0.9452
0.9463
0.9474
0.9484
0.9495
0.9505
0.9515
0.9525
0.9535
0.9545
1.6 1.7 0.9554
0.9564
0.9573
0.9582
0.9591
0.9599
0.9608
0.9616
0.9625
0.9633
1.7 1.8 0.9641
0.9649
0.9656
0.9664
0.9671
0.9678
0.9686
0.9693
0.9699
0.9706
1.8 1.9 0.9713
0.9719
0.9726
0.9732
0.9738
0.9744
0.975
0.9756
0.9761
0.9767
1.9 2.0 0.9772
0.9778
0.9783
0.9788
0.9793
0.9798
0.9803
0.9808
0.9812
0.9817
2.0
2.1 0.9821
0.9826
0.983
0.9834
0.9838
0.9842
0.9846
0.985
0.9854
0.9857
2.1 2.2 0.9861
0.9864
0.9868
0.9871
0.9875
0.9878
0.9881
0.9884
0.9887
0.989
2.2 2.3 0.9893
0.9896
0.9898
0.9901
0.9904
0.9906
0.9909
0.9911
0.9913
0.9916
2.3 2.4 0.9918
0.992
0.9922
0.9925
0.9927
0.9929
0.9931
0.9932
0.9934
0.9936
2.4 2.5 0.9938
0.994
0.9941
0.9943
0.9945
0.9946
0.9948
0.9949
0.9951
0.9952
2.5 2.6 0.9953
0.9955
0.9956
0.9957
0.9959
0.996
0.9961
0.9962
0.9963
0.9964
2.6 2.7 0.9965
0.9966
0.9967
0.9968
0.9969
0.997
0.9971
0.9972
0.9973
0.9974
2.7
2.8 0.9974
0.9975
0.9976
0.9977
0.9977
0.9978
0.9979
0.9979
0.998
0.9981
2.8
2.9 0.9981
0.9982
0.9982
0.9983
0.9984
0.9984
0.9985
0.9985
0.9986
0.9986
2.9 3.0 0.9987
0.9987
0.9987
0.9988
0.9988
0.9989
0.9989
0.9989
0.999
0.999
3.0 3.1 0.999
0.9991
0.9991
0.9991
0.9992
0.9992
0.9992
0.9992
0.9993
0.9993
3.1 3.2 0.9993
0.9993
0.9994
0.9994
0.9994
0.9994
0.9994
0.9995
0.9995
0.9995
3.2 3.3 0.9995
0.9995
0.9995
0.9996
0.9996
0.9996
0.9996
0.9996
0.9996
0.9997
3.3 3.4 0.9997
0.9997
0.9997
0.9997
0.9997
0.9997
0.9997
0.9997
0.9997
0.9998
3.4 3.5 0.9998
0.9998
0.9998
0.9998
0.9998
0.9998
0.9998
0.9998
0.9998
0.9998
3.5 3.6 0.9998
0.9998
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
3.6 3.7 0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
3.7 3.8 0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
0.9999
3.8 3.9 1
1
1
1
1
1
1
1
1
1
3.9 Z 0.000 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Z Real-World Application: Buying Lunch
::真实世界应用程序:购买午餐Mack asked 42 fellow high-school students how much they spent for lunch, on average . According to his research online, the amount spent for lunch by high school students nation wide has , with . What is the probability that Mack’s random sample will result within $0.01 of the national average?
::麦克问42名高中同学平均午餐花费了多少。 根据他在网上的研究,全国高中学生午餐花费的金额是15美元和9美元。 麦克随机抽样的概率是全国平均0.01美元以下?There are a few important facts to note here:
::值得一提的重要事实如下:-
Mack’s sample is 42 students, since
, he can safely assume that the distribution of his
sample
is approximately normal, according to the Central Limit Theorem.
::麦克的样本为42名学生, 自42°30起, 他可以安全地假定,根据中央限制理论, 样本的分布大致正常。 -
The
range
we are considering is $14.99 to $15.01, since that represents $0.01 above and below the mean.
::我们考虑的幅度是14.99至15.01美元,因为这一幅度比平均值高出和低于0.01美元。 -
The mean of the sample should approximate the mean of the population, in other words
::样本的平均值应接近人口的平均值,即 μx {_______________________________________________________________________________________________ -
The standard deviation of Mack’s sample,
, can be calculated as
, where
::Mack 样本的标准偏差 kxx , 可以计算为 kx x n, 其中n= 42
Let’s start by finding the standard deviation of the sample, :
::让我们首先找出样本的标准偏差,
::942=96. 48_31x 1.389Since Mack’s sample of 42 samples can be assumed to be normally distributed, and since we now know the standard deviation of the sample, 1.39, we can calculate the z -scores of the range using :
::由于Mack的42个样本样本可以假定正常分布,而且既然我们现在知道1.39个样本的标准偏差,我们可以使用 {x\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\可以\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
::Z1=15.01-15.0013890.01Z2=14.99-15.001.3890.01Finally, we look up and on the Z -score probability table to get a range of 50.4% to 49.6% = 0.80%
::最后,我们在Z核心概率表中查查Z1和Z2的概率表,得出50.4%至49.6%=0.8%的范围。The probability that Mack’s sample will have a mean within $0.01 of the population mean of $15.00 is a little less than 1%.
::麦克的样本在15美元人口平均值0.01美元范围内的平均值可能略低于1%。Real-World Application: Time to Complete a Test
::真实世界应用: 完成测试的时间The time it takes a student to complete the mid-term for Algebra II is a bi-modal distribution with and . During the month of June, Professor Spence administers the test 64 times. What is the probability that the average mid-term completion time for students during the month of June exceeds 48 minutes?
::学生完成代数二的中期需要多少时间才能完成代数二的双模式分布,即1小时和1小时。 6月份,Spence教授管理了64次考试。 6月份学生的平均中期完成时间超过48分钟的概率有多大?Important facts:
::重要事实:-
There are more than 30 samples, so the Central Limit Theorem applies
::有超过30个样本,所以中央限制定理适用 -
The mean of the sample should approximate the mean of the population, in other words
::样本的平均值应接近人口的平均值,即 μx {_______________________________________________________________________________________________ -
The standard deviation of Professor Spence’s sample,
, can be calculated as
, where
(the number of tests/samples)
::Spence教授样本的标准偏差 x , 可以计算为 x n, 其中n= 64 (测试/样本数) -
48 minutes is the same as
, so the range we are interested in is
::48分钟与4860=0.8小时相同,因此我们感兴趣的射程为x>0.8小时。
First calculate the standard deviation of the sample, using :
::首先计算样本的标准偏差, 使用 x n 计算 :
::======================================================================================================================================= ================================================================================== ============================================================================================================================================================================================================Since the sample is normally distributed, according to the CLT, we can use the standard deviation of the sample to calculate the z -score of the minimum value in the relevant range, 0.80 hrs:
::由于样本通常分布,根据CLT,我们可以使用样本的标准偏差来计算相关范围0.80小时最低值的z分:
::0.80-10.10.125 1.60Finally, we use the z -score probability reference above to correlate the z -score of -1.60 to the probability of a value greater than that
::最后,我们使用上面提到的z-z-scolor 概率参考值,将z-1.60 的z-scor 与值大于 -1.60 的概率联系起来。
::P(1.6)=.9452或94.52%Real-World Application: Online Auctions
::真实世界应用程序:在线拍卖Evan price-checked 123 online auction sellers to record their average asking price for his favorite game. According to a major nation price-checking site, the national average online auction cost for the game is $35.00 with a standard deviation of $3.00. Evan found the prices less than $34.86 on average. How likely is this result?
::Evan价格检查了123个在线拍卖卖主,记录了他们为他最喜欢的游戏平均索价。 根据一个主要的国家物价检查网站,该游戏的全国平均在线拍卖成本为35美元,标准偏差为3.00美元。 Evan发现平均价格低于34.86美元。结果如何?Since there are more than 30 samples , we can apply the CLT theorem and treat the sample as a normal distribution.
::由于有30多个样本(123>30),我们可以应用CLT定理并将样本作为正常分布处理。The standard deviation of the sample is:
::样本的标准偏差是: x 3123=311.09=27。The z -score for Evan’s price point of $34.86 is:
::Evan的价格点为34.86美元,兹记是:
::34.86-35.27 14.27 0.518Consulting the z -score probability table, we learn that the area under the normal curve less than 0.52 is .3015 or 30.15%
::参考z-C-C-概率表,我们了解到,正常曲线下低于0.52的面积为0.3015或30.15%。The likelihood of 123 samples having a mean of $34.86 is approximately 30.15%
::123个样本中平均值为34.86美元的样本的可能性约为30.15%Earlier Problem Revisited
::重审先前的问题What is the Central Limit Theorem? How does the Central Limit Theorem relate other distributions to the normal distribution?
::中心限制定理是什么? 中心限制定理如何将其他分布与正常分布联系起来?The Central Limit Theorem says that the larger the sample size, the more the mean of multiple samples will represent a normal distribution. Since that is true regardless of the original distribution, the CLT can be used to effect a bridge between other types of distributions and a normal distribution.
::中央限制论者指出,样本大小越大,多个样本的平均值就越代表正常分布。 由于无论最初分布如何,CLT都可用于在其他类型的分布和正常分布之间搭桥。Examples
::实例Example 1
::例1The time it takes to drive from Cheyenne WY to Denver CO has a of 1 hr and of 15 minutes. Over the course of a month, a highway patrolman makes the trip 55 times. What is the probability that his average travel time exceeds 60 minutes?
::从Cheyenne WY到Denver CCO的驾驶时间为1小时和15分钟。在一个月中,高速公路巡逻员出行55次。平均旅行时间超过60分钟的概率有多大?The sample mean, is the same as the population mean: .
::样本的平均值是,x与人口值相同:1小时=60分钟。The sample standard deviation is
::标准差为15分55=157.42=2.02分The 55 trips made by the patrolman exceed the minimum sample size of 30 required to apply the CLT, so we may assume the sample means to be normally distributed.
::巡逻员的55次旅行超过适用《综合边界限制法》所需的30次最低抽样规模,因此我们可以假定通常分配抽样手段。The z -score of the patrolman’s average time is:
::巡逻员平均巡逻时间的z分数是:60-602.02=02.02=0。According to the z -score percentage reference, a z -score of 0 corresponds to .50 or 50%
::根据z-score百分率参考值,0的z-scro相当于50%或50%。There is a 50% probability that the patrolman’s mean travel time is greater than 60 mins.
::巡警平均旅行时间超过60分钟的概率为50%。Example 2
::例2Abbi polls 95 high school students for their GPA. According to the school, the average GPA of high school students has a mean of 3.0, and a standard deviation of .5. What is the probability that Abbi's random sample will have a mean within 0.01 of the population.
::Abbi民意测验95名高中学生的GPA。根据该校的数据,高中学生的平均GPA平均值为3.0,标准偏差为0.5。 Abbi随机抽样的平均值在人口0.01范围内的概率是多少。The sample mean of the 95 polled G.P.A. scores is the same as the population mean: 3.0
::95个G.P.A.得分的抽样平均数与人口平均数相同:3.0。The sample standard deviation is
::抽样标准偏差为 595=.59.75=0.05The 95 sampled G.P.A.’s exceed the minimum sample size of 30, so we may apply the CLT.
::G.P.A.的95个抽样G.P.A.超过30个最低抽样规模,因此我们可以适用CLT。The z -scores of the minimum and maximum values in the range of interest, 2.99 to 3.01 is:
::2.99至3.01利息范围中最低和最高值的z分数为:
::Z1=2.99-3.00.05.01.050.2Z2=3.01-3.00.05=.01.050.2Referring to the z- score reference table, the z -scores -0.2 and 0.2 cover a range of apx. 15.86%
::在提及z-score参考表时,z-scores-0.2和0.2覆盖了15.86%的apx范围。Example 3
::例3A recipe website has calculated that the time it takes to cook Sunday dinner has of 1 hour with of 25 minutes. Over the course of a month, 172 users report their time spent cooking Saturday dinner, what is the probability that the average user reports spending less than 45 minutes cooking dinner?
::一个食谱网站计算了做周日晚饭所需的时间是1小时的微秒,25分钟。 在一个月中,172个用户报告了他们做周六晚饭的时间,平均用户报告做晚饭的时间少于45分钟的可能性有多大?The sample mean, is the same as the population mean: .
::样本的平均值是,x与人口值相同:1小时=60分钟。The sample standard deviation is
::标准差为25分172=2513.11=1.91分。The 172 users reporting cooking times exceed the minimum sample size of 30 required to apply the CLT, so we may assume the sample means to be normally distributed.
::172个报告烹饪时间的用户超过适用CLT所需的30个最低抽样规模,因此我们可以假定样本手段正常分配。The z -score of the average reported cooking time is:
::所报告的平均烹饪时间的z分数为:45-601.91151.917.85。According to the z -score percentage reference, a z -score of -7.85 corresponds to 0%.
::根据z-score百分率参考,z-7.85的z-score相当于0%。There is essentially zero probability that 172 users would average only 45 mins.
::172个用户平均只有45分钟的概率基本上为零。Review
::回顾-
128 randomly-sampled students reported how much they spent on a movie at the theater. If the national average amount spent at the movies has a mean of $15 and standard distribution of $8, what is the probability that the random sample will give a result within $0.01 of the true value?
::128个随机抽样的学生报告了他们在剧院的电影上花费了多少。 如果全国平均在电影上花费的金额平均为15美元,标准分布为8美元,那么随机抽样的结果在真实价值0.01美元之内的概率是多少? -
The time an American family spends doing dishes in the evening has
and
. 58 Americans were polled to find the time they spend doing dishes. What is the probability that their average time exceeds 60 minutes?
::美国家庭在晚间做盘子的时间是60分钟和60分钟。 58个美国人被民意调查,以找到他们做盘子的时间。 他们的平均时间超过60分钟的概率有多大? -
Rachel asked 65 second year college students how many credits they have taken. According to the colleges, the average number of credits taken by 2
nd
year students is 15, with a standard deviation of 7. How likely is it that Rachel got less than 17.17 on average?
::根据学院的数据,二年级学生的平均学分为15个,标准差为7个,Rachel的平均学分低于17.17个的可能性有多大? -
What do you need in order to apply the Central Limit Theorem to sample means?
::您需要什么才能将中央限制定理应用到抽样手段中? -
117 business women were asked how much they spend for lunch, on average. If the national average has a mean of $30, and standard distribution of $9, what is the probability that the random sample will return a result within $0.01 of the true value?
::117名商业妇女被问及平均午餐费用是多少,如果全国平均水平平均为30美元,标准分配为9美元,那么随机抽样的结果在实际价值0.01美元之内返回的可能性有多大? -
According to the phone company, the daily average number of calls made by Americans is 30, with a standard deviation of 10. What is the probability that 117 Americans reported less than 30.92 calls per day, on average?
::据电话公司称,美国人每天平均通话次数为30次,标准偏差为10次,117名美国人报告每天通话次数平均少于30.92次,概率有多大? -
The time spent by the average technician repairing a laptop is governed by an exponential distribution where
and
are each 60 minutes. In the month of June, a technician repairs 76 laptops. How likely is it that the average repair time is greater than 77 minutes?
::平均技师修理膝上型电脑所花的时间按指数分布,每60分钟有μ和 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -
46 teenagers were asked how many .mp3’s they purchase each month. According .mp3 sales data, the average has a mean of 15, with a standard distribution of 2. How likely is it that the 46 polled teens averaged within 0.02 of the national average?
::46名青少年被问及他们每月购买多少.mp3。 根据.mp3销售数据,平均平均15人,标准分配为2人,46名受访青少年平均在全国平均数0.02人之内的可能性有多大? -
44 classrooms were investigated to see how many students they contained. According to school data, the average number of students per classroom is 35, with a standard deviation of 10. How likely is it that the 44 classrooms averaged fewer than 33.49 students?
::根据学校数据,每间教室的平均学生人数为35人,标准偏差为10人,44间教室平均学生少于33.49人的可能性有多大? -
100 bags of candy were counted to see how many pieces they contained. According to the company that fills the bags, the average number of candies per bag has a mean of 50, and standard distribution of 10. What is the probability that the 100 bags will have an average number within 0.02 of the production average?
::100袋糖果被计算为看它们装了多少块糖果。 根据装满袋子的公司,每袋糖的平均数量平均为50个,标准分配为10个。 100袋平均数量在生产平均数量0.02范围内的概率是多少?
Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
As long as your
sample size
is
30 or greater
, you may assume the distribution of the sample means to be approximately normal, meaning that you can calculate the probability that the mean of a single sample of size 30 or greater will occur by using the
z
-score of the mean.