3.3 分批抽样
Section outline
-
Suppose you wanted to find out if age influences the choice of classes for students at a particular university. You might divide the students up by age ranges such as: Under 18, 18 – 21, 21 – 25, 25 – 35, and 35 and over. How could you make sure a random sample of college students would have members of each age range ?
::假设你想知道年龄是否影响特定大学学生的班级选择。 你可以将学生按年龄范围划分,比如:18岁以下、18岁以下、21岁、21岁、21岁、25岁、25岁、35岁和35岁以上。 你如何确保随机抽样的大学生拥有每个年龄段的成员?Look to the end of the lesson for the answer.
::寻找教训的结尾 以找到答案。Stratified Random Sampling
::批分随机随机抽样Stratified random sampling is an excellent method of choosing members of a sample when there are clearly defined subgroups in the population you are studying. Each subgroup, called a stratum (strata if plural), should have a clearly defined characteristic that separates the members from the rest of the population.
::分层随机抽样是选择抽样成员的一个极好的方法,如果在所研究的人口中有明确界定的分组。 每个分组,称为平流层(如果是复数,则比例),应该有一个明确界定的特征,将成员与其他人口区分开来。To implement stratified sampling , first find the total number of members in the population, and then the number of members of each stratum. For each stratum, divide the number of members by the total number in the entire population to get the percentage of the population represented by that stratum. Finally, take the percentage and multiply by the number of units you want in your final sample group to see how many you need from each stratum. Always round any decimals up to whole units assuming you cannot take half of a sample.
::为了实施分层抽样, 首先找到人口中的成员总数, 然后找到每个直流的成员数目。 对于每个直流, 成员数目除以整个人口总数, 以获得该直流所代表的人口百分比。 最后, 以您在最后抽样组中想要的单位数的百分比和乘以您想要的单位数来查看您需要从每个直流中取多少个单位。 任何小数到整个直流中, 假设您不能取取取一半样本, 总是将小数点的四舍五入到整个单位的四舍五舍五入 。As a formula, this process looks like:
::作为一种公式,这个过程看起来像:Conducting a Stratified Sample
::进行分批抽样How many Blue Heelers would you need for a stratified sampling of 50 dogs from a population consisting of:
::需要多少个蓝鞋类 来对50只狗进行分层抽样 包括:-
247 Collies
::247 串联 -
138 Pit Bulls
::138皮牛 -
96 English Mastiffs
::96 英语马斯特夫 -
172 Blue Heelers
::172 蓝海鞋 -
222 Welsh Corgis
::222 威尔士Corgis
First identify the total number of dogs in the population:
::首先确定人口中的狗总数:
::247+138+96+172+222=875只狗Then divide the number of Blue Heelers by the population count:
::然后,将蓝色脚跟的数除以人口数:
::172875=.197或19.7%Finally, multiply this number by the desired sample size :
::最后,将这一数字乘以所希望的样本大小:
::. 197×50=9.85 向10 个蓝高跟鞋方向发射Determining Number of Participants Needed
::所需参加人数How many members would you need from each age stratum to obtain a stratified sample of 350 from the following population?
::你需要多少个年龄层的成员才能从以下人口获得350个分层样本?Age Count 15yrs to 18yrs 297 18yrs to 21yrs 349 21yrs to 27yrs 323 27yrs to 35yrs 240 35yrs to 42yrs 191 First find the total population count:
::首先发现人口总数计数:Then divide the count of each stratum by the total to get the percentage:
::然后将每一层的计数除以总数以获得百分比 :Age Count % 15yrs to 297 18yrs to 349 21yrs to 323 27yrs to 240 35yrs to 191 Finally, multiply the percentage of each stratum by the desired sample size:
::最后,将每一层的百分比乘以所希望的样本大小:-
15 – 18yrs: 21.2%
of
350 = 74.2
round to
74
::15-18岁:350人中的21.2%=74.2圆至74岁 -
18 – 21yrs: 24.9%
of
350 = 87.15
round to
87
::18 - 21岁:350人中的24.9%=87.15-87 -
21 – 27yrs: 23%
of
350 = 80.5
round to
80
::21 - 27岁:350人中的23%=80.5左右至80岁 -
27 – 35yrs: 17.1%
of
350 = 59.85
round to
60
::27-35岁:350人中的17.1%=59.85左右至60岁 -
35 – 42yrs: 13.6%
of
350 = 47.6
round to
48
::35-42岁:350人中的13.6%=47.6 左右至48岁
Determining Appropriate Number of Samples Needed
::所需样品的适当数量Would it be appropriate to use 42 samples of green and 78 samples of blue marbles for a stratified sample of 120 marbles from a population of 960 green and 1500 blue marbles?
::是否适宜使用42个绿色样品和78个蓝色大理石样品,作为来自960个绿色和1 500个蓝色大理石的120个大理石的分层样品?Just compare the ratios of each color:
::比较每一颜色的比重 :-
Green sample ratio:
::绿色抽样比率:42120=35 -
Green population ratio:
::绿色人口比率:9602460=39 -
Blue sample ratio:
::蓝色抽样比率:78120=.65 -
Blue population ratio:
::蓝色人口比率:15002460=.61
We can see by looking at the ratios that the actual population that they don’t quite match. There should be 47 green and 73 blue in the sample. This may not seem like enough of a difference to pose a problem, but notice that the 5 too few green marbles is more than 10% of the sample, and the 5 too many blues is nearly 10% of the blue sample. That is enough to possibly skew the results.
::我们可以看到他们不完全匹配的实际人口比例。 样本中应该有47个绿色和73个蓝色。 这似乎并不足以造成问题,但应该注意到,5个太少的绿色弹珠占样本的10%以上,5个太多的蓝色几乎占蓝样的10%。 这足以扭曲结果。Earlier Problem Revisited
::重审先前的问题Suppose you wanted to find out if age influences the choice of classes for students at a particular university. You might divide the students up by age ranges such as: Under 18, 18 – 21, 21 – 25, 25 – 35, and 35 and over. How could you make sure a random sample of college students would have members of each age range?
::假设你想知道年龄是否影响特定大学学生的班级选择。 你可以将学生按年龄范围划分,比如:18岁以下、18岁以下、21岁、21岁、21岁、25岁、25岁、35岁和35岁以上。 你如何确保随机抽样的大学生拥有每个年龄段的成员?By now, I’m sure you can see that a stratified sample would be perfect for this situation.
::现在,我相信你们可以看到,一个分层抽样对这种情况来说是完美的。Examples
::实例Example 1
::例1Ivana wants to create a sample of the students in her school to see if it would be a good idea to put up posters of country music bands in each grade’s locker hall. Is this a good situation to use a stratified sample?
::伊万娜想在她的学校里建立一个学生样本,看看在每个年级更衣厅张贴乡村音乐乐队海报是否是一个好主意。 使用分层样本是一个很好的情形吗?Yes, absolutely. Ivana will want to get a sample of the students in the school that is stratified by grade level to make sure each grade will appreciate the posters, since she plans to put them up in each hall.
::是的,绝对。Ivana想得到学校学生的样本,按年级分级,以确保每个年级都会欣赏这些海报,因为她计划在每个大厅张贴这些海报。Example 2
::例2If Laurana wants to create a stratified sample of the distance an arrow can be shot from each of several different types of bows in the population of bows from her tribe, will she need to get a complete count of every single bow owned by every tribe member?
::如果劳拉纳想建立一条分层的距离样本, 一支箭可以从本部落的弓群中 的几种不同的弓上射出, 她是否需要获得每个部落成员拥有的每一首弓的完整数字?Inconveniently, yes. If she does not get a full count, she will not be able to come up with an accurate ratio to 'aim for' in her bow sample. Since she wants to use her sample to make prediction s about the entire population, she needs to be sure she has a true random sample. She needs to be certain that each bow has an equal chance of ending up in the same sample.
::无意间,是的。 如果她没有得到一个完全的计数, 她将无法在她的弓样中找到与“ 瞄准” 的准确比率。 因为她想用她的标本来预测整个人口, 她需要确定她拥有一个真正的随机样本。 她需要确定每只弓都具有相同的机会 结束在同一样本中。Example 3
::例3If Tanis wants to investigate the waterproofing of Kitiara’s 200 pairs of boots, should he first try to separate them into different groups by style or maker?
::如果坦尼斯想调查基蒂亚拉200双靴子的防水工作,It would be a good idea, yes. Different makers or styles are liable to be more similar to each other than to the entire population.
::不同的制造者或风格可能比整个人口更相似。Review
::回顾For questions 1-5, assume you intend to create a stratified sample of 250 from a population of 920 trucks, 1540 subcompact cars, 1320 sedans, 450 motorcycles, 110 R.V.’s, 550 luxury cars, and 780 sports cars.
::对于问题1-5,假设你打算从920辆卡车、1540亚集装汽车、1320辆轿车、450辆摩托车、110辆R.V.、550辆豪华轿车和780辆运动车中抽取250辆分层抽样。1. What percentage of the population is represented by sedans?
::1. 轿车占人口的百分比是多少?2. How many motorcycles should you have in your sample?
::2. 样品中应有多少辆摩托车?3. How many subcompacts should you have in your sample?
::3. 抽样中应有多少次合同?4. Is 10 R.V.’s a good number for your sample?
::4. 10辆R.V.是否适合你的样品?5. Should you have more than 15% of your sample represented by trucks?
::5. 你是否应该拥有超过15%的样品由卡车代表?For questions 6-10, assume your stratified sample consists of 29 cats, 62 small dogs, 48 large dogs, 19 birds, 37 pot-bellied pigs, and 55 horses. Assume the total population of pets is 6474.
::对于问题6-10,假设你的分层样本包括29只猫、62只小狗、48只大狗、19只鸟、37只锅炉鱼和55匹马。 包括宠物总数为6474只。6. How many horses are there in the entire population?
::6. 总人口中有多少匹马?7. What percentage of the population is represented by dogs?
::7. 狗占人口的百分比是多少?8. Are there more than 1000 pot-bellied pigs in the population?
::8. 人口中是否有1 000多只锅炸猪?9. What would the total population be if there were no horses?
::9. 如果没有马匹,总人口会是什么?10. What percent of the sample is made up of cats?
::10%的样本是由猫构成的?For questions 11-15, decide whether a stratified sample is warranted and why.
::对于问题11-15,决定是否需要分层抽样,以及为什么。11. The estimated mileage of U.S. automobiles compared to vehicle weight.
::11. 美国汽车与车辆重量相比的估计里程。12. The average height of college basketball players.
::12. 大学篮球运动员的平均身高。13. The G.P.A. of students in various sports.
::13. 参加各种运动的学生的G.P.A.。14. The number students in your school with access to the internet.
::14. 可上网的学校学生人数。15. Homework grades of sports participants.
::15. 体育参与者的家庭工作等级。Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
247 Collies