2.1 抽样介绍
Section outline
-
Suppose you were chosen to help pick out a theme for your school prom. Out of all of the initial suggestions offered by your team, you have narrowed the options down to 3: Famous Couples through the Ages, Romance Under the Sea, and Stairway to Heaven.
::假设你被选中来帮助选择一个 学校毕业舞会的主题。在你团队提出的所有初步建议中,你已经把选择范围缩小到3:Since this is the Senior Prom, you feel that the Senior Class should make the final call. Unfortunately, there are over three hundred seniors in your school, and your deadline for a decision is in one hour! How could you get a good idea of the preference of the class as a whole in such a limited time?
::既然这是高级毕业舞会,你觉得高级班应该做出最后决定。 不幸的是,你的学校里有三百多名高年级学生,而你作出决定的最后期限在一小时之内!你怎么能在这么短的时间内很好地了解整个班的偏好呢?By the end of this lesson, you should have no problem suggesting a good solution!
::到这一课结束时,你应该没有问题 提出一个好的解决方案!Introduction to Sampling
::抽样介绍There are many situations in life where we need to gather data on a very large or difficult to study population . Certainly it is ideal in most cases to be able to individually poll each and every member, but sometimes that just isn’t feasible.
::在很多情况下,我们需要收集大量或难以研究的人口的数据。 当然,在大多数情况下,让每个成员单独投票是理想的,但有时这根本不可行。In such cases, the solution is to use a sample or subset that is carefully picked to accurately represent the full population. An experiment conducted on a well-chosen sample should provide an accurate representation of the results you would get by performing the same experiment on the population from which the sample was created.
::在这种情况下,解决办法是使用经过仔细挑选的样本或子集来准确代表全部人口,对选择良好的样本进行的实验应准确反映通过对生成样本的人口进行同样的实验而得出的结果。There are many different ways to choose a sample, and all have applications for which they are more or less appropriate.
::选择样本有多种不同的方式,所有申请都大致适合。A few examples of :
::举例来说:-
Random Sampling
(choosing representatives by rolling a die, for instance)
::随机抽样(例如通过滚死来选择代表) -
Stratified Sampling
(choosing a proportional number of representatives from each of a number of
subgroups
of the initial population) These divisions are chosen based on the belief that the subgroups differ significantly with respect to the
variable
that you are measuring. For example you might stratify by age or by income.
::分层抽样(选择初始人口各分组代表的成比例数) 选择这些分层是基于这样的信念,即分组在您测量的变量方面差异很大。例如,您可以按年龄或收入划分。 -
(choosing representatives which are close to other representatives based on a particular factor such as location, age, color, size, etc.)
:根据地点、年龄、肤色、大小等特定因素,选择与其他代表关系密切的代表)
-
Multi-Stage Sampling
(narrowing down a field of representatives by successively applying multiple different sampling methods) For example you might stratify and then take a
simple random sample
from each
stratum
.
::多层抽样(通过相继采用多种不同取样方法缩小代表的字段)例如,您可以分层,然后从每个层抽取一个简单的随机抽样。
Understanding When to Use Sample Groups
::了解何时使用抽样组Would it be necessary to use a sample group to evaluate the effects of too much sugar on a group of 15 elementary-school children? What about a playground full of 300 children?
::是否有必要使用一个抽样小组来评估糖量过大对15名小学生的影响? 装满300名儿童的操场如何?15 children certainly seems like a manageable size group for study, so choosing a sample to represent the whole group is probably not necessary from that standpoint. However, this is the type of study where a control group would be an important consideration. If you just gave an extra handful of candy to every child, you would not know how much of the later energy actually came from the sugar, and how much was just a result of age. By pulling aside a control group of perhaps 6 students who would not get the extra sugar, you could better evaluate the difference in energy actually due to diet rather than age.
::15个孩子显然看起来像一个可以控制大小的学习群体, 所以从这个角度来说, 选择一个代表整个群体的样本也许并不必要。 但是, 这样的研究让一个控制群体成为重要的考虑。 如果您给每个孩子多一小撮糖果, 您将不知道后期能量中有多少来自糖, 多少来自年龄。 如果把一个可能只有6个学生不会得到额外糖的控制群体拉开, 你就可以更好地评估由于饮食而不是年龄而实际存在的能量差异。With 300 children all running around a playground, collecting them all together and attempting to organize a study might prove a daunting task. If you just chose a sample of perhaps 30 of them, some a little older, some younger, some boys, some girls, you could get an estimate of what would happen if you applied the study to the entire group.
::有300名儿童在游乐场周围跑来跑去,把他们一起收集起来,并试图组织一项研究,这也许证明是一项艰巨的任务。 如果你选择其中大约30名的样本,有些更年长一点,有些更年轻,有些男孩,有些女孩,你可以估计一下如果将研究运用到整个群体,会发生什么情况。Choosing the Appropriate Sampling Method
::选择适当的抽样方法Suppose you wanted to study the effect of rubbing marbles with candle wax before playing a classic game of marbles. After setting aside a control group, you are ready to choose a sample set of marbles to rub with the wax. Would a stratified sampling of the remaining marbles be a good choice in this situation?
::假设你想在玩经典的弹珠游戏之前先研究用蜡烛摩擦弹珠的效果。 撇开一个控制组后, 你准备选择一组弹珠样本来用蜡擦。 在这种情况下, 分层取样剩余弹珠是否是一个好的选择?Probably not. Marbles are generally created to be as alike as possible in every way other than appearance, and since appearance is unlikely to have an effect on the result of the wax experiment, it would not make sense to carefully attempt to represent each color or type of decoration. A random sample would be simpler and would very likely yield the same results.
::可能不会。 通常,除外表外,其他所有方法都尽可能地造就了大理石,而且由于外表不大可能对蜡类实验的结果产生影响,仔细尝试代表每一种颜色或类型的装饰是没有道理的。 随机抽样比较简单,而且很可能产生同样的结果。Recognizing Sampling Errors
::确认抽样错误The student council at your school has been given an assignment to find a good use for a grant that the school received to make school more enjoyable for the students. After a week or two of deliberation, the council announces that the studies they have conducted suggest that providing the cheerleading squad with new pom-poms is the #1 priority of 90% of the students in the school. Of course, the chess club members disagree and conduct their own study. If the chess team chooses a sample the same way the student council did, and their results suggest that 90% of respondents think that the money should go toward new chess clocks, what error do you think both groups committed in the choice of sample groups for study?
::你们学校的学生理事会被指派寻找学校收到的使学校更方便学生的补助金的好用途。经过一周或两次评议后,理事会宣布,他们所进行的研究表明,向啦啦队提供新的pom-poms是学校90%学生的第1个优先事项。当然,象棋俱乐部成员不同意并进行他们自己的研究。如果象棋球队选择的样本与学生理事会相同,他们的研究结果表明90%的受访者认为钱应该用于新的象棋钟,那么你们认为两个团体在选择抽样组进行学习时犯了什么错误?It would certainly appear that both groups were guilty of a process called ‘cherry-picking’, which means that they deliberately chose to question people who shared the same interests in order to get favorable results from their polls. Obviously neither group’s results are likely to be representative of the entire student body, but rather only represent the views of the chess team and the cheerleading squad!
::很明显,这两个群体都犯有所谓的“挑选”程序,这意味着他们故意选择质疑那些有着相同利益的人,以便从他们的投票中获得有利结果。 显然,这两个群体的结果都不可能代表整个学生团体,而只是代表国际象棋队和拉拉队队的观点!Earlier Problem Revisited
::重审先前的问题Suppose you were chosen to help pick out a theme for your school prom. Out of all of the initial suggestions offered by your team, you have narrowed the options down to 3: Famous Couples through the Ages, Romance Under the Sea, and Stairway to Heaven.
::假设你被选中来帮助选择一个 学校毕业舞会的主题。在你团队提出的所有初步建议中,你已经把选择范围缩小到3:Since this is the Senior Prom, you feel that the Senior Class should make the final call. Unfortunately there are over three hundred seniors in your school, and your deadline for a decision is in one hour! How could you get a good idea of the preference of the class as a whole in such a limited time?
::既然这是高级毕业舞会,你觉得高级班应该做出最后决定。 不幸的是,你们学校里有三百多名高年级学生,而你们作出决定的最后期限在一小时之内!你怎么能在这么短的时间内很好地了解整个班的偏好呢?This is an excellent case of the need for a representative sample of a population. Without having the time to poll all of the members of the senior class, you could get an idea of what the most popular theme would be by choosing a smaller number of seniors to represent the entire class. Just be careful to minimize the chance that your chosen representatives have any sort of bias that might keep them from properly representing the class as a whole.
::这是需要具有代表性的人口抽样的极好例子。 在没有时间对高级阶级所有成员进行民意测验的情况下,你可以通过选择人数较少的老年人代表整个阶级来了解最受欢迎的主题是什么。 注意尽可能减少你所选择的代表有可能会使他们无法适当代表整个阶级的任何偏见的可能性。Examples
::实例Example 1
::例1What kind of sampling would you expect was used if the sample group was composed of 5 yellow, 3 green, 4 red, and 6 blue members, and the population included 48 blue, 32 red, 24 green, and 40 yellow members?
::如果取样组由5个黄色、3个绿色、4个红色和6个蓝色成员组成,而且人口包括48个蓝色、32个红色、24个绿色和40个黄色成员,那么预计会使用何种取样?Since the sample group contains exactly as many members of each color as the entire population, it is reasonable to suspect that a stratified sampling was used.
::由于样本组中每个颜色的成员与整个人口相同,每颜色的18个,因此有理由怀疑使用了分层抽样。Example 2
::例2What type(s) of sampling method(s) might be most appropriate for approximating the number of cutthroat trout in a 25-mile section of river?
::哪种采样方法对河中25英里段的切喉鳟鱼数目的接近最合适?A 25-mile-long section of river is likely to include a number of different types of ecosystems that each would harbor a different density of fish. In order to get a good sample, a multi-stage sampling method comprised of a stratified sample of different ecosystems followed by a random sampling of fish in each ecosystem would probably be a good choice.
::一条长达25英里长的河段可能包含多种不同的生态系统,每个生态系统都拥有不同密度的鱼类。 为了获得良好的样本,多阶段采样方法可能是一个好的选择。 多阶段采样方法由不同生态系统的分层样本组成,然后对每个生态系统的鱼类进行随机采样。Example 3
::例3Would you reasonably expect bias to have affected a sample composed of 75% Toyota vehicles in a study of the most common cars in large U.S. cities?
::在研究美国大城市最常用的汽车时,你是否合理预期偏见会影响到由75%丰田汽车组成的样本?Although Toyota is a very popular vehicle manufacturer, 75% is an extremely high percentage of vehicles in a large city (reasonable estimates put Toyota somewhere between 25 and 30 percent). Such a huge number would definitely suggest sample bias.
::尽管丰田是一个非常受欢迎的汽车制造商,但75%的汽车在大城市中的比例极高(据合理估计丰田在25%到30%之间 ) 。 如此之大的数量肯定表明抽样偏差。Example 4
::例4Would a random sampling of students be the most appropriate method of sampling for a study of the most enjoyable after-school club in a large public school?
::对学生进行随机抽样是否是对大型公立学校中最享受的课后俱乐部进行抽样研究的最适当方法?Probably not, since a random sampling would likely include a large number of students who either have no opinion or have no experience with any after school clubs. More accurate results would be obtained by a multi-stage sample that first identified club members, and then randomly selected representatives from them.
::也许不会,因为随机抽样可能包括大量学生,他们或者没有意见,或者对校后俱乐部没有任何经验。 多阶段抽样可以得出更准确的结果,先是确定俱乐部成员,然后随机从俱乐部中挑选代表。Example 5
::例5What might you conjecture about a study that claims 100% of respondents preferred “Super Sweet and Crunchy” cereal over “Super Duper Sweet” cereal
::你对一项声称百分之百的受访者更喜欢“超甜和Crunchy”谷物而不是“超甜和Crunchy Sweet”谷物的研究有何猜测?There are a number of reasonable specific conjectures we might make, most related to inaccurate sampling methodology. Perhaps the sample was chosen from employees of the “Super Sweet and Crunchy” cereal company, perhaps respondents were offered a reward for choosing one option over the other, perhaps there was only a single member of the sample group or the “study” didn’t include milk for the other cereal, or didn’t offer samples of “Super Duper Sweet” to respondents at all
::也许样本是从“超级甜甜和Crunchy”谷物公司的雇员中挑选的,也许被调查者因选择一种选择而不是另一种选择而得到奖赏,也许抽样组中只有一个成员,或者“研究”不包括其他谷类的牛奶,或者根本没有向调查者提供“超级杜珀甜”的样本。Review
::回顾1. Margo collected 12 carrots in a bag. She drew 5 carrots out of the bag. Is this a random sample of the carrots in the bag?
::1. Margo在袋子里收集了12个胡萝卜,她从袋子里抽出5个胡萝卜,这是包里胡萝卜的随机样本吗?2. Chris put some assorted colored kerchiefs into a box. He looks into the box and pulls out the blue kerchiefs. Is this is a random sample of the kerchiefs in the bag?
::2. Chris把各种有色手帕放在盒子里,他看着盒子,拿出蓝色手帕,这是袋内手帕的随机抽样吗?3. Sue had red and white beans in a jar. She reached in and pulled out 10 beans, without looking in the jar. Is this a random sample of beans from the jar?
::3. Sue有一个罐子中的红豆和白豆,她伸手取出10个豆子,而没有看罐子,这是罐子中豆子的随机样本吗?For questions 4-6, identify the population and the sample from each:
::关于问题4-6,确定人口和每个样本:For example: In a class of 20 students, where each student is asked if they have gone to the movies in the past month, you would identify the population as 20 Students, and the sample as 20 students.
::例如:在由20名学生组成的班级中,每个学生被问及是否在过去一个月里看过电影,您将确定人口为20名学生,抽样为20名学生。4. People aboard a plane who have aisle seats are asked if they travel more than 5000 miles per year.
::4. 有过道座位的飞机上的人被问及他们是否每年旅行超过5000英里。a. Population:
::a. 人口:b. Sample:
::b. 样本:5. A team of marketing specialists survey every sixth child entering a park to find out how many rides they plan to go on while playing in the park.
::5. 一个营销专家小组对进入公园的每六名儿童进行调查,以了解他们在公园玩耍时打算骑多少车。a. Population:
::a. 人口:b. Sample:
::b. 样本:6. Every adult at the exit door of the grocery store is questioned to find out if the store should increase its hours of operation.
::6. 在杂货店出口门前,每个15个成年人都要接受询问,以确定该店是否应该增加营业时间。a. Population:
::a. 人口:b. Sample:
::b. 样本:7. Luke wants to find out where most high school students buy their food for lunch. He surveys every fourth student he sees in the high school parking lot and asks them where they get food for lunch. Which would have been an improvement in Luke’s experiment?
::7. 卢克想了解大多数高中学生在哪里购买午餐用的食物,他每四名学生中,他看到高中停车场,每四名学生中,他就会调查,并问他们从哪里得到午餐用的食物,卢克的实验有什么改进?a. Survey all of the students in the school.
::a. 调查学校所有学生。b. Survey all people in the parking lot.
::b. 调查停车场内的所有人。c. Survey students in the lunch hall.
::c. 午餐厅调查学生。8. Sue is trying to determine the best location to sell snow cones. There are 4 locations in the city (on a side street, downtown, near a park and at a school. Sue observed that many people visit the downtown area and the park. Sue decided to sell snow cones in the downtown area where she saw the most people gather. What changes to Sue’s sample would have given her a better understanding of where to sell snow cones?
::8. Sue试图确定出售雪糕的最佳地点,该市有四个地点(边街、市中心、公园附近和学校)。Sue观察到许多人参观了市区和公园。Sue决定在她看到大多数人聚集的市中心出售雪糕。Sue的样本有什么变化,她就能更好地了解在哪里出售雪糕?9. Kerry collected shells from a visit to the ocean in a shoebox. She takes out a handful of shells from the box. Is this a random sample of shells in a box?
::9. Kerry用一个鞋盒从参观海洋时收集的炮弹,她从盒子里取出几枚炮弹,这是盒子里弹壳的随机样本吗?10. There are four dentists in a city. Their offices are located in four different parts of the city. Jake wants to attempt to figure out which dentist has the most patients. He observed that the Downtown and West Street areas have larger populations. He concurred that the dentists in those areas must have more patients. After comparing those two areas, he decided that the West Street dentist had the most patients because the area had more traffic. What changes to Jakes technique would have given him a better understanding of which doctor had the most patients?
::10. 一个城市有4名牙医,他们的办公室位于城市的四个不同地方,Jake想找出哪个牙医的病人最多,他注意到下城和西街地区人口较多,他同意这些地区的牙医必须有更多的病人,在比较这两个地区之后,他决定西街牙医的病人最多,因为该地区交通交通较多。11. Caroline wants to predict which restaurant will have less business during the Christmas season. There are three restaurants in the city. Two are on the outskirts of a city and one is in the city. She knows that two hotels situated on the outskirts are fully booked because one has Christmas show and one has a huge indoor pool. From this information she inferred that the restaurant in the city will have less business during the Christmas season. What could Caroline due to improve her experiment?
::11. Caroline想预测圣诞节季节哪家餐馆生意会减少,城里有三家餐馆,两家在城市郊区,一家在城市,她知道郊区有两家旅馆由于有圣诞表演而订满,一家有巨大的室内游泳池,根据这一信息,在圣诞节季节,城里的餐馆生意会减少,Caroline有什么改进?a. Ask people at the hotels if they like fast food.
::a. 问旅馆里的人是否喜欢快餐。b. Survey all people to see which December holiday they celebrate.
::b. 调查所有人,看他们庆祝的12月节日。c. Look at the past holiday performance of the restaurants.
::c. 看看餐馆过去的节日表演。The table gives information about the number of girls in each of four schools.
::该表提供了四个学校中每个学校的女生人数。School A B C D Total Number of Girls 126 82 201 52 461 12. Jenny did a survey of these girls. She used a stratified sample of exactly 80 girls according to school. Calculate the number of girls from each school that were in her sample of 80. Complete the table.
::12. Jenny对这些女孩进行了调查,根据学校情况,她使用了80名女孩的分层抽样,计算了80名样本中每个学校的女生人数。School A B C D Total Number of Girls 80 Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
Random Sampling
(choosing representatives by rolling a die, for instance)