9.2 抽样分发-interactive
Section outline
-
Lesson Objectives
::经验教训目标-
Use the mean and
standard deviation
of a
data set
to fit it to a normal
distribution
and to
estimate
population
percentages.
::使用数据集的平均值和标准差,使其符合正常分布,并估计人口百分比。
Introduction: Traveling the World
::导言:世界之旅An image of a man traveling Trevor is conducting a study to determine the proportion of Americans living in Pennsylvania who have never left the country. He takes a convenience sample of 20 people and finds that 60% have never left the country. A dot plot is constructed for Trevor’s data below. The dot represents the proportion of the 20-person sample that has not left the country.
::特雷弗正在进行一项研究,以确定生活在宾夕法尼亚州从未离开过美国的人口比例。 他抽取了20人的方便抽样,发现60%的人从未离开过美国。 特雷弗为特雷弗的以下数据绘制了一个圆点图。 圆点代表了20人样本中尚未离开过这个国家的比例。Chloe is also conducting a study on the same question but performed the same sampling process 30 times. A dot plot is constructed for Chloe’s data below. Each dot represents the proportions of the 20-person samples that have not left the country.
::Chloe也在研究同一问题,但进行了30次同样的取样过程。以下为Chloe的数据绘制了一个点图。每个点代表了20人样本中尚未离开该国的样本的比例。Discussion Question: What information does Chloe have that Trevor doesn’t?
::讨论问题:Chloe有什么关于Trevor没有的信息?
Activity 1: Estimators
::活动1:估算者In the previous section, we used sample statistics to make estimates about populations. A statistic that is used to estimate a parameter is known as an estimator . In the introduction, Chloe’s sample proportions are displayed in a sample distribution. A sample distribution is a probability distribution of a random-sample-based statistic. Sample distributions help us to understand the accuracy of sample data in estimating population data. Each sample within the distribution is referred to as a trial. Sample distributions are commonly used to display the means, medians, or proportions of multiple samples. Use the interactive below to practice estimating parameters from sample distributions.
::在前一节中,我们使用抽样统计数据来估计人口。用于估计参数的统计数据被称为估计数据。在导言中,克洛伊的抽样比例显示在抽样分布中。抽样分布是随机抽样统计数据的概率分布。抽样分布有助于我们理解抽样数据在估计人口数据方面的准确性。在分布中,每个样本被称为试验。样本分布通常用来显示多种样本的手段、中位数或比例。使用下文的互动方法来从抽样分布中估算参数。Discussion Question: Based on the questions in the inline question set above, explain the effect that increasing the sample size would have on the mean and standard deviation of the sampling distribution .
::讨论问题:根据上述内含问题中的问题,解释增加抽样规模将对抽样分布的平均值和标准偏差产生的影响。
Activity 2: Understanding Sample Distributions
::活动2:了解抽样分发情况In the interactive below, we will use a probability distribution to generate sample data, which we will compare to the original distribution. Use this interactive to examine how the distribution of the population data affects the sample distribution.
::在下文互动中,我们将使用概率分布生成样本数据,并将这些数据与原始分布进行比较。使用此互动来审查人口数据的分布如何影响样本分布。+Do you want to reset the PLIX?Discussion Question: The distribution of sample means remains normal for any population distribution as long as the sample size is relatively high. Why do you think the sample mean distribution is normal even when the population distribution isn’t?
::讨论问题:抽样手段的分布对于任何人口分布而言,只要抽样规模相对较大,就仍属正常。 为什么你认为抽样分布即使人口分布不是正常的呢?
Activity 3: Estimator Bias
::活动3:估计比亚斯Earlier in this lesson, we saw that the distribution of sample means is approximately normal, even if the population distribution is not. This idea is the basic principle of the .
::先前的教训是,我们看到抽样手段的分布大致正常,即使人口分布不是正常的。Central Limit Theorem
::中心限制定理Given an appropriately large sample size, the distribution of sample means will approximate a normal distribution.This theorem allows us to use what we know about the normal distribution to examine the probability of obtaining a sample mean regardless of the population distribution shape. Furthermore, we can use it to interpret the accuracy of the population estimate.
::这个理论让我们能够使用我们所知道的关于正常分布的原理来检查获得样本平均值的概率,不管人口分布形态如何。 此外,我们可以用它来解释人口估计的准确性。When determining an appropriately large sample size, consider the population distribution. The "less normal the distribution," the greater the sample size required to produce a normal distribution of sample means . In general, most statisticians agree upon 30 as the minimum sample size required to produce a normal distribution of sample means. Keep in mind that this is not an official number, just a general rule of thumb. You can explore this using the interactive above .
::在确定适当大样本大小时,请考虑人口分布。“分布不正常”的“分布”越大,得出正常样本分布方法所需的样本规模越大。一般而言,大多数统计人员同意将30个最低样本规模作为得出正常样本分布方法所需的最低样本规模。请注意,这不是一个正式数字,只是拇指的一般规则。您可以使用上述交互方式来探索这一点。In the example above, we see that the mean of a sample distribution is an unbiased estimator. An unbiased estimator is a statistic that accurately predicts a parameter. The mean of a sample proportion distribution is also an unbiased estimator because proportion estimates follow a binomial probability distribution with the mean representing the . A biased estimator is a statistic that inaccurately predicts a parameter. Use the interactive below to determine which estimators are biased and which are unbiased.
::在以上的例子中,我们看到抽样分布的平均值是一个公正的估计值。 公正的估计值是一个准确预测参数的统计。 抽样比例分布的平均值也是一个公正的估计值, 因为比例估计的概率分布是二进制概率分布, 而平均值代表值。 有偏向的估计值是一个不准确预测参数的统计。 使用下面的交互数据来确定哪些是偏向的,哪些是不带偏见的。+Do you want to reset the PLIX?Discussion Question: What is the benefit of taking multiple samples and not just one big sample?
::讨论问题:采集多个样本而不只是一个大样本有什么好处?Summary -
A
sample distribution
is a probability distribution of a random-sample-based statistic.
::抽样分布是随机抽样统计的概率分布。 -
The
Central Limit Theorem
states that given an appropriately large sample size, the distribution of sample means will approximate a normal distribution.
::《中央限制理论》指出,鉴于取样规模适当大,抽样手段的分布将接近正常分布。 -
An
unbiased estimator
is a statistic that accurately predicts a parameter.
::公正的估计数字是一个精确预测参数的统计数据。 -
A
biased estimator
is a statistic that inaccurately predicts a parameter.
::偏向估计数据是一个不准确预测参数的统计数字。
Wrap-Up: Review Questions
::总结:审查问题 -
Use the mean and
standard deviation
of a
data set
to fit it to a normal
distribution
and to
estimate
population
percentages.