Section outline

  • In previous lessons, we calculated for the mean of a population based on data from a large sample ( n 30 ) . Does that mean that sample sizes of less than 30 are useless? If not, how do you calculate a confidence interval based on data from a smaller sample?
    ::在以往的教训中,我们根据一个大样本(n30)的数据计算出人口平均值。 这意味着30以下的样本规模没有用处吗? 如果没有,你如何根据较小样本的数据计算信任区间?

    The T-Test
    ::禁寺(TT),

    When you are attempting to estimate the mean of a population, it is generally best to collect as many data points as possible from the population. Unfortunately, in the ‘real world’, samples are not always easy to collect. Sometimes you simply do not have access to the 30 or more data points required to use the . If you happen to have good data giving you the standard deviation of the population ( σ ) , then it is generally permissible to use the z -score to calculate a confidence interval regardless. However, when you do not know σ , and you do not have enough data to estimate it with your sample ( n < 30 ) , a z -score confidence interval is not reliable.
    ::当您试图估算人口的平均值时,通常最好尽可能从人群中收集尽可能多的数据点。 不幸的是,在“现实世界”中,样本并不总是容易收集的。有时您根本无法获得使用该样本所需的30个或更多数据点。如果您碰巧掌握了好的数据,提供了人口标准偏差( 12) , 那么通常可以使用 z- score 来计算信任期。 但是,当您不知道 + , 且您没有足够的数据来用样本( n < 30) 来估计时, z- pocre 信任期是不可靠的 。

    lesson content

    Fortunately, there is a solution. A confidence interval can be calculated from a small sample when we do not know the population standard deviation, if we do two things differently:
    ::幸运的是,我们有一个解决方案。 当我们不知道人口标准差时,可以从一个小样本中计算信任区间,如果我们做两种不同的事情:

    • Since we do not know σ , we instead use the best approximation for it that we have:  s (the sample standard deviation).
      ::由于我们不知道...,我们反而采用了我们拥有的最佳近似值:s(抽样标准偏差)。
    • We modify the confidence interval formula to use a Student’s t- distribution reference, rather than the z-score percentage reference.
      ::我们修改信任区间公式, 以使用学生 t分配参考, 而不是z-cones百分率参考。

    Confidence Interval for   n < 30 : x ¯ ± t α 2 ( s n )

    ::n<30:x\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ n<30:x\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

    The Student’s t-distribution is similar to the normal distribution, except it is more spread out and wider in appearance, and has thicker tails. The differences are more exaggerated when there are fewer data points, and therefore fewer degrees of freedom . Degrees of freedom are essentially the number of samples that have the ‘freedom’ to change without necessarily affecting the sample mean . A clear description of degrees of freedom is beyond the scope of this lesson, but you can find many online lessons describing them if you are interested. For our purposes, all you really need to know about degrees of freedom is that there is always one less degree of freedom than the number of data points:
    ::学生的 t 分布与正常分布相似, 但外观更加分散,范围更广,尾巴更厚。 当数据点少时,差异就更为夸大,因此自由度也更低。 自由度基本上是指可以改变的样本数量,而不必影响样本平均值。 明确描述自由度超出了这一课程的范围,但如果你感兴趣,你可以在网上找到许多描述自由程度的课程。 为了我们的目的,你真正需要知道的自由程度总是比数据点少一个自由程度:

    d f = n 1

    ::df=n-1

    The reason you need to know how to find the number of degrees of freedom is quite simple: the t-distribution has a different value for each number of degrees of freedom, as you can see in the reference below:
    ::您需要知道如何找到自由度数量的原因很简单: t分布对于每一自由度数量具有不同的价值,如下文所示:

    lesson content

    To use the reference, find the point of intersection of the degrees of freedom on the left with the  α 2 (for 2-tailed tests) or α  (for one-tailed tests) across the top. If you need a refresher on the calculation of α , you can reference the lesson Critical Values (essentially the alpha value is the area in the tail(s) of the distribution, that is 1.0 confidence level ).
    ::要使用引用, 请在左侧找到自由度与顶部的α2( 双尾测试) 或α( 单尾测试) 之间的交叉点。 如果您在计算 α 时需要刷新, 您可以参考“ 关键值” 课( 基本上, 字母值是分布尾部的区域, 即 1. 0 - 信任水平 ) 。

    Finding the Degrees of Freedom
    ::寻找自由的程度

    How many degrees of freedom are there in a sample of size n = 11 ?
    ::N=11大小样本中有多少自由度?

    Recall that d f = n 1 :
    ::回顾 df=n- 1 :

    d f = 11 1 d f = 10

    ::df=11-1-1df=10

    Finding the T-Score Multiplier 
    ::查找 T- Score 乘数器

    What is the -score multiplier for a two-tailed test with a 98% confidence level , given n = 16 ?
    ::以 n=16 表示98% 的置信度进行双尾测试的t- 数乘数是多少?

    Begin by finding α :
    ::开始发现α:

    α = 1 .98 α = .02

    Then, identify the number of degrees of freedom, n 1 :
    ::然后,确定自由度的数量, n - 1 :

    d f = 16 1 d f = 15

    ::df=16-1df=15

    Now we reference the t -score table.
    ::现在,我们参考了t -score表。

    Since this is a two-tailed test, we need to look up t α 2 = t .01 , cross-referenced with d f = 15 :
    ::由于这是一个双尾测试, 我们需要查看 tα2=t. 01, 与 df=15 交叉参照 :

    lesson content

    The t -score multiplier is 2.602.
    ::高分乘数为2.602。

    Finding the Right Confidence Interval
    ::寻找正确的信任

    You are given a sample of n = 19 , where s = 4.3 , and x ¯ = 26 . What confidence interval would you used to bracket μ  with a confidence level of 95%?
    ::给您一个 n=19 的样本, s=4.3 和 x \\\ 26。 您用何种置信度间隔来括括号 μ , 置信度为95% ?

    The first thing to note is that we do not have a large enough sample to use a z -test, since n < 30 , so we will instead use a t -test.
    ::首先要指出的是,我们没有足够大的样本,无法使用z-t-t-t-t-t-t-test,因为从n<30到n<30,所以我们将使用t-t-t-test。

    Recall the t -test formula for calculating confidence interval:
    ::回顾用于计算信任间隔的t- 测试公式:

    x ¯ ± t α 2 ( s n )

    ::x tα2( sn)

    We are given the values of x ¯ , s and  n in the question text. In order to use the t -test, we will need to know the t -score multiplier t α 2 , so we will need the values of α  and the degrees of freedom ( d f ) .
    ::问题文本中给出了 x,s 和 n 的值。 为了使用 t 测试, 我们需要知道 t- 数乘数 t- 3, 因此我们需要 α 的值和自由度( df ) 。

    The α  value is a direct result of the confidence level, which is 95% in this case. With a 95% confidence level, α = 0.05  or 5% of the total area under the curve.
    ::α值是置信度的直接结果,在此情况下,置信度为95%。95%的置信度为0.05或占曲线下总面积的5%。

    α 2 = 0.05 2 = 0.025

    Remember that the degrees of freedom ( d f )  are 1 less than the number of data points.
    ::记住自由度(df)比数据点少1个。

    d f = 19 1 d f = 18

    ::df=19-1-1df=18

    Now we can use the t -score table to reference d f = 18 and t 0.025 :
    ::现在我们可以使用 t- score 表格来引用 df=18 和 t0.025 :

    t α 2 = 2.101

    ::tα2=2.101

    Putting everything together:
    ::将一切汇集在一起:

    x ¯ ± t α 2 ( s n ) 26 ± 2.101 ( 4.3 19 ) 26 ± 2.101 ( 4.3 4.359 ) 26 ± 2.101 ( .986 ) 26 ± 2.072

    ::x tα2(sn)26/21.01(4.319)26/21.01(4.349)26/21.01(986)26/2.072

    The confidence interval is 23.928 to 28.072, with a 95% confidence level.
    ::置信度间隔为23.928至28.072,置信度为95%。

    Earlier Problem Revisited
    ::重审先前的问题

    In previous lessons, we calculated confidence intervals for the mean of a population based on data from a large sample ( n 30 ) . Does that mean that sample sizes of less than 30 are useless? If not, how do you calculate a confidence interval based on data from a smaller sample?
    ::在以往的教训中,我们根据一个大样本(n30)中的数据计算出人口平均值的置信度间隔。 这意味着30岁以下样本的大小没有用处吗? 如果没有,你如何根据较小样本中的数据计算置信度间隔?

    Sample sizes less than 30 are certainly not useless. A confidence interval from a sample of size n < 30 , can certainly be calculated, even if σ  is unknown. Rather than using a z -test, however, you use a t -test, referencing values from a Student’s t -distribution, and estimate the value of σ  with the sample standard deviation, denoted σ ¯ or s .
    ::不到 30 的样本大小肯定并非毫无用处。 与大小为 n < 30 的样本的置信间隔当然可以计算出来,即使 + 未知。 然而,您不是使用 z 测试,而是使用 t 测试,引用学生 t 分布 的值,并用 {或 s} 标注样本标准偏差来估计 + 的值。

    Examples
    ::实例

    Example 1
    ::例1

    If you a conducting a hypothesis test with n = 27 and σ = 3 , is it permissible to conduct a z -test?
    ::如果您用n=27和3进行假设试验,是否允许进行z试验?

    Since you know the value of  σ , a z-test is permissible.
    ::既然你知道ZZ值, 允许进行 z- test 。

    Example 2
    ::例2

    What is  α for a two-tail t- test with a 90% confidence level?
    ::以90%的置信度进行双尾 t 测试的α值是什么?

    Any test with 90% confidence level would have  α = 0.1 . Since it is a two-tail test, we would look up  t 0.05  on the reference chart, because of  1 2  of the alpha would be on the each end of the curve.
    ::90%的置信度水平的任何测试都会达到0.1。 由于这是一个双尾测试,我们将在参考图上查查 t0.05, 因为12个阿尔法将出现在曲线的每个端。

    Example 3
    ::例3

    What is the 95% confidence interval for a one-tailed test with n = 17 , x ¯ = 142.23 , and σ ¯ = 13 ?
    ::以 n=17, x 142.23 和 13 进行单尾测试的95%的置信度间隔是多少?

    The first thing to decide is what type of test to use. Since n < 30 , and we do not know σ , it would not be appropriate to use a z -test, so we will use a t -test instead. To calculate the confidence interval for a t- test, use the formula: x ¯ ± t α 2 ( s n ) .
    ::首先决定要使用的测试类型。 从 n<30 开始, 我们不知道 {{{}, 使用 z- t 测试是不合适的, 所以我们将使用 t- 测试 。 要计算 t- 测试的置信间隔, 请使用公式: x\\ t- ffa2( sn) 。

    • x ¯  is given: 142.23
      ::x 给出: 142.23
    • t α 2 = t 1 0.9 2 = t 0.1 2 = t 0.05
      ::tα2=t1-0.92=t0.12=t0.05
    • s  is given: 13
      ::给定 s: 13
    • n  is given: 17, meaning that  d f = 16
      ::n 被给定: 17, 意思是 df=16

    If we reference t 0.05  with d f = 16  on the table from the lesson above, we find: 2.921
    ::如果我们从上文的教训中,在表格上用df=16来参照上文的教训,我们就会发现:2.921

    Now we can put it all together:
    ::现在,我们可以将所有这一切结合起来:

    Confidence  Interval = x ¯ ± t α 2 ( s n ) 142.23 ± 2.921 ( 13 17 ) 142.23 ± 2.921 ( 13 4.123 ) 142.23 ± 2.921 ( 3.153 ) Confidence  Interval = 142.23 ± 9.210

    ::互信互信=xtα2(sn)142.23/2.921(1317)142.23/29.21(134.123)142.23/23.29.921(3.153)

    Review 
    ::回顾

    1. What is a t -test?
    ::1. 什么是T-测试?

    2. What conditions indicate the use of a t -test, rather than a z -test?
    ::2. 使用 t 测试而不是 z 测试的条件是什么?

    3. How does the shape of a Student’s t -distribution differ from a normal distribution?
    ::3. 学生的 t-分配形式与正常分配有何不同?

    For questions 4-8, identify the t -score multiplier for the given confidence level:
    ::对于问题4-8,确定给定信任水平的t-位数乘数:

    4. 99% CL, one tail,  n = 17
    ::4.99%的CL,1尾,N=17

    5. 99% CL, two-tail,  d f = 9
    ::5.99%的CL,双尾,df=9

    6. 95% CL, one tail,  n = 22
    ::6. 95% CL, 单尾尾, n=22

    7. 95% CL, two-tail, d f = 17
    ::7. 95%的CL,双尾,df=17

    8. α = 0.1 , n = 13 , one-tail
    ::8. 0.1,n=13,单尾

    Questions 9-12 refer to the following:
    ::问题9-12涉及以下方面:

    The school director at Desiderata School wants to determine if the mean GPA for the entire student body for the current year is above 3.0, with a 95% confidence level. He collects the following sample GPA’s, using a SRS: 2.97, 3.21, 3.10, 2.81, 3.35, 4.0, 2.51, 2.38, 3.85, 3.24, 3.81, 3.01, 2.85, 3.4, 2.94.
    ::Desiderata 学校的校长想确定今年整个学生群体的平均GPA值是否超过3.0,信任度达到95%。 他收集了以下GPA样本,使用SRS:2.97, 3.21, 3.10, 2.81, 3.35, 4.0, 2.51, 2.38, 3.85, 3.24, 3.81, 3.01, 2.85, 3.4, 2.94。

    9. What kind of test should he use?
    ::9. 他应采用何种测试?

    10. What are the null and alternative hypotheses?
    ::10. 什么是无效的和替代性的假设?

    11. What is s ?
    ::11. 什么是S?

    12. What is x ¯ ?
    ::12 什么是"X"?

    13. How many degrees of freedom are there?
    ::13. 有多少程度的自由?

    14. What is the confidence interval?
    ::14. 信心间隔是多少?

    15. Should he reject or fail to reject, the null hypothesis?
    ::15. 他应拒绝还是不拒绝无效假设?

    Review (Answers)
    ::回顾(答复)

    Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
    ::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。