11.6 千分之二 - 独立测试
章节大纲
-
It is a common belief that age influences food preference. Suppose you wanted to test that hypothesis. How could you test observed data to see if the two variables (age and food preference) influence each other?
::年龄影响食物偏好是一个共同的信念。 假设你想测试这一假设。 你如何测试观察数据来观察两种变量(年龄和食物偏好)是否相互影响?Look to the end of the lesson to see the answer.
::看着教训的结束 看到答案。Chi-Squared Test of Independence
::千方独立测试A chi-square test can be used to determine if observed data indicates that two variables are dependent in much the same way that the test can be used to determine goodness of fit.
::可使用一个 chi-quare (2) 测试来确定观测到的数据是否表明两个变量以与该测试可用来确定合适性同样的方式取决于两个变量。Just as with a goodness of fit test, we will calculate expected values, calculate a chi-square statistic , and compare it to the appropriate chi-square value from a reference to see if we should reject , which is that the variables are not related .
::就像测试是否适合一样,我们将计算预期值,计算一个奇平方统计,并从引用中将其与适当的奇平方值进行比较,看我们是否应该拒绝H0,即变量无关。In fact, the only major difference in process between a goodness of fit test and a test of independence is how we calculate the expected values, as you will see in the first example.
::事实上,在优异的测试和独立测试之间,过程的唯一重大区别是,我们如何计算预期值,正如你们在第一个例子中将看到的那样。Just for reference:
::仅供参考:-
The formula to calculate chi-square is:
::计算 chi- quare 的公式是 :
::2(观测到-预期)2 预期-
Some good resources for a chi-square critical value calculator's are
Daniel Soper's
website and
Easy Calculation's
website. You can also search "free critical chi-square value calculator".
::用来计算“奇平方”关键值的好资源是丹尼尔·索珀的网站和“简单计算”的网站。您也可以搜索“无“奇平方”关键值计算器。 -
The formula for calculating expected values in a test of independence is:
::在独立测试中计算预期值的公式是:
::预期单元格值=CxRnWhere is the observed column total for the cell, is the observed row total for the cell, and is the total number of samples. (see Example A for clarification of the use of the formula)
::如果C是单元格的观测栏总数,R是单元格的观测行总数,n是样本总数。 (关于澄清公式使用情况,请参见例A)。-
The
degrees of freedom
in a test of independence are calculated as:
::独立测试的自由度计算如下:
::df=( rows- 1) (列-1)Finding Expected Values
::查找预期值Given the following contingency table, what is the for each of the four cells in the body of the table?
::鉴于下表的应急情况,表格正文中四个单元的每个单元是什么?A
::A A AB TOTAL
::共计共计X 23
37
60
Y 19
41
60
TOTAL
::共计共计42
78
120
To calculate the expected values, use the formula for each cell:
::要计算预期值, 请为每个单元格使用公式预期单元格值=CxRn:-
Cell
::XA:42x60120=21 -
Cell
::YA:42x60120=21 -
Cell
::XB:78x60120=39 -
Cell
::YB:78x60120=39
Calculating Chi-Square Statistics
::计算千平方统计Using the contingency table and data from the previous example, calculate .
::使用应急表和上一个示例的数据计算2。Start by adding the expected values you calculated in the previous example to the table. Use parentheses to set off the expected values:
::开始将您在上一个示例中计算出的预期值添加到表格中。使用括号来抵消预期值:A
::A A AB TOTAL
::共计共计X
::X 十23 (21)
37 (39)
60
Y 19 (21)
41 (39)
60
TOTAL
::共计共计42
78
120
Now use the chi-square formula to calculate the statistic:
::现在使用 chi- square 公式% 2 (observe- expect) 2 来计算统计 :Real-World Application: Photography
::真实世界应用程序:摄影Rachel claims that girls take more black and white and color photographs than boys, but Jack (who is a photographer) is skeptical. If Jack collects the following data, would it be correct to say that he should reject Rachel’s claim that gender affects tendency to take photographs?
::瑞秋声称女孩拍的黑白照片和彩色照片比男孩多,但杰克(谁是摄影师)持怀疑态度。 如果杰克收集了以下数据,那么说他应该拒绝瑞秋关于性别影响拍照倾向的说法是否正确?Black/White
::黑色/白色Color
::颜色颜色颜色TOTAL
::共计共计Female
::女性 女性 女性72
489
561
Male
::男性 男性48
530
578
TOTAL
::共计共计120
1019
1139
The question here is whether gender affects tendency to take more photographs, or, in other words, are gender and photograph-taking tendency dependent?
::这里的问题是,性别是否影响拍摄更多照片的倾向,换言之,性别与拍照的倾向是否取决于性别?To run a chi-squared test , we need to know the expected value for each of the four cells containing observations. In a test for independence , this is calculated with the formula: .
::运行 chi 方形 测试时, 我们需要知道四个包含观测的单元格中的每个单元格的预期值 。 在独立测试中, 这是用公式计算的 : 预期单元格值= CxRn 。1. The upper-left cell, female X black/white:
::1. 左上囚室,女性X黑/白:
::预期单元格值=CxRn=(列总计)xx(行总计)观察总数=120x5611139期望单元格值=59.12. The cell below that, male X black/white:
::2. 男性X黑/白:
::预期单元格值=CxRn=(列总计)xx(行总计)观察数=120x5781139预期单元格值=60.93. Top-right cell, female X color:
::3. 右上牢房,女性X颜色:
::预期单元格值=CxRn=(列总计)xx(行总计)观察总数=1019x5611139预期单元格值=501.94. Bottom-right cell, male X color:
::4. 右下牢房,男性X色:
::预期单元格值=CxRn=(列总计)xx(行总计)观察总数=1019x5781139期望单元格值=517.1Now we can add the expected values to our initial table, placing the expected value for each cell in parentheses:
::现在,我们可以在最初的表格中加上预期值,将每个单元格的预期值置于括号内:Black/White
::黑色/白色Color
::颜色颜色颜色TOTAL
::共计共计Female
::女性 女性 女性72 (59.1)
489 (501.9)
561
Male
::男性 男性48 (60.9)
530 (517.1)
578
TOTAL
::共计共计120
1019
1139
Now we can calculate our statistic as before, using: and each of the four values in the body of the table:
::现在我们可以像以前一样计算我们的% 2统计, 使用 :% 2 {( 观察 - 预料) 2 / 预产值 2 和表格正文中的 4 个值 :
::2(观察-预期) 2=(72-59.1)259.10+(48-60.1)260.89+(489-501.9)2501.89+(530-517.1)25.10_2=(12.9)259.10+(12.9)260.10+(12.9)260.89+(12.9)2501.89+(12.9)2501.89+(12.9)9)255.10}2=166.4159.10+1666.166.166.160.89+166.415.100.89+166.414.17.10+2=2.82+2.73+32+2=6.20。To see if our statistic is greater or less than the critical value at the default significance level of 0.05, we need the number of degrees of freedom :
::看看我们的"%2"统计值是否大于或低于 0.05 默认值的临界值, 我们需要自由度的数量: df=(rows-1)(列-1)
::df= df= (2- 1) (2 - 1) (2 - 1) df= 1Using our chi-squared critical value reference, we find that the critical value for 0.05 with is 3.8414.
::使用我们的奇差方位关键值参考, 我们发现0.05和 df=1 的关键值是 3. 8414 。Finally, we compare our calculated chi-squared value of 6.2 to the critical value of 3.8414 and determine that since , we can reject , in other words, we reject the independence of the variables . The observed data indicates that there is a gender bias on picture-taking tendency.
::最后,我们将我们计算出的6.2的奇差方数值与3.8414的临界值进行比较,并确定自6.2>3.8414以来,我们可以拒绝H0,换句话说,我们拒绝变量的独立性。Earlier Problem Revisited
::重审先前的问题It is a common belief that gender influences movie genre preference. Suppose you wanted to test that hypothesis. How could you test observed data to see if the two variables (gender and movie genre preference) influence each other?
::性别会影响电影流派的偏好,这是一个共同的信念。 假设你想测试这一假设的话。 你如何测试观察的数据来观察这两个变数(性别偏好和电影流派偏好)是否相互影响?A of independence could be used in this situation. Create a contingency table to organize observed data on movie preference and gender, calculate the value of the data, and compare it to the critical value with the appropriate number of degrees of freedom. If the calculated value is greater than the critical value, then the variables are not related.
::在此情况下可以使用独立等级。 创建一个应急表, 以组织关于电影偏好和性别的观察数据, 计算数据的% 2 值, 并用适当自由度的% 2 关键值进行比较。 如果计算值大于关键值, 那么变量就无关紧要 。Examples
::实例Examples 1-5 refer to the following:
::实例1-5如下:Kato claims that single people prefer different pizzas than married people do. Kato’s brother doesn’t think that is true, so he conducts some research of his own, and collects the data below.
::Kato声称,单身者比已婚者更喜欢不同的比萨饼。 Kato的兄弟并不认为这是正确的,因此他自己做了一些研究,并收集了以下数据。Pepperoni
::辣椒尼Sausage
::香肠Cheese
::奶酪TOTAL
::共计共计Single
::单身单身单身29
12
61
102
Married
::已婚 已婚8
47
56
111
TOTAL
::共计共计37
59
117
213
Example 1
::例1F ill in the expected values of the 6 cells in the body of the table using parenthesis.
::使用括号填充表格正文中的6个单元格的预期值。Completed table, using :
::填写表格,使用预期单元格值=CxRn:Pepperoni
::辣椒尼Sausage
::香肠Cheese
::奶酪TOTAL
::共计共计Single
::单身单身单身29 (17.71)
12 (28.25)
61 (56.02)
102
Married
::已婚 已婚8 (19.28)
47 (30.74)
56 (60.97)
111
TOTAL
::共计共计37
59
117
213
Example 2
::例2What is the value of ?
::2的价值是什么?Using :
::使用 @% 2( 观察 - 预期) 2 预期 :Example 3
::例3How many degrees of freedom are there?
::有多少自由度? 有多少自由度?
:3-1)(2-1)=(2)(1)=2 df
Example 4
::例4If we plan to test the claim, what are and ?
::如果我们计划测试索赔要求,H0和H1是什么?: The observed data supports the hypothesis, : The data does not support the hypothesis.
::H0:观察到的数据支持假设H1:数据不支持假设。Example 5
::例5Assuming a significance level of .05, does the observed data support Kato's claim?
::假设0.05这一重要水平,观察到的数据是否支持加藤的索赔?The critical value for with at 0.05 significance is 5.991. Since our calculated value of , and we can reject the null hypothesis that the data supports the claim .
::由于我们的计算值为2=32.6和32.6>5.991,我们可以拒绝数据支持索赔的无效假设。Review
::回顾1. What use of the statistic is used in this lesson?
::1. 在这一教训中,使用%2统计数字有何用?2. How is an expected value calculated, in the context of this lesson?
::2. 在这一教训中如何计算预期值?3. How are degrees of freedom calculated in a chi-square test of independence?
::3. 自由程度如何在独立 " 基平方 " 测试中计算?4. What type of table is commonly used to organize information for a chi-square test of independence?
::4. 哪些类型的表格通常用来组织独立测试的 " 奇夸尔 " 的信息?5. What is the default level of significance for a chi-squared test of independence?
::5. 对独立进行 " 奇差 " 测试的默认重要性有多大?Questions 6 – 10 refer to the following breakdown of favorite flavor by gender:
::问题6 - 10提到下列按性别分列的最喜爱口味细目:Cherry
::樱樱桃Lemon
::柠檬Strawberry
::草莓Other
::其他TOTAL
::共计共计Male
::男性 男性13
11
7
13
44
Female
::女性 女性 女性15
18
11
5
49
TOTAL
::共计共计28
29
18
18
93
6. Fill in the expected values of the 6 cells in the body of the table using parenthesis.
::6. 使用括号填充表格正文中的6个单元格的预期值。7. What is the value of ?
::7. 2的价值是什么?8. How many degrees of freedom are there?
::8. 有多少程度的自由?9. If we plan to test the claim that gender affects favorite flavor, what are and ?
::9. 如果我们计划测试性别影响最喜爱的口味的说法,H0andH1是什么?10. Assuming a significance level of 0.05, does the observed data indicate that we reject or fail to reject ?
::10. 假设0.05的临界值水平为0.05,观察到的数据是否表明我们拒绝或不拒绝H0?Questions 10 – 15 refer to the following:
::问题10-15涉及以下方面:Are hamburger cooking preferences dependent on gender? 1087 people were asked their preference among three ways of cooking burgers, grilling, frying, and broiling. The men stated their preferences as: 137: grilling, 193: frying, and 212: broiling. The women were distributed as: 110: grilling, 215: frying, and 220: broiling.
::汉堡的烹饪偏好是否取决于性别? 1 087人被问及三种烹饪汉堡、烧烤、煎烤和烧烤方式中的优先选择:137人:烧烤、193人:煎烤和212人:烧烤。 妇女被分配为110人:烧烤、215人:烧烤和220人:烧烤。11. Create a contingency table to organize the information.
::11. 创建一个应急表来组织信息。12. What is the value of ?
::12. 2的价值是什么?13. How many degrees of freedom are there?
::13. 有多少程度的自由?14. If we plan to test the claim that gender affects cooking preference, what are and ?
::14. 如果我们计划测试性别影响烹饪偏好的说法,H0andH1是什么?15. Assuming a significance level of 0.05, does the observed data indicate that we reject or fail to reject ?
::15. 假设0.05的临界值水平为0.05,观察到的数据是否表明我们拒绝或不拒绝H0?Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
The formula to calculate chi-square is: