11.4 意外开支表
Section outline
-
Suppose you wanted to evaluate how gender affects the type of movie chosen by movie-goers, how might you organize data on Male and Female watchers, and Action, Romance, Comedy, and Horror movie types, so it would be easy to compare different combinations?
::假设你想评估性别对电影观众所选择的电影类型的影响, 如何组织关于男女观众的数据, 以及Action、Romance、喜剧和Horror电影类型的数据, 这样比较不同的组合很容易吗?See the end of the lesson where this question is reviewed.
::参看对这一问题进行审查的教训的结尾。Contingency Tables
::应急应急情况表Contingency tables are used to evaluate the interaction of statistics from two different categorical variables. They are often used to organize data from different random variables in preparation for a contingency test (which we will be discussing further in the next lesson).
::应急表用来评价两个不同绝对变量统计数据的相互作用,经常用来组织来自不同随机变量的数据,为应急测试作准备(我们将在下一个教训中进一步讨论)。Contingency tables are sometimes called two-way tables because they are organized with the outputs of one variable across the top, and another down the side. Consider the table below:
::应急表格有时被称为双向表格,因为它们与一个变量在上方和另一变量在下方的产出一起编排。Male
::男性 男性Female
::女性 女性 女性Chocolate Candy
::巧克力糖果42
77
Fruit Candy
::果果糖果58
23
This is a contingency table comparing the variable ‘Gender’ with the variable ‘Candy Preference’. You can see that, across the top of the table are the two gender options for this particular study: ‘male students’ and ‘female students’. Down the left side are the two candy preference options: ‘chocolate’ and ‘fruit’. The data in the center of the table indicates the reported candy preferences of the 100 students polled during the study.
::这是将变量“性别”与变量“加拿大偏好”进行比较的应急表。 您可以看到,在表格的顶端,有两种性别选择,即“男学生”和“女学生”。 在左边,有两种糖果优先选择,即“巧克力”和“水果 ” 。 在表格的中央,数据显示了在研究期间被调查的100名学生报告的糖果偏好。Commonly, there will be one additional row and column for totals, like this:
::通常,总数将增加一行和一列,例如:Male
::男性 男性Female
::女性 女性 女性TOTAL
::共计共计Chocolate Candy
::巧克力糖果42
77
119
Fruit Candy
::果果糖果58
23
81
TOTAL
::共计共计100
100
200
Notice that you can run a quick check on the calculation of totals, since the “total of totals” should be the same from either direction: .
::请注意,您可以快速检查总计的计算,因为“总计”在任一方向上应相同:119+81=200=100+100。The benefits of a contingency square will be apparent the more you use it. As you begin to evaluate different bits of information, each combination of variable outputs is easily noted.
::当您开始评估不同的信息位数时,很容易注意到每种可变产出的组合。Constructing a Contingency Table
::建立一个应急表Construct a contingency table to display the following data: “250 mall shoppers were asked if they intended to eat at the in-mall food court or go elsewhere for lunch. Of the 117 male shoppers, 68 intended to stay, compared to only 62 of the 133 female shoppers”.
::建立一个应急表,以显示以下数据:“250名商场购物员被问及他们是否打算在市内食品法院吃饭或去其他地方吃午餐,117名男顾客中,68人打算留下来,而133名女顾客中只有62人”。First, let’s identify our variables and set up the table with the appropriate row and column headers.
::首先,让我们找出变量,用合适的行和列页页头设置表格。The variables are gender and lunch location choice:
::变量是性别和午餐地点选择:Male
::男性 男性Female
::女性 女性 女性TOTAL
::共计共计Food Court
::粮食法院食品法院Out of Mall
::在购物厅外TOTAL
::共计共计Now we can fill in the values we have directly from the text:
::现在我们可以直接从文本中填充我们拥有的数值:Male
::男性 男性Female
::女性 女性 女性TOTAL
::共计共计Food Court
::粮食法院食品法院68
62
Out of Mall
::在购物厅外TOTAL
::共计共计117
133
250
Now we can fill in the missing data with simple addition/subtraction:
::现在,我们可以以简单增加/减法填充缺失的数据:Male
::男性 男性Female
::女性 女性 女性TOTAL
::共计共计Food Court
::粮食法院食品法院68
62
130
Out of Mall
::在购物厅外49
71
120
TOTAL
::共计共计117
133
250
Answering Questions
::答 答 问 问 问 问Referencing data from the previous example, answer the following:
::参照上一个例子的数据,回答如下:a. What percentage of food-court eaters are female?
::a. 食用食品的食用人中女性占多大比例?If we read across the row “Food Court”, we see that there were a total of 130 shoppers eating “in”, and that 62 of them were female. To calculate percentage, we simply divide: or 47.7%.
::如果我们跨行读“食品法院”一行,我们看到共有130名顾客吃“进”,其中62人是女性。为了计算百分比,我们只是将比例分为:62130477or47.7 % 。b. What is the distribution of male lunch-eaters?
::b. 男性午餐吃者的分布情况如何?The male shoppers were distributed as 68 food court and 49 out of mall.
::男顾客在68个食品法院和49个商场外分发。c. What is the marginal distribution of the variable "lunch location preference?
::c. 可变 " 午餐地点优惠 " 的边际分布是什么?The marginal distribution is the distribution of data “in the margin”, or in the TOTAL column. In this case, we are interested in the data on lunch location preference, which is found in the far right column: 130 food court and 120 out of mall.
::边际分布是“差值”或“总计”一栏的数据分布,在这种情况下,我们感兴趣的是午餐地点偏好数据,最右侧一栏有:130个食品法院和120个不在商场的食品法院。d. What is the marginal distribution of the variable "Gender"?
::d. 变数 " 性别 " 的边际分布是什么?The marginal distribution of gender can be found in the bottom row: 117 males and 133 females.
::性别的边际分布可见于底线:男性117人,女性133人。e. What percentage of females prefer to eat out?
::e. 女性更愿意外出吃饭的百分比是多少?Here we are interested in data from the females, so we will be dealing with the ‘female’ column. From the data in the column, we see that 71 of the 133 females preferred to eat out. This is a percentage of: or 53.4%.
::在这里,我们对女性的数据感兴趣,因此我们将讨论“女性”栏。 从该栏的数据中,我们看到133名女性中有71人喜欢外吃。 这一比例为:71133-534或53.4 % 。Identifying Marginal Distributions and Making Observations
::查明边际分布和观察“Out of 213 polled amateur drag racers, 37 drove cars with turbo-chargers, 59 had superchargers, and the rest were normally aspirated. The racers themselves were split between 102 rookies and 111 veterans. The rookies evidently preferred turbos, since 29 of them had turbo-charged vehicles, and avoided superchargers, since there were only 12 of them”.
::“在213辆有投票的业余拖车中,37辆配有涡轮充电机的汽车,59辆配有超充电机,其余的通常都配有超充电机,这些赛车员自己分成102名新手和111名退伍军人,新手显然更喜欢涡轮,因为其中29辆配有涡轮充电机,避免了超充电机,因为只有12辆”。a. Construct a contingency table:
::a. 建立一个应急表:Set up the table with the appropriate headers, and fill in the data we know. Note that this time we will need a table instead of a (it is still a two- way table though, as there are only two variables: engine aspiration and driver experience):
::用合适的信头设置表格, 并填入我们所知道的数据 。 请注意, 这次我们需要一个 3x2 表格, 而不是 2x2 表格( 仍然是一个双向表格, 因为只有两个变量 : 引擎渴望和驱动经验 ) :Turbocharger
::涡轮充气器Supercharger
::超充电器Normal Aspiration
::正常呼吸TOTAL
::共计共计Rookie
::菜菜29
12
102
Veteran
::退伍军人111
TOTAL
::共计共计37
59
117
213
Now we can update the table with the missing data, calculated using addition or subtraction:
::现在我们可以用缺少的数据更新表格, 计算时使用增减法 :Turbocharger
::涡轮充气器Supercharger
::超充电器Normal Aspiration
::正常呼吸TOTAL
::共计共计Rookie
::菜菜29
12
61
102
Veteran
::退伍军人8
47
56
111
TOTAL
::共计共计37
59
117
213
b. Identify the marginal distributions
::b. 查明边际分配情况The marginal data refers to the overall data for each of the two variables:
::边际数据是指两个变量中每个变量的总数据:-
Aspiration type is distributed as follows:
37 Turbos, 59 Superchargers, and 117 normally aspirated.
::呼吸型号分布如下:37个涡轮、59个超充电机和117个通常被摄入。 -
Driver experience distribution:
102 Rookies and 111 Veterans.
::驾驶员经验分配:102名Rookie和111名退伍军人。
c. Identify 3 different percentage-based observations
::c. 确定3项不同百分比的审计意见Three percentage-based observations:
::基于三个百分比的意见:-
or
59.8% of Rookies drive normally aspirated cars.
::61102=0.598或59.8%的Rookie通常驾驶充气汽车。 -
or
79.66% of the Superchargers were in cars driven by Veterans.
::4759=0.79666或79.66%的超充电器是退伍军人驾驶的汽车。 -
or
42.34% of Veterans use Superchargers.
::47111=0.4234或42.34%的退伍军人使用超充电器。
Earlier Problem Revisited
::重审先前的问题Suppose you wanted to evaluate how gender affects the type of movie chosen by movie-goers, how might you organize data on Male and Female watchers, and Action, Romance, Comedy, and Horror movie types, so it would be easy to compare different combinations?
::假设你想评估性别对电影观众所选择的电影类型的影响, 如何组织关于男女观众的数据, 以及Action、Romance、喜剧和Horror电影类型的数据, 这样比较不同的组合很容易吗?A contingency table would be excellent for this purpose. By listing gender categories in one direction and movie type in the other, it would be a simple matter to evaluate different combinations of variables.
::为此目的,一个应急表将是极好的,如果将性别类别按一个方向列出,而将电影类型按另一个方向列出,那么评价不同组合的变数将是一个简单的问题。Examples
::实例Example 1
::例1Complete the data in the contingency table:
::填写应急表内的数据:A B TOTAL
::共计共计X
::X 十47
Y
::Y Y Y32
100
TOTAL
::共计共计100
200
A
::A A AB TOTAL X
::X 十Y
::Y Y YTOTAL
::共计共计Example 2
::例2What is the marginal distribution of the variable consisting of categories A and B?
::A类和B类变量的边际分布是什么?There variable consisting of categories A and B is distributed as A: 100 and B: 100.
::由A类和B类构成的变数按A类:100和B类:100分配。Example 3
::例3What percentage of B’s are Y’s?
::占B的百分比是Y多少?There are 32 B's that are also Y's, out of the total 100 B's:
::32B是Y,在总共100B中32B是Y:32100=32%Example 4
::例4What portion of A’s are X’s? Express your answer as a decimal .
::A的哪个部分是 X 的? 以小数表示回答。47 of the 100 A’s are X’s,
::100个A中的47个是X ' s,47100=0.47。Review
::回顾Questions 1-9 refer to the following table:
::问题1-9如下表所示:Sports Cars
::运动车Pickup Trucks
::皮卡车Luxury Cars
::豪华汽车TOTAL
::共计共计Male Drivers
::男司机72
67
36
175
Female Drivers
::女司机36
71
68
175
TOTAL
::共计共计108
138
104
350
1. What is the marginal distribution of vehicle types?
::1. 车辆种类的边际分布是什么?2. What is the marginal distribution of driver gender?
::2. 驱动力性别的边际分布是什么?3. What decimal portion of male drivers have luxury cars?
::3. 男性驾驶员中有多少小数部分拥有豪华轿车?4. What percentage of female drivers have pickups?
::4. 有多少比例的女性驾驶员有皮卡?5. How many drivers were polled?
::5. 对多少司机进行了民意测验?6. What is the overall most popular vehicle type, by percentage?
::6. 按百分率计算,什么是总体上最受欢迎的车辆类型?7. Which vehicle type has the single largest cell value, and what percentage does it represent of that gender category?
::7. 哪种车型具有最大的单细胞值,该车型占该性别类别的百分比是多少?8. What percentage of pickup trucks are driven by females?
::8. 女性驾驶的小卡车占多大比例?9. What percentage of females drive pickup trucks?
::9. 女性驾驶小卡车的百分比是多少?Questions 10-18 refer to the following data:
::问题10-18涉及以下数据:“One hundred eighty dogs were studied to determine if breed affected food preference. Of the 70 Huskies, 30 preferred beef flavor and 40 preferred chicken. Of the 50 Poodles, 27 preferred beef, the rest chicken. The rest of the dogs, English Mastiffs, were obviously beef-lovers, as only 19 preferred chicken over beef”.
::“对180只狗进行了研究,以确定品种是否影响食物偏好;在70只Huskies中,30只优先牛肉口味,40只优先鸡;在50只Poodles中,27只优先牛肉,其余的是鸡;其余的狗,英国马斯蒂夫(English Mastiffs),显然是食牛肉者,因为只有19只偏爱鸡而不是牛肉。”10. Create a contingency table to display the data.
::10. 创建一个显示数据的应急表。11. What is the marginal distribution of dog breeds?
::11. 狗品种的边际分布是什么?12. What is the marginal distribution of food types?
::12. 食物种类的边际分配是什么?13. What percentage of Mastiffs preferred beef?
::13. 马斯特夫斯偏爱牛肉的比例是多少?14. What percentage of beef-lovers were Mastiffs?
::14. 马斯特夫斯是多少比例的牛肉爱好者?15. What flavor/dog combination indicated the strongest preference? What percentage of the breed did it represent?
::15. 哪种口味/狗的组合表示最强烈的偏好?这种品种占多大比例?16. What is the distribution of chicken preference?
::16. 鸡肉偏好的分配是什么?17. What is the distribution of beef preference?
::17. 牛肉优惠的分配情况如何?18. Which breed shows the least defined preference, as a percentage?
::18. 哪种品种的偏好定义最小,按百分比表示?Review (Answers)
::回顾(答复)Click to see the answer key or go to the Table of Contents and click on the Answer Key under the 'Other Versions' option.
::单击可查看答题键, 或转到目录中, 单击“ 其他版本” 选项下的答题键 。 -
Aspiration type is distributed as follows:
37 Turbos, 59 Superchargers, and 117 normally aspirated.