8.4 确定两种分配是否不同-interactive
Section outline
-
Chemical Reactions
::化学反应Scientists often need to test if datasets are different to determine if a changing a certain factor can influence the outcome of an event. One example of this is studying the factors that affect the rate of a chemical reaction. A scientist wants to determine which factors affect the reaction time of Hydrochloric Acid and Magnesium.
::科学家们往往需要测试数据集是否不同,以确定某一因素的改变是否会影响事件的结果。这方面的一个例子就是研究影响化学反应率的因素。科学家们想要确定哪些因素会影响盐酸和镁的反应时间。Use the interactive below to determine this.
::使用下面的交互来决定这一点。+Do you want to reset the PLIX?
Trustworthy Data
::可信赖数据In this lesson , you will consider an important dilemma faced by many statisticians and data analysts: determining if two data sets are different. C onsider an example from a nother lesson in this book, where Mrs. Panthaki and Mr. Shields compared the performances of their classes on an exam. You saw that Mrs. Panthaki’s class performed better on the exam by nearly every metric.
::在这个教训中,你会考虑到许多统计家和数据分析家所面临的一个重大难题:确定两个数据集是否不同。 参考这本书中另一个教训的一个例子,即潘达基夫人和希尔德先生在考试中比较了他们班级的成绩。 你看到了班达基夫人的班达基班级考试成绩几乎每分每分都比较好。What if you added one student to Mr. Shields’ class? C an the performance of one student change your perception on which class performed better?
::如果你在希尔德先生的班级上增加一个学生呢? 一个学生的表现能改变你对哪个班级的表现的看法吗?+Do you want to reset the PLIX?Progress0 / 31.In a different lesson, you may have looked at exam results for Mr. Shields' and Mrs. Panthaki's classes. Compared to Mr. Shields' class, Mrs. Panthaki's class performed better on every metric. Here is a table of the results:
::与希尔德先生的班级相比,班达基夫人的班级成绩都比较好。Mr. Shield's class Mrs. Panthaki's class Mean 79 80 Median 80 81 MAD 10.6 6.35 What happens if there is just one change?
::如果只有一个变化怎么办?In the interactive above, move the red dot to set one of the scores on Mr. Shields' exam to 100.
::在上述互动中,将红点移到100分,将希尔德先生考试的分数之一定在100分。What do you notice about the new results?
::你注意到新的结果了吗?Select all that applya Mr. Shields' class median is now higher than Mrs. Panthaki's.b Mr. Shields' class mean is still lower than Mrs. Panthaki's.c Mr. Shields' class median is now the same as Mrs. Panthaki's.dMr. Shields' class mean is now higher than Mrs. Panthaki's.
::希尔德先生的阶级劣势 现在比板崎夫人的要高Discussion Questions
::讨论问题 讨论问题-
Set the grade of the new student to Mr. Shields’ class to 100. How does this change your perception of which class performed better?
::将新学生的分数设为100级, 这如何改变你对哪个班的成绩更好的看法? -
How does this affect your trust in the data?
::这如何影响你对数据的信任? -
Is it possible that people could use data results like this to mislead someone purposely?
::人们是否可能利用这种数据结果故意误导某人? -
What are some things you can do to ensure that you are not tricked by untrustworthy data?
::你能做些什么来确保你不被不可信的数据所欺骗?
How to Determine if Two Distributions are Different
::如何确定两种分发是否不同Before you compare the results in two datasets similar to the ones above, you must assess their similarity. If the values in one data set are close to the values in another, you say that the data sets are similar . If the values in one data set are not close to the values in another, you say that the data sets are different . To determine if a dataset is different when you cannot immediately tell, you can use on the formula :
::在比较与上述两个数据集相类似的两个数据集的结果之前,您必须评估其相似性。如果一个数据集中的值与另一个数据集中的值接近,您必须说数据集是相似的。如果一个数据集中的值与另一个数据集中的值不接近,您必须说数据集是不同的。要确定数据集在无法立即识别时是否不同,您可以在公式上使用:
::数据集 1 - 数据集 2 的平均值 @ label MADIf the value produced is greater than or equal to 2, the two groups are considered different. If the value produced is less than 2, the two groups are considered similar.
::如果所产生的价值大于或等于2,则认为这两类不同,如果所产生的价值少于2,则认为这两类相似。Using the datasets below, explore this idea further.
::利用下面的数据集,进一步探讨这一想法。Dataset A: 17, 18, 20, 21
::数据集A:17、18、20、21Dataset B: 17, 20, 16, 19
::数据集B:17、20、26、19Dataset C: 15, 10, 13, 10
::数据集C:15、10、13、10Example
::示例示例示例示例Are datasets A and B similar or different?
::数据集A和B类似还是不同?Dataset A: 17, 18, 20, 21
::数据集A:17、18、20、21-
Mean = 19
::平均=19 -
MAD = 1.5
::MAD=1.5
Dataset B: 17, 20, 16, 19
::数据集B:17、20、26、19-
Mean = 18
::平均值=18 -
MAD = 1.5
::MAD=1.5
::数据集 1 - 数据集 2 - 平均值 19 - 18 = 1. 5 = 11. 5=0. 6The difference between the means is 19 - 18 = 1. The MADs are equal, so you can use either one. When you divide the difference in the means by the MAD, you get . Since is less than 2, the datasets are similar.
::手段之间的差别是 19 - 18 = 1 。 MAD 是相等的, 所以您可以使用其中之一 。 当您用MAD 分割手段上的差别时, 您可以得到 1 1.5 = 0.6 。 因为 0. 6 = 少于 2 , 数据集是相似的 。Example
::示例示例示例示例Are datasets A and C similar or different?
::数据集A和C类似还是不同?Dataset A: 17, 18, 20, 21
::数据集A:17、18、20、21-
Mean = 19
::平均=19 -
MAD = 1.5
::MAD=1.5
Dataset C: 15, 10, 13, 10
::数据集C:15、10、13、10-
Mean = 12
::平均=12 -
MAD = 2
::MAD=2
::数据集 1 - 数据集的平均值 2 - 大 MAD% 19 - 12 @ @ @% 2 @ @ 7 @ 2=72=3.5The difference between the means is 19 - 12 = 7. The greater MAD is 2. When you divide the difference in the means by the greater MAD, you get . Since 3.5 is greater than 2, the datasets are different .
::手段之间的差别是19 - 12 = 7. 最大 MAD是 2 。 当您用更大的 MAD 分割手段的差别时, 你得到 7 2 = 3. 5。 由于3.5 大于 2, 数据集是不同的 。Use the interactive below to determine if the datasets are similar or different.
::使用下面的交互式数据来确定数据集是否相似或不同。+Do you want to reset the PLIX?Summary -
If two data sets are
similar
the values in one data set are close to the values in another.
::如果两个数据集相似,一个数据集中的数值接近另一个数据集中的数值。 -
If two data sets are
different
the values in one data set are not close to the values in another.
::如果两个数据集不同,一个数据集中的数值与另一个数据集中的数值不接近。 -
Use the formula:
::使用公式 : @ manes of datas 1 - mase of datas 2 \\\\\\\\ 更大的 MAD -
If the value produced is greater than or equal to 2, the two groups are considered different.
::如果所产生的价值大于或等于2,则认为这两类不同。 -
If the value produced is less than 2, the two groups are considered similar.
::如果生产的价值低于2,则认为这两类情况相似。
-
Set the grade of the new student to Mr. Shields’ class to 100. How does this change your perception of which class performed better?