Chapter 22 Solutions 22.1. (a) Diagram below. (b) The null hypothesis is “all groups have the same mean rest period,”
and the alternative is “at least one group has a different mean rest period.” The P-value shows significant evidence against H0, and the graph leads us to conclude that caffeine has the effect of reducing the length of the rest period.
Note: Studentsmightbetemptedtothinkthatthealternativehypothesisis“mean restperioddecreaseswithincreasingcaffeinedosage”orsomethingsimilar,butANOVA alternativesarenondirectional. 22.2. (a) The null hypothesis is “all age groups have the same (population) mean road-rage
measurement,” and the alternative is “at least one group has a different mean.” (b) The F test is quite significant, giving strong evidence that the means are different. The sample means suggest that the degree of road rage decreases with age. (We assume that higher numbers indicate more road rage.) 22.3. (a) The stemplots (right) appear to suggest that
logging reduces the number of trees per plot and
that recovery is slow (the 1-year-after and 8-years-
after stemplots are similar). (b) The means lead one
to the same conclusion as in (a). (c) In testing H
= µ2 = µ3 vs. Ha: not all means are the same, we find that F =11.43 with df 2 and 30, which
has P = 0.000205, so we conclude that these differences are significant: The number of trees per
22.4. (a) The stemplots (right) show no extreme
outliers or skewness. (b) The means suggest that a
dog reduces heart rate, but being with a friend
appears to raise it. (c) F = 14.08 and P = 0.000
(meaning P < 0.0005), which means we reject HµP = µF = µC in favor of Ha: at least one mean is
different. Based on the confidence intervals, it
appears that the mean heart rate is lowest when a
pet is present (although this interval overlaps the
control interval) and is highest when a friend is
present (although again, this interval overlaps the control interval).
22.5. (a) I, the number of populations, is 3; the sample sizes from each population are n1 = n2 = 12
and n3 = 9; the total sample size is N = 33. (b) Numerator (“Group”): I – 1 = 2, denominator (“Error”): N – I = 30. (c) Because F >9.22, the largest critical value for an F 2,25 distribution in Table D, we conclude that P <0.001. 22.6. (a) I, the number of populations, is 3; the sample sizes from each population are n1 = n2 = n3 =
15; the total sample size is N = 45. (b) Numerator (“Group”): I – 1 = 2, denominator (“Error”): N –I = 42. (c) Because F >9.22, the largest critical value for an F (2,25) distribution in Table D, we conclude that P < 0.001. 22.7. (a) We have I = 4 populations; individual sample size n1 = 146, n2 = 125, n3 = 104, and n4 =
84; and total sample size N = 459. The degrees of freedom are therefore 3 and 455. (b) Because 3.18 <F < 3.88, 0.010 < P < 0.025. (c) Because F < 2.11, P > 0.100. (d) Because 5.63 < F, P <0.001. 22.8. (a) Yes: largest s (b) Yes: largest s 22.9. The standard deviations (0.1201, 0.1472, 0.1134)
do not violate our rule of thumb. However, the
distributions appear to be skewed and have outliers,
Chapter 22 One-Way Analysis of Variance: Comparing Several Means22.10. (a) The biggest difference is that single men earn considerably less than men who have been
or are married. Widowed and married men earn the most; divorced men earn about $1300 less (on the average), and single men are $4000 below that. (b) Yes: 8119 1.42. (c) The degrees of
freedom are 3 and 8231. (d) The sample sizes are so large that even small differences would be found to be significant; these differences are fairly large. (e) No: Single men are likely to be younger than men in the other categories. This means that typically they have less experience, and have been with their companies less time than the others, and so have not received as many raises, etc. (Age is the lurking variable.) 22.11. (a) The sample sizes are quite large, and the F test is robust against non-Normality with large
samples. (b) Yes (barely): The ratio is 3.11 = 1.94. (c) We have I = 3 and N = 1342. The details of
the computations are given here; the Minitab output below confirms the computed values.
SSG = 244(2.22 −1.31) + 734(1.33 −1.31) + 364(0.66 −1.31)
SSE = 243× 3.11 + 733× 2.21 + 363×1.60
(d) With df 2 and 1339, we find that P < 0.001; this is strong evidence that the means differ Minitab output 22.12. (a) Yes: The rule-of-thumb ratio is 5.2 = 1.24. (b) x = 9.92, and MSG = 10.0 (details
below). (c) MSE = 21.9 (details below). (d) F = 0.46 (details below). With df 2 and 112 (use 2 and 100 in the table), we find P > 0.100, so we have no reason to doubt the null hypothesis; that is, there is not enough evidence to conclude that mean weight loss differs between these exercise programs.
In the details of the computations, we have I = 3 and N = 115:
37(10.2−9.92) +36(9.3−9.92) +42(10.2−9.92)
22.13. (a) Multiply by n to find the standard deviations: scold = 8.08 16 = 32.32, sneutral
= 5.61 38 = 34.58, and shot = 4.10 75 = 35.51. This easily satisfies our rule of thumb:
35.51 =1.10. (b) We have I = 3 and N = 129, so
(16)(28.89)−32.045) +(38)(32.93−32.045) +(75)(32.27−32.045)
With df 2 and 126, we find P > 0.100, so we have no reason to doubt the null hypothesis; that is, there is no evidence that nest temperature affects mean weight.
22.14. (a) Multiply by n to find the standard deviations: scold = 5.67 16 = 22.68, sneutral
= 4.24 38 = 26.14, and shot = 2.70 75 = 23.38. This easily satisfies our rule of thumb:
26.14 =1.15. (b) We have I = 3 and N = 129, so
(16)(6.40−5.008) +(38)(5.82−5.008) +(75)(4.30−5.008)
With df 2 and 126, we find P > 0.100, so we have no reason to doubt the null hypothesis; that is, there is no evidence that nest temperature affects propensity to strike.
22.15. Populations: nonsmokers, moderate smokers, and heavy smokers; response variable: hours of
sleep per night. I = 3, n1 =n2 = n3 = 200, and N = 600, so there are I – 1 = 2 and N – I = 597 degrees of freedom.
22.16. Populations: consumers (responding to different package designs); response variable:
attractiveness rating. I = 6, n1 = n2 = n3 = n4 = n5 = n6 = 120, and N = 720, so there are I – 1 = 5 and N – I = 714 degrees of freedom.
22.17. Populations: tomato varieties; response variable: yield. I = 4, n1 = n2 = n3 = n4 = 10, and N =
40, so there are I – 1 = 3 and N – I = 36 degrees of freedom.
22.18. Populations: different concrete mixtures; response variable: strength. I = 5, n1 = · · · =n5 = 6,
and N = 30, so there are I – 1 = 4 and N – I = 25 degrees of freedom.
22.19. Populations: students taught by different methods; response variable: test scores. I = 4, n1 = n2 = n3 = 10, n4 = 12, and N = 42, so there are I – 1 = 3 and N – I = 38 degrees of freedom.
Chapter 22 One-Way Analysis of Variance: Comparing Several Means22.20. (a) We test H0: µ1 = µ2 = µ3 = µ4 = µ5 vs. Ha: not all means are the same. (b) N = 168, I = 5, n1 = 39, n2 = 35, n3 = 29, n4 = 30, and n5 = 35. (c) We have 4 and 163 degrees of freedom. 22.21. (a) The graph does suggest that emissions rise
when a plant is attacked, because the mean control emission rate is half the smallest of the other rates. (b) The null hypothesis is “all groups have the same mean emission rate.” The alternative is “at least one group has a different mean emission rate.” (c) The most important piece of additional information would be whether the data are sufficiently close to Normally distributed. (From the description, it seems reasonably safe to assume that these are more-or-less
random samples.) (d) The SEM equals s / 8, so we can find the standard deviations by
multiplying by 8 — however, this factor of 8 would cancel out in the process of finding the ratio of the largest and smallest standard deviations, so we can simply find this ratio directly from the SEMs: 8.75
22.22. (a) The means are given in the table (below) and are shown in the plot (as the symbol “+”).
(The plot also shows the original data as solid circles.) The means vary over time but do not consistently decrease. (b) The standard deviations are also given in the table; the largest-to- smallest ratio is 16.0873 3.49. Because this is much more than 2, ANOVA should not be used. (c) The two-sample t test does not require that the standard deviations be equal, but the ANOVA assumes that they are and is not reliable if there is evidence that they are different. These data suggest that variability increases over time. 22.23. Only Design A would allow use of one-way ANOVA because it produces four independent
sets of numbers. The data resulting from Design B would be dependent (a subject’s responses to the first list would be related to that same subject’s responses to the other lists), so that ANOVA would not be appropriate for comparison.
22.24. The ANOVA test statistic is F = 4.92 (df 3 and 92), which has P = 0.003, so there is strong
evidence that the means are not all the same. In particular, list 1 seems to be the easiest, and lists 3 and 4 are the most difficult.
22.25. (a) Stemplots (below) suggest that yields first increase with plant density, then decrease. The
standard deviation ratio is 1.95. With such small samples, outliers and skewness cannot be assessed. (b) We test H0: µ1 = µ2 = µ3 = µ4 = µ5 (all plant densities give the same mean yield per acre) vs. Ha: not all means are the same. Minitab output is below; with F = 0.50 and P = 0.736, we conclude that the differences are not significant. (c) The sample sizes were small, which means there is a lot of potential variation in the outcome. (That is why the confidence intervals shown are very wide.) Minitab output
Individual 95% CIs For Mean Based on Pooled StDev
4 131.03 18.09 (-----------*-----------)
3 143.07 11.44 (------------*-------------)
22.27 (---------------*----------------)
22.26. (a) The table is given in the Minitab output below; because 4.500 = 1.28, ANOVA should be
safe. The means appear to suggest that logging reduces the number of species per plot and that recovery takes more than 8 years. (a) ANOVA gives F = 6.02 with df 2 and 30, so P < 0.010 (software gives 0.006), so we conclude that these differences are significant; the number of species per plot really is lower in logged areas. Minitab output
Individual 95% CIs For Mean Based on Pooled StDev
Chapter 22 One-Way Analysis of Variance: Comparing Several Means22.27.(a) Table below (part of Minitab out-put); stemplots at
right. The data suggest that the presence of too many
nematodes reduces growth. (b) H0: µ1 = · · · = µ4 (all mean
heights are the same) vs. Ha: not all means are the same.
This ANOVA tests whether nematodes affect mean plant
growth. (c) Minitab output is shown below: F = 12.08 df 3
and 12, P = 0.001, so the differences are significant. The
first two levels (0 and 1000 nematodes) do not appear to be
significantly different, nor do the last two. However, it does 11
appear that somewhere between 1000 and 5000 nematodes
the effects of the worms and are hurt by their presence.
Minitab output Analysis of Variance on Growth
Individual 95% CIs For Mean Based on Pooled StDev
22.28. We have I = 3 and N = 259. The details of the computations are given here; the Minitab
output that follows confirms the computed values.
SSG = 57(26,470 − 29,484) +108(30, 610 − 29,484)
SSE = 56 × 9507 +107 × 4504 + 93× 7158
With df 2 and 256 (use 2 and 200 in the table), we find P < 0.001, so we have fairly strong
Minitab output 22.29. We have I = 4 and N = 32. Also, all SEMs must be squared and multiplied by 8 to find the
variances. The details of the computations are given here; the Minitab output below confirms the computed values.
SSG = 8(9.22 − 21.585) + 8(31.03 − 21.585)
With df 3 and 28, we find P > 0.100, so we have no reason to doubt the null hypothesis; that is,
there is not enough evidence to conclude that mean emission rates are different.
Minitab output Chapter 22 One-Way Analysis of Variance: Comparing Several Means22.30. (a) The standard error is SE
= 0.38 /13 + 0.58 /17 = 0.1758, so the t statistic is
− 34, which is obviously not significant. The conservative method gives df = 12,
and software reports P = 0.74. (b) The details of the computation are given here; Minitab output is shown below.
With df 1 and 28, software reports that P = 0.749. (c) The two P-values are almost identical. Minitab output
– – – – – – – – – – – – – – – T test – – – – – – – – – – – – – – –
95% C.I. for mu Stock – mu Mutual: ( -0.42, 0.30) T–Test mu Stock = mu Mutual (vs not =): T= –0.34 P=0.74 DF= 27
– – – – – – – – – – – – – – – ANOVA – – – – – – – – – – – – – – – Analysis of Variance 22.31. (a) A chi-square test is appropriate, because we are comparing three proportions (attrition
rates in each of three groups). (b) ANOVA is appropriate, because we are comparing three means (weight loss in each of three groups). (c) ANOVA is appropriate, because we are comparing three means (duration of exercise in each of three groups). 22.32. (a) By moving the middle mean to the same level as the other two, it is possible to reduce F
to 0.0236, which has a P-value very close to the left end of the scale (near 1). (b) By moving any mean about 1 centimeter up or down (or any two means about 0.5 cm in opposite directions), the value of F increases (and P decreases) until it appears at the right end of the scale. 22.33. (a) F can be made as small as 0.3174, while P > 0.5. (b) F can be made quite large (and P

8.2 It has been hypothesized that silicone breast implants cause illness. In one study it was foundthat women with implants were more likely to smoke, to be heavy drinkers, to use hair dye, andto have had an abortion than were women in a comparison group who did not have implants. Usethe language of statistics to explain why this study casts doubt on the claim that implants causeillness. Sol

MANY YEARS ago when I was a young man, I happened to spend asummer with my friends, the Wints, in Oxford. Guy Wint was on the staff of Theobserver and was away in London most of the day. His wife, Freda, had convertedto Buddhism and was also out most of the time meeting fellow Buddhists. Theirson, Ben, was at a boarding school. For company, I had the Wints' three-year-old daughter, Allegra. In