AN INTRODUCTION TO CATEGORICAL DATA ANALYSIS, 2nd ed.
SOLUTIONS TO SELECTED PROBLEMS for STA 4504/5503
These solutions are solely for the use of students in STA 4504/5503 and are not to be distributed elsewhere. Please report any errors in the solutions to Alan Agresti, e-mail *********fl.edu.
copyright 2009, Alan Agresti.
Chapter 1
1. Response variables are (a) Attitude toward gun control, (b) Heart disease, (c) Vote for President, (d)
Quality of life.
2.a. nominal, b. ordinal, c. ordinal, d. nominal, e. nominal, f. ordinal
3.a. Binomial, n = 100, π = 0.25.
b. The mean is nπ = 25 and the standard deviation is
p
nπ(1 − π) = 4.33. 50 correct responses would
be surprising, since 50 is z = (50 − 25)/4.33 = 5.8 standard deviations above the mean of a distribution
that is approximately normal.
4.a. Y is binomial for n = 2 and π = 0.50. Thus, Y = 0 with probability 0.25, Y = 1 with probability 0.50, and Y = 2 with probability 0.25. The mean is 2(0.50) = 1.0 and the standard deviation is
p
2(0.50)(0.50) = 0.71.
b.(i) P(Y = 0) = 0.16, P(Y = 1) = 0.48, P(Y = 2) = 0.36;
(ii) P(Y = 0) = 0.36, P(Y = 1) = 0.48, P(Y = 2) = 0.16.
c. ℓ(π) = 2π(1 − π).
d. From the plot or using calculus by taking the derivative and setting it equal to 0, the function
ℓ(π) = 2π(1 − π) takes its maximum value at π = 0.50.
8.a. 0.294. b. z = −14.9, P-value < 0.0001. Conclude that minority of population would say ‘yes.’
c. p ± 1.96
p
p(1 − p)/n is 0.294 ± 2.58(0.0133), or (0.26, 0.33).
12.a. SE = 0, and the z statistic equals −∞.
b. CI is (0, 0); no, in the population we expect some vegetarians, even if the proportion is small.
c. z = (0 − 0.50)/
p
0.50(0.50)/25 = −5.0, P-value < 0.0001.
d. Note z = (0−0.133)/
p
odds0.133(0.133)/25 = −1.96, so 0.133 is the null value that has a P-value of 0.05.
15.a. σ(p) equals the binomial standard deviation
p
nπ(1 − π) divided by the sample size n.
b. σ(p) takes its maximum value at π = 0.50 and its minimum at π = 0 and 1. If π = 1, for instance,
every observation must be a success, and the sample proportion p equals π with probability 1.
Chapter 22.a. Sensitivity = P(Y = 1|X = 1) = π1, specificity = P(Y = 2|X = 2) = 1−P(Y = 1|X = 2) = 1−π2.
b.
P(X = 1|Y = 1) =
P(Y = 1|X = 1)P(X = 1)
P(Y = 1|X = 1)P(X = 1) + P(Y = 1|X = 2)P(X = 2)
.
c. 0.86(0.01)/[0.86(0.01) + 0.12(0.99)] = 0.0675.
d.
Test diagnosis
True
+ − Total
disease 0.0086 0.0014 0.01
no disease 0.1188 0.8712 0.99
Nearly all (99%) subjects do not have breast cancer. The 12% errors for them swamp (in frequency)
the 86% correct cases for the relatively few subjects who truly have it. In the column corresponding to
a positive test result, we see that a much higher proportion are in the ‘no disease’ category than the
‘disease’ category.
3.a. (i) 0.0000624 − 0.0000013 = 0.000061, (ii) 62.4/1.3 = 48, so the estimated probability of a gunrelated death in U.S. was 48 times that in Britain.
b. Relative risk, as difference of proportions makes it misleadingly seem as if there is no effect.
5.a. Relative risk.
b. (i) π1 = 0.55π2, so π1/π2 = 0.55. (ii) 1/0.55 = 1.82.
6.a. 0.0012, 10.78; relative risk, since difference of proportions makes it appear there is no association.
b. (0.001304/0.998696)/(0.000121/0.999879) = 10.79; this happens when the proportion in the first
category is close to zero for each group.
7.a. The quoted interpretation is that of the relative risk. Should substitute odds for probability. It
would be approximately correct if the probability of survival were close to 0 for females and for males.
b. For females, proportion = 2.9/(1 + 2.9) = 0.744. Odds for males = 2.9/11.4 = 0.254, so proportion
= 0.254/(1 + 0.254) = 0.203.
c. R = 0.744/0.203 = 3.7.
8.a. (0.847/0.153)/(0.906/0.094) = 0.574.
b. This is interpretation for relative risk, not the odds ratio. The actual relative risk = 0.847/0.906 =
0.935; i.e., 60% should have been 93.5%.
12.a.Heart attack
Group Yes No Total
Placebo 193 19,749 19,942
Aspirin 198 19,736 19,934
b. 0.974. The sample odds of a heart attack were actually a bit less for the placebo group.
c. CI for log odds ratio is −0.0262 ± 1.96(0.1017), or (−0.225, 0.173). CI for odds ratio is (0.80, 1.19).
It is plausible that there is no effect. If there is an effect, it is relatively weak.
17.a. X2 = 25.0, df = 1, P < 0.0001. b. G2 = 25.4, df = 1; for each statistic, very strong evidence that incidence of heart attacks depends on aspirin intake.
18.a. 35.8 = (290)(168)/n, where n = 1362.

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。