Chapter 1: Measurement
1.1 Value, variable, observation.
The value of LUNGCA for the 7th observation is 18.
The value of COUNTRY for the 11th observation is Iceland.
1.3 Value, variable, observation (cont.).
VAR3 records the gender of subjects.
1.5 Measurement scale.
VAR1 − Categorical
VAR2 − Categorical
VAR3 − Categorical
VAR4 − Quantitative
VAR5 − Categorical
1.7 Duration of hospitalization.
(a) The following variables are categorical: SEX, AB, CULT, and SERV. The following variables are quantitative: DUR, AGE, TEMP, and WBC. There are no ordinal variables in this data set.
(b) The value of DUR for observation 4 is 11.
(c) The value of AGE for observation 24 is 43.
1.9 Dietary histories.
Prospective dietary logs are more accurate than retrospective recall. Memory tends to be unreliable.
(a) Quantitative
(b) Quantitative
(c) Categorical
(d) Quantitative
(e) Quantitative
(f) Ordinal
(g) Categorical
(h) Ordinal
(i) Categorical
(j) Quantitative
(k) Categorical
(l) Ordinal
(m) Ordinal
(n) Categorical
1.13 Age recorded on different measurement scales.
Age can be recorded quantitatively in months, years, and so on. It can also be grouped into categories, for example, 1 = youngest age group and 2 = next age category.
1.15 Binge drinking.
AGEGRP is ordinal. BINGE2003 and BINGE2008 are quantitative.
Chapter 2: Types of Studies
2.1 Sample and population.
(a) The sample consists of the 125 individuals in the study. The source population can be viewed as either (a) all patients that attended this hospital during the study period or (b) all individuals in the hospital’s geographic capturement area, that is, the population who might have been hospitalized had there been a need. It would be useful to know additional information about these source populations, for example, the time-frame of the study, the demographic make-up of the capturement area, and so on.
(b) The sample consists of the 18 diabetics in the study. The population consists of 35- to 44-year-old male diabetics. It would be useful to know other person, place, and time factors that define the source population.
2.3 California counties.
There are 58 counties to choose from. Starting on line 33 of Table A, the first four random numbers that apply are 18, 56, 16, and 38. These numbers identify the following counties: Lassen, Ventura, Kings, and San Francisco.
2.5 Experimental or nonexperimental?
(a) Nonexperimental
(b) Nonexperimental
(c) Experimental
2.7 Five-City Project. Study design schematic:
2.9 Five-City Project (cont.).
The first five digits in line 17 of Table A are 50728. Therefore, cities 5 and 2 will be designated at treatment cities.
2.11 Campus survey.
(a) The high nonresponse rate hinders our ability to make generalizations about the population; the distribution of behaviors in responders may not randomly reflect behaviors in the campus populations.
(b) The information on the questionnaires is of uncertain quality. What people say about their behavior often does not match how they behave.
2.13 Telephone directory sampling frame.
This sampling frame omits households that lack land telephones lines. It also excludes those with unlisted numbers and double counts those with more than one phone line.
2.15 Four-naughts.
Four zeors in a row would occur 1 in every 10,000 four-tuples.
2.17 Employee counseling.
(a) Depending on how one structures the research question, the population can be defined as either the 1000 employees who used the service or all employees who might potentially use the service.
(b) The sample consists of the 25 employees who completed and returned their questionnaire.
(c) Since only one in four potential respondents returned a completed questionnaire, there is the potential for nonresponse bias (a form of selection bias in which nonresponders differ systematically from responders).
(d) This would still not be an SRS. This would be a systematic sample.
Chapter 3: Frequency Distributions
3.1 Poverty in eastern states, 2000.
Percent of people living below the poverty line by state.
• Shape: The distribution has a mild positive skew. There are no apparent outliers.
• Location: The median (underlined) has a depth of (26 + 1) / 2 = 13.5 and a value of 10.2. Michigan and Massachusetts are the “median states.”
• Spread: Values vary from 7.3% to 15.8%.
3.3 Leaves on a common stem.
(a) Comparison A. Groups have the same central locations; group 1 has greater variability (spread).
(b) Comparison B. Group 1 has larger values on average; groups have the same variability.
(c) Comparison C. Group 1 has a lower average and greater spread.
3.5 Hospital stay duration.
(a) Frequency table
(b) Five of 25 (20.0%) were less than 5 days.
(c) Ninety-two percent were less than 15 days (see Cumulative Frequency column).
(d) Two of 25 (8%) were at least 15 days in length.
3.7 Body weight expressed as a percentage of ideal.
• Shape: The distribution has a negative skew and high outlier.
• Location: The median is 114 (underlined).
• Spread: Data range from 88 to 152.
3.9 Seizures following bacterial meningitis.
Data have a pronounced positive skew and a high outlier (shape). The median is 24 (location). Induction times are highly variable, ranging from 0.1 to 96 months (spread).
3.11 U.S. Hispanic population.
Here is the stemplot with single stem values:
Here is the stemplot with split stem values:
The stemplot with split stem values does a better job showing the distribution’s shape. Notice the positive skew with possible outlier values: 42 (New Mexico), 32 (California), 32 (Texas), and 25 (Arizona). The median (about 4) is underscored.
3.13 Air samples.
Here’s a stemplot with a regular stem:
Here’s a stemplot with split stem values:
• The sites have similar central locations.
• Site 1 exhibits greater variability.
• A high outlier value is apparent at site 1.
3.15 Practicing docs.
(a) 2004 data. The distribution has a positive skew and several potential outliers (labeled on plot). The median has a depth of (51 + 1)/2 = 26 and an approximate value of 22. (Median is approximate because it is based on counting the truncated leaves on the stemplot.) Data range from about 15 to 64.
Here is how SPSS plots the data:
Notice that SPSS includes frequency counts to the left of the stem. The stem exhibits a quintuple split but does not use our convention of using “*, T, F, S, and .” to keep track of stem values. Seven observations are identified as EXTREMES (values of 31 or greater). STEM WIDTH 10.0 refers to the stem multiplier (×10).
(b) 1975 data. The distribution has a positive skew and several potential outliers. The median has a depth of (51 + 1)/2 = 26 and an approximate value of 11. Values range from 7.7 to 34.6.
(c) 1975 and 2004 data in back-to-back stemplot form. The back-to-back placement of the plots facilitates comparisons. Note that there is little overlap of the distributions.
3.17 Cancer treatment.
The distribution has a strong positive skew with at least two potential outliers (510 and 700). The median has a depth of (11 + 1)/2 = 6 and a value of approximately 30 (underlined); the actual median is 34 (precision lost when data points were truncated to draw the plot). Data range from 0 to 700.
Chapter 4: Summary Statistics
4.1 Gravitational center.
Locations of means are shown with a ˆ.
(a)
(b)
(c)
(d)
4.3 More visualization.
The calculated means are (a) 9.9, (b) 159, and (c) 2.8. How good were your visual estimates?
4.5 Outside?
Five-point summary: 88, 101, 114, 120, 152
IQR = 120 − 101 = 19
FenceUpper = 120 + (1.5)(19) = 148.5
The value 152 is outside the upper fence.
4.7 Spread.
No arithmetics is required to see that batch A has the greatest variability.
4.9 Standard deviation (and variance) via technology.
For site 1, s = 14.56 and s2 = 211.93.
For site 2, s = 2.88 and s2 = 8.286.
4.11 Units of measure changes numeric values of the standard deviation.
The distributions are identical and are merely expressed in different units. The standard deviations are s1 = 1 year, s2 = 12 months, and s3 = 365 days, respectively.
4.13 Which statistics?
Data set (a) is fairly symmetrical. Therefore, use of the mean and standard deviation are recommended.
Data set (b) is bimodal and has an outlier. The median and IQR should be used to describe this distribution.
Data set (c) has a mild negative skew. It would be prudent to calculate both the mean and median to see whether they differed. If they differed in a meaningful way, you should report the median and IQR.
4.15 Leaves on stems.
(a) Comparison A
Group 1: = 50 and s = 31.6.
Group 2: = 50 and s = 15.8.
How statistics relate to what we see: Same central locations; greater spread is group 1.
Group 1: = 70 and s = 15.8.
Group 2: = 50 and s = 15.8.
How statistics relate to what we see: Higher central location in group 1; equal group spreads.
(c) Comparison C
Group 1: = 50 and s = 31.6.
Group 2: = 70 and s = 15.8.
How statistics relate to what we see: Smaller central location and greater spread in group 1.
4.17 Health insurance by state.
(a) Mean = 14.28, median = 13.8. This suggests the distribution has a positive skew or high outlier.
(b) Five-point summary: 8.5, 11.2, 13.8, 16.75, 25.1
(c) IQR = 16.75 − 11.2 = 5.55
FenceUpper = 16.75 + (1.5)(5.55) = 25.075. There is an outside value on top. This value is 25.1.
FenceLower = 11.2 − (1.5)(5.55) = 2.875. There are no outside values on the bottom.
4.19 What would you report?
The stemplot is mound shaped and symmetrical. (Symmetry is confirmed by the fact that both and the median are equal to 5.5.) Therefore, the mean and standard deviation should do a good job summarizing this distribution.
4.21 Practicing docs (side-by-side boxplots).
Notes for 1975 data: 5-point summary: 7.7, 10.05, 11.4, 13.85, and 34.6. IQR = 13.85 − 10.05 = 3.8. FenceL = 10.05 − (1.5) · (3.8) = 4.35; there are no lower outside values. FenceU = 13.85 + (1.5) · (3.8) = 19.55; there are two upper outside values: 20.2 and 34.6. The upper inside value is 18.3.
Notes for 2004 data: 5-point summary: 15.6, 20.05, 22.2, 24.05, and 64.2. IQR = 24.05 − 20.05 = 4. FenceL = 20.05 − (1.5) · (4) = 14.05; therefore, there are no lower outside values. FenceU = 24.05 + (1.5) · (4) = 30.05; therefore, there are seven upper outside values: 30.7, 31.3, 31.9, 33.0, 33.8, 37.4, and 64.2. The upper inside value is 28.1.
• Shapes: Both distributions have upper outside values indicating positive skews. The tail of the 2004 distribution appears to be more prominent than the tail of the 1975 distribution.
• Locations: In 2004, the states demonstrated a much higher number of practicing medical doctors per capita, so much so that there is no overlap in the boxes and little overlap in the whiskers.
• Spreads: Their IQRs appear to be similar. However, their range of values is greater in 2004.
4.23 Melanoma treatment.
The response variable is cell doubling time in days (DOUBLING) and the explanatory variable is the manner in which the cells were cultured (indicated by the variable COHORT). These side-by-side boxplots indicate much longer doubling times in cohort 1 (extended ex vivo culturing of cells) and less variability in doubling times in cohort 2 (short ex vivo culturing).
Note: The cell doubling times are multiplied by a factor 10 for the plot.
Chapter 5: Probability Concepts
5.1 Explaining probability.
We cannot say for certain whether this particular patient will survive, but we can say that in a large number of patients with identical characteristics, 60% will survive at least five years and 40% will not.
5.3 February birthdays.
(a) February 28 occurs four times every 4 years. Three of the years have 365 days, and every fourth year (leap year) has 366 days. Therefore, .
(b) February 29 occurs once every 4 years. Therefore, .
(c) (February 28 or 29) occur five times every four years. Therefore,.
5.5. N = 26.
(a) 1 in 26 = 0.0385
(b) 1 in 26 = 0.0385
(c) 0
5.7 Expressions of probability.
(a) This event seldom happens. It has a very low chance of occurring. |
5% |
(b) This event is infrequent. It is unlikely. |
20% |
(c) This happens as often as not; chances are even. |
50% |
(d) This is very frequent. It has high probability. |
80% |
(e) This event almost always occurs. It has a very high probability. |
95% |
5.9 Lottery.
(a) μ = Σ xi · Pr(X = xi) = (0 · 0.999999982) + (10,000,000 · 0.000000018) = 0 + 0.18 = 0.18
(b) 19 cents
5.11 Uniform (0, 1) pdf.
(a) Pr(X ≤ 0.8) = 0.8
(b) Pr(X ≤ 0.2) = 0.2
(c) Pr(0.2 ≤ X ≤ 0.8) = 0.8 − 0.2 = 0.6
5.13 The sum of two uniform (0,1) random variables.
(a) Pr(X ≤ 1) = half the area under the curve = 0.5
Also note that the area is that of a right angle triangle with height = 1 and base = 1. Area = ½ × h × b = ½ × 1 × 1 = 0.5.
(b) Pr(X ≤ 0.5) = ½ × h × b = ½ × 0.5 × 0.5 = 0.125
(c) Pr(0.5 ≤ X ≤ 1.5) = 1 − Pr(X ≤ 0.5) − Pr(X ≥ 1.5) = 1 − 0.125 − 0.125 = 0.75.
Explanation:
• Pr(X ≤ 0.5) = 0.125 as explained in part (b).
• By symmetry, Pr(X ≥ 1.5) is also equal to 0.125.
• The area under the curve sums to exactly 1 (property 2 of probabilities). Subtract Pr(X ≤ 0.5) and Pr(X ≥ 1.5) from 1 to get Pr(0.5 ≤ X ≤ 1.5).
5.15 Uniform distribution of highway accidents.
(a) Pr(in the first mile) = shaded area = height × base = 1 × =
or 0.20.
(b) Pr(not in the first mile) = 1 − Pr(in the first mile) = 1 − 0.20 = 0.80
(c) Pr(between miles 2.5 and 4) = 1.5 × = 0.30.
(d) Pr(in first mile OR between 2.5 and 4 miles) = Pr(in the first mile) + Pr(between 2.5 and 4 miles) = 0.20 + 0.30 = 0.50.
5.17 Bound for Glory (variance).
σ² = Σ (xi − μ)2 · Pr(X = xi) = [(0 − 0.1667)2 · 59/60] + [(10 − 0.1667)2 · 1/60] = 0.0273 + 1.6116 = 1.6389 (units are dollars²).
5.19 The sum of two uniform (0,1) random variables (areas under the curve).
(a) Pr(X < 1) = ½ hb = ½ ·1·1 = 0.50:
(b) Note that Pr(X > 1.5) = ½ hb = ½ · ½ · ½ = (right tail in this figure):
By the law of complements, Pr(X ≤ 1.5) = 1 − Pr(X > 1.5) = 1 − =
, as shown as the area under the curve to the right of 1.5 in the above figure.
(d) Pr(1 < X < 1.5) = Pr(X < 1.5) − Pr(X < 1) = ½ =
.
Chapter 6: Binomial Probability Distributions
6.1 Tay-Sachs.
(a) Yes, this is a binomial random variable because it is based on n = 3 independent Bernoulli trials, each with probability of success p = 0.25.
(b) This is not a binomial because the number of trials n is not fixed.
6.3 Tay-Sachs inheritance.
Based on Mendelian genetics, there is a one-in-four chance that both carriers will contribute the Tay-Sachs allele during conception. Let X represent the number of Tay-Sachs affected children out of 3. X ~ b(3, 0.25). Therefore:
Pr(X = 0) = nCx · px · (1 − p)n−x = 3C0 · 0.250 · (1 − 0.25)3 = 1 · 1 · 0.4219 = 0.4219
Pr(X = 1) = 3C1 · 0.251 · (1 − 0.25)2 = 3 · 0.25 · 0.5625 = 0.4219
Pr(X = 2) = 3C2 · 0.252 · (1 − 0.25)1 = 3 · 0.0625 · 0.75 = 0.1406
Pr(X = 3) = 3C3 · 0.253 · (1 − 0.25)0 = 1 · 0.0156 · 1 = 0.0156
6.5 Telephone survey.
Given: X ~ b(8, 0.15)
Pr(X = 2) = 8C2 · 0.152 · 0.856 = 28 · 0.0225 · 0.3771 = 0.2376.
6.7 Tay-Sachs.
μ = np = (3)(0.25) = 0.75
σ2 = npq =(3)(0.25)(0.75) = 0.5625
(a) μ = (5)(0.768) = 3.84
(b) μ = (10)(0.768) = 7.68
(c) Pr(X ≥ 9) = Pr(X = 9) + Pr(X = 10) = 0.2156 + 0.0714 = 0.2870
6.11 Prevalence 10%.
(a) X ~ b(15, 0.10)
(b) Pr(X = 0) = 0.2059
(c) Pr(X = 1) = 0.3432
(d) Pr(X ≤ 1) = Pr(X = 0) + Pr(X = 1) = 0.2059 + 0.3432 = 0.5491
(e) Pr(X ≥ 2) = 1 − Pr(X ≤ 1) = 1 − 0.5491 = 0.4509
6.13 Linda’s omelets.
Let X represent the number of contaminated eggs in the three-egg omelet. Given: X ~ b(n = 3, p = 0.16667). Calculate: Pr(X = 0) = 3C0 · 0.166670 · (1 − 0.16667)3 = 1 · 1 · 0.5787 = 0.5787. Therefore, the probability of “at least one” Pr(X ≥ 1) = 1 − Pr(X = 0) = 1 − 0.5787 = 0.4213.
6.15 Decayed teeth.
. Therefore, there are 190 possible combinations.
6.17 Human papillomavirus.
(a) The pmf for X ~ b(4, 0.20):
(b) Pr(X ≥ 1) = 1 − Pr(X = 0) = 1 − 0.4096 = 0.5904
Chapter 7: Normal Probability Distributions
7.1 Heights of 10-year-olds.
Let X represent heights of 10-year-old males in centimeters. Given: X ~ N(138, 7). Visually, this probability density function is represented as a bell-shaped curve centered on a μ value of 138 with inflection points at μ ± σ = 138 ± 7, as depicted in the following drawing. According to the “68 part” of the 68–95–99.7 rule, 68% of the values from this distribution will fall in this range. The remaining 32% fall outside this range, with equal numbers at either extreme: 16% fall below 131 and 16% fall above 145.
7.3 Visualizing the distribution of gestational length.
X ~ N(39, 2)
The curve appears as Figure 7.13 on page 160.
7.5 Heights of 10-year-old boys.
X ~ N(138, 7)
(a) What proportion of the population is less than 150 cm tall?
The four-step solution is:
1. State: We want to determine Pr(X < 150)
2. Standardize:
3. Sketch:
4. Use Table B: Pr(z < 1.71) = 0.9564
(b) What proportion of the population is less than 140 cm tall?
1. State: Pr(X < 140)
2. Standardize:
3. Sketch:
4. Use Appendix Table B: Pr(z < 0.29) = 0.6141
(c) What proportion is between 150 and 140 cm?
Make use of the fact demonstrated in Figure 7.12 (page 159): Pr(140 ≤ X ≤ 150) = Pr(X ≤ 150) − Pr(X ≤ 140) = 0.9564 − 0.6141 = 0.3423
7.7 45th percentile on a Standard Normal curve.
z0.45 = −0.13
1. State: X ~ N(100, 15). We want to find the range for the middle 50% of values, that is, from the 25th percentile to the 75th percentile.
2. Table B: The z-scores for these percentiles are z0.25 = −0.67 and z0.75 = 0.67
3. Sketch:
4. Unstandardize using the formula x = μ + (zp)(σ)
The 25th percentile is x = 100 + (−0.67)(15) = 100 − 10.05 = 89.95 ≈ 90.
The 75th percentile is x = 100 + (0.67)(15) = 100 + 10.05 = 110.05 ≈ 110. The range 90 to 110 captures the middle 50% of values.
7.11 Death row inmate.
Again, the four-step procedure is used:
1. State: X ~ N(100, 15). We want to determine Pr(X < 51)
2. Standardize:
3. Sketch: not shown
4. Table B: Pr(z ≤ −3.27) = 0.0005 (about 0.05%)
7.13 Alzheimer brains.
1. State: X ~ N(1077, 106). We want to determine Pr(X ≥ 1250)
2. Standardize:
3. Sketch: not shown
4. Use Table B: Pr(z ≥ 1.63) = 1 − 0.9484 = 0.0516
(a) z0.10 = −1.28 (The 10th percentile on a Standard Normal curve is 1.28.)
(b) z0.35 = −0.39
(c) z0.74 = 0.64
(d) z0.85 = 1.04
(e) z0. 999 = 3.09
7.17 Gestation less than 32 weeks.
• State the problem. Let X represent normal gestational length from conception to birth in weeks: X ~ N(39, 2). This question asks “What percentage of gestational lengths is less than 32 weeks?” In notation, Pr(X < 32) = ?
• Standardize the value. The z-score corresponding to 32 weeks is z = (32 − 39)/2 = −3.50.
• Sketch the curve and shade the probability area. The drawing is not shown in this key. However, one can imagine the standardized value of −3.50 located in the far left tail of the Standard Normal curve. The AUC to the left of this point is very tiny.
• Use Appendix Table B to look up the probability. Table B does not include the cumulative probability for −3.50. However, it does include the fact that Pr(z ≤ −3.49) = 0.0002. Our standardized value is very close to this point, so we can assume Pr(z ≤ −3.50) ≈ 0.0002. Using the StaTable probability application, Pr(z ≤ −3.50) = 0.000233. Therefore, 0.02% of gestations are less than 32 weeks in length.
7.19 A six-foot seven-inch tall man.
Let X represent male height in inches. We are given X ~ N(70, 3). The probability of seeing a man who is 70″ tall or taller is Pr(X ≥ 70) = Pr(z ≥ (70 − 70)/3) = Pr(z ≥ 0) = 0.5. Therefore, half the men in the population will be taller than 5′10″. In contrast, the probability of seeing a man who is 79″ tall or taller is Pr(X ≥ 79) = Pr(z ≥ (79 − 70)/3) = Pr(z ≥ 3) = 1 − Pr(z < 3) = 1 − 0.9987 = 0.0013. Therefore, only 0.13% of men are taller than 6′7″. It follows that the 6′7″ man is a rarity, while the 5′10″ man is common. That is why 6′7″ seems so much taller than 5′10″.
7.21 |z| ≥ 2.56.
Pr(z ≥ 2.56) = 0.0052 and Pr(z ≤ −2.56) = 0.0052. Since these two events are disjoint (in separate tails of the pdf), we can add their probabilities to get the probability of their union: Pr(z ≥ 2.56 or z ≤ −2.56) = Pr(z ≥ −2.56) + Pr(z ≤ −2.56) = 0.0052 + 0.0052 = 0.0104.
Let X represent scores on the biological section of the MCATs. We are given the fact that X ~ N(9.2, 2.2). The probability of a score of 10.8 or greater is Pr(X ≥ 10.8) = Pr(z ≥ (10.8 − 9.2)/2.2) = Pr(z ≥ 0.73) = 0.2327. Therefore, approximately 23% of those taking the exam will get a score of 10.8 or better.
Chapter 8: Introduction to Statistical Inference
8.1 Breast cancer survival.
Whether a value is a parameter or a statistic often depends on how the research question is stated. The current research question seems to address survival of breast cancer cases in general. Therefore, these 1225 cases must be considered a sample of the larger population of breast cancer cases and all the highlighted calculations represent sample statistics.
8.3 Parameter or statistic?
(a) Statistic
(b) The number 12% is a parameter. The number 8% is a statistic.
(c) All are statistics. (These statistics will be used to infer costs at online and community pharmacies, respectively.)
8.5 Survey of health problems.
(a) True. The standard deviation of .
(b) False. It is not reasonable to assume that the number of health problems per person is Normal. Most people will have 0 or 1 health problem and a small number will have 2, 3, 4, or more problems; the distribution is likely to have a positive skew.
(c) True. Because n is large, we can count on the central limit theorem to make the sampling distribution of fairly Normal.
8.7 Repeated lab measurements.
(a) The standard deviation of the mean of four measurements .
(b) Assuming the measurements are unbiased, the average is more likely to reflect the true value the measurement than is a single measurement.
(c) . Solving for n,
. To achieve a standard error of 0.2, use
.
8.9 Pediatric asthma survey, n = 50.
npq = (50)(0.05)(0.95) = 2.375. Therefore, the sample is too small to apply the Normal approximation.
The binomial function X ~ b(50, 0.05) can be applied to random variable X.
8.11 Pediatric asthma survey, n = 250.
npq = (250)(0.05)(0.95) = 11.875. Therefore, the Normal approximation to the binomial can be applied. Note that μ = 250 · 0.05 = 12.5 and , that is, X ~ N(12.5, 3.446).
1. State: We want to determine Pr(X ≥ 25)
2. Standardize:
3. Sketch:
4. Use Table B: Pr(z ≥ 3.63) ≈ 0.000
8.13 Fill in the blanks.
(a) binomial
(b) p
(c) np
(d)
(e)
8.15 Patient preference.
We start by assuming X ~ b(10, 0.5).
Pr(X ≥ 7)
= Pr(X = 7) + Pr(X = 8) + Pr(X = 9) + Pr(X = 10)
= 0.117188 + 0.043945 + 0.009766 + 0.000977
= 0.1719
Chapter 9: Basics of Hypothesis Testing
9.1 Misconceived hypotheses.
(a) The null and alternative hypotheses must be set up so that either H0 or Ha is true. Here, it is possible for neither to be true.
(b) Hypotheses must address the parameter (for example, μ), not the statistic (for example, ).
(c) Same problem as we identified in (b). These hypothesis statements address sample statistic . They should address population parameter p.
9.3 Patient satisfaction.
(a)
(b) The sketch of is not shown in this key. Note that the curve should be centered on μ = 50 with points of inflection at 48.75 and 51.25. The standard deviation landmarks starting 2 standard errors below the mean are at 47.5, 48.75, 50.0, 51.25, and 52.5.
(c) Notice that . Therefore, 48.8 is a little less than 1 standard deviation below μ0. This would not be unusual and would not provide strong evidence against H0.
9.5 P from z.
One-sided P = Pr(z ≤ −2.45) = 0.0071 (from Table B)
Two-sided P-value 2 × 0.0071 = 0.0142
9.7 Patient satisfaction (sample mean of 48.8).
(a) Ha: μ < 50
(b)
(c) P = Pr(z < −0.96) = 0.1685. Interpretation: If H0 were correct, results this extreme or more extreme would occur about 17% of the time (that is, would not be that unusual). Thus, the sample mean of 48.8 is not significantly different from a population mean of 50.
9.9 LDL and fiber.
The P-value lets you know that the observed difference (or one more extreme) could occur 1 in 100 observations if there was no reduction in the population. Because this is unlikely, the results are considered to be statistically significant.
9.11 Gestational length, African American women, hypothesis test.
A. Hypotheses. H0: μ = 39 versus Ha: μ ≠ 39
B. Test statistic.
C. P-value. One-sided P = Pr(z ≤ −1.17) = 0.1210 (from Table B) and two-sided P = 2 × 0.1210 = 0.2420. The evidence against H0 is nonsignificant by usual conventions.
D. Significance level. The results are not significant at α = 0.10 (retain H0).
E. Conclusion. The mean gestational length in this sample of African-American women (38.5 weeks) is not significantly different from the expected population mean of 39 weeks (P = 0.24).
9.13 Gestational length, African American women, sample size.
. Round this up to 168.
9.15 Female administrators.
(a) Conditions for the test: Data represent an SRS of female executives. The distribution of is approximately Normal.
(b) First note that . Under the null hypothesis
. The sketch of
is shown below.
(c) .
One-sided P = Pr(z ≤ −1.88) = 0.0301
Two-sided P = 2 × 0.0301 = 0.0602
(d) Explanations for the observed difference:
1) Chance (P = 0.0602)
2) Selection bias: The sample is not an SRS of female executives.
3) Confounding: An extraneous variable is lurking in the background. For example, these executives may be younger and/or less experienced on average than the population.
4) Gender bias.
9.17 University men.
The four-step procedure is used to solve the problem:
A. Hypotheses. H0: μ = 69 versus Ha: μ ≠ 69
B. Test statistic. zstat = 1.52
C. P-value. One-sided P = Pr(z ≥ 1.52) = 0.0643. Therefore, the two-sided P = 2 × 0.0643 = 0.1286. This is considered to be nonsignificant by usual conventions.
D. Significance level. Results are not significant at α = 0.10 (retain H0).
E. Conclusion. This group is not taller than average (P = 0.13).
9.19 The criminal justice analogy.
TRUTH | ||||
DECISION OF JURY |
Did not do crime |
Did crime | ||
Not guilty |
(a) |
(c) | ||
Guilty |
(b) |
(d) | ||
Declaring an innocent person guilty (b) is analogous to a type I error. Declaring a criminal not guilty (c) is analogous to a type II error. In both hypothesis testing and in the criminal justice system, it is important to avoid a type I error.
9.21 Lab reagent, power analysis.
Conditions: α = 0.05 (two-sided), n = 6, μ0 = 5 weeks, μa = 4.75 weeks, σ = 0.2 weeks. Based on these conditions, .
Chapter 10: Basics of Confidence Intervals
10.1 Misinterpreting a confidence interval.
The pharmacist is incorrect. The confidence interval applies to the population mean μ; it does not apply to the distribution of individual observations.
10.3 Newborn weight.
(a)
(b)
(c)
10.5 SIDS.
The 95% confidence interval for .
Interpret your results. Based on this sample, we have 95% confidence the population mean μ is between 2774 and 3222.
10.7 Hemoglobin.
(a)
(b)
10.9 P-value and confidence interval.
The 95% confidence interval for μ will exclude 0 because the mean is significantly different from 0 at α = 0.05. However, the 99% confidence interval will capture 0 because the mean is not significantly different from 0 at α = 0.01.
10.11 Antigen titer.
95% confidence interval for μ = 7.4033 ± (1.96)(0.0404) = 7.4033 ± 0.0792 = (7.3241 to 7.4825)
10.13 Reverse engineering the confidence interval.
(a) Because the sample mean is the center of the confidence interval, = (6.5 + 5.7) / 2 = 6.1.
(b) The margin of error is half the confidence interval length: m = ½ · (6.5 − 5.7) = 0.4.
(c) For 95% confidence, m = 1.96 × SE. Therefore, SE = m/1.96 = 0.4/1.96 = 0.204.
(d) 99% confidence interval for μ = 6.1 ± (2.576)(0.204) = 6.1 ± 0.53 = (5.57 to 6.63) pounds.
(e) Yes, the sample mean is significantly different from 7.2 pounds at α = 0.01 because it excludes 7.2 with 99% confidence.
10.15 True or false?
(a) False; 5 is the margin of error.
(b) False; 13 is the point estimate.
(c) True.
10.17 Lab reagent, 90% confidence interval for true concentration.
90% confidence interval for . We conclude with 90% confidence that the true concentration of the solution is between 4.854 and 5.123 (mg/dL).
Chapter 11: Inference About a Mean
11.1 Blood pressure.
(a)
(b) Therefore,
. Round up to the next integer to ensure the stated level of precision. Therefore, resolve to use 107 observations.
11.3 Sketch a curve.
The middle 95% of the curve is defined by t22,0.975 = 2.074 and t22,0.025 = −2.074.
11.5 Probabilities not in Table C.
Pr(T8 > 2.65) is between 0.01 and 0.025 (VIA TABLE C). Using a computer program, Pr(T8 > 2.65) = 0.015.
11.7 Software utility programs.
(a) Pr(T8 ≥ 2.65) = 0.0150
(b) Pr(T 8 ≥ 2.98) = 0.0088
(c) Pr(T19 ≤ 2.98) = 0.9962
11.9 Critical values for a t-statistic.
First note that df = 21 − 1 = 20. For a one-sided test we get a P-value less than 0.05 when the tstat is either less than −1.725 or more than 1.725, that is, |tstat| ≥ 1.725. For a two-tailed test, we get a P-value less than 0.05 when the tstat is either less than −2.086 or more than 2.086, that is, |tstat| ≥ 2.086.
A. Hypotheses. H0: μ = 29.5 days versus Ha: μ ≠ 29.5 days
B. Test statistic.
C. P-value. Use Table C to determine that t8,0.90 = 1.40 (right tail = 0.10) and t8,0.95 = 1.86 (right tail = 0.05). Therefore, the one-sided P-value is between 0.05 and 0.10 and the two-tailed P-value is 0.10 < P < 0.20. Using a utility program, P = 0.11.
D. Significance level. The evidence against H0 is not significant at α = 0.10 (retain H0).
E. Conclusion. The sample mean (27.78 days) is not significantly different from the hypothesized value of 29.5 days (P = 0.11).
11.13 Menstrual cycle length.
(a) . Therefore, the 95% CI for μ = 27.78 ± (2.306)(0.9687) = 27.78 ± 2.23 = (25.55 to 30.01) days.
(b) The confidence interval includes 28.5. It also includes 30. Therefore, the sample mean is not significantly different from either 28.5 or 30 at α = 0.05.
11.15 Water fluoridation.
(a) DELTA values:
AFTER |
BEFORE |
DELTA |
49.2 |
18.2 |
31.0 |
30.0 |
21.9 |
8.1 |
16.0 |
5.2 |
10.8 |
47.8 |
20.4 |
27.4 |
3.4 |
2.8 |
0.6 |
16.8 |
21.0 |
−4.2 |
11.3 |
−0.6 | |
5.7 |
6.1 |
−0.4 |
23.0 |
25.0 |
−2.0 |
17.0 |
13.0 |
4.0 |
79.0 |
76.0 |
3.0 |
66.0 |
59.0 |
7.0 |
46.8 |
25.6 |
21.2 |
84.9 |
50.4 |
34.5 |
65.2 |
41.2 |
24.0 |
52.0 |
21.0 |
31.0 |
Stemplot of DELTA values:
Interpretation:
• Data spread from −4 to 34.
• The median is about 7.5 (taken from the stemplot).
• The shape of the distribution is difficult to assess because of the small sample size. There are no prominent outliers.
(b) All but 4 of the 16 cities (25%) showed improvement.
(c) n = 16
df = 16 − 1 = 15
t15,0.975 = 2.131
95% CI for μd = 12.21 ± (2.131)(3.405) = 12.21 ± 7.26 = (4.95 to 19.47) additional cavity-free children per 100.
11.17 Large t-statistic.
When the sample contains more than just a few observations, the associated t-statistic will have more than a few df and will look very much like a Standard Normal z-distribution. Based on the 68–95–99.7 rule, we would almost never get a test statistic that is 6.6 standard deviations away from the 0. Therefore, we can say that the t-test statistic is in the far right-hand tail of the sampling distribution and the P-value will be very, very small (e.g., less than 0.01).
11.19 Vector control in an African village.
(a)
t100-1,1−(0.05/2) = t99,0.975 = 1.984 (via computer applet). If a computer program is not available, you can make use of the fact that program t99,0.975 ≈ t100,0.975 = 1.984
95% CI for μ = 249 ± (1.984)(3.982) = 249 ± 7.9 = (241.1 to 256.9) square feet.
(b) It would not be correct to make this statement. The confidence interval applies to population mean μ, not to individual observations.
11.21 Boy height.
df = 26 − 1 = 25
t25,0.975 = 2.060
95% CI for μ = 63.8 ± (2.060)(0.608) = 63.8 ± 1.3 = (62.5 to 65.1) inches.
11.23 Faux pas.
(a) Data
VISIT1 |
VISIT2 |
DELTA |
5 |
4 |
−1 |
13 |
11 |
−2 |
17 |
12 |
−5 |
3 |
3 |
0 |
20 |
14 |
−6 |
18 |
14 |
−4 |
8 |
10 |
2 |
15 |
9 |
−6 |
(b) .
(c) The stemplot is:
• Six of the eight students (75%) showed a decline in the number of faux pas.
• Data range from −6 to 2 (spread).
• The median is about −1.5 (location).
• The distribution is mound shaped with no apparent outliers, and shows no dramatic departures from Normality.
(d) Yes, a t-procedure can be used because the data are mound-shaped and there are no major departures from Normality.
(e) Hypothesis test.
A. Hypotheses. H0: μd = 0 versus Ha: μd ≠ 0
B. Test statistic.
with df = 8 − 1 = 7.
C. P-value. P = 0.034 (via applet). Using Table C, 0.025 < P < 0.05. These P-value provide good evidence against H0.
D. Significance level. The difference is significant at α = 0.05 but not at α = 0.01.
E. Conclusion. There was a significant reduction in the number of faux pas after the intervention (P = 0.034).
11.25 Beware α = 0.05.
(a) P = 0.0464. Yes, the test is statistically significant at α = 0.05.
(b) P = 0.0514. No, the test is not statistically significant at α = 0.05 (P > α).
(c) It is not reasonable to derive different conclusions because the observed mean changes are identical and the P-values are nearly identical, both providing fairly strong evidence against the null hypothesis.
11.27 Benign prostatic hyperplasia, maximum flow.
A. Hypotheses. H0: μd = 0 versus Ha: μd ≠ 0
B. Test statistic. n = 10, sample mean difference = 3, sd = 4.6188, tstat = 2.054 with df = 10 − 1 = 9
C. P-value. P = 0.072. The evidence against H0 is marginally significant.
D. Significance level. The evidence against the null hypothesis is significant at α = 0.10 but not at α = 0.05.
E. Conclusion. Maximum urine flow increased by an average of 2.91 units. By standard conventions, this result is deemed marginally significant (P = 0.072).
t14,0.975 = 2.145
95% confidence interval for μ = 4.67 ± (2.145)(0.4493) = 4.67 ± 0.96 = (3.71 to 5.63).
This result is compatible with random “5 out of 10” guessing.
Chapter 12: Comparing Independent Means
12.1 Sampling designs.
(a) Independent samples
(b) Paired samples
(c) Single sample
12.3 Facetious data.
(a) The mean in group 1 is 98. The mean in group 2 is 108. The mean difference is 98 − 108 = −10. This mean difference is based on independent samples.
(b) The mean change in group 1 is equal to 4. This mean difference is based on paired samples.
(c) The mean change in group 2 is equal to 1. This mean difference is based on paired samples.
(d) Group 1 had a greater mean change by 4 − 1 = 3 units. This is an independent comparison.
(a) Stemplots
Discussion:
• Site 1 has a high outlier.
• The distributions have similar central locations.
• Site 1 has greater variability.
(b) Means and standard deviations
(c) The summary statistics confirm that the distributions have similar central locations and that site 1 has much greater variability.
12.7 Air samples.
(a)
(b) dfconserv = 7; t7,0.975 = 2.365
The 95% CI for μ1 − μ2 = (36.25 − 36.00) ± (2.365)(5.247) = 0.25 ± 12.41 = (−12.16 to 12.66) μg/m3
(c) . The one-side P-value (by Appendix Table C) is greater 0.25. Therefore, the two-tailed P > 0.50. Using a software utility (www.cytel.com/Products/StaTable/), P = 0.96. There is no significant difference in means.
12.9 Sample size calculation.
n = (2)(0.672)(1.28 + 1.96)2 / 0.252 = 150.80. Resolve to study 151 individuals in each group.
(a) One sample
(b) Paired samples
12.13 Risk taking behavior in boys and girls.
The boxplots is:
The girls have lower scores on average and less variability. There is one outlier in the girls group.
12.15 Scrapie treatment, delay of death.
A. Hypotheses. H0: μ1 = μ2 versus Ha: μ1 ≠ μ2
B. Test statistic. and tstat = (116 − 88.5)/5.91 = 4.65. Since n1 = 10 and n2 = 10, dfconserv = 10 − 1 = 9.
C. P-value. The two-sided P-value is between 0.001 and 0.002. Using a computer applet the two-sided P-value = 0.0012.
D. Significance level. The evidence against the null hypothesis is significant at α = 0.01.
E. Conclusion. The treated hamsters survived significantly longer than the control hamsters (mean survival 116 vs. 88.5 days, P = 0.0012).
12.17 Bone density in newborns.
(a) The infants of smoking mothers had slightly higher bone density on average compared to those of the nonsmoking mothers (0.098 compared to 0.095 g/cm3).
. dfconserv = the lesser of (n1 − 1) or (n2 − 1) = 77 − 1 = 76. Since this df is not in Appendix Table C, use the next smallest df (60) to derive t60,1−(0.05/2) = t60,0.975 = 2.000. The 95% confidence for μ1 − μ2 = (0.098 − 0.095) ± (2.000)(0.003558) = 0.003 ± 0.007 = (−0.004 to 0.010) g/cm3.
(b) In testing H0: μ1 − μ2 = 0, the value of the mean difference under the null hypothesis is 0. Since the 95% confidence interval for μ1 − μ2 includes a value of 0, you would retain H0 at α = 0.05 and conclude no significant difference in the mean bone densities of newborns from smoking and nonsmoking mothers.
12.19 Efficacy of echinacea, severity of symptoms.
A. Hypotheses. H0: μ1 − μ2 = 0 versus Ha: μ1 − μ2 ≠ 0.
B. Test statistic. ;
dfconserv = 337 − 1 = 336.
C. P-value. Use the row for 100 df in Table C to derive a conservative estimate of the P-value estimate. Thus, P > 0.50. (A statistical applet derived P = 0.5691.)
D. Significance level. The evidence against the null hypothesis is not statistically significant at any reasonable level for α.
E. Conclusion. There was no significant difference in the mean severity of symptoms in the echinacea treatment group and the control group (P = 0.57).
12.21 Calcium supplementation and blood pressure, exploration.
Here are the side-by-side boxplots:
The plot shows that the calcium treated group had a greater average decline in blood pressure. They also exhibited greater variability. There is a high outside value in the placebo group.
12.23 Delay in discharge.
A. Hypotheses. H0: μ1 = μ2 versus Ha: μ1 ≠ μ2
B. Statistics calculated with SPSS > Analyze > Compare means > Independent Samples T
tstat = 2.121 with dfWelch = 21.7
C. P = 0.046
D. Significance level. The observed difference is significant at α = 0.05 but not at α = 0.04.
E. Conclusion. The mean delay at facility A was significantly greater than that at facility B (14.1 vs. 11.0 days, P = 0.046).
12.25 Time spent sitting or walking.
A. Hypotheses. This exercise seeks to answer whether lean and obese people differ in the average time they spend sitting. Under the null hypothesis, the population means are equal, so we test H0: μ1 = μ2 versus Ha: μ1 ≠ μ2.
B. Test statistics. Statistics calculated with SPSS > Analyze > Compare means > Independent Samples T. Output shown.
tstat = −4.201 with dfWelch = 15.2 (unequal variance t-procedure).
C. P = 0.001.
D. Significance level. The observed difference is significant at α = 0.001.
E. Conclusion. The lean individuals spent significantly less time sitting per day than the obese individuals (407 vs. 571 min, P = 0.001).
Chapter 13: Comparing Several Means (One-Way Analysis of Variance)
13.1 Birth weight, exploration.
(a) This study is nonexperimental (“observational”) because the investigator did not assign the explanatory factor (smoking) to study participants.
(b) Outline of study design:
(c) Interpretation of Figure 13.5.
• Location: Nonsmokers and ex-smokers have higher average birth weights than smokers.
• Spread: It is difficult to evaluate variability in such small samples. However, hinge-lengths (IRQs) seem comparable.
• Shape: The samples are too small (six to eight in each group) to make definitive statements about their shapes; there is an outside value in group 2.
13.3 Smoking and birth weight, ANOVA.
A. Hypotheses. H0: μ1 = μ2 = μ3 = μ4 versus Ha: at least two of the μis differ
B. Test statistic. ANOVA table (below). Fstat = 4.096/1.255 = 3.26 with 3 and 21 df.
C. P-value. P < 0.05 by Appendix Table D (use the 3 and 20 df column). P = 0.042 by a software utility.
D. Significance level. The results are significant at α = 0.05 but are not significant at α = 0.04.
E. Conclusion. Mean birth weights differed significantly according to the smoking status of the mothers (P = 0.042). Infants of nonsmoking mothers demonstrated the highest average birth weight (7.9 pounds), while the mothers who smoked at least half a pack per day demonstrated the lowest (6.3 pounds).
13.5 Smoking and birth weight, post hoc comparisons.
(a) Least squared difference (LSD) tests
Calculated by SPSS (Rel. 11.0.1. 2001. Chicago: SPSS Inc.)
Summary in concise narrative form: Differences between groups 1 and 4 and between groups 2 and 4 are statistically significant. The difference between groups 1 and 3 is marginally significant.
(b) Bonferroni’s method tests
Calculated with SPSS (Rel. 11.0.1. 2001. Chicago: SPSS Inc.)
Summary in concise narrative form: The difference between groups 1 and 4 is marginally significant. All other differences are not statistically significant.
(c) Post hoc confidence intervals incorporating Bonferroni’s correction
Calculated with SPSS (Rel. 11.0.1. 2001. Chicago: SPSS Inc.)
13.7 Smoking and birth weight. Kruskal–Wallis test.
A. Hypotheses. H0: birth weight distributions in the four populations are the same versus Ha: the populations differs
B. Test statistic. Chi-square = 7.305 with 3 df (Calculated with SPSS, Rel. 11.0.1. 2001. Chicago: SPSS Inc.)
C. P-value. P = 0.063 (“marginal significance”)
D. Significance level. The results are significant at α = 0.10 (reject H0) but are not significant at α = 0.05 (retain H0).
E. Conclusion. The birth weight distributions differ, with evidence rising to marginal significance (P = 0.063).
13.9. Antipyretic trial.
(a) Here is the plot:
This plot shows that aspirin and ibuprofen are associated with greater average fever reduction than acetaminophen. Results with ibuprofen are more variable (larger IQR) than with aspirin.
(b) Means and standard deviations (calculated with SPSS, Rel. 11.0.1. 2001. Chicago: SPSS Inc.)
(c) Hypothesis test
A. Hypotheses. H0: μ1 = μ2 = μ3 versus Ha: at least two of the μis differ
B. Test statistic. Calculated with SPSS, Rel. 11.0.1. 2001. Chicago: SPSS Inc.
C. P-value. P = 0.030, showing the differences to be statistically significant.
D. Significance level. The results are significant at α = 0.05 (reject H0) but are not significant at α = 0.01 (retain H0).
E. Conclusion. There is significant difference in the mean reduction in body temperature according to analgesic type (P = 0.030). Aspirin reduced fever by an average of 1.26°F, ibuprofen reduced fever by an average of 1.20°F, and acetaminophen reduced fever by an average of 0.25°F.
(d) Here are the post hoc comparisons via the LSD method:
*The mean difference is significent at the 0.05 level.
The aspirin group and acetaminophen group differ significantly (P = 0.023), as do the ibuprofen group and acetaminophen group (also P = 0.023). The aspirin group and ibuprofen group means do not differ significantly (P = 0.888).
Chapter 14: Correlation and Regression
14.1 Bicycle helmet use.
(a) Here is the scatterplot:
A negative linear relationship is evident. There is a possible outlier in the upper right-hand quadrant (observation 13, Los Arboles).
(b) r = −0.581 (n = 13)
(c) Discuss what this (the potential outlier) means in plain terms… This observation had high helmet use and low socioeconomic status.
(d) r = −0.849 (n = 12)
The correlation went from moderate strength to strong after removing the outlier, indicating a much better fit of data points to the negative trend line.
(e) Hypothesis test
A. Hypotheses. H0: ρ = 0 versus Ha: ρ ≠ 0
B. Test statistic. tstat = −5.08 with 10 df
C. P-value. P = 0.00048 (strong evidence against the null hypothesis)
D. Significance level. The evidence against H0 is significant at α = 0.01 (reject H0).
E. Conclusion. The observed negative association between the receipt of free or reduced-fee lunches at school and the prevalence of bicycle helmet use is statistically significant (r = −0.85, n = 12, P = 0.00048).
14.3 Bicycle helmet use, n = 12.
(a) Regression coefficients
a = 47.490
b = −0.539
Interpretation of b: For each additional “percent children receiving reduced-fee or free meals,” the model predicts a 0.5% decline in bicycle helmet use.
Interpretation of a: This is where the regression line would cross the Y axis, that is, where x = 0.
(b) The 95% CI for β = (−0.775 to −0.303).
(c) The slope is significant at α = 0.05 because the 95% confidence interval does not include 0.
(d) Here is the stemplot of residuals:
There are no major departures from Normality.
14.5 Anscombe’s quartet.
I would use correlation or linear regression to analyze data set I because there appears to be a positive linear trend. I would not use correlation or regression to analyze the other data sets because these relations cannot be accurately described with a single straight line.
14.7 Domestic water and dental cavities, range restriction.
(a) The range below 1 ppm fluoride demonstrates a fairly straight relationship. The least squares regression line for this range has these coefficients:
a = 780.34
b = −528.07
Therefore, regression line for this range is = 780.34 + (−528.07)X.
The slope in this range predicts a decline of 528 caries per ppm of fluoride (equivalently, 52.8 fewer caries per 0.1 ppm fluoride).
(b) r2 = 0.856. The fit of this model is not as good as the ln–ln model calculated in Exercise 14.6, in which r2 = 0.947.
(c) Opinions on this matter will differ. I prefer this model. Although its fit is not as good as the ln–ln model (Exercise 14.6), this model is (a) easier to interpret and (b) addresses a useful biological range. The major declines in caries occur in the 0 to 0.8 range. Since higher levels have only modest benefits, and other sources suggest toxicity with high fluoride, it seems reasonable to restrict the analysis to this biologically relevant range.
14.9 True or false.
(a), (c), (e), and (g) are false. The others are true.
14.11 is always on the least squares regression line.
When . Therefore,
is always on the least squares regression line.
14.13 Nonexercise activity thermogenesis (NEAT).
(a) The scatterplot reveals a linear negative association between NEAT and FATGAIN. There are no apparent outliers. The relationship appears to be moderate in strength. (See prior comments about the difficulty judging correlational strength by eye alone.)
The least square regression line is . This linear relationship is statistically significant (P = 0.00061). Each calorie unit of NEAT predicts 0.0034 fewer kilograms of fat gained. Equivalently, each 100 calories of NEAT predicts 0.34 fewer kilograms of fat gained.
(b) .
(c) The regression line is not shown in this key.
(d) Observation 1 (x1, y1) = (−100, 4.2). The predicted value for this observation and
.
Observation 2 (x2, y2) = (−60, 3.0). The predicted value for this observation is and
.
Observation 3 (x3, y3) = (−20, 3.8). The predicted value for this observation is and
.
The graph showing the residuals is not shown in this key.
14.15 Gorilla ebola.
(a) DISTANCE is the independent variable. ONSET is the dependent variable.
(b) The scatterplot reveals a positive linear association with no apparent outliers.
(c) r = 0.962, demonstrating that the correlation is extremely strong.
(d) r2 = 0.9622 = 0.93
(e) The least square regression model is ONSET = −8.09 + 11.26 · DISTANCE. Results calculated with SPSS v 20.0.0 are shown in the following table.
The slope predicts that it takes, on average, 11.3 days for the outbreak to move from one band of gorillas to the next.
Chapter 15: Multiple Linear Regression
15.1 The relation between FEV and SEX in the illustrative data set.
The simple regression model is FEV = 2.451 + (0.361)(SEX). Because SEX is coded 0 = female and 1 = male, the intercept (2.451) represents mean FEV for female subjects and the slope represents the mean difference between females and males. Therefore, the mean FEV for males is 0.361 (L/sec) higher than that for females.
Adding AGE to the model results in FEV = 0.281 + (0.323)(SEX) + (0.220) (AGE). To address whether AGE confounded the observed relationship between SEX and FEV, we consider the effect of adding AGE to the regression model, which reduced the slope of SEX slightly, from 0.361 to 0.323 (11% decrease in relative terms). Thus, the potential for AGE to confound the relationship between SEX and FEV is minimal.
Chapter 16: Inference About a Proportion
16.1 AIDS-related risk factor.
(a) The population to which inference will be made is U.S. adult heterosexuals at the time of the survey. The parameter of interest is the proportion of individuals with multiple sexual partners. The sample proportion = 170/2673 = 0.063598 ≈ 0.064 or 6.4%.
(b) Examples of selection biases that may be pertinent: (1) Specific high-risk groups (for example, homeless, intravenous drug users) may not have permanent telephone lines and may be underrepresented in the sample. (2) Nonresponse should be scrutinized.
(c) Without a specific validation study to address this issue, it is difficult to determine the accuracy of responses. However, we may hypothesize that data could have understimated the true prevalence is respondents underreported sexual behavior out of fear of embarrassment or reprisal.
16.3 AIDS-related risk factor.
(a) Sampling distributions: Let X represent the number of individuals who are positive for the attribute. Random variable X has a binomial distribution with n = 2673 and parameter p. The value of p is unknown. Using the notation established in this chapter, X ~ b(n = 2673, p = unknown).
(b) A Normal approximation can be used if p is not too small. Suppose, for example, p = 0.01. Then npq = (2673)(0.01)(0.99) = 26.5. Since this exceeds 5, the sampling distribution of X is X ~ N(μ = 2673 · p, ). In contrast, if P is very small (say, 0.0001), then a normal approximation can not be used because npq = 2673 · 0.0001 · 0.99990 = 0.267.
16.5 AIDS-related risk factor.
A. Hypotheses. H0: p = 0.075 versus Ha: p ≠ 0.075
B. Test statistic. We start by checking the npq rule: np0q0 = 2673(0.075) (0.925) = 185. Therefore, the sample is large enough to use the z-test.
C. P-value. P = 0.025 via Appendix Table F (good evidence against H0)
D. Significance level. The evidence against H0 is significant at α = 0.05 (reject H0) but is not significant at α = 0.01 (retain H0).
E. Conclusion. The current prevalence of 6.4% is significantly less than the historical level of 7.5% (P = 0.025).
16.7 Patient preference, Fisher’s method.
A. Hypotheses. H0: p = 0.50 versus Ha: p > 0.50
B. Test statistic. An exact binomial test is used because of the small sample size. Under H0, X ~ b(8, 0.5). The observed number of success in the sample is x = 7.
C. P-value. P (one-sided) = Pr(X = 7) + Pr(X = 8) = 0.0313 + 0.0039 = 0.0352. This provided good evidence against the null hypothesis.
D. Significance level. The evidence against H0 is significant at α = 0.05 (reject H0) but is not significant at α = 0.01 (retain H0).
E. Conclusion. The data provide reliable evidence that more than half the patient population favors procedure A (P = 0.0352).
16.9 AIDS-related risk factor.
The 95% confidence interval for, population prevalence p = 0.0643 ± (1.96) (0.004740) = 0.0643 ± 0.0093 = (0.055 to 0.074) or (5.5% to 7.4%).
16.11 Patient preference.
The 95% confidence interval for p by Fisher’s method is 0.473 to 0.997. The 95% confidence interval for p by the Mid-P method is 0.520 to 0.994. Confidence intervals calculated with WinPepi describe.exe version 1.5.1.
16.13 Cerebral tumors and cell phone use.
A. Hypotheses. H0: p = 1/2 against Ha: p ≠ 1/2
B. Test statistic.
C. P-value. P = 0.085 via Table F. Results suggest that evidence against H0 is marginally significant (by usual conventions).
D. Significance level. The results are significant at α = 0.10 (reject H0) but not significant at α = 0.05 (retain H0).
E. Conclusion. Data provide some evidence that the tumors occurred more frequently on the side of the head as cellular phone use (P = 0.085).
Additional notes
• With continuity correction, zstat,c = 1.56 and P = 0.12
• The Fisher’s test derives P = 0.117
• The published article (Muscat et al., 2000) reported P = 0.06 based on a one-sided goodness of fit test with continuity correction (Muscat 2006, personal communication). This corresponds perfectly with our two-sided continuity corrected z-test.
16.15 Insulation workers.
A. Hypotheses. H0: p = 0.0259 versus Ha: p ≠ 0.0259 B. Test statistic. First check whether the Normal approximation can be used: np0q0 = (556)(0.02590)(1 − 0.0259) = 14.0. Therefore, the Normal (z-statistic) method is OK.
C. P-value. P = 0.0020 (“highly significant” by usual conventions).
D. Significance level. Data provide significant evidence against H0 at α = 0.01.
E. Conclusion. The incidence of cancer deaths in these insulation workers (4.68%) is significantly greater than the expected incidence of 2.59% (P = 0.0020).
16.17 Kidney cancer survival.
A. Hypotheses. H0: p = 0.2 versus Ha: p ≠ 0.2
B. Test statistic. The z-test can be used because np0q0 = (40)(0.2)(0.8) = 6.4. The observed proportion
C. P-value. P = 0.00158 by Table F, providing highly significant evidence against H0.
D. Significance level. The evidence against H0 is significant at α = 0.01 (reject H0).
E. Conclusion. There has been a significant improvement in survival (P = 0.0016).
Note: Using the exact Mid-P procedure (calculated with WinPepi describe.exe 1.5.1) the two-sided P = 0.0039.
16.19 Sample-size requirement.
Conditions: 95% confidence; p* = 0.50 (since no educated guess for p is available); desired margin of error m = 0.06.
Calculation: . Therefore, resolve to study 267 individuals.
16.21 Alternative medicine.
95% CI for population prevalence
. We conclude with 95% confidence that between 41.5% and 46.5% of the population would use alternative medicine if traditional medical care failed to produce the desired results.
Note: The plus-four confidence interval method is unnecessary with this large sample size—a straight Normal approximation would have sufficed. However, there is no harm in using the plus-four method.
16.23 Perinatal growth failure.
A. Hypotheses. H0: p = 0.025 versus Ha: p > 0.025
B. Test statistic. x = 8
C. P-value. P = Pr(X ≥ 8 | X ~ b(33, 0.025)) = 6.5 × 10−7 (“highly significant”).
D. Significance level. Data provide significant evidence against H0 at extremely low α levels.
E. Conclusion. Infants with perinatal grown failure syndrome have a higher incidence of very-low intelligence scores at age 8 compared to the general population (P < 0.0001).
16.25 Incidence of improvement.
Incidence
Confidence interval for p by the plus-four method:
95% confidence interval for p = 0.2785 ± (1.96)(0.0504) = 0.2785 ± 0.0988 = 0.1797 to 0.3773 or about 18% to 38%.
We can conclude with 95% confidence that between 18% and 38% of this population shows spontaneous improvement within a month.
16.27 Familial history of breast cancer, sample size requirements.
(a)
. Therefore, resolve to study 941 individuals.
(b) . Therefore, resolve to study 453 individuals.
(c) The larger expected differences diminished the sample size requirement of the study.
(d)
. Therefore, resolve to study 1291 individuals.
(e) The lower α level increased the sample size requirement of the study.
16.29 Freshman binge drinking.
(a) This is a large sample, so we could go directly to the large sample formula. However, there is no harm in using the plus-four method. Thus, , and ñ = (5266 + 4) = 5270, and the 95% CI for
. We can now state with 95% confidence that the population prevalence is between 33.0% and 35.5%.
(b) The 99% CI for . We can now state with 99% confidence that the population prevalence is between 32.6% and 36.0%.
(c) Yes. Both the 95% confidence interval and 99% confidence interval for p exclude a population proportion of 20%. This is equivalent to saying that the evidence against H0: p = 0.20 is reliable at both α = 0.05 and 0.01 levels of statistical significance.
Chapter 17: Comparing Two Proportions
17.1 Prevalence of cigarette use among two ethnic groups.
(a) The sampling distribution of 1−
2 will be approximately Normal with mean μ = p1 − p2 = 0.40 − 0.12 = 0.28 and standard deviation
. In symbols,
1−
2 ~ N(0.28, 0.01859)
(b) (from Table B)
17.3 Cytomegalovirus and coronary restenosis.
(a) Risk in the CMV+ group
Risk in the CMV− group
Risk difference 1−
2 = 0.4286 − 0.0769 = 0.3517
(b) 95% confidence by the plus-four method:
95% confidence interval for p1 − p2 = (0.4314 − 0.1071) ± (1.96) (0.0907) = 0.3243 ± 0.1778 = (0.1465, 0.5021).
We are 95% confident that CMV increases the risk of restenosis by between 14.7% and 50.2%.
(c) Here are results calculated by WinPepi Compare2.exe (version 1.38).
DIFFERENCE (A minus B) = 0.324 SE = 0.091
Large-sample method (Fleiss), continuity-corrected:
90% CI = 0.147 to 0.501
95% CI = 0.119 to 0.530
99% CI = 0.063 to 0.586
Wilson’s score method:
Not continuity-corrected (Newcombe’s method 10):
90% CI = 0.153 to 0.455
95% CI = 0.117 to 0.477
99% CI = 0.044 to 0.518
Continuity-corrected (Newcombe’s method 11):
90% CI = 0.131 to 0.469
95% CI = 0.094 to 0.490
99% CI = 0.022 to 0.529
The 95% confidence interval by the Wilson score method, not continuity-corrected (bold face) most closely corresponds with our plus-four method.
17.5 Joseph Lister and anti septic surgery.
1 = 0.457143
2 = 0.1500
Hypothesis test
A. Hypotheses. H0: p1 = p2 versus Ha: p1 ≠ p2
B. Test statistic. zstat = 2.91
C. P-value. P = 0.0036, providing highly significant evidence against the null hypothesis.
D. Significance level. The evidence against H0 is significant at α = 0.005 (reject H0).
E. Conclusion. Adoption of aseptic surgical techniques decreased postoperative mortality from 45.7% to 15.0% (P = 0.0036).
17.7 Induction of labor and meconium staining.
(a) Estimates
(b) The following table of expected values shows that all expected values exceed 5. Therefore, an exact procedure is unnecessary.
(c) Hypothesis test using z-procedure
A. Hypothesis statements. H0: p1 = p2 versus Ha: p1 ≠ p2
B. Test statistic.
C. P = 0.00133 (via Appendix Table F).
D. Significance level. The evidence against the null hypothesis is significant at the α = 0.002 level but not at the α = 0.001 level.
E. Conclusion. Induction of labor significantly lowered incidence of meconium staining (P = 0.0013).
(d) The P-value by Fisher’s test is 0.0014, which is not materially different from the P-value derived by the z-test.
17.9 Framingham Heart Study.
The incidence proportion in the high cholesterol group .
The incidence proportion in the low cholesterol group .
.
Note that and
. The 95% CI for lnRR = 1.2276 ± (1.960)(0.27847) = 1.2276 ± 0.54580 = 0.6818 to 1.77340. Therefore, the 95% confidence interval for the RR = e(0.6818, 1.77340) = (1.98 to 5.89). Interpretation: We can be 95% confident that the RR in the source population is between 1.98 and 5.89 (i.e., two to six times the risk of coronary heart disease in high cholesterol group compared to the low cholesterol group).
17.11 Sample-size plan.
Assumptions: α = 0.05, p1 = 0.20, p2 = 0.30, average risk , and n1 = n2 = n.
. Therefore, resolve to study 293 individuals in each group.
WinPEPI estimated a sample size requirement of 294 per group. The discrepancy between the hand calculated estimate of 293 and WinPEPI’s estimate of 294 is inconsequential and is due to rounding error (e.g., using z0.80 = 0.84 instead of z0.80 = 0.842).
The earlier calculation assumes no continuity correction. Incorporation of a continuity correction factor results in
per group.
17.13 Smoking cessation trial.
Confidence interval by the plus-four method:
95% CI for p1 − p2 = (0.3563 − 0.1667) ± (1.96)(0.03864) = 0.1896 ± 0.0757 = (0.1139, 0.2653).
The confidence interval is more useful than the hypothesis test because it quantifies the effect of the intervention. The hypothesis test merely addressed whether there was any effect.
17.15 Telephone survey completion rates.
(a) Descriptive statistics. Proportion that completed the survey in the group that received advanced warning 1 = 134/291 = 0.4605. Proportion in the group that did not receive advanced warning
2 = 33/100 = 0.3300.
(b) Hypothesis test.
A. H0: p1 = p2 versus Ha: p1 ≠ p2.
B. Note:
C. P = 0.023.
D. The observed difference is significant at α = 0.025 but not at α = 0.02.
E. The advanced warning letter improved interview completion rates from 33.0% to 46.0% (P = 0.023).
(c) Estimation of effect size. The point estimate of the difference in proportions . Thus, advanced warning improved the response rate by 13.0% (in absolute terms).
95% CI for
. We can be 95% confident that the effect of the warning letter is to improve the response rate from 1.7% to 23.2%. A larger study is needed to derive a more precise estimate of the effect.
17.17 4S coronary mortality.
A. Hypotheses. H0: p1 = p2 versus Ha: p1 ≠ p2
B. Test statistic. zstat = 4.66
C. P-value. P (two-tailed) = 0.0000032
D. Significance level. The evidence against the null hypothesis is significant at α = 0.01.
E. Conclusion. The simvastatin treatment group demonstrated significantly lower fatal heart attacks risk compared to the placebo group (5.0% vs. 8.5%, P = 3.2 × 10−6).
In relative terms, how much did simvastatin lower heart attach risk? The easiest way to address this question is to first calculate the relative risk:
The relative reduction in risk = 1 − 0.59 = 0.41, or 41%.
A. Hypotheses. H0: p1 = p2 versus Ha: p1 ≠ p2
B. Test statistic. zstat = 2.054
C. P-value. P = 0.040 D. Significance level. The evidence against the null hypothesis is significant at α = 0.05.
E. Conclusion. Clearance of the effusions from ear infections after 14 days was significantly better with cefaclor than with amoxicillin (55.7% vs. 41.2%, P = 0.040).
Notes
• With continuity correction, zstat, c = 1.913 and P = 0.056.
• Take care in interpreting results, and do not extrapolate beyond the conditions of the test. The current test applies only to the 14-day follow-up point. In the published article (Mandel et al., 1982), the improvement rates equalized by 42 days (68.9% in the cefaclor group and 67.5% in the amoxicillin group). This and other studies suggest no difference in long-term failure rates with cefaclor and amoxicillin. For clinical recommendations, see AHRQ (2001). Number 15. Management of Acute Otitis Media. Retrieved August 30, 2006, from http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=hstat1.chapter.21026.
Chapter 18: Cross-Tabulated Counts
18.1 YRBS prevalence proportions.
18.3 Cytomegalovirus infection and coronary restenosis.
(a) Prevalence of CMV in both groups combined = 49/75 = 0.653 = 65.3% (b) Incidence of restenosis overall = 23/75 = 0.307 = 30.7%
(c) The proportion of the CMV+ group experiencing restenosis .
The proportion of the CMV− group experiencing restenosis .
These are row percents.
(e)
This OR is larger than the RR because the outcome is common.
18.5 Response to leprosy treatment.
Here are the relevant row percentages:
*Example of calculation: 7/52 × 100% = 13.5%.
These distributions show that patients with high skin damage were more likely to show improvement than those with low skin infiltration.
18.7 Chi-square approximation.
The area under the curve is between the chi-square landmarks of 4.64 (right-tail 0.20) and 5.32 (right-tail 0.15). Therefore, 0.15 < P < 0.20. The precise area computed with a software utility is 0.1564.
18.9 Cytomegalovirus infection and coronary restenosis.
(a)
With continuity-correction,
(b) zstat = 3.14, P = 0.0017. Note that
With continuity-correction,
18.11 Anger and heart disease.
A. Hypotheses. H0: “no trend between anger-trait and hard coronary events in the source population” versus Ha: “trend in population”
B. Test statistic. zstat, trend = 3.16
C. P-value. P = 0.0016
D. Significance level. The evidence against H0 is significant at α = 0.005 but at α = 0.001.
E. Conclusion. The positive trend between the anger trait and incidence of coronary heart disease is statistically significant (P = 0.0016).
18.13 Cell phones and brain tumors, study 1.
Results fail to support an association between cell phone use and brain tumors. The odds ratios for glioma and meningioma show small negative associations. The odds ratio for acoustic neuroma shows a small positive association. All confidence interval are consistent with no association. The point estimate for all tumor types combined is 1.0 indicating no association between recent cell phone use and intracranial tumors in general. All confidence interval are consistent with population odds ratios of 1 (no association).
18.15 Doll and Hill, 1950.
This suggests that the smokers had 14 times the risk of nonsmokers.
Determining the confidence interval for the OR:
ln(14.04) = 2.6419
SE = sqrt(647−1 + 622−1 + 2−1 + 27−1) = 0.7350
The 95% CI for OR = e(2.6419 ± 1.96 × 0.7350) = e(1.2013, 4.0825) = 3.3 to 59.3
18.17 Baldness and myocardial infarction, self-assessed baldness.
(a)
This reveals a positive trend in odds ratios after baldness level 2.
(b) . The association is highly significant.
(c) zstat,trend = 3.39, P = 0.00070. The trend is highly significant.
18.19 Diet and adenomatous polyps.
A. Hypotheses. H0: no association between fruit and vegetable consumption and colon polyps in the population versus Ha: H0 false.
B. Test statistic. zMcN = sqrt[(45− 24)2 / (45 + 24)] = 2.528
C. P-value. P = 0.011
D. Significance level. The evidence against H0 is significant at α = 0.05 and is almost significant at α = 0.01.
E. Conclusion. The association between low fruit and vegetable consumption and the recurrence of colon polyps is statistically significant (P = 0.011).
Additional notes: With continuity-correction, zMcN, c = sqrt[(|45 − 24| − 1)2 / (45 + 24)] = 2.408 and P = 0.016.
18.21 Thrombotic stoke in young women.
(a) . Interpretation: Oral contraceptive use was associated with an almost nine-fold increase in the risk of thrombotic stroke.
(b) Here are the data with the match broken:
The odds ratio with the match broken is = (46)(99) / (7)(60) = 10.8, overestimating the more appropriate matched-pair odds ratio of 8.8.
18.23 Don’t sweat the small stuff.
Without continuity-correction: χ2stat = 4.107, df = 1, P = 0.043
With continuity-correction: χ2stat, c = 3.598, df = 1, P = 0.058
It would not be reasonable to derive different conclusions because the actual data has not changed. Both Pearson’s test (P = 0.043) and Yates’ test (P = 0.058) provide reasonably reliable evidence against the null hypothesis. Therefore, the treatment group experienced the outcome significantly more often than the control group (12.5% vs. 7.7%).
18.25 Yates, 1934 (three-by-two).
Here are the expected values:
Normal teeth |
Malocclusion | |
Breast fed |
1.739 |
18.261 |
Bottle fed |
1.913 |
20.087 |
Breast & bottle feed |
4.348 |
45.652 |
You should not use a chi-square test in this situation because three table cells have expected values that are less than 5. WinPepi Compare2.exe (version 1.38) calculates a Fisher’s P of 0.1503. Therefore, the evidence against H0 is not significant. The conclusion is that the prevalence of malocclusion did not differ significantly according to whether the infant was breast fed or bottle fed (P = 0.15).
18.27 Esophageal cancer and tobacco use.
confidence interval for the OR = (1.37, 2.81). This confidence interval was calculated with WinPEPI > Compare2 > A Proportions or odds > Cornfield’s confidence interval for the odds ratio.
Interpretation: The point estimate suggests a doubling in the risk of esophageal cancer risk with tobacco use at the reported level. The 95% confidence interval suggests data are consistent with population odds ratios between 1.37 and 2.81.
18.29 Baldness and myocardial infarction, interviewer assessments.
(a) The interviewer assessments are likely to be more consistent and objective than the self-assessments.
(b) Baldness levels were classified as 1 = none, 2 = frontal, 3 = mild vertex, 4 = moderate vertex, and 5 = severe vertex according to interviewer assessments using the Hamilton baldness scale. Odds ratio estimates are as follows:
.
(c) This table below compares the results of the two analyses:
Baldness level |
Self-assessed baldness |
Interviewer-assessed baldness |
1 (no baldness) |
1.0 (reference) |
1.0 (reference) |
2 |
1.0 |
1.1 |
3 (moderate baldness) |
1.4 |
1.6 |
4 |
1.9 |
1.8 |
5 (severe baldness) |
2.6 (small sample) |
3.1 (small sample) |
Similar positive trends are observed between baldness level and myocardial infraction risk.
18.31 Practice with chi-square.
Observed
Expected
Use of the chi-square test is justified because only 20% of the table cells have expected frequencies less than 5.
(O − E)2 / E
Baldness |
Cases |
Controls |
1 (none) |
1.191 |
1.023 |
2 |
0.998 |
0.857 |
3 |
2.151 |
1.847 |
4 |
3.227 |
2.771 |
5 (extreme) |
0.272 |
0.234 |
χ2stat = 1.191 + 1.023 + 0.998 + 0.857 + 2.151 + 1.847 + 3.227 + 2.771 + 0.272 + 0.234 = 14.571
df = (5 − 1)(2 − 1) = 4
P = 0.0057 (highly significant evidence against H0)
Chapter 19: Stratified two-by-two Tables
19.1 Is participating in a follow-up survey associated with having medical aid?
(a) Among the 416 children who were followed-up, 46 (11.1%) had medical aid. In contrast, 195 of 1174 (16.6%) who were not followed-up had medical aid. Therefore, there is a negative association between follow-up and having medical aid.
(b) Among the white participants, 10 of 12 (83%) who were followed-up had medical aid. In total, 104 of the 126 white subjects who were not followed-up (83%) had medical aid. Therefore, there is no association between follow-up and having medical aid in this stratum.
(c) Nine percent of both groups had medical aid. Therefore, there is no association between follow-up and having medical aid in this stratum.
(d) Race is associated with follow-up and medical aid coverage. Therefore, race confounded the crude analysis in part (a).
19.3 Is participating in a follow-up survey associated with having medical aid?
(a) Race-specific prevalence ratios:
These strata-specific relative risks are homogeneous, so interaction is absent.
(b) Hypothesis test
A. Hypotheses. H0: RR1 = RR2 (no interaction) versus Ha: RR1 ≠ RR2 (interaction)
B. Recall that (calculated with WinPEPI > Compare2 > A. Proportions > “Stratified tables”). Calculation of our ad hoc interaction statistic is as follows:
C. P = 0.93.
D. Significance level. The evidence against the null hypothesis is not at all significant.
E. Conclusion. There is no significant interaction in the relative risks ().
Comment: The test for interaction derived by WinPEPI produces chi-square = 0.005, 1 df, P = 0.94 (i.e., nearly identical results). However, instead of using the Mantel–Haenszel relative risk estimate in its equation, WinPEPI uses an “inverse variance” pooled estimate of relative risk as its baseline summary measure.
19.5 Sex bias in graduate school admissions?
(a) Crude analysis
Overall, a higher percentage of male applications were accepted.
(b) Applicants for major 1
In major 1, a higher percentage of female applications were accepted.
(c) Applicants for major 2
Major 2 accepted approximately the same percentage of male and female applicants.
(d) There does not seem to be gender bias in favor of males in this graduate school’s admissions practices. A higher percentage of female applicants were accepted to major 1 (82% vs. 62%). In major 2, the acceptance rates were about the same for females and males (7% and 6%, respectively).
(e) The initial analysis was confounded because males tended to apply to major 1. Major 1 had a high acceptance rate (601 of 933 = 64%), while major 2 had a low acceptance rate (46 of 714 = 6%).
(f) Relative incidence of acceptance for males by major
Test for interaction
A. Hypotheses. H0: RR1 = RR2 (no interaction) versus Ha: RR1 ≠ RR2 (interaction)
B. Test statistic. χ2stat,int = 0.136 with df = 1. Calculated with WinPEPI > Compare 2 > A. Proportions.
C. P-value. P = 0.713 D. Significance level. The evidence against the null hypothesis is not significant at any reasonable level of α.
E. Conclusion. The “RR” for males in major 1 was 0.75. In major 2, the RR was 0.84. These RRs do not differ significantly (P = 0.71). Therefore, interaction in the ratio measures of effect is absent.
Comment: It is possible to have no interaction in the incidence ratios while still having an interaction in the incidence ratio differences: interactions are measure of effect specific. For example, there is no significant interaction in incidence ratios for the current data (P = 0.71). However, a test for interaction in the incidence difference conducted with WinPEPI proved to be significant (P = 1.5 × 10−5). Thus, the ratio effect measure statistical model adequately predicted the joint effects of gender and major, while the difference effect measure model did not.
(g) . After adjusting for major, male applicants were 23% less likely to be accepted than female applicants. The 95% confidence interval for the RR is (0.68, 0.86).
19.7 Infant survival.
(a) Incidence in the group with less care .
Incidence in the group with more care .
. “Less care” was associated with almost three times the risk of not surviving.
(b) In clinic 1, the mortality rates were 1.7% (less care) and 1.3% (more care), respectively, for a relative risk of 1.24. In clinic 2, the mortality rates were 7.9% and 8.0%, respectively, for a relative risk of 0.99. Thus, there is almost no association between survival and amount of care received within the clinics. It is also worth noting that clinical 2 has much higher mortality in absolute terms, possibly because it treats a much sicker population.
(c) The crude association was confounded by “clinic,” with “clinic” perhaps representing a surrogate measure for severity of the underlying condition.
(d) Test for interaction in the RRs.
A. H0: RR1 = RR2 versus Ha: RR1 ≠ RR2.
B. Chi-square interaction statistic = 0.047, df = 1 (derived using WinPEPI > Compare2 > precision-based heterogeneity statistic for ratio measures).
C. P = 0.83.
D. The evidence against H0 is not significant.
E. The risk ratio in the two clinics (1.24 and 0.99, respectively) do not differ significantly (P = 0.83). Therefore, evidence of interaction in risk ratios is absent.
(e) Mantel–Haenszel summary risk ratio = 1.11 (95% CI: 0.40 to 3.07) calculated with WinPEPI > Compare2 > A proportions. This indicates that there is no significant association between mortality and amount of care received after adjusting for the effect of “clinic.”
19.9 Herniated lumbar discs.
In testing, H0: OR1 = OR2, , df = 1, P = 0.055 (via WinPEPI’s heterogeneity of odds ratio procedure). Therefore, the odds ratios in the two strata (1.05 and 2.38, respectively) appear to be heterogeneous (P = 0.055); there is evidence of significant interaction in odds ratios.
Because interaction is present, we report separate odds ratios for the strata. There appears to be little effect of “no sports participation” among smokers (odds ratio = 1.05). In contrast, “no sports participation” seems to increase the risk among nonsmokers (odds ratio = 2.38).