Correlation coefficient for a direct relationship	Correlation coefficient for an indirect relationship	Relationship strength of the variables
0.0	0.0	None/trivial
0.1	−0.1	Weak/small
0.3	−0.3	Moderate/medium
0.5	−0.5	Strong/large
1.0	−1.0	Perfect

7.4 Computing the Spearman Rank-Order Correlation Coefficient

The Spearman rank-order correlation is a statistical procedure that is designed to measure the relationship between two variables on an ordinal scale of measurement if the sample size is n ≥ 4. Use Formula 7.1 to determine a Spearman rank-order correlation coefficient r_s if none of the ranked values are ties. Sometimes, the symbol r_s is represented by the Greek symbol rho, or ρ:

(7.1) $c7-math-0001$

where n is the number of rank pairs and D_i is the difference between a ranked pair.

If ties are present in the values, use Formula 7.2, Formula 7.3, and Formula 7.4 to determine r_s:

(7.2) $c7-math-0002$

where

(7.3) $c7-math-0003$

and

(7.4) $c7-math-0004$

g is the number of tied groups in that variable and t_i is the number of tied values in a tie group.

If there are no ties in a variable, then T = 0.

Use Formula 7.5 to determine the degrees of freedom for the correlation:

(7.5) $c7-math-0005$

where n is the number of paired values.

After r_s is determined, it must be examined for significance. Small samples allow one to reference a table of critical values, such as Table B.7 found in Appendix B. However, if the sample size n exceeds those available from the table, then a large sample approximation may be performed. For large samples, compute a z-score and use a table with the normal distribution (see Table B.1 in Appendix B) to obtain a critical region of z-scores. Formula 7.6 may be used to find the z-score of a correlation coefficient for large samples:

(7.6) $c7-math-0006$

where n is the number of paired values and r is the correlation coefficient.

Note that the method for determining a z-score given a correlation coefficient and examining it for significance is the same for each type of correlation. We will illustrate a large sample approximation with a sample problem when we address the point-biserial correlation.

Although we will use Formula 7.6 to determine the significance of the correlation coefficient, some statisticians recommend using the formula based on the Student's t-distribution, as shown in Formula 7.7:

(7.7) $c7-math-0007$

According to Siegel and Castellan (1988), the advantage of using the Student's t-distribution over the z-score is small with larger sample sizes n.

7.4.1 Sample Spearman Rank-Order Correlation (Small Data Samples without Ties)

Eight men were involved in a study to examine the resting heart rate regarding frequency of visits to the gym. The assumption is that the person who visits the gym more frequently for a workout will have a slower heart rate. Table 7.2 shows the number of visits each participant made to the gym during the month the study was conducted. It also provides the mean heart rate measured at the end of the week during the final 3 weeks of the month.

TABLE 7.2

Participant	Number of visits	Mean heart rate
1	5	100
2	12	89
3	7	78
4	14	66
5	2	77
6	8	103
7	15	67
8	17	63

The values in this study do not possess characteristics of a strong interval scale. For instance, the number of visits to the gym does not necessarily communicate duration and intensity of physical activity. In addition, heart rate has several factors that can result in differences from one person to another. Ordinal measures offer a clearer relationship to compare these values from one individual to the next. Therefore, we will convert these values to ranks and use a Spearman rank-order correlation.

7.4.1.1 State the Null and Research Hypothesis

The null hypothesis states that there is no correlation between number of visits to the gym in a month and mean resting heart rate. The research hypothesis states that there is a correlation between the number of visits to the gym and the mean resting heart rate.

The null hypothesis is

H_O: ρ_s = 0

The research hypothesis is

H_A: ρ_s ≠ 0

7.4.1.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

The level of risk, also called an alpha (α), is frequently set at 0.05. We will use α = 0.05 in our example. In other words, there is a 95% chance that any observed statistical difference will be real and not due to chance.

7.4.1.3 Choose the Appropriate Test Statistic

As stated earlier, we decided to analyze the variables using an ordinal, or rank, procedure. Therefore, we will convert the values in each variable to ordinal data. In addition, we will be comparing the two variables, the number of visits to the gym in a month and the mean resting heart rate. Since we are comparing two variables in which one or both are measured on an ordinal scale, we will use the Spearman rank-order correlation.

7.4.1.4 Compute the Test Statistic

First, rank the scores for each variable separately as shown in Table 7.3. Rank them from the lowest score to the highest score to form an ordinal distribution for each variable.

To calculate the Spearman rank-order correlation coefficient, we need to calculate the differences between rank pairs and their subsequent squares where D = rank (mean heart rate) − rank (number of visits). It is helpful to organize the data to manage the summation in the formula (see Table 7.4).

Next, compute the Spearman rank-order correlation coefficient:

$c7-math-5001$

7.4.1.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

Table B.7 in Appendix B lists critical values for the Spearman rank-order correlation coefficient. In this study, the critical value is found for n = 8 and df = 6. Since we are conducting a two-tailed test and α = 0.05, the critical value is 0.738. If the obtained value exceeds or is equal to the critical value, 0.738, we will reject the null hypothesis. If the critical value exceeds the absolute value of the obtained value, we will not reject the null hypothesis.

7.4.1.6 Compare the Obtained Value with the Critical Value

The critical value for rejecting the null hypothesis is 0.738 and the obtained value is |r_s| = 0.619. If the critical value is less than or equal to the obtained value, we must reject the null hypothesis. If instead, the critical value is greater than the obtained value, we must not reject the null hypothesis. Since the critical value exceeds the absolute value of the obtained value, we do not reject the null hypothesis.

7.4.1.7 Interpret the Results

We did not reject the null hypothesis, suggesting that there is no significant correlation between the number of visits the males made to the gym in a month and their mean resting heart rates.

7.4.1.8 Reporting the Results

The reporting of results for the Spearman rank-order correlation should include such information as the number of participants (n), two variables that are being correlated, correlation coefficient (r_s), degrees of freedom (df), and p-value's relation to α.

For this example, eight men (n = 8) were observed for 1 month. Their number of visits to the gym was documented (variable 1) and their mean resting heart rate was recorded during the last 3 weeks of the month (variable 2). These data were put in ordinal form for purposes of the analysis. The Spearman rank-order correlation coefficient was not significant (r_s₍₆₎ = −0.619, p > 0.05). Based on this data, we can state that there is no clear relationship between adult male resting heart rate and the frequency of visits to the gym.

7.4.2 Sample Spearman Rank-Order Correlation (Small Data Samples with Ties)

The researcher repeated the experiment in the previous example using females. Table 7.5 shows the number of visits each participant made to the gym during the month of the study and their subsequent mean heart rates.

TABLE 7.5

Participant	Number of visits	Mean heart rate
1	5	96
2	12	63
3	7	78
4	14	66
5	3	79
6	8	95
7	15	67
8	12	64
9	2	99
10	16	62
11	12	65
12	7	76
13	17	61

As with the previous example, the values in this study do not possess characteristics of a strong interval scale, so we will use ordinal measures. We will convert these values to ranks and use a Spearman rank-order correlation.

Steps 1–3 are the same as the previous example. Therefore, we will begin with step 4.

7.4.2.1 Compute the Test Statistic

First, rank the scores for each variable as shown in Table 7.6. Rank the scores from the lowest score to the highest score to form an ordinal distribution for each variable.

Next, compute the Spearman rank-order correlation coefficient. Since there are ties present in the ranks, we will use formulas that account for the ties. First, use Formula 7.3 and Formula 7.4. For the number of visits, there are two groups of ties. The first group has two tied values (rank = 4.5 and t = 2) and the second group has three tied values (rank = 8 and t = 3):

$c7-math-5002$

For the mean resting heart rate, there are no ties. Therefore, T_y = 0. Now, calculate the Spearman rank-order correlation coefficient using Formula 7.2:

$c7-math-5003$

7.4.2.2 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

Table B.7 in Appendix B lists critical values for the Spearman rank-order correlation coefficient. To be significant, the absolute value of the obtained value, |r_s|, must be greater than or equal to the critical value on the table. In this study, the critical value is found for n = 13 and df = 11. Since we are conducting a two-tailed test and α = 0.05, the critical value is 0.560.

7.4.2.3 Compare the Obtained Value with the Critical Value

The critical value for rejecting the null hypothesis is 0.560 and the obtained value is |r_s| = 0.860. If the critical value is less than or equal to the obtained value, we must reject the null hypothesis. If instead, the critical value is greater than the obtained value, we must not reject the null hypothesis. Since the critical value is less than the absolute value of the obtained value, we reject the null hypothesis.

7.4.2.4 Interpret the Results

We rejected the null hypothesis, suggesting that there is a significant correlation between the number of visits the females made to the gym in a month and their mean resting heart rates.

7.4.2.5 Reporting the Results

For this example, 13 women (n = 13) were observed for 1 month. Their number of visits to the gym was documented (variable 1) and their mean resting heart rate was recorded during the last 3 weeks of the month (variable 2). These data were put in ordinal form for purposes of the analysis. The Spearman rank-order correlation coefficient was significant (r_s₍₁₁₎ = −0.860, p < 0.05). Based on this data, we can state that there is a very strong inverse relationship between adult female resting heart rate and the frequency of visits to the gym.

7.4.3 Performing the Spearman Rank-Order Correlation Using SPSS

We will analyze the data from the previous example using SPSS.

7.4.3.1 Define Your Variables

First, click the “Variable View” tab at the bottom of your screen. Then, type the names of your variables in the “Name” column. As shown in Figure 7.1, the first variable is called “Number_of_Visits” and the second variable is called “Mean_Heart_Rate.”

7.4.3.2 Type in Your Values

Click the “Data View” tab at the bottom of your screen as shown in Figure 7.2. Type the values in the respective columns.

7.4.3.3 Analyze Your Data

As shown in Figure 7.3, use the pull-down menus to choose “Analyze,” “Correlate,” and “Bivariate… .”

Use the arrow button to place both variables with your data values in the box labeled “Variables:” as shown in Figure 7.4. Then, in the “Correlation Coefficients” box, uncheck “Pearson” and check “Spearman.” Finally, click “OK” to perform the analysis.

7.4.3.4 Interpret the Results from the SPSS Output Window

The output table (see SPSS Output 7.1) provides the Spearman rank-order correlation coefficient (r_s = −0.860) labeled Spearman's rho. It also returns the number of pairs (n = 13) and the two-tailed significance (p ≈ 0.000). In this example, the significance is not actually zero. The reported value does not return enough digits to show the significance's actual precision.

Based on the results from SPSS, the Spearman rank-order correlation coefficient was significant (r_s₍₁₁₎ = −0.860, p < 0.05). Based on these data, we can state that there is a very strong inverse relationship between adult female resting heart rate and the frequency of visits to the gym.

7.5 Computing the Point-Biserial and Biserial Correlation Coefficients

The point-biserial and biserial correlations are statistical procedures for use with dichotomous variables. A dichotomous variable is simply a measure of two conditions. A dichotomous variable is either discrete or continuous. A discrete dichotomous variable has no particular order and might include such examples as gender (male vs. female) or a coin toss (heads vs. tails). A continuous dichotomous variable has some type of order to the two conditions and might include measurements such as pass/fail or young/old. Finally, since the point-biserial and biserial correlations each involves an interval scale analysis, they are special cases of the Pearson product-moment correlation.

7.5.1 Correlation of a Dichotomous Variable and an Interval Scale Variable

The point-biserial correlation is a statistical procedure to measure the relationship between a discrete dichotomous variable and an interval scale variable. Use Formula 7.8 to determine the point-biserial correlation coefficient r_pb:

(7.8) $c7-math-0008$

where $c7-math-5004$ is the mean of the interval variable's values associated with the dichotomous variable's first category, $c7-math-5005$ is the mean of the interval variable's values associated with the dichotomous variable's second category, s is the standard deviation of the variable on the interval scale, P_p is the proportion of the interval variable values associated with the dichotomous variable's first category, and P_q is the proportion of the interval variable values associated with the dichotomous variable's second category.

Recall the formulas for mean (Formula 7.9) and standard deviation (Formula 7.10):

(7.9) $c7-math-0009$

and

(7.10) $c7-math-0010$

where $c7-math-5006$ is the sum of the values in the sample and n is the number of values in the sample.

The biserial correlation is a statistical procedure to measure the relationship between a continuous dichotomous variable and an interval scale variable. Use Formula 7.11 to determine the biserial correlation coefficient r_b:

(7.11) $c7-math-0011$

where $c7-math-5007$ is the mean of the interval variable's values associated with the dichotomous variable's first category, $c7-math-5008$ is the mean of the interval variable's values associated with the dichotomous variable's second category, s_x is the standard deviation of the variable on the interval scale, P_p is the proportion of the interval variable values associated with the dichotomous variable's first category, P_q is the proportion of the interval variable values associated with the dichotomous variable's second category, and y is the height of the unit normal curve ordinate at the point dividing P_p and P_q (see Fig. 7.5).

You may use Table B.1 in Appendix B or Formula 7.12 to find the height of the unit normal curve ordinate, y:

(7.12) $c7-math-0012$

where e is the natural log base and approximately equal to 2.718282 and z is the z-score at the point dividing P_p and P_q.

Formula 7.13 is the relationship between the point-biserial and the biserial correlation coefficients. This formula is necessary to find the biserial correlation coefficient because SPSS only determines the point-biserial correlation coefficient:

(7.13) $c7-math-0013$

After the correlation coefficient is determined, it must be examined for significance. Small samples allow one to reference a table of critical values, such as Table B.8 found in Appendix B. However, if the sample size n exceeds those available from the table, then a large sample approximation may be performed. For large samples, compute a z-score and use a table with the normal distribution (see Table B.1 in Appendix B) to obtain a critical region of z-scores. As described earlier in this chapter, Formula 7.6 may be used to find the z-score of a correlation coefficient for large samples.

7.5.2 Correlation of a Dichotomous Variable and a Rank-Order Variable

As explained earlier, the point-biserial and biserial correlation procedures earlier involve a dichotomous variable and an interval scale variable. If the correlation was a dichotomous variable and a rank-order variable, a slightly different approach is needed.

To find the point-biserial correlation coefficient for a discrete dichotomous variable and a rank-order variable, simply use the Spearman rank-order described earlier and assign arbitrary values to the dichotomous variable such as 0 and 1. To find the biserial correlation coefficient for a continuous dichotomous variable and a rank-order variable, use the same procedure and then apply Formula 7.13 given earlier.

7.5.3 Sample Point-Biserial Correlation (Small Data Samples)

A researcher in a psychological lab investigated gender differences. She wished to compare male and female ability to recognize and remember visual details. She used 17 participants (8 males and 9 females) who were initially unaware of the actual experiment. First, she placed each one of them alone in a room with various objects and asked them to wait. After 10 min, she asked each of the participants to complete a 30 question posttest relating to several details in the room. Table 7.8 shows the participants' genders and posttest scores.

TABLE 7.8

Participant	Gender	Posttest score
1	M	7
2	M	19
3	M	8
4	M	10
5	M	7
6	M	15
7	M	6
8	M	13
9	F	14
10	F	11
11	F	18
12	F	23
13	F	17
14	F	20
15	F	14
16	F	24
17	F	22

The researcher wishes to determine if a relationship exists between the two variables and the relative strength of the relationship. Gender is a discrete dichotomous variable and visual detail recognition is an interval scale variable. Therefore, we will use a point-biserial correlation.

7.5.3.1 State the Null and Research Hypothesis

The null hypothesis states that there is no correlation between gender and visual detail recognition. The research hypothesis states that there is a correlation between gender and visual detail recognition.

The null hypothesis is

H_O: ρ_pb = 0

The research hypothesis is

H_A: ρ_pb ≠ 0

7.5.3.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.5.3.3 Choose the Appropriate Test Statistic

As stated earlier, we decided to analyze the relationship between the two variables. A correlation will provide the relative strength of the relationship between the two variables. Gender is a discrete dichotomous variable and visual detail recognition is an interval scale variable. Therefore, we will use a point-biserial correlation.

7.5.3.4 Compute the Test Statistic

First, compute the standard deviation of all values from the interval scale data. It is helpful to organize the data as shown in Table 7.9.

Using the summations from Table 7.9, calculate the mean and the standard deviation for the interval data:

$c7-math-5009$

$c7-math-5010$

$c7-math-5011$

$c7-math-5012$

$c7-math-5013$

Next, compute the means and proportions of the values associated with each item from the dichotomous variable. The mean males' posttest score was

$c7-math-5014$

The mean females' posttest score was

$c7-math-5015$

The males' proportion was

$c7-math-5016$

The females' proportion was

$c7-math-5017$

Now, compute the point-biserial correlation coefficient using the values computed earlier:

$c7-math-5018$

The sign on the correlation coefficient is dependent on the order we managed our dichotomous variable. Since that was arbitrary, the sign is irrelevant. Therefore, we use the absolute value of the point-biserial correlation coefficient:

$c7-math-5019$

7.5.3.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

Table B.8 in Appendix B lists critical values for the Pearson product-moment correlation coefficient. Using the critical values, table requires that the degrees of freedom be known. Since df = n − 2 and n = 17, then df = 17 − 2. Therefore, df = 15. Since we are conducting a two-tailed test and α = 0.05, the critical value is 0.482.

7.5.3.6 Compare the Obtained Value with the Critical Value

The critical value for rejecting the null hypothesis is 0.482 and the obtained value is |r_pb| = 0.637. If the critical value is less than or equal to the obtained value, we must reject the null hypothesis. If instead, the critical value is greater than the obtained value, we must not reject the null hypothesis. Since the critical value is less than the absolute value of the obtained value, we reject the null hypothesis.

7.5.3.7 Interpret the Results

We rejected the null hypothesis, suggesting that there is a significant and moderately strong correlation between gender and visual detail recognition.

7.5.3.8 Reporting the Results

The reporting of results for the point-biserial correlation should include such information as the number of participants (n), two variables that are being correlated, correlation coefficient (r_pb), degrees of freedom (df), p-value's relation to α, and the mean values of each dichotomous variable.

For this example, a researcher compared male and female ability to recognize and remember visual details. Eight males (n_M = 8) and nine females (n_F = 9) participated in the experiment. The researcher measured participants' visual detail recognition with a 30 question test requiring participants to recall details in a room they had occupied. A point-biserial correlation produced significant results (r_pb₍₁₅₎ = 0.637, p < 0.05). These data suggest that there is a strong relationship between gender and visual detail recognition. Moreover, the mean scores on the detail recognition test indicate that males ( $c7-math-5020$ ) recalled fewer details, while females ( $c7-math-5021$ ) recalled more details.

7.5.4 Performing the Point-Biserial Correlation Using SPSS

We will analyze the data from the previous example using SPSS.

7.5.4.1 Define Your Variables

First, click the “Variable View” tab at the bottom of your screen. Then, type the names of your variables in the “Name” column. As shown in Figure 7.6, the first variable is called “Gender” and the second variable is called “Posttest_Score.”

7.5.4.2 Type in Your Values

Click the “Data View” tab at the bottom of your screen as shown in Figure 7.7. Type in the values in the respective columns. Gender is a discrete dichotomous variable and SPSS needs a code to reference the values. We code male values with 0 and female values with 1. Any two values can be chosen for coding the data.

7.5.4.3 Analyze Your Data

As shown in Figure 7.8, use the pull-down menus to choose “Analyze,” “Correlate,” and “Bivariate… .”

Use the arrow button near the middle of the window to place both variables with your data values in the box labeled “Variables:” as shown in Figure 7.9. In the “Correlation Coefficients” box, “Pearson” should remain checked since the Pearson product-moment correlation will perform an approximate point-biserial correlation. Finally, click “OK” to perform the analysis.

7.5.4.4 Interpret the Results from the SPSS Output Window

The output table (see SPSS Output 7.2) provides the Pearson product-moment correlation coefficient (r = 0.657). This correlation coefficient is approximately equal to the point-biserial correlation coefficient. It also returns the number of pairs (n = 17) and the two-tailed significance (p = 0.004).

Based on the results from SPSS, the point-biserial correlation coefficient was significant (r_pb₍₁₅₎ = 0.657, p < 0.05). Based on these data, we can state that there is a strong relationship between gender and visual detail recognition (as measured by the posttest).

7.5.5 Sample Point-Biserial Correlation (Large Data Samples)

A colleague of the researcher from the previous example wished to replicate the study investigating gender differences. As before, he compared male and female ability to recognize and remember visual details. He used 26 participants (14 males and 12 females) who were initially unaware of the actual experiment. Table 7.10 shows the participants' genders and posttest scores.

TABLE 7.10

Participant	Gender	Posttest score
1	M	6
2	M	15
3	M	8
4	M	10
5	M	6
6	M	12
7	M	7
8	M	13
9	M	13
10	M	10
11	M	18
12	M	23
13	M	17
14	M	20
15	F	14
16	F	26
17	F	14
18	F	11
19	F	29
20	F	20
21	F	15
22	F	18
23	F	9
24	F	14
25	F	21
26	F	22

We will once again use a point-biserial correlation. However, we will use a large sample approximation to examine the results for significance since the sample size is large.

7.5.5.1 State the Null and Research Hypothesis

The null hypothesis is

H_O: ρ_pb = 0

The research hypothesis is

H_A: ρ_pb ≠ 0

7.5.5.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.5.5.3 Choose the Appropriate Test Statistic

7.5.5.4 Compute the Test Statistic

First, compute the standard deviation of all values from the interval scale data. Organize the data to manage the summations (see Table 7.11):

$c7-math-5022$

$c7-math-5023$

$c7-math-5024$

$c7-math-5025$

$c7-math-5026$

Next, compute the means and proportions of the values associated with each item from the dichotomous variable. The mean males' posttest score was

$c7-math-5027$

The mean females' posttest score was

$c7-math-5028$

The males' proportion was

$c7-math-5029$

The females' proportion was

$c7-math-5030$

Now, compute the point-biserial correlation coefficient using the values computed earlier:

$c7-math-5031$

$c7-math-5032$

Since our number of values is large, we will use a large sample approximation to examine the obtained value for significance. We will find a z-score for our data using an approximation to the normal distribution:

$c7-math-5033$

$c7-math-5034$

$c7-math-5035$

7.5.5.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

Table B.1 in Appendix B is used to establish the critical region of z-scores. For a two-tailed test with α = 0.05, we must not reject the null hypothesis if −1.96 ≤ z* ≤ 1.96.

7.5.5.6 Compare the Obtained Value with the Critical Value

Notice that z* is in the positive tail of the distribution (2.055 > 1.96). Therefore, we reject the null hypothesis. This suggests that the correlation between gender and visual detail recognition is real.

7.5.5.7 Interpret the Results

We rejected the null hypothesis, suggesting that there is a significant and moderately weak correlation between gender and visual detail recognition.

7.5.5.8 Reporting the Results

For this example, a researcher replicated a study that compared male and female ability to recognize and remember visual details. Fourteen males (n_M = 14) and 12 females (n_F = 12) participated in the experiment. The researcher measured participants' visual detail recognition with a 30 question test requiring participants to recall details in a room they had occupied. A point-biserial correlation produced significant results (r_pb₍₂₄₎ = 0.411, p < 0.05). These data suggest that there is a moderate relationship between gender and visual detail recognition. Moreover, the mean scores on the detail recognition test indicate that males ( $c7-math-5036$ ) recalled fewer details, while females ( $c7-math-5037$ ) recalled more details.

7.5.6 Sample Biserial Correlation (Small Data Samples)

A graduate anthropology department at a university wished to determine if its students' grade point averages (GPAs) can be used to predict performance on the department's comprehensive exam required for graduation. The comprehensive exam is graded on a pass/fail basis. Sixteen students participated in the comprehensive exam last year. Five of the students failed the exam. The GPAs and the exam performance of the students are displayed in Table 7.12.

TABLE 7.12

Participant	Exam performance	GPA
1	F	3.5
2	F	3.4
3	F	3.3
4	F	3.2
5	F	3.6
6	P	4.0
7	P	3.6
8	P	4.0
9	P	4.0
10	P	3.8
11	P	3.9
12	P	3.9
13	P	4.0
14	P	3.8
15	P	3.5
16	P	3.6

Exam performance is a continuous dichotomous variable and GPA is an interval scale variable. Therefore, we will use a biserial correlation.

7.5.6.1 State the Null and Research Hypothesis

The null hypothesis states that there is no correlation between student GPA and comprehensive exam performance. The research hypothesis states that there is a correlation between student GPA and comprehensive exam performance.

The null hypothesis is

H_O: ρ_b = 0

The research hypothesis is

H_A: ρ_b ≠ 0

7.5.6.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.5.6.3 Choose the Appropriate Test Statistic

As stated earlier, we decided to analyze the relationship between the two variables. A correlation will provide the relative strength of the relationship between the two variables. Exam performance is a continuous dichotomous variable and GPA is an interval scale variable. Therefore, we will use a biserial correlation.

7.5.6.4 Compute the Test Statistic

First, compute the standard deviation of all values from the interval scale data. Organize the data to manage the summations (see Table 7.13):

$c7-math-5038$

$c7-math-5039$

$c7-math-5040$

$c7-math-5041$

$c7-math-5042$

Next, compute the means and proportions of the values associated with each item from the dichotomous variable. The mean GPA of the exam failures was

$c7-math-5043$

The mean GPA of the ones who passed the exam was

$c7-math-5044$

The proportion of exam failures was

$c7-math-5045$

The proportion of the ones who passed the exam was

$c7-math-5046$

Now, determine the height of the unit normal curve ordinate, y, at the point dividing P_p and P_q. We could reference the table of values for the normal distribution, such as Table B.1 in Appendix B, to find y. However, we will compute the value. Using Table B.1 also provides the z-score at the point dividing P_p and P_q, z = 0.49:

$c7-math-5047$

$c7-math-5048$

$c7-math-5049$

Now, compute the biserial correlation coefficient using the values computed earlier:

$c7-math-5050$

The sign on the correlation coefficient is dependent on the order we managed our dichotomous variable. A quick inspection of the variable means indicates that the GPA of the failures was smaller than the GPA of the ones who passed. Therefore, we should convert the biserial correlation coefficient to a positive value:

$c7-math-5051$

7.5.6.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

Table B.8 in Appendix B lists critical values for the Pearson product-moment correlation coefficient. The table requires the degrees of freedom and df = n − 2. In this study, n = 16 and df = 16 − 2. Therefore, df = 14. Since we are conducting a two-tailed test and α = 0.05, the critical value is 0.497.

7.5.6.6 Compare the Obtained Value with the Critical Value

The critical value for rejecting the null hypothesis is 0.497 and the obtained value is |r_b| = 0.972. If the critical value is less than or equal to the obtained value, we must reject the null hypothesis. If instead, the critical value is greater than the obtained value, we must not reject the null hypothesis. Since the critical value is less than the absolute value of the obtained value, we reject the null hypothesis.

7.5.6.7 Interpret the Results

We rejected the null hypothesis, suggesting that there is a significant and very strong correlation between student GPA and comprehensive exam performance.

7.5.6.8 Reporting the Results

The reporting of results for the biserial correlation should include such information as the number of participants (n), two variables that are being correlated, correlation coefficient (r_b), degrees of freedom (df), p-value's relation to α, and the mean values of each dichotomous variable.

For this example, a researcher compared the GPAs of graduate anthropology students who passed their comprehensive exam with students who failed the exam. Five students failed the exam (n_F = 5) and 11 students passed it (n_P = 11). The researcher compared student GPA and comprehensive exam performance. A biserial correlation produced significant results (r_b₍₁₄₎ = 0.972, p < 0.05). The data suggest that there is an especially strong relationship between student GPA and comprehensive exam performance. Moreover, the mean GPA of the failing students ( $c7-math-5052$ ) and passing students ( $c7-math-5053$ ) indicates that the relationship is a direct correlation.

7.5.7 Performing the Biserial Correlation Using SPSS

SPSS does not compute the biserial correlation coefficient. To do so, Field (2005) has suggested using SPSS to perform a Pearson product-moment correlation (as described earlier) and then applying Formula 7.13. However, this procedure will only produce an approximation of the biserial correlation coefficient and we recommend you use a spreadsheet with the procedure we described for the sample biserial correlation.

Relative class rank	Fifth-year salary ($)
1	83,450
2	67,900
3	89,000
4	80,500
5	91,000
6	55,440
7	101,300
8	50,560
9	76,050

Participant	Number of visits	Mean heart rate
1	5	96
2	12	63
3	7	78
4	14	66
5	3	79
6	8	95
7	15	67
8	12	64
9	2	99
10	16	62
11	12	65
12	7	76
13	17	61

Participant	Gender	Posttest score
1	M	7
2	M	19
3	M	8
4	M	10
5	M	7
6	M	15
7	M	6
8	M	13
9	F	14
10	F	11
11	F	18
12	F	23
13	F	17
14	F	20
15	F	14
16	F	24
17	F	22

Participant	Gender	Posttest score
1	M	6
2	M	15
3	M	8
4	M	10
5	M	6
6	M	12
7	M	7
8	M	13
9	M	13
10	M	10
11	M	18
12	M	23
13	M	17
14	M	20
15	F	14
16	F	26
17	F	14
18	F	11
19	F	29
20	F	20
21	F	15
22	F	18
23	F	9
24	F	14
25	F	21
26	F	22

Participant	Exam performance	GPA
1	F	3.5
2	F	3.4
3	F	3.3
4	F	3.2
5	F	3.6
6	P	4.0
7	P	3.6
8	P	4.0
9	P	4.0
10	P	3.8
11	P	3.9
12	P	3.9
13	P	4.0
14	P	3.8
15	P	3.5
16	P	3.6

Average survey score	Years of service
4.0	18
4.0	15
2.4	2
4.2	13
3.4	4
4.0	10
5.0	24
1.8	4
3.2	9
2.5	5
2.5	3
3.0	8
3.6	16
4.6	14
4.8	12

Participant	Gender	Posttest score
1	M	44
2	M	30
3	M	50
4	M	33
5	M	37
6	M	35
7	M	36
8	F	29
9	F	39
10	F	33
11	F	50
12	F	45
13	F	37
14	F	30
15	F	34
16	F	50

Participant	Poverty level	Survey score
1	Above	15
2	Above	19
3	Above	15
4	Above	20
5	Above	7
6	Above	12
7	Above	3
8	Above	15
9	Below	9
10	Below	5
11	Below	13
12	Below	13
13	Below	11
14	Below	10
15	Below	8
16	Below	9
17	Below	10
18	Below	17

Participant	Number of visits	Mean heart rate
1	5	96
2	12	63
3	7	78
4	14	66
5	3	79
6	8	95
7	15	67
8	12	64
9	2	99
10	16	62
11	12	65
12	7	76
13	17	61

Participant	Gender	Posttest score
1	M	7
2	M	19
3	M	8
4	M	10
5	M	7
6	M	15
7	M	6
8	M	13
9	F	14
10	F	11
11	F	18
12	F	23
13	F	17
14	F	20
15	F	14
16	F	24
17	F	22

Participant	Gender	Posttest score
1	M	6
2	M	15
3	M	8
4	M	10
5	M	6
6	M	12
7	M	7
8	M	13
9	M	13
10	M	10
11	M	18
12	M	23
13	M	17
14	M	20
15	F	14
16	F	26
17	F	14
18	F	11
19	F	29
20	F	20
21	F	15
22	F	18
23	F	9
24	F	14
25	F	21
26	F	22

Participant	Exam performance	GPA
1	F	3.5
2	F	3.4
3	F	3.3
4	F	3.2
5	F	3.6
6	P	4.0
7	P	3.6
8	P	4.0
9	P	4.0
10	P	3.8
11	P	3.9
12	P	3.9
13	P	4.0
14	P	3.8
15	P	3.5
16	P	3.6

Average survey score	Years of service
4.0	18
4.0	15
2.4	2
4.2	13
3.4	4
4.0	10
5.0	24
1.8	4
3.2	9
2.5	5
2.5	3
3.0	8
3.6	16
4.6	14
4.8	12

Participant	Gender	Posttest score
1	M	44
2	M	30
3	M	50
4	M	33
5	M	37
6	M	35
7	M	36
8	F	29
9	F	39
10	F	33
11	F	50
12	F	45
13	F	37
14	F	30
15	F	34
16	F	50

7.1 Objectives

7.2 Introduction

7.3 The Correlation Coefficient

7.4 Computing the Spearman Rank-Order Correlation Coefficient

7.4.1 Sample Spearman Rank-Order Correlation (Small Data Samples without Ties)

7.4.1.1 State the Null and Research Hypothesis

7.4.1.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.4.1.3 Choose the Appropriate Test Statistic

7.4.1.4 Compute the Test Statistic

7.4.1.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

7.4.1.6 Compare the Obtained Value with the Critical Value

7.4.1.7 Interpret the Results

7.4.1.8 Reporting the Results

7.4.2 Sample Spearman Rank-Order Correlation (Small Data Samples with Ties)

7.4.2.1 Compute the Test Statistic

7.4.2.2 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

7.4.2.3 Compare the Obtained Value with the Critical Value

7.4.2.4 Interpret the Results

7.4.2.5 Reporting the Results

7.4.3 Performing the Spearman Rank-Order Correlation Using SPSS

7.4.3.1 Define Your Variables

7.4.3.2 Type in Your Values

7.4.3.3 Analyze Your Data

7.4.3.4 Interpret the Results from the SPSS Output Window

7.5 Computing the Point-Biserial and Biserial Correlation Coefficients

7.5.1 Correlation of a Dichotomous Variable and an Interval Scale Variable

7.5.2 Correlation of a Dichotomous Variable and a Rank-Order Variable

7.5.3 Sample Point-Biserial Correlation (Small Data Samples)

7.5.3.1 State the Null and Research Hypothesis

7.5.3.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.5.3.3 Choose the Appropriate Test Statistic

7.5.3.4 Compute the Test Statistic

7.5.3.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

7.5.3.6 Compare the Obtained Value with the Critical Value

7.5.3.7 Interpret the Results

7.5.3.8 Reporting the Results

7.5.4 Performing the Point-Biserial Correlation Using SPSS

7.5.4.1 Define Your Variables

7.5.4.2 Type in Your Values

7.5.4.3 Analyze Your Data

7.5.4.4 Interpret the Results from the SPSS Output Window

7.5.5 Sample Point-Biserial Correlation (Large Data Samples)

7.5.5.1 State the Null and Research Hypothesis

7.5.5.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.5.5.3 Choose the Appropriate Test Statistic

7.5.5.4 Compute the Test Statistic

7.5.5.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

7.5.5.6 Compare the Obtained Value with the Critical Value

7.5.5.7 Interpret the Results

7.5.5.8 Reporting the Results

7.5.6 Sample Biserial Correlation (Small Data Samples)

7.5.6.1 State the Null and Research Hypothesis

7.5.6.2 Set the Level of Risk (or the Level of Significance) Associated with the Null Hypothesis

7.5.6.3 Choose the Appropriate Test Statistic

7.5.6.4 Compute the Test Statistic

7.5.6.5 Determine the Value Needed for Rejection of the Null Hypothesis Using the Appropriate Table of Critical Values for the Particular Statistic

7.5.6.6 Compare the Obtained Value with the Critical Value

7.5.6.7 Interpret the Results

7.5.6.8 Reporting the Results

7.5.7 Performing the Biserial Correlation Using SPSS

7.6 Examples from the Literature

7.7 Summary

7.8 Practice Questions

7.9 Solutions to Practice Questions

Participant	Number of visits	Mean heart rate
1	5	96
2	12	63
3	7	78
4	14	66
5	3	79
6	8	95
7	15	67
8	12	64
9	2	99
10	16	62
11	12	65
12	7	76
13	17	61

Participant	Gender	Posttest score
1	M	7
2	M	19
3	M	8
4	M	10
5	M	7
6	M	15
7	M	6
8	M	13
9	F	14
10	F	11
11	F	18
12	F	23
13	F	17
14	F	20
15	F	14
16	F	24
17	F	22

Participant	Gender	Posttest score
1	M	6
2	M	15
3	M	8
4	M	10
5	M	6
6	M	12
7	M	7
8	M	13
9	M	13
10	M	10
11	M	18
12	M	23
13	M	17
14	M	20
15	F	14
16	F	26
17	F	14
18	F	11
19	F	29
20	F	20
21	F	15
22	F	18
23	F	9
24	F	14
25	F	21
26	F	22

Participant	Exam performance	GPA
1	F	3.5
2	F	3.4
3	F	3.3
4	F	3.2
5	F	3.6
6	P	4.0
7	P	3.6
8	P	4.0
9	P	4.0
10	P	3.8
11	P	3.9
12	P	3.9
13	P	4.0
14	P	3.8
15	P	3.5
16	P	3.6

Average survey score	Years of service
4.0	18
4.0	15
2.4	2
4.2	13
3.4	4
4.0	10
5.0	24
1.8	4
3.2	9
2.5	5
2.5	3
3.0	8
3.6	16
4.6	14
4.8	12

Participant	Gender	Posttest score
1	M	44
2	M	30
3	M	50
4	M	33
5	M	37
6	M	35
7	M	36
8	F	29
9	F	39
10	F	33
11	F	50
12	F	45
13	F	37
14	F	30
15	F	34
16	F	50