9 Time to event studies

DAVID MACHIN, MARTIN J GARDNER

It is common in follow-up studies to be concerned with the survival time between the time of entry to the study and a subsequent event.¹ The event may be death in a study of cancer, the disappearance of pain in a study comparing different steroids in arthritis, or the return of ovulation after stopping a long-acting method of contraception. These studies often generate some so-called “censored” observations of survival time. Such an observation would occur, for example, on any patient who is still alive at the time of analysis in a randomised trial where death is the end point. In this case the time from allocation to treatment to the latest follow-up visit would be the patient’s censored survival time.

The Kaplan–Meier product limit technique is the recognised approach for calculating survival curves in such studies.^2,3 An outline of this method is given here. Details of how to calculate a confidence interval for the population value of the survival proportion at any time during the follow up and the median survival are given. Confidence interval calculations are also described for the difference in survival between two groups as expressed by the difference in survival proportions as well as for the hazard ratio between groups which summarises, for example, the relative death or relapse rate.

In some circumstances, the comparison between groups is adjusted for prognostic variables by means of Cox regression.³ In this case the confidence interval describing the difference between the groups is adjusted for the relevant prognostic variable.

In the survival comparisons context, confidence intervals convey only the effects of sampling variation on the precision of the estimated statistics and cannot control for any non-sampling errors such as bias in the selection of patients or in losses to follow up.

Survival proportions

Single sample

Suppose that the survival times after entry to the study (ordered by increasing duration) of a group of n subjects are t_l, t₂, t₃,… t_n. The proportion of subjects surviving beyond any follow-up time t, often referred to as S(t) but here denoted p for brevity, is estimated by the Kaplan–Meier technique as

where r_i is the number of subjects alive just before time t_i (the ith ordered survival time), d_i denotes the number who died at time t_i and ∏ indicates multiplication over each time a death occurs up to and including time t.

The standard error (SE) of p is given by

where n_effective is the “effective” sample size at time t. When there are no censored survival times, n_effective will be equal to n the total number of subjects in the study group. When censored observations are present, the effective sample size is calculated each time a death occurs.⁴

The 100(1 – α)% confidence interval for the population value of the survival proportion p at time t is then calculated as

where z_1–α/2 is the appropriate value from the standard Normal distribution for the 100(1 – α/2) percentile. Thus for a 95% confidence interval α = 0.05 and Table 18.1 gives Z_1–α/2 – 1·96.

There are other and more complex alternatives for the calculation of the SE given here including that of Greenwood⁵ but, except in situations with very small numbers, these will lead to similar confidence intervals.³

The times at which to estimate survival proportions and their confidence intervals should be determined in advance of the results. They can be chosen according to practical convention— for example, the five-year survival proportions which are often quoted in cancer studies—or according to previous similar studies.

Worked example

Consider the survival experience of the 25 patients randomly assigned to receive γ-linolenic acid for the treatment of colorectal cancer of Dukes’s stage C.⁶ The ordered survival times (t), the calculated survival proportions (p), and the effective sample sizes (n_effective) are shown in Table 9.1.

The data come from a comparative trial, but it may be of interest to quote the two-year survival proportion and its confidence interval for the group receiving γ-linolenic acid. The survival proportion to any follow-up time is taken from the entries in the table for that time if vailable or otherwise for the time immediately preceding. Thus for two years, t = 24 months, the survival proportion is p = S(24) = 0·5498. The corresponding effective sample size is n_effective = 16.

Table 9.1 Survival data by month for 49 patients with Dukes’s C colorectal cancer randomly assigned to receive either γ-linolenic acid or control treatment⁶

The standard error of this survival proportion is

The 95% confidence interval for the population value of the survival proportion is then given by

that is, from 0·31 to 0·79.

The estimated percentage of survivors to two years is thus 55% with a 95% confidence interval of 31% to 79%.

Median survival time Single sample

Single sample

If there are no censored observations, for example, if all the patients have died on a clinical trial, then the median survival time, M, is estimated by the middle observation (see also chapter 5) of the ordered survival times t₁, t₂…, t_n if the number of observations n is odd, and by the average of t_n/2 and t_{n/2 + 1} if n is even. Thus

Worked example

If we ignore the fact that there are censored observations in Table 9.1 and therefore consider all the patients to have died, then the median survival time of the 25 patients receiving γ-linolenic acid is the 13th ordered observation or M = 12 months. Making the same assumption for the 24 patients of the control group the median is the average of the 12th and 13th ordered survival times, that is M = (16 + 18)/2 = 17 months.

In the presence of censored survival times the median survival is estimated by first calculating the Kaplan–Meier survival curve, then finding the value of t that satisfies the equation

This can be done by extending a horizontal line from p = 0·5 (or 50%) on the vertical axis of the Kaplan–Meier survival curve, until the actual curve is met, then moving vertically down from that point to cut the horizontal time axis at t = M, which is the estimated median survival time.

The calculations required for the confidence interval of a median are quite complicated and an explanation of how these are derived is complex.⁷ The expression for the standard error of the median includes SE(p) described above but evaluated at p = S(M) = 0·5. When p = 0·5,

The standard error of the median is given by

where t_small is the smallest observed survival time from the Kaplan–Meier curve for which p is less than or equal to 0·45, while t_large is the largest observed survival time from the Kaplan–Meier curve for which p is greater than 0·55. The ratio [(t_small – t_large)/(p_small – p_large)] in the above expression, estimates the height of the distribution of survival times at the median. Just as the blood pressure values of chapter 4 have a distribution, in that case taking the Normal distribution form, survival times will also have an underlying distribution of some form. The values of 0·45 and 0·55 are chosen at each side of the median of 0·5 to define “small” and “large” and are arbitrary. Should. P_large = P_small then the two values will need to be chosen wider apart. They may be chosen closer to 0·5 for large study sizes.

The 100(1 – α)% confidence interval for the population value of the median survival M is then calculated as

where z_{1– α/2} is obtained from Table 18.1.

However, we must caution against the uncritical use of this method for small data sets as the value of SE(M) is unreliable in such circumstances, and also the values of t_small and t_large will be poorly determined.

Worked example

The Kaplan–Meier survival curve for the control patients of Table 9.1 is shown in Figure 9.1 and the hatched line indicates how the median is estimated. This gives M = 30 months. (We note that this is quite different from the incorrect value given in the illustrative example above.)

Figure 9.1 Kaplan–Meier estimate of the survival curve of 24 patients with Dukes’s C colorectal cancer.⁶

The effective sample size at 30 months is n_effective = 14 so that

Reading from Table 9.1 at p_small = 0·3852 < 0·45 gives t_small = 30 months also, and for p_large = 0·5870 > 0·55 gives t_large = 20 months.

Thus

The 95% confidence interval is therefore

that is, from 17·0 to 43·0 months.

The estimated median survival is 30 months with a 95% confidence interval of 17 to 43 months. For the γ-linolenic acid group M = 32 months and SE(M) = 14·18.

Two samples

The difference between survival proportions at any time t in two study groups of sample sizes n₁ and n₂ is measured by p₁ – p₂, where p₁ = S₁(t) and p₂ = S₂(t) are the survival proportions at time t in groups 1 and 2 respectively.

The standard error of p₁ – p₂ is

where n_{effective, 1} and n_{effective, 2} are the effective sample sizes at time t in each group.

The 100(1 – α)% confidence interval for the population value of P₁ – P₂ is

where z_1–α/2 is obtained from Table 18.1.

Worked example

The survival experience of the patients receiving γ-linolenic acid and the controls can be compared from the results given in Table 9.1. At two years for example, p₁ = 0·5498 and p₂ = 0·5136 with n_{effctive, 1} = 16 and n_{effctive, 2} = 17. The estimated difference in two-year survival proportions is thus 0·5498 – 0·5136 = 0·0363.

The standard error of this difference in survival proportions is

The 95% confidence interval for the population value of the difference in two-year survival proportions is then given by

that is, from –0·30 to 0·38.

Thus the study estimate of the increased survival proportion at two years for the patients given γ-linolenic acid compared with the control group is only about 4%. Moreover, the imprecision in the estimate from this small study is indicated by the 95% confidence interval ranging from –30% to +38%.

Difference between median survival times

The difference between the median survival times in two study groups of sample sizes n₁ and n₂ is measured by M₁ – M₂, where M₁ and M₂ are the medians in groups 1 and 2 respectively. The standard error of M₁ – M₂ is

The 100(1 – α)% confidence interval for the population value of M₁ – M₂ is

where z_1–α/2 is obtained from Table 18.1.

Worked example

The median survival experiences of the patients receiving γ-linolenic acid and the controls can be compared from the results given in Table 9.1. Thus M₁ = 32 and M₂ = 30 months, a difference of M₁ – M₂ = 2 months. The standard error of this difference is estimated by

The 95% confidence interval for the population value of the difference in medians is then given by

that is, from –28·7 to 32·7 months.

Thus the study estimate of the increased median survival for the patients given γ-linolenic acid compared with the control group is only 2 months. Moreover, the imprecision in the estimate from this small study is indicated by the 95% confidence interval ranging from –29 to +33 months.

The hazard ratio

In a follow-up study of two groups the ratio of failure rates—for example, death or relapse rates—is termed the “hazard ratio”. It is a common measure of the relative effect of treatments or exposures. If O₁ and O₂ are the total numbers of deaths observed in the two groups then the corresponding expected numbers of deaths (E₁ and E₂) assuming an equal risk of dying at each time in both groups, may be calculated as

Here r_1i and r_2i are the numbers of subjects alive and not censored in groups 1 and 2 just before time t_i with r = r_1i + r_2i; d_i = d_1i + d_2i is the number who died at time t_i in the two groups combined; and ∑ indicates addition over each time of death.

One estimator of the hazard ratio (HR) is (O_l /E_l)/(O₂ /E₂) although, for technical reasons, the more complex estimator

where

is more appropriate.

To obtain a 100(1 – α)% confidence interval for the population value of the hazard ratio one first calculates the two quantities

where z_1–α/2 is the appropriate value from the standard Normal distribution for the 100(1 – α/2) percentile (see Table 18.1). Thus for a 95% confidence interval α = 0·05 and z_1–α/2 = 1·96.

The hazard ratio can then be estimated by HR and the confidence interval for the hazard ratio by⁸

The hazard ratio calculated from (O₁/E₁)/(O₂/E₂) will be close to e^X except in unusual data sets.

Worked example

For the data at the end of the trial, shown in Table 9.1, O₁ = 10, E₁ = 11·37, O₂ = 12, E₂ = 10·63, and V = 4·99.

The values of X and for Y, with α = 0·05, are

The hazard ratio is thus estimated as

The 95% confidence interval for the population value of the hazard ratio is then given by

that is, from 0·32 to 1·83.

The results indicate that treatment with γ-linolenic acid has been associated with an estimated reduction in mortality to 76% of that for the control treatment, while the alternative hazard ratio calculation gives a similar figure of 78%. The reduction, however, is imprecisely estimated as shown by the wide confidence interval of 32% to 183%.

In the case when the distributions for the two groups can be assumed to be from exponential distributions., the ratio of the inverse of the two medians provides an estimate of the hazard ratio, that is, HR_median = M₂/M₁. In this case, the approximate confidence interval is given as⁹

where

As noted earlier, O₁ and O₂ are the number of deaths in the respective groups.

Worked example

For the data at the end of the study, shown in Table 9.1, M₁ – 32, M₂ = 30, while O₁ = 10 and O₂ = 12. This gives an estimate of the hazard ratio as 30/32 = 0·9375. This is equivalent to a reduction in mortality of 6%. The corresponding standard error is

The 95% confidence interval for the population value of the hazard ratio is then given by

that is, from 0·9375 – 0·4320 to 0·9375 + 0·4320, or 0·51 to 1·37.

Cox regression

Just as in the situations described in chapter 8 in which the linear regression equation is used for predicting one variable from another, it is often important to relate the outcome measure (here survival time) to other variables. In contrast to the y variable of chapter 8, the comparable variable is time t but with the added complication that this will usually have censored values in some cases. As a consequence, and for quite technical reasons, special methods have been developed for survival time regression.¹⁰ These Cox regression models are then utilised in much the same way as the regression models of chapter 8. In the special case of a comparison between two groups of subjects, the Cox model provides essentially the same estimate of HR and the associated confidence interval as described earlier. The basic assumption is that the risk of failure (death) in one group is the same constant multiple of the other group at any point in the follow-up time.³

The Cox regression model for the comparison of two groups assumes that the risk of death in the two groups can be respectively described by

Here if β = 0 then both groups have the same underlying death rate (hazard), λ₀(t), at each time t, but this rate may change over time. For comparing two groups, it is usual to write x₁ = 1 and x₂ = 0, in which case

Since t does not appear in the above expression (e^β) the hazard ratio does not change with time.

The 100(1 – α)% confidence interval for the population hazard ratio is

where SE(β) is obtained from a computer program.

Worked example

For the data of Table 9.1 use of a standard statistical package gives β = –0·2528, with SE(β) = 0·4302. Thus the HR_Cox = e^–0·2528 = 0·78. The corresponding 95% confidence interval for the hazard ratio is

or 0·33 to 1·80.

It is useful to note that the estimate of HR_Cox and the corresponding 95% confidence interval are similar to those given in earlier calculations. They differ somewhat from those corresponding to HR_median for which the assumption of a constant hazard (one that does not change with t) was made within each treatment group.

In certain circumstances there may be prognostic features of individual patients which may influence their survival and thus may modify the observed difference between groups. In such cases, we wish to compare the groups taking account of (or adjusted for) these variables. This leads to extending the single variable Cox model just described (with one explanatory variable indicating the group) to include also one or more prognostic variables as one may do in other multiple regression situations (see chapter 8). In the context of randomised controlled trials., described in chapter 11., we wish to check whether or not the treatment effect observed., as expressed by the hazard ratio., will be modified after taking account of these prognostic variables.³

1 Bland JM, Altman DG. Time to event (survival) data. BMJ 1997;317:468–9.

2 Altman DG, Bland JM. Survival probabilities (the Kaplan–Meier method). BMJ 1997;317:1572.

3 Parmar MKB, Machin D. Survival analysis: a practical approach. Chichester: John Wiley, 1995:26–40;115–42.

4 Peto J. The calculation and interpretation of survival curves. In: Buyse ME, Staquet MJ, Sylvester RJ (eds). Cancer clinical trials: methods and practice. Oxford: Oxford University Press, 1984:361–80.

5 Greenwood M. The natural duration of cancer. Reports of Public Health and Medical Subjects, 33. London: HMSO, 1926.

6 Mclllmurray MB, Turkie W. Controlled trial of γ-linolenic acid in Dukes’s C colorectal cancer. BMJ 1987;294:1260 and 295:475.

7 Collett D. Modelling survival data in medical research. London: Chapman & Hall, 1994: section 2·4.

8 Daly L. Confidence intervals. BMJ 1988;297:66.

9 Altman DG. Practical statistics for medical research. London: Chapman & Hall, 1991:384–5.

10 Cox DR. Regression models and life tables (with discussion). J R Statist Soc Ser B 1972;34:187–220.