The number needed to treat: a clinically useful nomogram in its proper context.

Gilles CHATELLIER, Eric ZAPLETAL, David Lemaître, Joël MENARD and Patrice DEGOULET

This article originally appeared in the British Medical Journal (BMJ 1996; 312:426-9)

Go directly to the Nomogram (Java) - A screen dump of the Nomogram (GIF/9Ko)

Contents
Summary
1. The nomogram for summarising results of therapeutic trials
2. Using the nomogram in the individual patients
2.1. Using the results of an individual trial
2.2. Combining the results of several trials
3. Taking into consideration some limitations of the method
3.1. Extrapolating the number needed to treat to baseline risks not considered in available clinical trials
3.2. Extrapolating the number needed to treat to timepoints not considered in available clinical trials
3.3. Subjective probabilities and the value of numbers
4. Conclusion
References

Summary
The "number needed to treat" is a very meaningful way of expressing the benefit of an active treatment over a control. It can be used either for summarising the results of a therapeutic trial or for individualised medical decision making. In this last setting, it requires time consuming calculations that impede bedside use. We therefore devised a nomogram that greatly simplify calculations. Since calculations are now very easy, we propose using the "number needed to treat" to assess the value of several interventions and also warn against several limitations of this number. We particularly warn against using the "number needed to treat" when it is not known whether the relative risk reduction associated with intervention is constant for all levels of risk, or for periods of time longer than that of original trials.

In most medical fields, the gold standard for the evaluation of the benefit of an active treatment is the randomised controlled trial. However, there are many obstacles to the correct utilisation of clinical trials results. The insufficient diffusion of results may explain between-physician differences for instance in the awareness of key advances in myocardial infarction (1). Another obstacle is the influence of the way in which the results of therapeutic trials are presented on clinicians' views of the effectiveness of the therapies involved (2,3,4). An informative way of presenting results is the number needed to treat described by Laupacis et al (5). As recently underlined by Cook and Sackett (6), this very simple index is attractive since the meaning of a sentence such as "20 patients needed to be treated to avoid 1 death over a five-year period" is easily understood by both physicians and patients. However, the authors underline that the calculations involved, i.e. the multiplying of two numbers followed by the taking of the inverse of the result, are cumbersome and can lead to errors. We have therefore devised a nomogram for calculating the number needed to treat . This nomogram could be used alongside the one proposed by Sackett et al (7) in their book for applying Bayes' formula when assessing the informative value of diagnostic tests, either for interpreting the results of a therapeutic trial, or for making therapeutic decisions in the individual patient.

1. The nomogram (Figure): a useful tool for summarising results of therapeutic trial.
The number needed to treat of a therapeutic trial is simply the inverse of the absolute benefit of intervention, i.e. the difference between the proportion of events in the control group (Pc) and the proportion of events in the intervention group (Pi):

Number needed to treat = 1 / (Pc - Pi)

The absolute benefit of intervention has the most relevance for addressing public health concerns regarding disease frequency reduction attributable to treatment. However, when using this additive model for representing treatment effect, it means that the risk of disease in the intervention group is equal to the risk in the control group minus a quantity Q which corresponds to the effect of treatment. For using this model in another population with a different value of Pc, it is necessary to make the assumption of independence between Pc and Q, which is most frequently unproven. For example, if proportions of death are 10% and 20% in the treatment and control groups of a therapeutic trial, respectively, Q is equal to 20%-10% = 10%. Clearly, if proportion of death is only 2% in the control group, Q cannot be equal to 10%. Therefore, although it has been largely debated, a multiplicative model based on the use of the relative risk or of the odds ratio appears to be more appropriate when trying to measure the strength of an effect (8). These ratios summarise the treatment effect and are frequently assumed to be independent of the value of Pi and Pc. In fact, this assumption of a constant relative treatment effect should be carefully verified in every trial by stratifying patients according to their baseline risk. Metaanalysis can also be used in this purpose. Examples are given later in this paper. The most widely used measure of effect in therapeutic trials is the relative risk:

Relative risk = Pi / Pc

The odds ratio is another measure frequently used in multiplicative models. It can be obtained through the following formula:

Odds ratio = (Pi / (1 - Pt)) / (Pc / (1 - Pc))

The odds ratio approximates the relative risk only when the probability of end- points is lower than 10%. Above this threshold, the odds ratio will overestimate the relative risk. It is easy to verify the "lower than 10%" rule, or to derive the relative risk from the odds ratio by using the following formula:

Relative risk = Odds ratio / (1 + Pc * (Odds ratio - 1))

Finally, the relative risk reduction (frequently expressed as a percentage) is an appealing way of expressing the benefit of intervention, that can be more easily remembered that the odds ratio or the relative risk. It is either already provided in papers or easily calculable by subtracting the relative risk from 1. For example, if the relative risk is 0.80, then the relative risk reduction is 1-0.80 = 0.20 = 20%. The number needed to treat can finally be derived from Pc and the relative risk reduction by the simple formula:
For example, if the relative risk reduction is 20% and the spontaneous risk of events 10%, then the number needed to treat = 1/(0.2 x 0.1) = 50. Using the nomogram, the number needed to treat can be directly obtained without any calculation by drawing a straight line from the point corresponding to the proportion of events in the control group on the left-hand scale to the point corresponding to the relative risk reduction measured in the trial on the central scale. The point of intercept of this line with the right-hand scale gives the number needed to treat. By taking the upper and lower limits of the confidence interval of the relative risk reduction, it is then possible to obtain the upper and lower limits of the number needed to treat. This allow assessing the precision of the result, and the magnitude of the intervention effectiveness in the most optimistic and the most pessimistic hypotheses.
The potential advantages of using the nomogram are illustrated in Table 1. The prescription of captopril in patients with left ventricular dysfunction 3 to 15 days after a myocardial infarction is associated with a considerable 42-month benefit (only 23 patients needed to be treated to avoid one death) (9). Even the smallest number needed to treat obtained from the lower limit of the 95% confidence interval of the relative risk reduction is associated with a sizeable benefit: 122 patients to be treated to avoid one death. In comparison, 167 patients with mild to moderate hypertension (diastolic ² 110 mmHg) need to be treated for 5 years to avoid one stroke (11). The benefit is lower when captopril is prescribed to all patients with suspected myocardial infarction within 24 h of the onset of symptoms (10). Two hundred patients needed to be treated to avoid one death, a result relevant only for the 5 weeks following myocardial infarction. The upper limit of the 95% confidence interval gives 1451 patients to be treated to avoid one death, a number suggesting almost no clinical benefit.

2. Using the nomogram in the individual patient

2.1 Using the results of an individual trial
The number needed to treat summarises, by means of a single number, the results of a therapeutic trial in the same way that the arithmetic mean of a variable summarises all the measurements performed on each individual of the sample. However, the arithmetic mean is only the most probable value of a population, provided that the variable is normally distributed, and is not the value observed in every individual. Accordingly, the number needed to treat measured in a trial does not provide an estimate of the benefit for each patient treated. Cook and Sackett (6) therefore propose that decisions be made in the individual patient by using the number needed to treat calculated from the relative risk reduction measured in the trial and the baseline risk in the absence of treatment estimated for this individual patient. This gives another reason to use in practice the relative reduction, instead of the absolute reduction.
As an example, we will use this approach for the decision to perform coronary artery bypass graft surgery in patients with stable coronary heart disease. In their overview of the effects of this type of surgery on survival, Yusuf et al also developed an 8-variable risk score predicting mortality (12). The 5-year mortality was 6.3%, 13.9% and 25.2% in the lowest, middle and highest tertiles of risk respectively. The nomogram makes it very easy to calculate that, if the same 39% reduction of the 5-year risk of mortality does exist in each subgroup (see discussion of this point below), 40, 18 and 10 patients needed to be operated on to avoid one death in the lowest, middle and highest tertiles of risk respectively.

2.2 Combining the results of several trials
Two or more interventions can be proposed to the same patient, particularly in the cardiovascular field. The best way for obtaining data on the joint effect of 2 or more drugs is the factorial design. In such trials it is possible to test if there is an independence between effects of two drugs. When this hypothesis holds true, the relative risk observed with the drugs combination is the product of the relative risks observed with each drug. Let us consider two drugs inducing relative risk reductions of 40% (relative risk 0.60), and 15% (relative risk 0.85), respectively. If the patient risk is 5%, the nomogram gives a number needed to treat of 50 for the first drug and 133 for the second. By combining the two drugs, the relative risk is 0.60*0.85 = 0.51, the relative risk reduction 49% and the number needed to treat 41. The small additional benefit conferred by the combination over the prescription of the most effective drug only should therefore be weighted with risks and side effects of the combination of the 2 drugs. A practical example was derived from the International Study of Infarct Survival trial data (10) (table 2). In the group receiving only captopril or only mononitrate, the relative risks of death are 0.885 and 0.974, respectively. In the group receiving the combination of captopril and mononitrate, the observed relative risk is 0.859, a value close to that obtained by multiplying the two individual relative risks. The number needed to treat estimated by using the two methods are therefore almost identical, 91 and 93 patients to be treated to avoid one death.
In the absence of a trial using a factorial design, strong evidence will be lacking concerning between-drug interaction and the type of this interaction. When indirect evidence against interaction is also lacking, using the method described above can either under- or over-estimate the true benefit of a drug combination.

3. Taking into consideration some limitations of the method
The number needed to treat is appealing measure which is always valid as a measure of treatment effect in a clinical trial. In bedside medical decision making, a valid use of the number needed to treat requires making two further assumptions: 1) the relative risk reduction is independent of the baseline risk; and 2) it is possible to extrapolate results to timepoints not considered in available clinical trials Other problems are difficulties of estimating subjective probabilities and the value of numbers.

3.1 Extrapolating the number needed to treat to baseline risks not considered in available clinical trials
When using the arithmetic mean to calculate the mean of a population, one makes the assumption that the arithmetic mean is a valid estimate of the central location of a distribution, which is not always true. Adjusting the number needed to treat for the baseline risk of the patient implies, as underlined by Cook and Sackett (6), that the relative risk reduction is constant for all levels of disease severity. This assumption is true for hypertension treatment, where the overview of Collins et al (11) clearly shows a typical 40% relative reduction in risk of stroke for all degrees of hypertension severity. This is not always the case: for example, in the International Study of Infarct Survival 4 Trial (10), the relative reduction was 17% in patients with a history of previous myocardial infarction versus only 3% for patients without such a history. In the overview of the effects of coronary artery bypass graft surgery in patients with stable coronary heart disease (12), the relative risk reductions attributable to surgery, instead of being equal to 39% in each tertile of risk, were around 45% in both the middle and highest tertiles of risk, whereas there was a 17% relative risk increase in the lowest tertile.

3.2 Extrapolating the number needed to treat to timepoints not considered in available clinical trials
A second issue concerns the impact of time which can affect the number needed to treat in two different ways. Firstly, in most clinical situations, the longer the follow- up the greater the number of events. Since, for the same relative risk reduction of 50%, the number needed to treat is lower if the proportion of events in the control group is 20% than if it is 2%, the number needed to treat will be generally lower (meaning a greater individual benefit) in trials with long follow-up than in trials with short follow- up. This is exemplified above in the trials of angiotensin converting enzyme inhibition after myocardial infarction (Table 1). Secondly, the relative risk reduction may or may not vary with time. The three possibilities are shown in Figure 2 for hypothetical trials lasting 30 months. In panel A, treatment produces a constant relative risk reduction over time, continuing after the 30-month period of follow-up. In panel B, the relative risk reduction decreases after 30 months and is almost abolished at 60 months. In panel C, treatment produces a constant relative risk reduction during the first months of treatment, and no further benefit afterwards. This example shows that, when data are available only at 30 months, extrapolation for the following 30 months may be invalid. Further discussion of this point can be found in Laupacis et al (5). Extended follow-up of randomised trials have provided examples of these various models. In the secondary prevention of myocardial infarction, the 3-year beneficial effect of beta-blockers after infarction was maintained for at least 6 years with timolol (13), whereas, among the one-year survivors of the Beta-Blocker in Heart Attack Trial, a continuing treatment benefit appeared to be restricted to patients at highest risk (14). In the Hypertension Detection and Follow-up Program, the benefit observed on completion of the 5-year trial extended up to 8.3 years, despite discontinuation of the formal stepped care program in the intervention group, according to the pattern shown on Figure 2A (15). Six years of post- Lipid Research Clinics Coronary Primary Prevention Trial follow-up have not provided conclusive evidence of a benefit with cholestyramine treatment beyond that which was evident at the cessation of the 7-year trial, this following the Figure 2B hypothesis (16). Finally, in the overview of the effect of coronary artery bypass graft surgery on survival (12), the relative risk reduction reached a maximum of 39% at 5 years, and decreasing thereafter (32% at 7 years, and 17% at 10 years), as in Figure 2C.
In trials where a large proportion of patients presents at least one of the trial end- points during a relatively short period of follow-up (e.g. in the SAVE trial (9), 1 patient out of 4 died during the 42-week follow-up), there is no need to adjust the number needed to treat for a longer period of time. Conversely, in other trials, such as the International Study of Infarct Survival 4 trial, the great majority (around 92.5%) of patients survived at 5 weeks. In cases such as this, although the number needed to treat is useful for estimating the short-term benefit, it does not provide an answer as to whether this benefit is maintained for a clinically meaningful period of time, e.g. one year. In International Study of Infarct Survival 4 data provided in the article shows that the 0.49% reduction in absolute risk reduction at 5 weeks was maintained after one year, when a 0.54% absolute risk reduction was observed. As to treatment of hypertension, it may not be valid to extrapolate from the available 5-year trials the benefit expected after 20 years, the clinically relevant time frame for patients with mild hypertension. Use of the 5-year number needed to treat may therefore unpredictably under- or over-estimate the benefit of a 20-year treatment and may be misleading for decision-making.

3.3 Subjective probabilities and the value of numbers
Bias in estimations of probabilities have been described 20 years ago by Tversky and Kahneman (17). We have shown that among 5 hypertension specialists, there were large inter- and intra- physician variations in estimation of absolute cardiovascular risk (18). A study among primary care physicians in Canada has also shown that assessment of coronary risk was difficult for many doctors (19). Since, in both cases, physicians tended to overestimate the absolute risk, the use of an estimated absolute risk in decision-making will result in a decrease in the number needed to treat and therefore an overestimation of the benefit of intervention. The employment of computerised tools based on published equations of risk, such as the Framingham equation (20), will certainly be helpful for predicting risk more reliably, at least in the cardiovascular field.
The impact of this quantification on decision making is the last issue. What is the clinical meaning of a number needed to treat of 100 for 5 years to avoid one clinical event for the average physician? It is likely that some physicians will consider that this number represents an important health benefit, whereas others will consider the benefit as only moderate or even slight. This between-physician variability will only reflect the different opinions of doctors on risk or the value they ascribe to a given health state. Presently, most physicians probably overlook the fact that this number needed to treat of 100 is the summary measure of the preventive effect of antihypertensive treatment on coronary heart disease and stroke among patients with mild to moderate hypertension (diastolic blood pressure < 110 mmHg) (11). The extent to which this number needed to treat value could influence medical decision making in mild to moderate hypertension is unknown, and therefore requires testing in clinical practice taking into account both the physician and the patient points of view.

4. Conclusion
The use of the nomogram proposed in this paper makes it possible to calculate the number needed to treat at the patient's bedside for medical decision making. This decision making tool should be used only after an educational course in clinical epidemiology, especially in the fields of elementary probabilities, prognostic studies and randomised clinical trials. Caution towards decisions based on "magic" numbers should stay part of good clinical sense.


Table 1: Number needed to treat to avoid one death and its 95% confidence interval (*) in two trials of the converting enzyme inhibitor captopril after myocardial infarction (9,10)

No. of deaths (%) Follow-up Relative risk reduction
(95% CI)
No. needed to be treated
(95% CI)
Control
group
Intervention
group
SAVE trial (9)
No. of deaths (%)
Total No. of patients

275 (24.7)
1115

228 (20.4)
1116
42 months 17% (3-29) 23 (13-122)
International Study of Infarct Survival 4 Trial (10)
No. of deaths (%)
Total No. of patients
2231 (7.69)
29022
2088 (7.19)
29028
5 weeks 6% (1-12) 200 (111-1451)

(*) Relative risk reductions and their 95% confidence intervals have been calculated from the tables. They differ slightly from those of the originals papers, derived from survival analysis.
CI: Confidence Interval


Table 2: Individual and joint effects of the converting enzyme inhibitor captopril and oral mononitrate after myocardial infarction in the International Study of Infarct Survival 4 Trial (10)

Trial Drug No. of deaths (%) Relative risk Relative risk reduction (95% CI) No. needed to be treated (95% CI)
Control group Intervention group
Captopril 7.80% 6.90% 0.885 1 - 0.885= 11.5% 111
Mononitrate 7.80% 7.60% 0.974 1 - 0.974= 2.6% 500
Captopril plus Mononitrate: OBSERVED 7.80% 6.70% 0.859 1 - 0.859= 14.1% 91
Captopril plus Mononitrate: CALCULATED 7.80% 0.862 (*) 1-0.885 = 13.8% 93
(*) The calculated relative risk , under the hypothesis of independence between effects of the 2 treatments, is obtained by multiplying the relative risk on captopril and the relative risk on mononitrate: calculated relative risk = 0.885 x 0.974 = 0.862


References
1. Ayanian JZ, Hauptman PJ, Guadagnoli E, Antman EM, Pashos CL, McNeil BJ. Knowledge and practices of generalist and specialist physicians regarding drug therapy for acute myocardial infarction. N Engl J Med 1994; 331: 1136-42
2. Naylor CD, Chen E, Strauss B. Measured enthusiasm: does the method of reporting trial results alter perceptions of therapeutic effectiveness? Ann Intern Med 1992; 117: 916-21.
3. Forrow L, Taylor WC, Arnold RM. Absolutely relative : How research results are summarized can affect treatment decisions. Am J Med 1992; 92:121-4.
4. Bucher HC, Weinbacher M, Gyr K. Influence of method of reporting study results on decision of physicians to prescribe drugs to lower cholesterol concentration. BMJ 1994; 309: 761-4.
5. Laupacis A, Sackett DL, Roberts RS. An assessment of clinically useful measures of the consequences of treatment. N Engl J Med 1988; 318: 1728-33.
6. Cook RJ, Sackett DL. The number needed to treat: a clinically useful measure of treatment effect. BMJ 1995; 310: 452-4.
7. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. Clinical epidemiology: a basic science for clinical medicine. 2nd ed. Boston: Little Brown, 1991.
8. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic research:
9. Pfeffer MA, Braunwald E, MoyŽ LA et al. Effect of captopril on mortality and morbidity in patients with left ventricular dysfunction after myocardial infarction. Results of the Survival and Ventricular Enlargement Trial. N Engl J Med 1992; 327: 669-77.
10. ISIS4 Collaborative Group. International Study of Infarct Survival 4 (ISIS4): a randomized factorial trial assessing early oral captopril, oral mononitrate and intravenous magnesium sulfate in 58050 patients with suspected acute myocardial infarction. Lancet 1995; 345: 669-85.
11. Collins R, Peto R, MacMahon S, Herbert P, Fiebach NH, Eberlein KA, et al. Blood pressure, stroke and coronary heart disease. II. Short-term reductions in blood pressure: overview of randomised drug trials in their epidemiological context. Lancet 1990; 335: 827-38.
12. Yusuf S, Zucker D, Peduzzi P, Fisher LD, Takaro T, Kennedy JW, et al. Effect of coronary by pass graft surgery on survival: overview of 10-year results from randomised trials by the Coronary Artery Bypass Graft Surgery Trialists Collaboration. Lancet 1994; 344: 563-70.
13. Pedersen TR, for the Norwegian Multicenter Study Group. Six-year follow-up of the Norwegian Multicenter Study on timolol after acute myocardial infarction. N Engl J Med 1985; 313: 1055-8.
14. Viscoli CM, Horwitz RI, Singer BH. Beta-blockers after myocardial infarction: influence of first-year clinical course on long-term effectivenes. Ann Intern Med 1993; 118: 99-105.
15. Hypertension Detection and Follow-up Program Cooperative Group. Persistence of reduction in blood pressure and mortality of participants in the Hypertension Detection and Follow-up Program. JAMA 1988; 259: 2113-22.
16. The Lipid Research Clinics Investigators. The Lipid Research Clinics Coronary Primary Prevention Program. Results of 6 years of post-trial follow-up. Arch Intern Med 1992; 152: 1399-410.
17. Tversky A, Kahneman D. Judgement under uncertainty: heuristics and biases. Science 1974; 185: 1124-31
18. Chatellier G, Blinowska A, MŽnard J, Degoulet P. Do physicians estimate reliably the cardiovascular risk of hypertensive patients? In: Greenes RA, Peterson HE, Protti DJ, Eds. MEDINFO 95 Proceedings. Edmonton, Canada: IMIA, 1995: 876-9.
19. Grover A, Lowensteyn I, Esrey KL, Steinert Y, Joseph L, Abrahamowicz M. Do doctors accurately assess coronary risk in their patients? Preliminary results of the coronary health assessment study. BMJ 1995; 310: 975-8.
20. Anderson KM, Odell PM, Wilson PWF, Kannel WB. Cardiovascular disease risk profiles. Am Heart J 1990; 121: 293-8.

Contents Screen dump (GIF/5Ko) Nomogram (Java)