Abstract
The NHS screening program was launched 20 years ago and, until recently, has been accepted in an uncritical way. However, emerging data have suggested that the reduction in breast cancer mortality owing to screening is much less than it has been credited for. Furthermore, the harms from false-positive results and the overdiagnosis of indolent disease, which includes the detection of a cancer that is not destined to present clinically in that patient's lifetime, are now perceived as much greater than ever anticipated. This article suggests that it is complacent to continue with the program unchecked, a program that has so far denied women an informed choice. It is also suggested that a more efficient use of scarce resources that may reduce all-cause mortality might be to shift from a ‘one size fits all’ approach, to a risk assessment/risk management scheme.
The majority of laypeople could be forgiven for believing that one of the mainstays in the fight against cancer is ‘early detection’. This belief has generated a European-wide consensus that screening for cancer before it becomes symptomatic will save lives. It has also become the main plank in the UK government's campaign to improve cancer survival in the UK to match the highest levels achieved in the EU. In the vanguard of this campaign, the NHS breast cancer screening program (NHS BSP) by mammography, launched after the Forrest report in 1987, has been lauded as a triumph and has laid claim to the responsibility for the dramatic decline in breast cancer mortality since its initiation more than 20 years ago. If nothing else, the introduction of this program has improved the service for the diagnosis and treatment of all women with breast cancer of any age and any stage. For that alone we should be thankful. However, we cannot remain complacent and uncritically continue with a service based on a limited number of trials that are more than 20 years out of date. Our understanding of breast cancer has moved on since then and as a result our attitude to screening is worthy of a fresh look [1].
Biases of screening that can disguise the true magnitude of the benefits of screening & the diagnosis of ‘pseudocancers’
Let us start by considering two separate but related issues; first, biases of screening that give a false impression of benefit and second, the overdetection of cancer ‘look-alikes’ that, if left undetected, might never threaten a patient's life. Survival from cancer is measured from the time of detection until recurrence and death. If a frame shift in the chronology of the disease owing to screening occurs, then survival is automatically extended even if the ultimate outcome is the same; this is called lead-time bias. Of course, if the cancer that is detected would never have threatened a woman's life in the first instance then that lead-time might be as long as 30 years. In addition, bearing in mind that the interval between screens is anything from 1 to 3 years, it is inevitable that the fast-growing tumors with a bad prognosis will appear during the intervals, whilst the slow-growing tumors with a good prognosis will remain dormant until they are found by mammography; this is called length bias. There is also another subtle bias that can be described as the ‘self-selection’ bias, in that women who accept invitations for screening might be demographically different to those who ignore the invitation. For a variety of reasons, such women have better outcomes in the treatment of cancer, regardless of whether they were screen-detected or not. The only way to account for these biases is to consider all of the clinical trials of screening versus no screening and to look for the pooled results described in terms of mortality (i.e., the number of women dying in the screened group compared with those dying in the control group, rather than case survival). There is, in fact, a modest advantage to screening that is considered in terms of these biases as described in the recent publication by Gøtzsche and colleagues from the Nordic Cochrane Centre [2]. In their article they describe a synthesis of all the papers that describe both the benefits and harms of screening using absolute rather than relative numbers, which makes it easier for laypeople to comprehend. Put in absolute terms, one can conclude that if 2000 women are screened regularly for 10 years, one will benefit from the screening, as she will avoid dying from breast cancer. The independent US Preventive Services Task Force derived a similar number in 2004 [3].
The NHS BSP prefer the figure one in 1000 benefiting from screening, derived from a somewhat selective reading of the literature; whatever the agreed figure, the principles of this discussion remain the same. However, even the figures one in 1000 or one in 2000 might be an overestimate. Remember, these data were derived from the trials that were mostly started in the 1970s and reported in the late 1980s. Since then, improvements in treatment, such as the adoption of tamoxifen and adjuvant chemotherapy, have narrowed the window of opportunity for screening and we have witnessed a drop in mortality of 30–40%, both in the age group that are invited for screening (>50 years of age) as well as for the younger woman. Therefore, perhaps the correct number might be nearer to one in 3000 (see calculations in Table 1). Whatever the number, that one woman who benefits from a decade of screening has a life of infinite worth, and if screening were as nontoxic as wearing a seatbelt, there would be no case to answer. However, there is a downside to screening, and that is the problem of overdiagnosis of ‘pseudocancers’.
Absolute values for breast cancer incidence and death following the screening of 10,000 women for 10 years assuming two estimates of relative risk reduction and assuming that unscreened symptomatic women receive the best of modern therapy.
HR: Hazard ratio; RRR: Relative risk reduction.
By this, I do not just mean the harms from false-positive results, but the overdiagnosis of indolent disease, which includes the detection of a cancer not destined to present clinically in that patient's lifetime. This results from both the biology of the slow-growing tumors and the aging patient dying as a result of comorbidity.
It is deduced by the Cochrane report that for every life saved, ten healthy women will, as a consequence of screening, become cancer patients and will be treated unnecessarily. Again, the NHS BSP disputes this number, but cannot deny the basic issue. These women will have either a part of their breast or the whole breast removed, and they will often receive radiotherapy and sometimes chemotherapy.
The nature of overdiagnosed cancer & an explanation for them
Screening for breast cancer is now adopted as an unequivocal good by most of the members of the EU. Invitations for screening promote this activity by being economical with the truth [4]. One of the uncomfortable truths concerns the overdiagnosis of both in situ and invasive breast cancers in screening populations [5]. Overdiagnosis of breast cancer does not mean false-positive rates, but rather the detection and treatment of cancers that, left undetected, would never threaten a patient's life and with which she would live, in blissful unawareness, until she died naturally of old age. We had always assumed that there was an overdiagnosis of ductal carcinoma in situ (DCIS), some of which had the potential to progress to an invasive and life-threatening phenotype. However, there is now clear evidence that anything between 10 and 50% of invasive cancers detected and treated radically as a result of screening would never threaten the patients's life [6–9]. As a result, the overall mastectomy rate rises after any country implements screening.
How can this possibly be? Do we not know that if cancer is neglected it will progress to a life-threatening condition? By way of illumination, let me propose that the pathological diagnosis of cancer at screening is based on a syllogism (i.e., a logical argument in three propositions, two premises and a conclusion, the conclusion being specious). A simple example might be that people who die from meningitis harbor meningococci in their nose. This does not mean that harboring meningococci in the nose is a lethal condition; in fact, approximately 10% of the population harbors these bacteria.
Cancer was defined by its microscopic appearance approximately 200 years ago. The 19th Century saw the birth of scientific oncology with the discovery and use of the modern microscope. Rudolf Virchow, often called the founder of cellular pathology, provided the scientific basis for the modern pathologic study of cancer. Since earlier generations had correlated the autopsy findings observed with the unaided eye with the clinical course of cancer 100 years earlier, so Virchow correlated the microscopic pathology of the disease. However, the material he was studying came from the autopsy of patients who had died from cancer.
In the mid 19th Century, pathological correlations were performed on living subjects presenting with locally advanced or metastatic disease that were almost always destined to die in the absence of effective therapy. Since then, without pause for thought, the microscopic identification of cancer according to these classic criteria has been associated with the assumed prognosis of a fatal disease if left untreated. Therefore, the syllogism at the heart of the diagnosis of cancer runs like this: people frequently die from malignant disease; under the microscope this malignant disease has many histological features we will call ‘cancer’; ergo anything that looks like ‘cancer’ under the microscope will kill the patient. I would therefore like to argue that some of these earliest stages of ‘cancer’, if left unperturbed, would not progress to a disease with lethal potential. These ‘cancers’ might have microscopic similarity to true cancers, but these appearances are only a necessary rather than sufficient condition for a fatal disease. I would also like to suggest that many of the risk factors for the development of cancer are in fact the promotional agents of a latent condition that Welch et al. have described as pseudocancers [1].
Biological models that support the idea of latency in tumor progression
If we stand back and take a broader look at nature, the idea of latency in tumor progression should not be surprising. Conventional mathematical models of cancer growth are linear or logarithmic; in other words, completely predictable at the outset. These mathematical formulae may be appropriate for designing some things, but cannot begin to explain the exquisite organization of cell proliferation and the complex inter-relationships of cells of different progeny.
Most natural biological mechanisms are nonlinear or are better described by the chaos theory. The rate of growth and the development of the lung, along with the fingers and toes in the fetus, cannot be described in linear terms. Wound healing starts with the knife and ends when it is appropriate to, although in some cases wound healing carries on too long, leaving an ugly keloid scar; Virchow himself once described cancer as the wound that never heals. Prolonged latency followed by catastrophe should not be all that surprising. We accept the case for prostate cancer; we know that most elderly men will die with prostate cancer in situ and not of prostate cancer that has invaded. In fact, the UK national prostate-specific antigen screening trial is predicated based on that fact, with two a priori outcome measures defined as deaths from prostate cancer versus the number of cancers treated unnecessarily [10,11]. However, the breast cancer screening services throughout the EU have been slow to recognize the problem or else remain in denial; we now have cases of women with screen-detected DCIS whose daughters have had problems raising a mortgage when the insurers have discovered this family history of breast cancer [12,13]. However, even the concession by the UK Department of Health on this fact only acknowledges one cancer over-diagnosed for every life saved. How they derive that number remains obscure compared with the Cochrane report that is transparent on this matter and the numbers are there for all to see. They simply record the number of breast cancers that have appeared in the screened group (observed) and subtracted the expected number as seen in the unscreened control populations in the trials that have now been followed-up for the majority of these women's lifetimes; (observed-expected = overdiagnosis).
Is there a reasonable way of modernizing the NHS BSP that enhances the benefit & reduces the harm?
The following is a summary of where I stand concerning breast cancer screening:
The current NHS screening program is based on the results of randomized, controlled trials that were published before 1987 and started in the late 1960s and early 1970s;
In retrospect, some of these trials were of poor quality;
With mature follow-up and careful attention to biases, a relative risk reduction (RRR) in breast cancer-specific mortality has been estimated as ±15% rather than the 25% promoted by the NHS BSP [2,3];
In absolute terms, therefore, the numbers needed to screen over 10 years in order to prevent one breast cancer death is between one in 2000 and one in 3000 women (Table 1). Anything less than this depends upon ignoring the obvious biases in the trials, as well as the mathematical manipulation of the data that is based on false assumptions or based on those self-selected women who accept the invitation to screen; this is termed ‘self selection bias’;
Along the way, the estimates of harm have increased. At the outset, the hazards of overdiagnosis were ignored; then, as the rate of screen-detected DCIS shot up, it was still judged to be worth the cost. Now we recognize that the overdiagnosis of invasive cancers that are not destined to threaten a woman's life is a problem. The extent of overdiagnosis is debatable, but personally, it is my opinion that that if one includes DCIS and invasive duct cancer, overdiagnosis amounts to approximately ten cases treated unnecessarily for every life saved;
Furthermore, in spite of the wonderful advances that have been made in imaging technology and treatment over the last 20 years, there has been only one new trial reported for screening and that was the trial for those under 50 years of age that supported the ±15% estimate in RRR for breast cancer mortality [16];
Over the last 20 years, treatment of symptomatic disease has improved greatly in both pre- and post-menopausal women and accounts for the majority of the fall in mortality observed in all age groups since 1989 [15];
It is my opinon that we are using state of the art imaging and modern therapy to service a program based on data that is 20 years old. It is also worth reiterating at this juncture that improvements in the treatment of symptomatic patients since the mid 1980s leaves a much narrower window of opportunity for screening so that even our best estimates based on these old trials might have to be multiplied by a factor of approximately 0.6 [16];
We should remember that only one in 25 women is likely to die of breast cancer; as such we are in danger in losing sight of more prevalent killers, such as cardiovascular disease in our obsessive focus on one disease [17];
Where do we go from here? To close the program is politically unacceptable. Therefore, I want to make two practical propositions for research and development. One concerns individual preferences and the other concerns the more efficient use of scarce resources that I will refer to as risk assessment/risk management (RARM).
Individual preference
Since 1997 when I resigned from the NHS BSP committee, I have publically expressed my concerns regarding the issue of informed choice for women invited for screening. I take no particular pleasure in the fact that the NHS has at last accepted the point and agreed to rewrite the letters of invitation.
My concern is that the mistakes of the past will be repeated. It is not for me to prejudge what level of benefit and what level of harm might influence the average woman to accept the invitation. For this reason I think there are two related areas of research. First, the development of an information pack that includes decision aids. This could be used in a study on individual preferences in which healthy women might be offered sliding scales of benefits and harms to find the point at which screening is judged acceptable. These data might then inform the second and perhaps more important area of research on more efficient ways of using scarce resources in the NHS.
Risk assessment/risk management
The benefit of RARM is that it provides a platform for the management of all women in an attempt to reduce all-cause mortality as well as mortality from breast cancer where mammographic screening is one component of an integrated program. The first step is to set up a nationwide facility for risk assessment using modern computer programs. Women would then be offered, but not compelled, to accept this service. Initially, a practice nurse could administer this questionnaire, but it would be quite easy to transfer this to a web-based program for the computer-literate members of the community. From the obtained results, an initial triage could be agreed. Those at the most extreme end of the risk spectrum, with a RR of, for exmaple, greater than 8.0, could be invited to a clinical genetics consultation. At the other extreme, those with a RR of, for example, less than 2.0 might be reassured and given lifestyle advise regarding diet, alcohol, tobacco and exercise that might not only impact on the risk of breast cancer but also on the greater risks of cardiovascular disease. Please note that these risk ratios are for illustration only; the actual figures used could be derived from the studies on individual preferences, and the cut-off for genetic counseling is already broadly accepted. Those in-between could then be invited to a special clinic for the second step. At this clinic, women of, for example, 45 years of age or older, could have a mammogram to determine breast density that might be kept as a baseline but could also provide additional evidence about risk; the greater the mammographic density, the higher the risk. Those with radiological abnormality at this stage would be investigated in the accepted way. If the mammographic density is low and the repeat estimate falls below a RR of 2.0 then they would be reassured and given lifestyle advice. Those that remain with a RR between 2.0 and 8.0 would be offered screening. In addition, those who were premenopausal might be offered prevention with tamoxifen, and those who were postmenopausal could be offered entry into the IBIS II trial (a study comparing tamoxifen with arimidex for the chemoprophylaxis of breast cancer). A recent paper in the Journal of the National Cancer Institute supports the validity of this approach [18]. Finally, before adopting such a radical change of policy we should consider soliciting a large, adequately powered, randomized, controlled trial in order to compare the current practice with a RARM regimen with allcause mortality and breast cancer-specific mortality as the primary outcome measures. While this is in the recruitment phase, the studies on individual preferences and the informed consent documents could be completed, since the RARM trial would itself enhance the consent procedures for those who continue to be offered unselected screening.
Conclusion
To carry on regardless is no longer acceptable; neither is political spin the answer. Women are becoming more informed and the demand for change comes from them as well. However, the changes I suggest are not nihilistic but constructive. The NHS BSP has indirectly led to the provision of the best specialist services for the diagnosis and treatment of symptomatic breast cancer in the world, riding on the back of the screening units. Centralization of care has led to rapid recruitment into randomized, controlled trials on treatment of cancer, which is the major contributor to the dramatic fall in breast cancer mortality in the UK over the last two decades. If we can now add to this the prevention of cardiovascular disease and a risk-adjusted screening program for breast cancer, the benefits would be more widespread.
Executive summary
The current NHS breast cancer screening program (NHS BSP) is based upon the results of randomized, controlled trials that were published before 1987 and started in the late 1960s and early 1970s.
In retrospect, some of these trials were of poor quality.
With mature follow-up and a careful attention to biases, a relative risk reduction (RRR) in breast cancer-specific mortality has been estimated as ±15% rather than the 25% promoted by the NHS BSP.
In absolute terms, the numbers needed to screen over 10 years to prevent one breast cancer death is between one in 2000 and one in 3000. Anything less than this depends on ignoring the biases in the trials mathematical manipulation of the data that is based on false assumptions or based on ‘self-selection bias’, a term referring to those self-selected women who accept the invitation to screen.
Along the way, the estimates of the harm resulting from screening have increased. At the outset, the hazards of overdiagnosis were ignored; then, as the rate of screen-detected duct carcinoma in situ shot up it was still judged to be worth the cost. Now we recognize that the overdiagnosis of invasive cancers that are not destined to threaten a woman's life is a problem. The extent of overdiagnosis amounts to approximately ten cases treated unnecessarily for every life saved.
Furthermore, in spite of the wonderful advances we have made in imaging technology and treatment in the last 20 years, there has been only one new trial reported for screening, and that was the trial for those under 50 years of age and supported the ±15% estimate in RRR for breast cancer mortality.
We are using state of the art imaging and modern therapy to service a program based on data that is 20 years old.
We should remember that only one in 25 women are destined to die of breast cancer, so we are in danger in losing sight of the more prevalent killers such as cardiovascular disease in our obsessive focus on one disease.
To close the program is politically unacceptable, but a more efficient use of scarce resources might be risk assessment/risk management with a triage, whereby the highest risk group is referred to the genetic services and the lowest risk group is offered lifestyle advice that may not only reduce the risk of breast cancer but cardiovascular disease as well, leaving only those at intermediate risk being offered screening as an informed choice.
Future perspective
In years to come, mammographic screening will probably be looked upon as an ‘intermediate technology’ that was based on incomplete knowledge of the growth kinetics of breast cancer. Improved treatment of clinically apparent disease, better knowledge of the genetic predisposition to the disease and chemoprevention will potentially make screening obsolete in 10–20 years.
Footnotes
The author has no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
No writing assistance was utilized in the production of this manuscript.
