Abstract
Evidence-based medicine (EBM) has been studied as a rich and diverse set of epistemic and infrastructural practices that relate imperfect medical knowledges to complex clinical practices. We examine instances of medical decision-making where medical professionals relate recommendations from clinical practice guidelines to individual patient characteristics when deciding to prescribe implantable cardioverter defibrillators to treat heart failure. When connecting evidence-based recommendations to decisions about individual patients, we find that clinical deliberations invoke different times, such as linear, chronological time, and biological aging, as well as different futures, such as individual future risks and the future sustainability of health systems. Such different “temporalities of evidence” are integral parts of clinicians’ considerations concerning economy, suffering, risks, and hopes in making decisions on individual patients. In line with calls to anchor science and technology studies scholarship studying EBM in the (clinical) concerns that matter most to patients and health care providers, we argue that the social study of EBM needs to attend to such temporalities to understand how evidence is weighed, which outcomes are considered important, and hence which clinical action is proposed. However, since different ways of “doing time” lead to different appreciations of the clinical outcomes, the notion of clinical “anchoring” becomes rather elusive.
Introduction: Simplifications and Complexities in Evidence-based Medicine (EBM)
It’s 8 a.m. on a Tuesday morning. In the meeting room of the cardiology department of a large university hospital in northern Europe, a team of physicians and nurses have come together to discuss whether patients with heart failure are suitable for receiving an implantable cardioverter defibrillator (ICD): a device that can, in case of cardiac arrest, automatically deliver an electric shock to resuscitate them and restore a regular heartbeat. As this is an expensive form of treatment with numerous possible side effects, not all patients are deemed fit for this preventive measure. To determine suitability, physicians follow nationally established evidence-based guidelines. These include recommendations on the minimum criteria for limitations in physical activity (functional classification, or New York Heart Association [NYHA] classes) and the degree of heart failure (ejection fraction [EF]). Doris is presenting the case of Jenny, a patient whom she feels may well benefit from receiving an ICD. But her assessment is not readily adopted by her colleagues: “NYHA II and EF 35%: just at the limit,” says Doris, a cardiologist. “But [the ejection fraction] was approximately 35%,” says Helen, the head of clinic. “Well, it was a visual estimation,” Doris explains, “but what we are talking about here is a typical patient!” “But where should we draw the line?,” Helen asks. “Below 35 or at 35? Should we accept 40 too?! We shouldn’t take matters to the extreme,” she admits, “but a visual EF?…” “But she is so ischemic and has so many arrhythmias,” Doris objects, pointing to the benefits Jenny could derive from getting an ICD in terms of a more regular heartbeat and better blood circulation. “Those symptoms were in the past,” Helen responds. Doris insists: “If anyone should have [an ICD] it is those who are still walking around, having…” She is interrupted by Helen who emphasizes her doubts about using a visually established EF that is suspiciously precise on meeting the cut-off point in the guideline (Field notes #150).
1
Criticism accompanied such efforts to bring evidence to bear on clinical decisions. Since EBM’s launch in the early 1990s, discussions have reiterated long-standing polarizations between the need for more standardization, on the one hand, and for more person-centered care, on the other. 2 Critics, especially from the fields of medical sociology, political science, philosophy of science, and medicine, opposed both EBM methodologies and the concrete effects of EBM in particular disciplines. Medical sociologists argued that EBM problematically assumes a universal “hierarchy of evidence that consistently places the evidence derived from randomised controlled clinical trials on top” (Goldenberg 2006, 2623). 3 Political scientists suggested that clinicians reject privileging epidemiological evidence since they reason differently: clinicians are “[m]edical realists [who] believe they can know, and conduct themselves so as to know, what actually occurs when someone gets sick or gets well,” whereas epidemiologists are “[e]mpiricists [who] concern themselves only with what they can observe, measure and manipulate statistically; they aspire only to the demonstration of relationships, not to the understanding of cause and effect” (Tanenbaum 1994, 30). Philosophers and historians of science argued that preferring a bad RCT over a good observational study points to a confused understanding of a “gold standard” of knowledge that is likely to let the believers end up with fool’s gold (Grossman and Mackenzie 2005; Rosner 2002; Howick 2011). Finally, clinical researchers commented that such fool’s gold risks producing not evidence-based, but evidence-biased medicine, which “is to use evidence in the manner of the fabled drunkard who searched under the streetlamp for his doorkey [sic] because that is where the light was, even though he had dropped the key somewhere else” (Evans 1995, 461).
All this criticism centers on challenging the fact that epidemiology, with its specific style of frequency- and population-based scientific reasoning (Wieringa et al. 2018), has come to dominate the practice of medicine with its particular problem of treating more complex individual—not average!—patients. Critics argue that epidemiological results rarely anticipate the variations and complexities of many patients that actually show up in clinics. But this critique of the simplification of EBM is also somewhat limited. EBM partly grew out of the scholarship by epidemiologists like John Wennberg and Alan Gittelsohn (1973) who argued that the variation they found in treatment delivered in different locations within the same US health system could often not be clinically explained or justified. The entire premise of EBM was thereby to standardize, simplify, and reduce variation of treatment in order to make quality of care less dependent on which professional happened to be treating patients. So to criticize EBM for its reductionism without acknowledging what such reductionism was trying to remedy, hardly does justice to the normative purpose (Timmermans and Haas 2008) of this health movement. As Law and Mol (2002) argue, such “critique of simplification is just too simple. The critique of simplification is so well established that it has become a morally comfortable place to be” (pp. 5-6).
To challenge the polarization of simplified EBM and complex clinical practice, scholars at the intersections of medical sociology and science and technology studies (STS) have studied EBM as an empirical field where the interlocking of aggregating and particularizing that is central to the development of EBM standards and to health care practices can be explored. Such detailed empirical STS studies of EBM, of the kind called for in the sociology of standardization (Timmermans and Epstein 2010), have pointed to the intertwinement of universals and particulars within evidence standards (Timmermans and Berg 1997) and to how standards can produce rather than inhibit reflexivity in clinical practice (van Loon and Zuiderent-Jerak 2012). They showed how developers achieve guidelines, not merely by reference to RCT studies or systematic reviews but by drawing on many different considerations and knowledges that are appraised using multiple “repertoires of evaluation” that attend to different intended audiences (Moreira 2005). The scientific robustness of aggregate epidemiological knowledge is an important part of guideline development but is juxtaposed to other repertoires, such as clinical usability and political acceptability. Universalities and particulars are therefore already integral to EBM standards; something Timmermans and Berg (1997) call “local universality” (p. 275). Resisting the critique of simplification, such work produced a more dynamic sense of the interplay between simplifications and complexities by showing how guidelines include local, diverse, and extrapolated types of knowledge to manage the absence of evidence (Knaapen 2013).
In addition to producing more sophisticated analysis of the interplay of universals and particulars within EBM, such nuanced accounts could potentially do more justice to EBM’s normative purpose by better attending to those concerns that “matter[s] most to patients and health care providers” (Timmermans and Haas 2008, 659), that is, the clinical outcomes that are achieved based on the interlocking of universal recommendations from guidelines and particular patient characteristics. This could possibly help to ameliorate the problem that studies that criticize the simplification of EBM have faced; that of “remain[ing] clinically unanchored” (Timmermans and Haas 2008, 659).
In this paper, we explore that potential by investigating the central role clinical outcomes play in professional judgments on how to apply recommendations from clinical practice guidelines regarding the prescription of ICDs. We highlight that although the potential of anchoring the study of EBM in clinical outcomes seems to provide the major advantage of moving beyond the “critique of simplification,” we also found that when taking evidence-based decisions for individual patients, the very physiological and biological manifestations of heart failure that were to serve as an “anchor” in our analysis, themselves became highly contested. Moreover, we found that the way clinical outcomes were related to guideline recommendations was largely dependent on one of the least studied aspects of the assessment of medical evidence in clinical practice: how “temporalities of evidence,” that is, differences in time, time frames, duration, pasts, futures, and timing that are invoked are integral to different ways of assessing treatment decisions and effects.
Although some STS scholarship attends to the role time frames play in enacting good care, this has largely been by opposing the practices of care with the practices of evidence-production and use. For example, when Mol (2006) analyses what counts as “good diabetes care,” she points to the different “time frames” patients and doctors have to navigate: Is good care the tight regulation of blood sugar, which is good for preventing long-term complications such as blindness, atherosclerosis, and neuropathy? Or, is it a more generous regulation, since tight regulation also increases the risk of low blood sugar levels, which can lead to acute discomfort for patients resulting in anger and frustration and potentially in a hypoglycemic coma, causing brain damage? Mol proposes that sensitivity to different time frames is important for understanding the tinkering and doctoring work needed to do good care in practice and to repair the problematic, linear time that is imposed by clinical research. Such tinkering and doctoring “helps to draw ‘treatment’ out of the linear time line into which it is inserted in the clinical trial: the line between intervention now and long-term health outcomes later” (Mol 2006, 412). Attention to multiple time frames as important for clinical work is thus contrasted with the singular, linear production of clinical evidence. In this paper, we extend the sensitivity to time frames in the practices of good care to the work of applying evidence-based guidelines, which equally comes with different temporalities—of evidence this time.
Our reference to temporalities of evidence thus amounts to more than highlighting that clinicians do consider issues of time (e.g., how long may a patient benefit from treatment). We rather wish to highlight that in the decisions we analyze, time itself is done differently: clinicians, for example, order their evaluations differently depending on whether they invoke chronological or biological time. Their ways of “doing time” turn out to have dramatic consequences for the meaning of clinical outcomes and the way professional deliberations and negotiations unfold. By analyzing how time is done in the cases presented, we aim to show how attempts to clinically anchor STS work on EBM may well lead to a more dynamic, temporally situated understanding of those very clinical outcomes that were to serve as an “anchor.”
We investigate an empirical case of evidence-based decision-making on a treatment for heart failure and arrhythmia: the ICD. We analyze how clinicians assess characteristics of individual patients who may benefit from such treatment in relation to recommendations that draw upon a generalized evidence base. The clinicians, we argue, differ in their assessment of individual patients’ suitability for an ICD procedure depending on the different temporalities of treatment and evidence they invoke. We then move on to analyze one crucial manifestation of temporalities of evidence in this case, namely, how clinicians handle the assessment category of “old age” to explore how the temporalities of patient lives that define much of the evidence base misalign with the regulatory setting that prohibits chronological time as a treatment criterion (out of human rights concerns for age discrimination). Given the crucial role that temporalities play in these discussions about treatment decisions, we conclude by exploring the implications of our study—the need for more temporal understandings of how universal EBM guidelines are translated to individual patients—and what this means for a more diversified understanding of clinical outcomes.
How ICDs Became Standardized Treatment
The clinical team meeting described in the opening section of this paper is, in some respects, a typical case of EBM in clinical practice—at least when considering the pertinence of guidelines and evidence in the conversations. In a weekly meeting lasting an hour, physicians involved in the ICD clinic collectively discuss and decide on treatments for heart failure and arrhythmia patients. Usually seven to twelve physicians of various seniority and several nurses attend the meetings. The key actors during the meetings at the time of the observations were the head of the clinic, Helen, and the senior physician, Doris, who attended more or less every meeting. Not one meeting was held without at least one of these two specialists. Some of the physicians were clearly there temporarily, mainly residents who worked at the ICD clinics as a part of their specialist training. The nurses rarely participated orally. Their main responsibility was to organize the projection of patient files on a screen that covered much of one of the walls or arranging conference calls with referring doctors or with specialist physicians from other hospitals. In quotes, we have provided female pseudonyms for all participants and patients in order to preserve anonymity in such a small and distinct circle of specialists, where even specifying gender would give away too many details. 4
As the opening vignette shows, evidence and guidelines feature heavily in the physicians’ interactions and also in the history leading up to these conferences. Funding for treatments had been wanting for several years and had recently been approved when one of us (M.S.) observed these meetings. The funding decision by the county council administration was an explicit result of guideline implementation, according to existing evidence. Our analysis of observations of thirty-eight team meetings during 1.5 years focuses on one of these treatments, the ICD that can be prescribed to some patients at risk of a future cardiac arrest. In addition, interviews were held with the main clinicians and administrators at the hospital and at the governmental institution regulating this clinic, the respective county council.
Automatic defibrillators for implantation were invented in order to prevent a cardiac arrest in at-risk patients. Two research groups worked on similar concepts during the 1970s; one at Sinai Hospital in Baltimore and one at the University of Missouri. In 1980, the first successful implantation was performed as an experimental treatment (Mirowski, Mower, and Reid 1981). During the 1990s, large studies were carried out to prove the effectiveness of the ICD. The findings of those studies were subsequently evaluated and then compared in meta-analyses (Connolly et al. 2000). Patients who were recruited for these successful studies were selected because they were at risk of dying from cardiac arrest, while being healthy enough to endure treatment and benefit from it. Another important criterion was that they faced only a small risk of dying within a short time frame for other reasons, for example, natural death from old age. Patients included in the studies were therefore characterized by three parameters. First, age. The oldest participants were seventy-six years old. 5 Second, the degree of heart failure, measured in EF, as discussed by Doris and Helen. This is the percentage of the total amount of blood that is pushed out of the left heart chamber with each heartbeat. Recruited patients had a maximum EF of 45 percent in the largest and most important study included in Connolly et al.’s meta-analysis from 2000 (AVID 1997), compared to a healthy heart pumping around 50 percent to 70 percent of blood out of the heart chamber each beat. Subgroup analyses showed that patients need to be sufficiently ill in terms of their EF for an ICD treatment because without a sufficiently low EF rate, it is hard to be sure that there is a clear case of heart failure and thus an increased risk of sudden cardiac death. However, in order for an ICD not to endanger patients’ general health and quality of life, the patients should not be too ill either. Therefore, the third parameter was limitations in physical activity. The standard for measuring this is a functional classification system known as NYHA classes, which range from I to IV based on how much the heart condition impedes a patient during physical activity. For inclusion in the studies, the person needs to be somewhat physically active. “Physically active” in this case corresponds to an NYHA score of I–III. Class I means “no limitation” of physical activity. Class II denotes experiences of “mild symptoms” during “ordinary activity.” And Class III indicates “marked limitation in activity” during “less-than-ordinary activity,” for example, walking short distances (twenty to hundred meters), while being free from symptoms while resting. A NYHA IV score, which implies an inability to carry out any physical activity without discomfort while having symptoms of heart failure even while resting, excluded patients from several studies since they were deemed too weak to benefit from an ICD (Moss et al. 1996, 2002; Bardy et al. 2005; Bänsch et al. 2002; Kadish et al. 2004; Strickberger et al. 2003).
Based on the inclusion criteria for the clinical characterization of recruited study participants and the outcomes of the studies and subgroup analyses, indication criteria for target patients were established during the 2000’s through coordinated efforts by professional organizations in Europe and the United States (Zipes et al. 2006). 6 Guidelines were formed that stated that the criteria of EF ≤ 35 percent and NYHA classes II–III could be applied by clinicians as standardized criteria for identifying patients most likely to benefit from the treatment. Age was also mentioned but not as standardized criteria (Zipes et al. 2006, 803). In Sweden, the National Board of Health and Welfare (NBHW) developed two national guidelines for cardiology. These national guidelines were drafted in 2004 and 2008 in close coordination with regional health care providers. The recommendations were increasingly implemented toward the end of the decade. By 2010, most major hospitals in Sweden had accepted ICD treatments as standard treatments for several risk indications, and the indication criteria of EF and NYHA were directly used in the clinic.
In practice, this means that patients in the clinic are deemed suitable for treatment through the application of indication criteria that require a minimum of clinical judgment, that is, merely the competence to measure the EF and assess the physical NYHA class. Judgment is also required to rule out alternative causes for heart failure and reduced physical fitness, but in many cases, the ICD is recommended based on a characterization of the patient in terms of their EF value and NYHA class that are positioned in the guidelines as the key forms of evidence. In meetings where clinicians discuss and decide on prescribing ICDs, decisions on these patients appear as brief reports that are quickly collectively confirmed before the conference moves on to the text patient. And, as a testimony to the power of simplification, many patients fit these categories and receive a straightforward ICD recommendation.
Yet, obviously, such simplifications only work in ideal cases provided—as we saw in the opening vignette—that their status of “typical ICD patient” is not challenged. Often patients do not quite fit the simplification the guideline implies, for example, in the cases of patients who have high EF values or who are older than the persons included in clinical trials. In such cases, the fact that patients do not match indication criteria in guidelines or in studies does not automatically mean that they should not be offered an ICD: their exclusion from the evidence base can be challenged for reasons that turn out to come with a temporal dimension.
A Struggle between Evidence-biased and Patient-centered Care?
Let us return to the case at the beginning of this article: the case of Jenny, whose EF first is at 35 percent, but then approximately at 35 percent, and even, visually established approximately at 35 percent. At first glance, Helen seems to want to firmly apply the recommendation, whereas Doris would like to bring in more situated factors, such as the degree of ischemia, the number of arrhythmias, and the fact that the patient is still walking around. Helen uses guidelines and the clinical complexities of EF measurements—“a visual EF?”—to challenge Doris’s intuitive judgment of “a typical patient!” and her reference to other clinically salient, but more complex, factors. This may look like a typical instance of two clinicians disagreeing on their interpretation of the evidence in relation to this particular patient, where one resonates strongly with the individual needs of the patient while the other wishes to follow the population-based guidance. But there is more going on than that.
Helen’s judgment is in fact barely supported by the guideline, since that document sets the limit at 35 percent and below: according to the guideline, Jenny should be offered an ICD, even if she was at the limit. So she interprets the guideline more strictly than required. Her justification for this is that EF measurement may be done through various techniques. One such technique is to establish EF visually by means of an echocardiogram. A cardiologist looking for signs of dysfunction in the movements of the heart estimates the size of the ventricles and the thickness of the ventricular walls. More precise procedures to establish EF, such as magnetic resonance imaging (MRI), computerized tomography (CT) scan, or cardiac ventriculography, involve injecting contrast media into the heart’s ventricles to get a precise measure of the volume of blood pumped out. But a visual EF is most commonly used since it is clearly the least invasive and cheapest method. The obvious lack of precision that comes with this method also makes it somewhat suspect, at least in the clinic studied here. Sometimes there is a collective amusement over EF values that perfectly hit the mark, as in the case of Jenny, since everybody knows that EF values, though expressed with numeric precision, are the result of judgment and can vary quite a lot depending on the method used.
We found it quite puzzling that the head of clinic used the very uncertainty regarding EF evidence to exclude Jenny—although the opposite conclusion could have been drawn quite easily. No arguments were brought forward for why the (supposedly) “real” EF in this case would be higher rather than lower than the measured value. So why did Helen, who knows the complexities of ICD evidence best, not accept that this patient met the standard criteria required? Why does she use EBM guidance as decisively prescriptive for action and uses the insufficient “hardness” of criteria to dismiss opening up the criteria? As we learned, we could only understand how the case of Jenny was handled by attending to how different temporalities of evidence matter for how ambiguous evidence is weighed. This included the history of implementing ICD treatment at this hospital and the particular region of Sweden where it is located, and the futures that thereby were made. It turned out that history of ICDs and its unsustainable futures immediately impacted upon the difference between Helen’s and Doris’s assessment of Jenny’s suitability for treatment in relation to the evidence-based guidelines.
Temporalities of evidence 1: Cardiology and affordable population health care
When interviewing cardiologists and health policy makers about the history of the guideline, they explained that the ICD had quite a troubled recent history. The use of ICDs to prevent cardiac arrest increased in Sweden toward the end of the 1990s. At that time, the treatment was extremely expensive and even carrying out a few implantations weighed heavily on the budgets of heart clinics. No extra financial support was provided for the new treatment. Cardiologists pressured hospital management for more resources, invoking the outcomes from large clinical studies that were completed in the 1990s and summarized in meta-analyses in 2000 (AVID 1997; Kuck et al. 2000; Connolly et al. 2000). In 2003, the hospital we study here implanted 110 ICDs at a cost of a quarter of a million Swedish Kronor (≈USD30,000) per device. Extra funds were offered by the hospital management to cover a minor part of a growing deficit that, according to the clinic, was mostly due to the new and expensive heart devices. Furthermore, the head of clinic asked the regional administration to put fresh capital into the clinic where all of the implantations were done at that time. Through professional meetings and coordination organized by a regional cardiac council, representatives of other heart clinics in the region joined in these demands to the regional administration for specific funding for ICD treatments. The disagreements between the regional administration and the clinicians were not solved until the NBHW guidelines were drafted and finalized in collaboration with regional representatives of local clinics, hospitals and county councils and strongly recommended more ICD treatment. Until this time, clinicians had struggled to make ICD demands meet budget restrictions.
When interviewing cardiologists about this decade of controversy around ICDs, they express their frustration about how long it took before the regional administration finally took the step to assign the required funds, in spite of the extensive evidence that existed for many years about the effectiveness of treatment. One clinician expresses the frustration this way: Why have we not adjusted to this [evidence]? You haven’t been able to deal with the investment cost. And then, unfortunately, we in [this county council] are, we are in Sweden, which is already at the bottom [among similar countries regarding implementation of ICD treatments]. Denmark is higher up. Then we in [this county council] are at the bottom in that [Swedish] context. And then I think that somewhere someone should step up and say that more patients die unnecessarily in [this county council] than in other places. (Interview with cardiologist) While you are drafting these guidelines, a majority of the profession has found an open space to run into [introducing a soccer metaphor]; they do not await the decision process, right, but they…start to act anyway. Perhaps we pose slightly higher demands on the evidence [from cardiological interventions] and do not let ourselves [as administrators] get overexcited by positive reports. (Interview regional government administrator)
This history of the push for ICDs by cardiologists drawing on a limited evidence base and posing future risks for a health system needing to serve more patient groups than heart patients, directly shaped Helen’s role in the clinical decision making in three substantive ways. First, she was a part of the NBHW expert panel that summarized the evidence. This may explain why she frequently referred to the guidelines and held them in high esteem during the meetings observed. 8 Her thorough knowledge of the guideline process also fed our sense of being surprised when, in the case of Jenny, she applied this guidance more strictly than how it was formulated rather than allowing for more flexible or at least consistent application. Second, as a head of clinic, she was painfully aware of the historical budget deficits, as she had many discussions with the hospital management and with heads of other departments about the funds that were being drawn toward ICD treatment, as well as of the risks cardiology interventions were posing for the sustainability of the health system serving more than heart patients. Third, she was aware of the future risks of treatment with an ICD and reminded colleagues and the observing ethnographer that EBM has sprung from awareness that health care is a risky business and that “no treatment” sometimes is the best treatment (Cochrane 1972). In the case of ICD treatments, the treatment is preventive and most devices are never used because patients who receive the device as a preventive treatment never experience a cardiac arrest. Of 30 devices that are implanted, only one person will have a cardiac arrest and may thus be saved by the device. Twenty-nine of 30 patients thus, in a sense, do not benefit from treatment but do face potential harm. This uncertain distribution of future benefit and risk induces particular caution for Helen when deciding on recommending an ICD or not.
With the historical push for ICDs, the future sustainability of the health system, and the future unevenly distributed risks for patients in mind, it becomes clear that Helen is not at all privileging EBM guidelines over the patient-related specificities Doris emphasizes. Helen, in contrast, temporalizes the decision within the complex history of evidence on interventions in cardiology, sustainable futures of the health system, and the future risk of side effects combined with the high number (thirty) needed to treat (one). This temporality of evidence leads Helen to a simple (overly strict) application of the EF criteria: no ICD should be prescribed for someone at a visually established EF of thirty-five. The evaluations of Doris and Helen invoke different temporalities. Doris, drawing on the timeline of Jenny’s recent complaints and potential future benefits, concludes that those justify implanting an ICD. Helen, considering a temporality of a history of overtreatment in cardiology and the future of sustainable health care at a population level, including the risk of harm for Jenny and the population at large, rules that Jenny should not receive this treatment.
This instance also shows that the classic opposition between clinicians being focused on the medical realism that allows them to know what happens to their patients and epidemiologists who focus on the statistical measurement and manipulation of relationships (Tanenbaum 1994, 30) needs to be troubled. Both Doris and Helen refer to aggregated knowledge from the collective of patients of RCTs. And both of them see the individual patient in front of them. But they weigh the possible benefits and harms along different temporalities, leading to radically different conclusions. Another temporal dimension in the case of ICD recommendations concerns the age of individual patients, to which we now turn.
Temporalities of evidence 2: Knowing chronological and biological heart times
Old age is a problem for guideline development at large given the general lack of evidence for the “very elderly” (Zipes et al. 2006, 803) or what respondents in a study by Will (2009) called “the ‘old’ old” (p. 613). Since patients above seventy-five years have generally been excluded from clinical studies on ICDs, this difficulty also manifests itself in this case. The cardiologists in the meeting go through the data of Sophie, a patient who seems eligible for an ICD. “But what is the actual age-limit for an ICD implantation?” Sally, a resident, asks. Helen, replies: “The gains don’t seem to exist after 75 years.” Doris, qualifies that, and states: “No age limit has been set. The oldest [patients] in studies are 60 to 65 years.
9
We don’t know what happens with the older ones.” Sally: “We set no limit, but we do anyway?” Helen: “The time you will benefit from the ICD is limited when you are older. Heart failure has its own journey and leads patients towards death. You are supposed to have an expected survival of one year. […] Sometimes you can see that the person is dying.” (Field notes #158)
Although the evidence base does not cover such older people since they are excluded from studies, the guidelines do provide guidance for the treatment of older patients. Supported by Swedish law, the guidelines developed and published by the NBHW explicitly prohibit using age as an exclusion criterion for treatment to prevent age discrimination. To attend to the higher risks of harm associated with age while avoiding age discrimination, the NBHW distinguishes between chronological age and biological age (Socialstyrelsen 2008a, 17). Equal access to health care services is a central concern within the Swedish welfare state and constitutes one of the six key notions formulated to guarantee good care (Socialstyrelsen 2006). Care providers and guideline developers are required to adhere to these criteria because they need to abide to Swedish law (Svensk Författningssamling 1982). Therefore, the guidelines on ICD surgery do not distinguish between patients based on chronological age and even urge doctors not to do so. Where such guidance may be laudable for a welfare state to fight ageism, it does produce a conflict between the temporality involved in establishing internal validity in scientific studies (chronological age) and the temporality ethically prescribed in clinical decision-making (biological age).
Clinicians routinely deal with this tension. Doctors usually ask for more factors in order to make a decision about the biological age of the patient. Assessment of biological age is more than attending to facts about the physical status of a patient. It is the assessment of time left until the end of the person’s life span. Facts of a patient’s physical status are thus merely relevant for a decision about biological age to the extent that they can be connected to a patient’s life span. With age, the complexity of these decisions increase, and this increase in complexity is prompted by the NBHW guidelines prohibiting age discrimination: by excluding chronological age as a clear treatment criterion, age must be discussed in relation to other criteria from the evidence base (see also Will 2009). Consequently, in the thirty-eight conferences, at which 159 patients were discussed, age came up for 17 patients—over 10 percent. In 6 of these cases “no age limit” or a version of it was mentioned, as was the principle of nondiscrimination. This does not mean, however, that chronological age is no longer considered, as the ban on age discrimination prescribes. As Helen indicates in the excerpt above, chronological age is one of the factors that help her “see” whether a patient is dying.
But given that using chronological age to assess biological age was precisely what the NBHW was trying to prevent, could Helen’s reference to chronological age not be considered unethical—even illegal? We were somewhat puzzled by Helen’s imposing an age limit where there so clearly wasn’t supposed to be one. But we would soon understand that the NBHW concern with preventing age discrimination through a focus on quality of life needed to be balanced with Helen’s equal concern about quality of death. A patient, Pauline, is presented because she had a heart attack and cardiac arrests. She has now an ejection fraction of 25–30%, which is well within the inclusion criteria set within the guidelines for ICD treatments. Her physical condition is also within the limits set: NYHA IIIa, which means that she is quite affected by her condition but can move about. The cardiologists discuss whether she should have an ICD to prevent another episode of cardiac arrest. “But do we put ICDs in 81-year-olds?” Loisa, another physician asks. “There is no age limit for ICDs,” Helen answers. Doris comments that she denied a 91-year-old patient the other day. “What is quality of life?,” she justifies her decision. “There is the risk that the patient is resuscitated at home in solitude again and again, until the battery runs out. There are so many other factors that have to be considered.” (Field notes, #189)
The NBHW assigns the highest priority, priority 1 (10 being the lowest priority), to interrupting the ICD treatment for terminally ill patients, notwithstanding an explicitly acknowledged lack of randomized and systematic studies of the phenomenon. 12 If you search the guidelines, the recommendation to turn ICDs off in terminally ill patients is just next to the recommendation to implant them in patients with heart failure and an EF of 35 percent or less. There is no formal difference in the guidelines between the strength of the two recommendations, although the latter is well supported by archetypal clinical evidence and the former is not. Also, the same format is followed for presenting the recommendation on turning off the ICD for terminally ill patients: a table that specified the intervention, degree of severity, evidence of effect, cost per life year gained, and finally the strength of the recommendation (Figure 1).

The recommendation from the National Board of Health and Welfare guidelines including translation (Socialstyrelsen 2008b, 165) and the way this assessment tool was used for individual patients in this clinic: D40 is the number of this individual recommendation; the recommendation 1 means an interruption of the ICD was recommended in strongest possible terms.
The risk of leaving the ICD on for too long is frequently raised spontaneously by doctors, nurses, and administrators during interviews and conversations about ICD treatment. The stories are converging and the images are forceful: someone dying but being resuscitated repeatedly; the pain involved in being electrically shocked to life-before-death again and again. There is a clear recommendation—“turn it off!”—as well as indication criteria—“terminally ill”—although with fluid measures—“how do you determine this?”—so the recommendation becomes difficult for the clinicians to manage. It sets a definite limit at the far end of a patient’s life, but it does not define where that end is. When is someone terminally ill? This becomes even more difficult to determine since the whole point of the ICD is to make patients autonomous and not dependent on hospital care. Usually patients (if not hospitalized) are not monitored on a daily basis but come in for routine visits three to four times per year. When clinicians approach the question of deciding whether someone is terminally ill and needs to have their ICD turned off, the decision-making is often complex due to individual variation in decision-making preferences, knowledge and view of the ICD technology, uncertainty about the illness trajectory, and different configurations of the patient–practitioner relationship (Lewis, Stacey, and Matlock 2014). The discomfort doctors face in discussing this decision so explicitly can be quite pressing, displays a qualitative study by the apt title “It’s Like Crossing a Bridge” (Goldstein et al. 2008).
For Helen, Sally, Loise, and Doris, decision-making about ICDs in such cases centers on how they entwine chronologically informed biological age with an evidence base that keeps chronological age and biological age apart. By doing so, they mobilize the temporality of a life cycle in relation to an evidence base that excludes the “old” old as well as their risk of suffering due to an ICD that does not allow that end of life to occur. Using chronological time in the assessment of biological aging, though being at the edge of what is legally permissible in Sweden, is based in a moral reasoning that changes the temporality of evidence to help prevent harm to dying patients.
Having paused at these practices of decision-making on ICDs in relation to “old” old age and when encountering EF values that hit the mark surprisingly well, and having explored the temporalities of evidence invoked in those decisions, we now return to the implications such analysis of the temporal organization of practicing EBM has for coming to a more diversified understanding of clinical outcomes.
Conclusion: Knowing Times in EBM
EBM currently is pursued on many levels and in more health care systems worldwide. STS scholarship has shown that this development provides ample opportunities for empirical analyses that move beyond considering EBM as exercises in reductionist simplification of the complexity of clinical practice. But whereas empirically detailed studies of EBM are said to help anchor such scholarship in concerns regarding clinical outcomes, our study shows that such empirical studies equally contribute to a more diversified understanding of what counts as a “clinical outcome.” Decisions about ICDs brought forward discussions about such outcomes in relation to histories of evidence in cardiology, sustainable futures of health systems, and patient futures that may come with severe iatrogenic harm. The historical abundance of evidence on cardiology interventions, combined with the recent history of budget constraints due to the rise of ICDs, led to concerns about how such evidence may lead to unsustainable futures for welfare state health systems that need to provide care for patients needing a wider range of therapies that cannot be studied so easily within an RCT. These pasts and futures of evidence in cardiology mattered hugely when deciding on ICDs in some cases. The high number needed to treat one patient caused concern about treatment outcomes for one compared with the risk for all others of side effects when their futures did not come with a cardiac arrest but with another cause of the end of the life cycle, while being shocked to life-before-death. In all these cases, clinical outcomes were central to the considerations, but how these were weighed and which decision was taken was strongly dependent upon the temporality of evidence invoked.
This clearly indicates that the suggestion, made by Timmermans and Haas (2008), that anchoring STS of EBM in the clinical outcomes that “matters most to patients and health care providers,” though crucial for moving beyond the critique of reductionism of EBM, mostly leads to a more diverse understanding of those very clinical outcomes that were to serve as an anchor. Including clinical markers in the analysis results in pointing to their diversity and to conflicting—or at least not simply matching—health outcomes that come into view depending on the temporal ordering that is mobilized.
In order to get closer to the outcomes that matter in relation to ICD decisions, attending to the different temporalities of evidence that are mobilized to weigh treatment, effects, and the consequences for individual patients and health systems, now and in future, proved crucial. Considering the guidelines in relation to a patient who perfectly matched the criteria and was facing recent arrhythmias, current suffering, and a possibly nearing cardiac arrest, radically changed when the evidence from that same guideline was placed in the temporality of the history of excessive evidence in cardiology and concerns about the future of a sustainable health system for the Swedish population extending well beyond cardiology patients. In the case of age considerations, clinicians had to temporalize the evidence base when making decisions to reduce the risk for a horrifying quality of death. Temporalization in this case implied making chronologically informed assessments of biological age. These analyses of temporal dimensions in relation to clinical outcomes show that practices of EBM—such as guideline production and use—are able to challenge the naturalizing tendencies of time that come with clinical studies that use chronological age as an exclusion criterion, especially for the “old” old (see also Moreira 2017). EBM in practice rather turns out to deploy different temporal understandings that result in different understandings of clinical outcomes. EBM in practice thus turns out to be about time. STS scholarship has been crucial in moving beyond studying the politics of EBM as a politics of simplification and in terms of including the study of clinical outcomes in the analysis; but doing so leads, rather than to clinical “anchoring” such scholarship, to an appreciation for how temporalities of evidence matter for how such outcomes may or may not count; in other words, by extending the study of EBM in practice in relation to clinical outcomes we may well be finding that EBM politics is a politics of knowing times.
Footnotes
Acknowledgments
Without the generosity of cardiology professionals like Helen, Doris, and their colleagues this study and its findings would not have been possible. Their ways of being professionals are at the core of this research. Thank you for giving us access to that expertise. We would like to thank two anonymous reviewers for helpful comments on earlier versions. Katie Vann did much more than can reasonably be expected from a managing editor to help us clarify the overall argument of the paper. We also wish to thank Ingemar Bohlin, whose larger project on knowledge production and application in evidence-based medicine enabled this study.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This study received funding from Vetenskapsrådet (Grant Id 2005-2373).
