The Ratio of Observed to Expected: How Much of It Is Unexpected?

Abstract

“Numbers don’t lie.” This is probably one of the few universally accepted truisms we are all brought up with from early childhood. Increasingly, the practice of medicine is subjected to applying the “truth in numbers” concept to our practices for analytical and ratings purposes. But do numbers really always tell the truth?

A recent example comes to mind to point out the limitations of this concept. A medical director of another medical system, whom I am friendly with, called with an acute concern of his. His spine service, which provides tertiary/quaternary spine care, had experienced a sudden, very concerning rise in their mortality (death) rate. This well-established facility prides itself on their excellent, publicly reported (and even advertised) health, quality and safety metrics, and he was very concerned about what seemed to be an extraordinary jump in the mortality dimension. This prompted him to request an “outsider’s expert opinion.” Without divulging too many details, I thought my ensuing observations might be interesting to other spine surgeons, as this story demonstrates ongoing concerns for data use and interpretation and our requirement to transparently prove the value, quality, and preeminent safety of our work; not only from one patient to the next, but also in the greater statistical purview of nonspine surgeons.

In brief, the spine service in question had, over several years, enjoyed an observed to expected (O/E) mortality ratio of below or well below 1, meaning that the rate of their deaths as calculated was of no concern to any reviewers (as it was below the magic number “1”). This was reassuring to all administrators and spine care providers alike and they were not shy in publicizing this number. Usually there would be 1 to 2 deaths over the course of a year assigned to the spine service, and these were in well-established fashion, carefully analyzed and processed to look for improvement possibilities.

Without any discernible changes in personnel, policies, or procedures, suddenly there were 4 recorded deaths in a quarter—and alarm bells were set off as the O/E ratio suddenly jumped into the “3 s.” After receiving appropriate clearances, I reviewed the cases in question and found the following details:

Patient A: An obese, diabetic, male patient in his sixties had an entirely uneventful lumbar decompression surgery for spinal stenosis and had experienced a catastrophic myocardial infarction within a day of the procedure, despite preoperatively having been cleared by a cardiologist and being on the appropriate perioperative precautionary measures. A case of the dreaded “silent” diabetic coronary angiopathy had likely claimed the life of this patient.

Patient B: A male patient in his late fifties presented with a bilateral unreduced fracture dislocation C4-5 with a C4 level ASIA A, MS 0 spinal cord injury more than 24 hours out from a motorcycle crash. A staged anterior decompression and reduction with interbody grafting and locking plate fixation was followed a day later by a multilevel posterior decompression and fusion. Both surgeries were performed swiftly without complications and resulted in an effective decompression of the cord and good hardware placement. A week later the patient had failed to improve neurologically at all, he was ventilator dependent, and his follow up MRI showed a nice cord decompression but devastating intramedullary cord signal changes. After several visits with the palliative care team, family conferences with surgeons, rehabilitation medicine doctors, and with support of his family, he decided to be given comfort care only. The patient expired peacefully a day later.

Patient C: A male patient in his later eighties with a typical AO type B2 lower cervical spine hyperextension fracture in ankylosing spondylitis without neurologic deficits was treated with early posttraumatic posterior multilevel fixation with low bloodloss and great-looking reduction and fixation. This emaciated male had ongoing postoperative dysphagia without any signs of esophageal injury or entrapment, and after repeated failed efforts at nutritional support, was finally scheduled for a percutaneous endoscopic gastric (PEG) tube. A day after this procedure, the patient expired from the complications of an occult perforated gastric artery.

Patient D: A female patient in her late seventies with multiple comorbidities, who had failed prolonged nonoperative care for a midlumbar burst fracture with high grade canal stenosis and inability to mobilize, had received an uneventful limited posterior decompression, fixation and fusion after multiple well-documented medical and shared decisionmaking interactions of her spine providers with an agonized patient and an agonized family. After successful mobilization she was transferred to a skilled nursing facility a week later. There she developed clear wound drainage 2 weeks after her discharge from the hospital. She was, however, refused return to her primary hospital by practitioners at her care facility for almost a full week with the statement that “she wouldn’t tolerate any more spine surgery anyways.” When she finally was brought back to the hospital in question, she presented in septic shock and despite heroic surgery, finally succumbed to multiorgan failure.

These 4 cases and the patient fates that they represent raise a number of interesting questions and concerns about how we see and use statistics to look at our practices.

Aging and sicker patients: Undoubtedly, in many countries the population that we as spine surgeons are asked to treat are getting older and sicker. The statistics are fairly clear in terms of the trends continuing to point in that direction. Three of the 4 patients in this mortality statistic clearly were representative of this well-documented global trend.¹

The attribution of “responsibility” in 3 of these 4 these cases (patients B, C, and D) raises questions about whose decision it is to attribute these cases. The patient with the complication of the PEG tube placement should have arguably never been assigned to the spine service, as the patient experienced the complication as a result of another procedure. However, the spine service “ordered” this other procedure, so they should be still the responsible party. Should the initial surgeon and their service bear the responsibility for this mortality or should this be placed into the domain of the subsequent consulting service? Should the clear delay in care of an outside convalescent facility asked to care for patient D be held more or less responsible than the treating surgeon, who did the best they could under most difficult circumstances and had no knowledge of a neglected wound healing complication (despite clear orders asking to be notified in case of wound drainage)?

In the case of the patient with severe spinal cord injury who decided to have life support measures ended, despite best medical efforts due to a grim preexisting outlook—is that the responsibility of the treating surgeon? Does the attribution of this case to surgeons really help us as society or could we intimidate future surgeons into thinking by doing nothing they could stay clear of a complications listing, thus depriving patients with difficult conditions from receiving any interventional care? In the end, it turns out that these assignment decisions are made by hospital coders who do their best by following guidelines. In their logic and according to their guidelines, they thought they were doing “the right thing” but also did not ask for input from the surgeons in order to stay “independent.” As was the case in this hospital system, coders are usually not part of the discussions held behind “closed doors” in morbidity/mortality and quality review proceedings. Thus, they don’t know what discussions went on among peers and are left to their own adjudication.

Calculating “E”: The statistical tool used to try to express the comorbidities of patients to provide a reference for the “observed” part of an occurrence is the “E” as in the “expected” part of the O/E quotient. In principle, a “quotient” is used to express mathematically the “presence or degree of a characteristic in someone or something.”² In an equation where the desired outcome is a quotient below “1,” the role of the divisor is obviously critically important, as it provides the very foundation of this equation; in short, the larger this number is the better the chance to bring the result towards the magical “1” or below, the smaller the likelihood it will be above “1,” a dreaded finding, especially if it is a change from the treasured past values. The critical question, therefore, is how the expected number is calculated. Basically, the targeted population for the expected cohort is compared to a “reference” population derived from a larger database (for instance, in the United States, this is derived from a Medicare database with fee-for-service reimbursements) using patient characteristics such as age range, gender, and some key diagnostic codes expressed in ICD 9 or 10 terminology. In the case of the ankylosing spondylitis fracture patient in his eighties, the statistical modeling apparently predicted a mortality rate of 3%. This was quite surprising as every publication over the last 20 years reports the 1 year mortality rate of patients with ankylosing spondylitis fractures, regardless of type of treatment to be 20% or even higher, especially in a patient close to 90 years age.³ In closer review, it appeared that the key parameter—the fracture in an ankylosing spine—had been dropped to a lower representation in the coding hierarchy, thus missing the key point of the impact of this injury entirely. The coder probably felt that the dysphagia was attributable to the surgery and the surgeon, but did not know that this was an expected problem of the advanced age and the nature of injury in this type of underlying spine condition. No doubt this adjudication would probably be heavily called into question by most, if not all, spine surgeons. Interestingly, change in coding personnel may result in different interpretations as well, as was likely the case here.

Small numbers effect: When relatively small numbers are assessed statistically then any shift in numbers can have a dramatic effect. In this example, a move from 1 to 4 patient deaths may look like a dramatic rise from one quarter to the next, but seen over the course of a year or longer should balance out.

Cost of care: Patients A and B seemed to have excellent care, not only during their hospitalization but also in their preoperative workup. In their attribution of cause, the question was raised why patient A did not have a more detailed coronary artery workup after having completed had an unremarkable preoperative ECG and a stress test, and reporting no clinical signs of cardiac insufficiency. In patient B’s case, the thought was that the patient could have been kept in a surgical unit for a longer period of time to monitor healing better. Again, this is an example of a discrepancy of what most, if not all, spine surgeons would consider an acceptable and widely practiced standard of care and an abstract image usually derived posthoc of an idealized form of care, which is frequently not only unrealistic, but quite possibly more dangerous to patients.

In the end I assured the medical director that no intervention was needed. Reassuringly, other typical quality parameters, like infection, unplanned readmissions, and returns to the OR were completely unchanged. My advice was to enhance communications of coders, quality managers, and clinicians to record best possible “cleaned” data so that real insights to benefit patient care can be gleaned without data manipulation and “gaming the system.” In this case an amicable resolution of the coders and the clinicians was possible, but it was not hard to see how different this could have been played out had there been differences between administrators and clinicians. In a situation of disjointed clinicians and administrators it is very possible, and indeed has happened, that “hard data” can have easily been weaponized against clinicians. In the end, this example shows how difficult it can be to apply the hard binary truth of simple numbers to complex clinical scenarios with many human factors attached. We have to realize that there are judgment calls made, not only by clinicians in the hospitals, but also by coders as to the classifications and ratings of care and complications. In the act of translation of words into numbers, judgment calls in grey zone situations affect the numeric expression and lead to false impressions. So in the end, numbers don’t lie—of course—but we humans can err in our interpretations of numbers.

For me, the main message is that clinicians in high-complexity specialties like spine surgery must not only document their decisionmaking and clinical actions diligently and clearly, but ideally are also involved from ground up in the data gathering and analysis process and cannot abdicate this responsibility to some detached coding office. Otherwise, the expected quality data can lead to most unexpected findings, with extraneous, and potentially harmful, conclusions being drawn.

Jens R. Chapman, MD Swedish Medical Center, Seattle, WA, USA Karsten Wiechert, MD Paracelsus Medical University, Salzburg, Austria Jeffrey C. Wang, MD USC Spine Center, Los Angeles, CA, USA

References

Murray

Vos

Lozano

. Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. The Lancet. 2012;380,9859:2197–2223.

Collins Dictionary. https://www.collinsdictionary.com/dictionary/english/quotient. Accessed September 11, 2018.

Rustagi

Drazin

Oner

. Fractures in Ankylosing disorders: a narrative review of disease and injury types, treatment techniques, and outcomes. J Orthop Trauma. 2017;31: S57–74.