Abstract
Healthcare organizations and workers are under pressure to produce increasingly complete and accurate data for multiple data-intensive endeavors. However, little research has examined the emerging occupations arising to carry out the data work necessary to produce “improved” data sets, or the specific work activities of these emerging data occupations. We describe the work of Clinical Documentation Integrity Specialists (CDIS), an emerging occupation that focuses on improving clinical documentation to produce more detailed and accurate administrative datasets crucial for evolving data-intensive forms of healthcare accountability, management, and research. Using ethnographic methods, we describe the core of CDIS’ work as a translation practice in which the language, interests, and concerns of clinicians and clinical documentation are translated via real-time “nudging” and ongoing education of clinicians into the language, interests, and concerns of medical coders, structured administrative datasets, and the various stakeholders of these datasets. Further, we show how the institutional context of CDIS’ work shapes the occupational virtues that guide CDIS’ translation practice, including financial reimbursement, quality measures, clinical accuracy, and protecting clinician’s time. Despite the existence of these multiple virtues, financial reimbursement is the most prominent virtue guiding CDIS’ limited attention. Thus, overall clinical documentation is “improved” in specific, partial ways. This research provides one of the first studies of the emergent data work occupations arising in the wake of digitization and big data opportunities, and shows how local data settings shape large scale data in specific ways and thus may influence outcomes of analyses based on such data.
Keywords
Introduction
The ability to collect, store, and analyze massive amounts of data is expected to revolutionize everything from scientific discovery (Hey et al., 2009) to business (McAfee et al., 2012). Data-intensive science has been widely touted in the health domain in particular. Health information technologies such as Electronic Health Records (EHRs) were widely marketed based on the promise that implementation would lead to a treasure trove of data that could be marshaled to improve the efficiency and quality of healthcare services on a broad scale. While computational advances provide the preconditions for data intensive analyses in healthcare, the institutional drive for increased accountability has simultaneously led to intensified oversight of healthcare providers (Tuohy, 2003). Healthcare is not operating in a vacuum; the move to quantified accountability is part of a powerful and global trend towards data-driven accountability (Shore and Wright, 2015). In universities (Sauder and Espeland, 2009) and other sectors of education (Anagnostopoulos et al., 2013), algorithmic measurements are applied to data representing the practices of both individual and organizational providers. Institutional demand and local adoption of data-driven accountability increase the need for more accurate data produced in specific formats.
Thus, interwoven forces of technological advances in data tools, digitization of healthcare work systems, and demand for data-driven accountability are contributing to what has been called “data intensive resourcing” in healthcare (Hogle, 2016), defined as “ … attempts at getting more data, of better quality, on more people. Sourcing is a dynamic process of creating, collecting, curating and storing data while simultaneously making them available for multiple purposes, including research, governance and economic growth” (Hoeyer, 2016: 74). Data intensive resourcing places increasing demands on healthcare workers. For example, since medical records are the major source of data extraction, clinicians are facing new demands for documentation in patient records to be accurate and comprehensive (Kuhn et al., 2015), so that other data workers such as medical coders can extract high-quality data for usages such as data-driven accountability. Subsequently, a new line of research has begun to examine the on-the-ground work required to produce, manage, analyze, and deploy data in healthcare carried out by healthcare workers, including both clinicians and non-clinicians (Bjørnstad and Ellingsen, 2019; Bonde et al., 2019; Grisot et al., 2019; Islind et al., 2019; Pine, 2019).
While this burgeoning research stream sheds light on how data intensive resourcing is re-shaping practices of existing workers, a related phenomenon remains unexamined: new and emerging occupations that have sprung up to manage new forms of healthcare data work. As outlined by Anteby et al. (2016: 187), occupations are “socially constructed entities that include: (i) a category of work; (ii) the actors understood—either by themselves or others—as members and practitioners of this work; (iii) the actions enacting the role of occupational members; and (iv) the structural and cultural systems upholding the occupation.” New occupations come into existence through multiple processes (Anteby et al., 2016), including going from unpaid to paid work (e.g., household work), “hiving off”—allocating of an established profession’s routine tasks to others (Nelsen and Barley, 1997) (e.g., sonographers taking over ultrasonic imaging from physicians), and arising when new tasks have to be done (Crosby, 2002). Variously referred to as “new” or “emerging,” in the U.S., these types of occupations are often defined by researchers in relation to the U.S. Bureau of Labor Statistics’ Standard Occupational Classification system as occupations with work activities, skills, and knowledge that are not listed on the most recent version, but exist and are growing in number (Crosby, 2002). Henceforth, we refer to “emerging occupations,” because the definition encompasses new occupations as well.
As data intensive resourcing has developed in healthcare, a number of emerging occupations have materialized—yet, studies of emerging occupations centered on data work are exceedingly rare. One such emerging occupation is Clinical Documentation Integrity Specialist, abbreviated “CDIS.” CDIS are an emerging occupation, because they are not listed in the most recent version of the Standard Occupational Classification System. The classification “Medical Records Specialists” (occupation 29-2072) refers to medical coders, a related but markedly different occupation; CDIS make higher salaries ($76,096 on average according to ziprecruiter.com vs. an average $40,350 for coders), and unlike coders typically have a clinical background and extensive clinical experience (ACDIS, 2017). CDIS are growing in number (at the time of writing over 10,000 CDIS job openings were listed on Glassdoor) (Barnhouse and Rudman, 2013), and formed a professional association (ACDIS) in 2007.
CDIS monitor clinicians’ charting in real time and query clinicians to improve documentation. They also lead educational efforts to teach clinicians how to better document, so medical coders can produce more specific and detailed code sets. In a classic study of medical records, Garfinkel noted there are “good” organizational reasons for “bad” records. From the perspective of clinicians, records are made for the purposes of documenting and coordinating care and treatment, not for managerial or statistical aims. Documenting for the latter purposes demands extra work for which clinicians have little time and interest; indeed, such time may be better spent on patient care (Garfinkel, 1974). CDIS nudge clinicians to improve “bad” medical records; records that were deemed “good enough” for decades. The value of CDIS is ultimately tied to data intensive resourcing and attendant demands for more—and more fine-grained—data (Kuhn et al., 2015). Yet, careful consideration of the work of CDIS raises important questions about the institutional demands placed on healthcare datasets, and in turn, data workers. Medical records are a key input into data intensive resourcing, and this source material is being shaped by CDIS into better data—but, what are the virtues that shape CDIS’ notion of “good” medical records? As we will show, the improvements made are directed by certain overarching occupational concerns, or what we (inspired by Pickersgill (2019)) call virtues.
As CDIS grow in number, their practice is reshaping medical records, and in turn, shifting the healthcare datasets increasingly relied upon for research, accountability, and other ends. Yet, despite the rapid growth of CDIS, no scholarly studies of their work exist outside of a small body of research aimed at practitioners. Scholarly literature contains no research about the content of CDIS’ data work, or about how the context of data work influences the data elements and datasets produced. To fill this gap, and extend the literature on practices of data production and emerging data work occupations broadly, we answer the following research questions:
What practices do the emerging occupation of CDIS employ to improve the integrity of medical documentation, thus producing data? How does the current institutional context of U.S. healthcare shape the data work of CDIS?
Using in-depth qualitative research at multiple locations in the U.S., we show how CDIS employ a novel and complex set of data practices focused on improving source material for coded datasets—namely, medical documentation. These data practices center on translating between the written form, concerns, and interests of medical documentation recorded by clinicians, and structured code sets (which collectively form “administrative datasets” used for a variety of purposes) produced by medical coders. We show how, through their translation, CDIS improve medical documentation in a specific and partial way, and how the practices of promoting data integrity are deeply shaped by the broader institutional context of healthcare in the U.S.
Theoretical framework
Documentation improvement as translation
Our theoretical framing is based on the concept of translation as it is used within the field of Science, Technology, and Society (STS). Originating from French philosopher Michel Serres, translation is not confined to language, but should be understood more broadly as “ … an act of invention brought about through combining and mixing varied elements (Brown, 2002: 6).” Translation involves displacement or “shifting out” of elements, functions, or relations. Translation is a process of making connections between various domains or entities, thus creating innovations that combine elements into new entities. Drawing on Serres, Latour shows how, for example, one might translate the aim of having closed doors in a variety of ways, from having a sign saying “please close the door,” to hiring a porter, to installing an automatic door-closer. Thus, different translations entail moves between different materials, each of which change the relational network in which “door” is embedded (Latour, 1988). Similarly drawing on Serres, Callon shows how the interests of fishermen, scientists, and scallops are translated through a project on introduction of Japanese scallops into the waters of Brittany (Callon, 1986). Here, translation entailed redirecting actors’ actions and interests to align with the project, thus establishing relations across separate domains. For Serres, translation is an ontological given, while Latour and Callon apply the principle on technologies and projects (On translation in STS, see Brown, 2002). Within healthcare, the move from paper-based to digital medical documentation is a kind of translation, where the medium of documentation is changed, and distribution of information to multiple readers, digital accumulation, and calculation of data is enabled, and new networks of actors established.
CDIS query and nudge clinicians to change their medical documentation language towards wording that is code-able by the requirements of medical coding, thus translating in the literal sense. In a broader sense, CDIS translate as they embed physicians’ practices into concerns of value-based reimbursement, hospital management and reputation, healthcare authorities, and accountability. Thus, clinicians are enrolled into a new practice of data intensive resourcing.
In their translation work, CDIS must handle different actors’ interests, including clinicians, administrators, medical coders, and quality improvement personnel. For example, doctors dislike extra work beyond what is necessary for medical purposes, administrators aim for financial pay off in the form of increased reimbursements and decreased denials from healthcare payers, and quality improvement personnel wish to have data necessary for accountability measures. We draw upon the concept of “biomedical virtues” defined as “ … the (profession-defined) praxis of goodness … ” within the clinic (Pickersgill, 2019: 17). Such virtues can be actual or aspirational, and can be strategically employed, such as when the canonical biomedical virtue of preventing death is applied for treatments that potentially may do this in the future. While virtues are possessed by groups—such as an occupation—they only do so through the situated practice of people who animate the institution (Fricker, 2010). Multiple virtues may be employed, as in the case of digital psychiatry, where virtues include access to treatment, enhancement of clinical decision-making, and new insights into disease (Pickersgill, 2019). Though working within healthcare and medicine, CDIS’s virtues are not biomedical as such. Their “praxis of goodness” is not directed at patients’ health, but at achieving comprehensive and accurate data; making doctors document “correctly” according to multiple virtues, as we will unfold below. We find Pickersgill’s concept useful for showing how CDIS’ work is directed by multiple virtues, and how these virtues shape their translation practices.
Emerging occupations and data work
As the importance of data for understanding, explaining, managing, regulating, and predicting aspects of the world has grown (Kitchin and Lauriault, 2014), research has critically examined practices of sharing and reusing data (e.g., Bietz and Lee, 2009; Borgman, 2015); maintaining and repairing data collection infrastructure (Cohn, 2016); designing algorithmic systems (Holten Møller et al., 2020); and “prospecting” data for data science (Slota et al., forthcoming). These studies are evidence of the fact that data work is often more effortful, skillful, and resource-intensive than suggested by much of the scholarly literature or popular media accounts that depict data as “digital exhaust.” In healthcare, demands for data work have resulted in occupational groups needing to learn new data practices as the content of their work is changed by new or increased demands for data. For example, birth certificate clerks in the U.S. have come under increasing pressure to improve their data entry, because birth certificate data have recently become an important source for research and data-driven accountability of obstetric practices (Pine et al., 2016). Clinicians are also tasked with performing new kinds of data work, for example, to maintain integrations between data in various information systems (Bjørnstad and Ellingsen, 2019) and to interpret patient-generated data (Grisot et al., 2019; Langstrup, 2019).
Less research has been conducted on the emerging occupations that have arisen to meet the demands of data intensive resourcing. Emerging occupations centered on data work in healthcare include “Health Data Scientist,” data scientists focused on health analyses (Davenport and Patil, 2012), and “Medical Scribe,” who record clinician notes into EHR systems (Bossen et al., 2019a). However, despite growing evidence of emerging occupations, there is still little scholarly research on the local, on-the-ground data workers, and data work practices that fuel the data intensive resourcing ecosystem (Bossen et al., 2019b).
Focusing on translation as a facet of data work, prior research has examined how layperson users apply their interpretive frameworks and translate between genetic analyses and different ways of deriving meaning, value, and action from that data (Ruckenstein, 2017). New forms of data necessitate translation work by workers who render data into a useable form; patient-generated data, for example, must be translated into medical record data by clinicians (Islind et al., 2019). Anteby et al. (2016) describe three (interrelated and often overlapping) lenses applied by researchers to study occupations in organizations. The lenses are “becoming” (how a new member is socialized into an occupation); “doing” (the specific activities an occupation does); and “relating” (the relations that an occupational group builds with other groups). The translations that CDIS engage in bridge different material forms, interests, and values, and also involve direct interaction with different occupational groups (e.g., clinicians and medical coders) to nudge clinicians toward producing records that can be better coded by medical coders. Hence, CDIS’ work activities of improving data are fundamentally bound up in their relations to other occupational groups.
The institutional context of data intensive resourcing in healthcare
A growing body of research examines how institutional contexts shape data work in organizations. For example, recent work shows that nonprofit organizations will prioritize data production over other concerns to provide data to funders to maintain legitimacy (Bopp et al., 2017). Several interwoven institutional, political, and economic forces are contributing to demands for data intensive resourcing in healthcare in which data sources, data analytics, health domains, and data-intensive applications are connected (Hogle, 2016). Healthcare organizations experience pressure to improve data infrastructures and to reconfigure healthcare organizations and healthcare work around data creation and use (Aula, 2019). As Hogle (2016) describes “ … capturing big data will enable the transformation of healthcare, so it is necessary to transform healthcare to capture big data” (Hogle, 2016: 380–381).
Since EHR data has proven not to be easily extractable or standardized across sites (Verheij et al., 2018), structured data produced as part of administrative processes has taken on increasing importance. ICD-10 (International Classification of Diseases, 10th revision) is an influential classification system for diagnoses and procedures that has been in use for decades (Bowker and Star, 2000); records coded using ICD-10 form the basis of administrative data in healthcare. Because of its size and complexity (ICD-10 contains roughly 139,000 codes), the ICD offers potential for accuracy in structured data sets. However, it takes an immense amount of work to fully apply ICD codes: … the transaction costs involved in collecting information multiply with precision. In the case of the ICD, clinicians saw the work of collecting data as trading off against patient resources, while statisticians wanted as much accurate information as possible. (ibid: 144)
Further, in the U.S., value-based reimbursement systems (which tie part of a provider’s reimbursement to their performance on selected quality measurements) have been mandated by policy in recent years (for an overview see Hogle, 2019). The passage of sweeping national healthcare policy starting in 2010 has advanced such schemes for providers with patients who have public health insurance. Calculating the quality measures key to value-based reimbursement involves specific data elements, and since the quality measurements now impact reimbursement, the financial ramifications of under-collecting data can be devastating. On the other hand, healthcare providers risk audit and penalties for “upcoding”—coding for a higher level of severity than warranted in the documentation or for procedures that did not occur.
The rise of the DRG classifications and subsequent introduction of value-based reimbursement has spurred data-intensive resourcing in healthcare, and hence ties in to the observation that the rise of modern administration and sciences display a parallel process of quantification that creates new entities (e.g., DRGs) that can travel across different domains, sites, and times (Porter, 1994). Further, the rapid growth of CDIS as an occupation reflects the gravity of this new world of intensified quantification. For years, healthcare organizations relied on medical coders or administrative assistants to code medical records produced by clinicians. Now, the data work entailed in producing administrative datasets for quantified entities—whether DRGs, quality measurements, or balance sheets—requires expert staff dedicated to accurate and comprehensive coding in accordance with the most recent guidelines. As we show next, the virtues embodied by this emerging occupation are centered on a variety of concerns in addition to accuracy and completeness of data. Thus, through the translation practices of CDIS, administrative datasets are “improved” in specific and partial ways. Nevertheless, these same datasets are key to a variety of big data analyses that may be blind to the virtues that shaped production of the datasets. For scholars and other stakeholders of big data broadly, the case of CDIS points to the importance of examining the virtues that guide translations carried out by data work occupations, thus the content of big datasets themselves.
Case and methods
CDIS—An emergent occupation
CDIS programs have existed since the 1990s (Richard, 1992), but did not take off at a larger scale before implementation of new coding requirements by Medicare and Medicaid in 2007, which required more granular DRG coding (AHIMA, 2010). This spurred the formation of the ACDIS (Association of CDIS, see: ACDIS.org) in 2007. ACDIS introduced certified credentials in 2009, and the American Health Information Management Association (AHIMA) launched its own regulations, guidelines and accreditation for CDIS in 2010 (AHIMA, 2010). The transition from ICD-9 to ICD-10 in 2015 greatly increased the number of ICD codes, further spurring interest in CDIS programs.
It is clear from CDIS literature that coding and reimbursement are closely linked. As one paper summarized: “Clinical documentation improvement is an effective, but challenging method for improving revenue” (Asakura and Ordal, 2012: 100). A recent position paper from ACDIS states that CDIS work initially focused on financial reimbursement and quality outcome scores, and today has expanded to include data to calculate quality indicators such as mortality, readmissions, and complications. In the future, ACDIS envisions a focus on the integrity of data and reimbursement accuracy, and changed the “I” from “improvement” to “integrity” in 2019 (ACDIS, 2019). In practice, CDIS display multiple virtues as part of improving documentation, as we will show in our case.
Methods
Our findings build on ethnographic studies conducted at two different field sites involving observation and interviews. One field site is a healthcare network in the Southwest U.S. (referred to as “Southwest Hospital System”), where the CDIS unit is part of a Health Information Department. In 2014, staff increased from four to nine CDIS as coding moved from ICD-9 to ICD-10 and has since grown rapidly to presently 26 CDIS. We interviewed the CDIS Manager, who had worked in this capacity for 11 years, and subsequently observed nine CDIS while working at 5 different locations for a total of 23 hours, and conducted interviews with all 9 of them for a total of 3 hours and 37 minutes of interviews (average 24 minutes). All CDIS had worked as Registered Nurses (RNs) for more than 10 years, except one who had only two years of nursing experience. The other field site is a hospital on the West coast of the U.S. (referred to as “West Coast Hospital”), where the CDIS unit is part of the Medical Records Department. We conducted one interview with the CDIS Director (60 minutes) and one CDIS (50 minutes), as well as observed two CDIS working in a single location for a total of 16 hours. The CDIS group had developed recently, having gone from zero to eight CDIS in only a few years. All CDIS in West Coast Hospital were Registered Nurses with several years of experience in the nursing field.
All interviews were recorded and transcribed with the exception of the CDIS Manager and one CDIS at the first site, where notes were taken during the interviews and extended to full text immediately after. The latter procedure was applied for observations as well.
Data analyses largely followed grounded theory, in that we did not seek to gather data to test existing theory. Rather, theorizing arose directly from data (Corbin and Strauss, 2014). To analyze data, all transcribed interviews and extended notes were coded by the two authors in a qualitative software analysis program, beginning with a round of open coding to create a set of agreed-upon codes that was subsequently applied through a subsequent round of focused coding. As analysis advanced, we refined our theorizing through the concepts of translation and virtues, as described in theoretical framework. For this paper, the two authors discussed the coded sections and identified the most pertinent findings to be presented and discussed, which we do next.
Findings
The process of CDIS work
CDIS’ work follows a general pattern of working between the EHR and a “computer assisted coding” (CAC) system, a software program that both CDIS and medical coders use to create code sets based on clinical documentation in the EHR. When a CDIS starts a new case, she (CDIS in our field sites are predominantly female, hence we use “she/her” pronouns) checks a “work queue” in the EHR, selects a patient case, and opens it in the EHR along with the CAC. Then the CDIS reads through the patient record. As the CDIS reads, she builds a summary of the patient in the CAC’s field for review notes. As the summary develops, the CDIS looks for details that will impact coding data and subsequently billing: the patient's condition, the level of severity of the patient’s condition, complications of the condition, primary and secondary diagnoses, procedures and tests, and so forth. CDIS look for complications (medical conditions that have arisen due in some part to other diagnoses) and comorbidities (presence of one or more chronic conditions along with the primary condition). Complications and comorbidities (CC) were introduced as codes with Medicare Severity DRG (MS-DRG) in 2007, and can be standard CC or “Major” CC (MCC).
The following observation of a CDIS outlines this typical process: “Andrea picks a chart from the work queue, and it opens in [the EHR]. She clicks ‘CAC’ and the chart opens. She skims the physicians’ notes, reads the case managers’ notes. The primary diagnosis is ‘A41.9 sepsis, unspecified organism’.” Andrea notices there is some conflicting information in the patient record: One of the auto generated codes, ‘Body mass index’ in the CAC piques her interest, and she starts investigating whether the patient has malnutrition. The patient’s body mass index is low, but the dietician’s consult note says ‘support non severe malnutrition’. However, since a physician note said ‘severe malnutrition,’ the record has conflicting information. Andrea starts writing the summary of the case for the physician: ‘The following clinical indications are in the record: … ’ She edits the possible options the physician can choose, inserting relevant choices and, as per CDIS guidelines, deleting irrelevant ones and leaving the options ‘Other’ and ‘Unable to determine’ on the list.
Once the query has been sent, Andrea places a “sticky note” in the EHR that will pop up each time the patient’s record is opened, alerting the physician that there is a documentation query. Physicians respond to queries by altering their documentation to add the necessary diagnosis, add specificity, clarify ambiguous charting, and so forth; physicians often also write a note back letting the CDIS know that they have altered their documentation—or decided not to do so.
CDIS return to cases every or every other day depending on how much activity and new information they expect, and continue to review additional documentation as long as the patient is admitted. They also check up on the status of their queries. When selecting cases to work on, CDIS start with newly discharged patients with unresolved queries. These are urgent because once a doctor has entered a discharge summary, it becomes harder for CDIS to successfully query a doctor. The next priority are cases with queries that have not been completed. Sometimes even when doctors respond to a query they forget to change their documentation, so CDIS monitor whether they change documentation and send reminders if that is not the case.
At the Southwest facility, CDIS have a workload of 8 to 10 new cases every day, and 18–20 additional cases that they revisit until discharge. For each case, the CDIS builds a summary of the patient via a code set and notes in the CAC, which both evolve throughout a patient’s stay as she re-reviews the patient chart, submitting queries as a need is detected. CDIS make notes for themselves in a column called “tickler comments” on the CDIS work queue to help them remember pertinent issues on a case so they can more easily pick it back up. For example, one CDIS made the note “max/procedure 4/3” to indicate what she had determined to be the maximal “SOI” (severity of illness) and “ROM” (risk of mortality) for this case given a successful query.
Translation practices: Improving documentation to improve data
At the core of CDIS, practice is the goal of improving clinical documentation, which will be processed into standardized data terminologies (ICD-10 and DRG codes). CDIS improve physician documentation by querying doctors about missing, ambiguous, and unclear documentation pertaining to the severity of a diagnosis, the presence of comorbid diagnoses, or complications. However, for documentation to be improved in a way that will result in “better” data, the clinician has to document in specific ways, reflected in the use of specific words in the medical record. In the following quotation, the CDIS Director of West Coast Hospital describes the need for clinicians to document more information and to do “better” documentation in specific ways: A lot of the older physicians think the chart is a means of communication between the care team. And it is–‘I am writing the orders. I am writing the note to tell the next doctor what I think is going on with the patient.’ In today’s day and age, so much data is going out and being monitored and looked at. That’s the hard thing for them to get ahold of. It’s what they are writing with that pen, or what they are typing into [the EHR] is the only means of us to get anything [billing] from these patients. And they aren’t used to … their words being so powerful … it is actually the specific terminology that they are using that has this impact. Not so much the general diagnosis—we know they are in heart failure—but they can say that 12 different ways. And five of those ways are going to shoot them down. … if they say diastolic heart function, they think they're saying this is congestive heart failure, but it doesn't necessarily translate out into congestive heart failure. Another example is flash pulmonary edema. They might think that they're saying pulmonary hypertension, but flash pulmonary edema is not a code-able diagnosis. Or they come in sick with urinary tract infection and sepsis and the word [that the doctor wrote in the chart] was urosepsis, but that has no code.
Another part of “creating the bridge” to cross the gap between clinicians’ documenting practices and coders’ coding practices is providing ongoing education to clinicians to improve their understanding of the linkages between medical documentation and the fiscal logics that surround administrative data. CDIS’ education efforts also go beyond fiscal consequences. The code sets produced by coders are also used to calculate quality measurements that are highly consequential for physicians and hospitals. CDIS Educators develop curriculum and “tip sheets” for clinicians and present these to the doctors. For example, the CDIS Director at West Coast Hospital explained the education she does to get doctors to connect the dots between their documentation and financial evaluation of utilization of services: What I try to show them at so many different levels is that pneumonia patient you are talking about … say they send them home on the fourth day. Had they not documented properly on paper that patient should have only been here for 2.8 days. So they kept them longer than they should have. … when they are looking to get contracts with Blue Cross and they are looking at utilization, they will look at a physician’s actual length of stay, and their allowed length of stay, their geometric length of stay. And if you are way over it, or under it, if you are keeping your patients too long, then you are a bad utilizer. Why is someone going to contract with you? … .You are costing us so much money, because you are keeping these patients days and days over. When in fact, if you had documented right, that patient could have had a five-day allowed length of stay. Now he is a good doctor, because he actually got him out a day early.
CDIS are also concerned with getting clinicians to document in the specific places in the EHR that coders draw from, because there are rules for what parts of the patient record medical coders can code. Thus, diagnoses and procedures must be contained in the discharge summary for medical coders to be able to apply codes. For example, the CDIS “Kerri” described that the challenge is to get physicians to “ … enter the right diagnoses and keep them there. A diagnosis might be mentioned every day in the notes, but for some reason be missing from the discharge summary, and in that case the coder will not code it.” Kerri explained that she can only sometimes convince a medical coder to do a “post discharge query” to a physician (a query done after a patient leaves the hospital, often weeks or months later). Medical coders are hesitant to send queries to clinicians, so it is imperative that CDIS get clinicians to not only document in code-able ways, but document in the sections of the medical record that the medical coder can draw from.
CDIS education efforts extend beyond clinicians, and CDIS also translate between the concerns of coders and quality personnel responsible for calculating and communicating quality indicators. Both CDIS programs we observed have mechanisms for working directly with quality personnel in the form of work groups or regular supervisor meetings. Through these mechanisms, CDIS convey to quality personnel the limitations of clinical documentation and the rules that underlie medical coding, which shape the administrative data that quality personnel use to calculate and make sense of quality indicators.
In sum, beyond the literal translation across the language gap, CDIS also translate in the sense of bridging the gap created by clinician’s belief that documenting is primarily about and interrelated to concerns of patient treatment and team collaboration. Clinicians’ practices are part of and interrelated to CDIS, medical coders, ICD-10 and DRG codes, reimbursement, healthcare providers, insurance, and accountability. CDIS “bridge the gap” by changing clinicians practices through monitoring and querying and education efforts. Thus, CDIS nudge clinicians and bring them—and their documentation—into the larger, interrelated network of institutional data interests and concerns beyond patient treatment and into the flow of data-intensive resourcing.
Better for what reason? Balancing virtues of data improvement
During observations and interviews, we found that CDIS account for their practices in multiple ways, which reflect different virtues for data improvement. Specifically, CDIS invoke accounts of documentation improvement for reimbursement; quality assessment; letting clinicians do their core job; and accuracy and completeness. For example, while CDIS code and make queries to clinicians to improve reimbursement, this is not pursued just for the love of money: CDIS see improved reimbursement as benefitting both patients and the hospital. In one case, a CDIS got the SOI/ROM for a chart up to 4/4, and commented: “This is good for the patient and for the [healthcare] system. The patient has already received treatment for $140,000 according to the EHR, but the diagnosis codes amount only to $18,000.” Thus, reimbursement is not only about profit, but also about making hospitals financially feasible. Below, we will unfold how CDIS account for their practices and how this reflects multiple virtues.
First, coding SOI (Severity of Illness) and ROM (Risk of Mortality) as in the above example are not only about billing, but also about accountability of the hospital in the arena of public perception. If a hospital has a high mortality rate, but a population whose SOI is low on average, the hospital’s profile is poor, as reflected by data presented to consumers via public-facing websites for organizations concerned with evaluating healthcare quality. Hence, SOI and ROM are also recorded, because otherwise patients will appear to have no complications or additional risks, and length of stay and mortality rate will appear unreasonably high. This in turn will reflect badly on physicians and the hospital. As a CDIS Director stated “ … if you were looking at these numbers on the outside from a public perspective to pick a hospital, and you looked at someone who was killing more patients than they were supposed to, you wouldn’t want to go there.” The CDIS “Kerri” explained: … if a 76 year old woman is admitted from a care home and she goes into sepsis and is treated with antibiotics for two weeks and then needs respiratory support for two weeks and then dies, this will look bad and incomprehensible if the doctor merely writes UTI (Urinary Tract Infection), whereas it should be Urinal Sepsis which would document that the woman was really sick when admitted. … this isn’t just about reimbursement, like I said, it’s about the physicians’ profiling, the hospitals’ profiling, and so much more. But again, money does talk. However much we can afford until we can get all those FTEs [full time employees] and that’s how we are marching it out so to speak.
When improving documentation does not have a clear impact in terms of reimbursement or risk adjustment, CDIS often decide not to pursue a case much further—even when doing so will result in more comprehensive or detailed data. For example, during an observation, a CDIS explained that she would not review the case frequently, because the SOI/ROM was already at its maximum, and further reviews were not a good use of her time, even though it may turn up opportunities to improve the clinician documentation overall. In another observation, a CDIS describes her logic for not querying a physician even though doing so may result in improved documentation: See here he [the patient] had some tachycardia. It could have been v-tac but they [clinicians] are saying it is an atrial fibrillation. Atrial fib is not a CC [comorbidity], but V-tac is. But they are not real clear, so I’m not going to query. If I had nothing, no CCs or anything, then I might pursue it more. But in this case, it’s not going to change anything, so I’m going to put it in my [CAC notes] but I’m not going to query.
However, overwhelmingly CDIS query based on whether or not doing so will increase the value of a chart. For example, when asked to describe her logic of carrying out a query, one CDIS described: … let's say [the doctor] said the patient is asthmatic. [if] asthma was my primary diagnosis, then I will ask [for specificity]. But if they're here because they have a toe infection, and then they gave them an inhaler for asthma, I'm not gonna ask, because it's not gonna move my chart. It's not gonna give me a higher SOI, or ROM. It's not gonna bring me any money. And I'd just be annoying the doctor. Like [the doctors] they’ve got more important stuff. He's about to amputate his toe, and maybe he's having a heart attack. Who cares about his asthma?
Thus, the various virtues that CDIS’ different accounts reflect are circumscribed by the concern for increased reimbursement, and that increasing financial revenues is their primary raison d’etre. Several features of CDIS’ work stress the financial aspects: the CAC calculates the number of queries each CDIS writes, how many queries were accepted by physicians, whether accepted queries changed codes for CC (comorbidity), MCC (major comorbidities), SOI (severity of illness), or ROM (risk of mortality), as well as the financial impact of the altered documentation due to queries. Such measurements serve to prove that CDIS are worth their salary in their healthcare network. Notably, however, CDIS at our two sites are not remunerated individually based on their financial impact.
Given the larger network of data-intensive resourcing and financial setup of U.S. healthcare in which CDIS work, the primacy of reimbursement is not surprising. But as we have shown, CDIS also invoke other virtues such as quality assessment, letting clinicians do their core job, and completeness and detail of documentation.
Discussion
While clinical documentation improvement has been known since the 1990s, the rise of data intensive resourcing in healthcare has led to the emerging occupation of CDIS. The majority of U.S. hospitals have CDIS programs (Barnhouse and Rudman, 2013), which leads us to believe that CDIS are an occupational group with some staying power. CDIS review patient records while patients are still in the hospital, code them, and nudge clinicians to alter their documentation by sending queries that prompt changes to clinician’s documentation so that it is code-able, disambiguated, complete, and accurate. Then, the patient records and coding can be forwarded and become validated by medical coders, and these coded datasets travel on to financial departments, insurance companies, data-driven accountability calculations, research endeavors, etc.
Thus, the central practice of CDIS is translation (Brown, 2002). CDIS literally translate between two dialects of medical language: the clinicians’ colloquial language on the unit floor and the language of ICD-10 and DRG coding. CDIS also act as intermediaries between the concerns and interests of clinicians producing medical documentation on the one hand, and the concerns and interests of coders and users of administrative healthcare data on the other hand. Drawing on Anteby et al.’s (2016) analysis of occupations in organizations, the translation practices of CDIS can be understood through the filter of “relating as brokering” where the occupational group of CDIS has arisen specifically to serve as a connecting, buffering, and mediating bridge between multiple occupational boundaries. As data intensive resourcing takes hold in healthcare, medical records are subject to new requirements as source material for administrative datasets. Clinicians are ill equipped to adopt strict new practices for documentation on their own, when striving for excellence in documentation offers little benefit to the patient in front of them. Medical coders have too much work and too little medical insight to successfully shift the documentation of clinicians. Thus, the CDIS occupation bridges the gaps in the complex relations between established occupations, shaping documentation, and datasets to get the “best” data possible (according to specific virtues) without intruding too deeply in clinical practice. In “servicing the bridge” between clinical documentation and structured datasets, CDIS are data workers who balance multiple and often conflicting virtues, as described above. Clinicians’ documentation is translated to meet the concerns and interests of stakeholders of structured data languages, and in so doing is inserted into the concerns of hospital management, insurance companies, and healthcare authorities. Whereas coding was previously retrospective and for billing, it has now also become concurrent and for multiple purposes.
CDIS’ translation practice is not one-way: they translate in both directions. Garfinkel (1974) found that clinicians’ writing in medical records services patient care and coordination with the medical team, but was “bad” for accountability reviews. In a sense, such “bad” documentation can serve as a positive indicator for patient care, since it can indicate more attention to patients. With the rise of data intensive resourcing, clinicians are under increasing pressure to provide complete and detailed documentation competing with time to care for patients. CDIS strive to take much of the labor for producing good documentation from clinicians, while improving completeness and accuracy of data and allowing clinicians to get paid. In their translation, CDIS are also shifting institutional virtues for accountability, reimbursement, and complete and accurate data to clinicians through monitoring and nudging clinicians’ documentation. Indeed, CDIS are affecting an instance of the feedback loops that Hogle (2016) addresses in which “capturing big data will enable the transformation of healthcare, so it is necessary to transform healthcare to capture big data” (ibid: 380–381). Our findings support this concept of the feedback loop by showing how the data work of CDIS is focused on transforming healthcare to capture big data. Our findings also extend Hogle’s concept of the feedback loop by showing that that CDIS are doing translation work that not only helps to produce datasets, but also serves to shift values and interests of healthcare institutions—namely reimbursement, accountability, and completeness and accuracy—out to clinicians.
Porter argued some decades ago that quantification in the sciences and administration was the outcome of similar concerns, and a matter of “the creation of new entities, made impersonal and (in this sense) objective when widely scattered people are induced to count, measure and calculate in the same way” (Porter, 1994: 390). Structured code sets such as ICD-10 and DRGs, including their many sub-indicators, are such new entities. As Porter also argues, these new entities are embedded in administration and, increasingly, healthcare data science. As billing has grown complex, it has become harder for hospitals and individual providers to get paid for what they do. Payers demand more evidence that the services rendered were appropriate and acceptable. As experts who balance and meld multiple virtues as part of their translation practices, CDIS shield clinicians from excessive queries and from the need for clinicians to become fluent in the language of medical coders and administrative data. Thus, they have emerged as a new occupation not through a process of hiving off routine tasks from one group to another, or unpaid work becoming paid, but through the creation of a new technology adopted widely enough to require support (Anteby et al., 2016): the creation and widespread adoption of accountability systems based on ICD-10 and DRG coding requiring specialized knowledge. Since physicians were already experiencing burnout due to digitization, they had no inclination to take on those tasks.
While a healthcare system designed to maximize production of big data might always value a data virtue of completeness and accuracy, CDIS often sublimate this virtue and save queries for when they will improve reimbursement or impactful accountability measurements. As translators of language, interests, value, and culture, CDIS balance the virtues at play such that they modulate the impact on healthcare workers, so that the drive for administrative data does not overwhelm the virtue of preserving clinician’s work. However, since CDIS almost always deem it justifiable to impose on clinicians work when it will impact reimbursement, data “integrity” still largely reflects financial virtues over virtues of completeness and accuracy.
Unlike at the time of Garfinkel’s (1974) study, which noted that medical records that were “bad” for accountability purposes had good organizational reasons, hospitals are presently expending enormous resources to pay CDIS for producing “better” medical records. However, the data impacted by CDIS’ translation is improved in particular and partial ways despite the multiple virtues of the emerging CDIS occupation. Specifically, clinical documentation is improved to the highest level of proper reimbursement; this is done for admitted patients, but not for outpatient visits (though this is a new focus for CDIS); and primarily for expensive treatments. The effect will most likely be that data on, for example, outpatient healthcare services where documentation is not improved, will be less amenable to consistent and accurate reporting, which in turn will skew (big) data analyses of healthcare. Big data science in healthcare will thus, even though “objective” in the above sense of Porter, be biased and partial due to the way that healthcare administration is shaped by its way of quantifying its subject matter. Seen from within its own financial, quality, and accountability concerns, U.S. healthcare has good reasons for asking for more complete, detailed, and accurate medical records, but this does not mean that the science of healthcare analytics moves above concerns of context and the logics through which data is produced.
Conclusion
While scholars have focused much attention on data scientists and the practices of data science, much less attention has been directed to the practices of data production. CDIS is an emergent occupation growing in the wake of data intensive resourcing in healthcare. CDIS review medical documentation in near real time, querying clinicians to improve clinical documentation directly so that documentation becomes complete and code-able structured code sets. These are ultimately approved by medical coders and provide an essential source of numeric, combinable data used for medical billing, management, large-scale medical research, and accountability of healthcare providers. Thus, CDIS are increasingly important data workers who translate between multiple occupational groups and their concerns and interests. Our ethnographic research on CDIS work practices shows that despite the fact that CDIS draw on different virtues to guide their documentation improvement efforts, the primary concern is to maximize reimbursement. This raises important questions about the logics that are reflected in massive stores of administrative healthcare data being used for a variety of data intensive analyses, and calls for attention to the practices of producing large data sets and the virtues that drive the data work occupations doing this work.
Footnotes
Acknowledgements
We thank the CDIS and their managers for their patience and time. It is highly appreciated and we could not have conducted this study without your collaboration and contributions. We also thank Mary Pauline Lowry and Christine Wolf for their important assistance.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by National Science Foundation award #1319897.
