Abstract
This article documents the “context cultures” underpinning efforts to develop regulations for collecting and reporting data in a United States public database known as Open Payments. Open Payments is a dataset published annually by the US Center for Medicare and Medicaid Services that documents the transfers of value from pharmaceutical and medical device manufacturers to physicians, prescribing non-physicians, and teaching hospitals. In the article, I show context became a manifold concern as differentially-situated actors engaged in modes of public advocacy and social action around not only what data meant, but also what it meant to make data meaningful. I show how “context” took on multiple meanings as it was brought into relationship with certain concepts (such as “light,” “transparency,” and “interpretation”) and as stakeholders developed arguments for where they believed meaning should originate. In presenting this case, I call for further ethnographic attention to the ways in which meaning-making is enacted in relation to datasets—particularly those datasets intended to hold institutions accountable. I conclude the article meditating on the political significance of attending to various “context cultures” when putting data signification in context, along with the implications for how critical data studies scholars historicize big data epistemologies and rhetoric.
In 2022, US drug and medical device companies disbursed more than 14 million payments to healthcare providers, valuing more than 12 billion US dollars. At least one physician that I saw in the past year accepted more than a thousand US dollars in payments from one pharmaceutical company in 2022, and at least one physician in the US accepted more than 700 payments for food and beverages in the same year. All of these facts were generated by analyzing a US open government dataset known as Open Payments. Open Payments is a dataset published annually by the US Center for Medicare and Medicaid Services (CMS) that documents the transfers of value from pharmaceutical and medical device manufacturers to physicians, prescribing non-physicians, and teaching hospitals. Reviewing these facts has prompted me to reflect a bit more on what my physicians prescribe me and my family and what motivates them to do so. Yet, while Open Payments provides data consumers ample opportunities to generate data statements about financial relationships in US healthcare, questions remain regarding how consumers should interpret the meaning and significance of these relationships. This article examines how differentially positioned actors have worked to shape how meaning gets assigned to data reported in this public interest database.
I started researching the socio-political provenance of the Open Payments dataset in Fall 2022; at the time, I had been focused on documenting the diverse moments and modes of data advocacy that implicated the dataset's narrative form—that is, how observations in the dataset were delineated, scoped, and categorized, how measurements were recorded and, more broadly, how meaning was encoded in the data. I reviewed legislation and regulations that documented the scope of the data program, analyzed archives of public legislative spaces where the dataset's structure had been debated, and interviewed stakeholders who had been involved in the dataset's design. At first glance, those stakeholders’ narratives followed a familiar and perhaps unsurprising pattern: industry actors advocated for streamlining reporting and reducing the regulatory burden of paperwork, while many politicians along with patient and consumer advocates pushed for more robust reporting and stronger mechanisms of accountability. So, I was surprised when, delving further into archives documenting the regulatory process, I came across a number of medical manufacturers requesting to submit
Foundational literature in critical data studies has characterized how dominant data rhetoric assumes an indexical relationship between numbers and what they are intended to represent, treating numbers as context-free (Kitchin, 2014), as “given” (Drucker, 2011), or as naturally revealing reality. Countering such rhetoric, critical analysts of data have demonstrated the ways in which all data are “cooked” (Gitelman, 2013)—how interpretations of data are emergent from “contingent and contested social practices” (Dalton et al., 2016) and how data are meaningless when taken out of context (boyd and Crawford, 2012). In the empirical space I present here, actors from some of the same institutions that were championing big data hype countered the treatment of data as given, suggesting that meaning did not emanate naturally from data but instead was actively produced (and at times erroneously produced) through socio-cultural forces operating in the spaces between metrics and what they were meant to signify. This article examines how stakeholders attempted to both discursively and infrastructurally mold those intermediary meaning-making spaces through their calls to bring certain forms of supporting explanation, infrastructure, and perspectives to bear on the data—through their calls to shape data “context.”
In presenting this case, I call for further ethnographic attention to the ways in which meaning-making is enacted in relation to datasets—particularly those datasets intended to hold institutions accountable. Over the past decade, work in critical data studies has begun to examine diverse ways in which meaning gets assigned to and emerges from data—through the designation of “yardsticks” that can demarcate when data values represent an actionable state of affairs (Ottinger and Zurer, 2011) and through the mobilization of data “narratives” (Dourish and Gómez Cruz, 2018) and “stories” (Gabrys et al., 2016). Scholarship has argued that data are mutable; as data “journey” across sites, they are adapted for different stakeholders toward varying ends (Bates et al., 2016). Accordingly, one way to study the contextualization, decontextualization, and recontextualization of data (i.e., to examine how data are rendered differentially meaningful) is to ethnographically follow data as they move through different sites, studying how they are reconstituted both materially and semiotically throughout. This article takes a different approach; rather than tracing how data are differentially contextualized across sites, this article examines how context itself is differentially rendered an object of concern. Nick Seaver (2015) argues that understandings of what context is and how it gets produced vary across communities—even communities that agree on its importance. Drawing inspiration from his call to document diverse “context cultures,” this article shows how “context” emerged as a shared but multi-valenced issue in relation to the Open Payments dataset. Context became a manifold concern as differentially situated actors engaged in modes of public advocacy and social action around not only what data meant, but also what it meant to make data meaningful. Context took on multiple meanings as it was brought into relationship with certain concepts (such “transparency” and “interpretation”) and as stakeholders developed arguments for where meaning should originate. I conclude the article by meditating on the political significance of attending to various “context cultures” when putting data signification in context, along with the implications for how critical data studies scholars historicize big data epistemologies and rhetoric.
Enacting the Physician Payments Sunshine Act
On April 9, 2008, US Iowa Republican Senator Chuck Grassley spoke in front of the Senate about a bipartisan bill he had introduced with Connecticut Democratic Senator Herb Kohl the previous year (C-SPAN, 2008: 1:31:15) This bill is not aimed at stopping money flowing to the doctors. … But it ought to throw a little sunshine on this issue. And that sunshine shown on this issue, I think, will go a long way toward curbing bad behavior. (C-SPAN, 2008: 1:42:52)
Light, Haridimos Tsoukas (1997) argues, has been engaged as a metaphor for knowledge since the enlightenment, marking an increased ability to see. Transparency, or the ability for light to pass through material, thus became a tenet of modern information societies—societies marked by a belief that having access to more knowledge will grant individuals greater control over their own futures and increase their capacity for self-regulation. Tsoukas (1997: 839) argues that individuals in information societies are “tempted” to view knowledge as objective (i.e., independent from human activity) and thus as offering a more rational depiction of social problems. In referencing sunlight as a self-regulating mechanism, Grassley alluded to a proverb often cited in transparency advocacy circles: “sunlight is said to be the best of disinfectants.” This quote can be traced back to a Harper's Weekly article written by Supreme Court Justice Louis Brandeis (1913)—an article that called for disclosure of bank commissions and profits more than two decades before the passing of the Securities Act of 1933 and the Securities and Exchange Act of 1934. The adage inspired language for the Government in Sunshine Act of 1976, which opened US federal government meetings to the public, and, since the early 2000s, open government advocates have often cited it in their calls for increasing the government data available online. More generally, its continued reference reflects a transition toward disclosure as a primary means of oversight and regulation in the US over the past century (Graham, 2002). It marks an evolution in the metaphorical relationship between light and knowledge; the sun's light was not only associated with objective knowing, but also with holding institutions to account.
Ultimately, the PPSA was enacted into US law as a provision in the Affordable Care Act of 2010. The law required “applicable manufacturers” (pharmaceutical and medical device manufacturers) to disclose each “transfer of value” to “covered recipients” (physicians and teaching hospitals) totaling more than $10 on an annual basis (United States Congress, 2010).
1
It also provided covered recipients an opportunity to review and dispute the reports. The legislation indicated that the Center for Medicare and Medicaid Services (CMS) would be responsible for overseeing the program and set a deadline of October 1, 2011, for implementation. Notably, the provision specifically required that the CMS Secretary: “consult with the Inspector General, affected industry, consumers, consumer advocates and other interested parties to ensure that the information made available to the public … is presented in the appropriate
In the years immediately following the passing of the Affordable Care Act, CMS held dozens of in-person meetings with stakeholders, gathering feedback on how reporting regulations should be implemented. In my research, I reviewed the agendas for these meetings, along with stakeholder write-ups summarizing them. After drafting proposed regulations, CMS collected public comments via Regulations.gov—a website that facilitates public participation in the US rulemaking process. More than 350 comments were submitted—predominantly written by representatives at medical manufacturing companies or physician advocacy organizations and containing point-by-point critiques of the regulation's wording and alignment with the law. I downloaded and read through each. CMS also participated in Senate sessions convening advocates to debate the program's implementation. On September 12, 2012, the US Senate Health, Education, Labor, and Pensions Subcommittee on Aging held a roundtable titled, “Let the Sunshine In: Implementing the Physician Payments Sunshine Act” (C-SPAN, 2012). The roundtable convened politicians, CMS representatives, and consumer and industry organizations to deliver testimony around the challenges of putting the law into practice—a transition that, at the time, was over a year late. Videos of these Senate sessions are archived on C-SPAN; I reviewed them as part of this research.
In each of these spaces, context was prioritized in stakeholder discourse. For example, an agenda item on a CMS Special Open Door Forum asked:
What types of background information on industry-physician relationships should CMS consider including? What are best practice approaches for presenting the data in a way that is most understandable by consumers? How can CMS maximize the use of data reported on the website? (US Centers for Medicare & Medicaid Services, 2011: 2)
Similarly, a September 2012 Senate roundtable opened with comments from the senators that had spearheaded the PPSA. Introducing the roundtable's aims, Senator Herb Kohl emphasized: “Most importantly, the information must be made available to the public, must be easily understood and provide enough context for patients to understand why their doctors’ names appear on the website” (C-SPAN, 2012: 17:42).
Within these spaces, there was not a singular, shared understanding of the meaning and significance of a financial relationship between a manufacturer and a physician. Did a payment necessarily mean a financial conflict of interest? Did it necessarily indicate bias in prescribing? This lack of shared understanding invited actors to stake claims over what the data should and shouldn’t signify, and in the process, actors wound up debating not only what kinds of context should be tied to the data, but also what it meant to put the data in context.
Scholarship in media studies has shown how context can be enacted in diverse ways as communities differentially problematize and prioritize it. Seaver (2015) juxtaposes how certain “big data critics” (specifically he cites boyd and Crawford (2012)) lament the lack of attention to context in big data communities with how designers of algorithmic recommender systems emphasize the need for “context-awareness.” He shows their understanding of the meaning of context to be incompatible. Computer scientists engaged in “context-aware” computing tend to treat context as a delineable and stable thing that, with enough data, can be modeled to paint a more holistic picture of who a user is and what that user likes in a particular moment. This maps to what Dourish (2004) refers to as a “representationalist” view of context—an attitude toward context rooted in a positivist epistemology suggesting that context can be data-ified and operationalized for computational modeling. Importantly, in this conception, context is something that can be cleanly demarcated from human activity; actions happen
While categorizing these views highlights the variability in the understanding of context, the point of delineating context cultures is not so much to bucket distinct ideologies. In his article, Seaver (2015) cites Marilyn Strathern's (1995) argument that the concept of “culture” was gaining global popularity as a means of bounding and distinguishing different communities right around the time that anthropologists were debating and even considering discarding the concept for its tendency to essentialize difference. The concept of “context” similarly entered into digital practitioner vocabularies as ethnographers began questioning its stabilizing tendencies (Seaver, 2015). In describing context cultures, there is a risk to treating both context and culture as unified stable backdrops organizing thought styles and actions—as environments for action rather than results of action. In other words, there is a risk of treating both context and culture through a representationalist lens—as something observed and demarcated rather than something pursued and enacted.
2
Rather than trying to bring definitional coherence to different conceptions of context (and thus running the risk of homogenizing the communities that wield it), in this article, I aim to show how context is differentially brought into being as a topic of concern—through advocacy, discourse, and action. In other words, I aim to demonstrate how communities
Rendering de-contextualized data a topic of concern
In the years immediately following the passing of the Affordable Care Act, representatives from industries and medical organizations participated in a number of public meetings and submitted public testimony, emphasizing the significance of putting data in context. Calls for context were punctuated by concerns that the public would draw improper conclusions when reviewing “just the numbers.” In testimony at the September 2012 roundtable, Diane Biagianti, Vice President and Chief Responsibility Officer for Edwards Lifesciences, critiqued the presentation of raw values without accompanying context: If our whole purpose is to provide clarity to the public and to patients around what those collaborations look like, it is not going to be helpful if we’re just providing numbers. We have to explain the context around those collaborations so that there's an understanding of what that means. (C-SPAN, 2012: 1:26:36)
In presenting these concerns, corporate stakeholders within policy and compliance offices suggested that numbers alone could not convey meaning. They acknowledged data to be mutable—that conclusions could be derived from data that contradicted the realities they were meant to represent. Notably, these critiques of decontextualized data were emerging at a historical moment when the hype around big data was rapidly building and faith in numbers was increasingly driving data-based decision-making (e.g., Anderson, 2008). This hype was particularly prominent in the life sciences; rhetoric around the promise of data analytics to streamline clinical trials, identify new life-saving drugs, and provide business insights was driven largely by leadership at the same institutions criticizing decontextualized payment data (e.g., Kayyali et al., 2013; Kreutter, 2011). Critical data studies scholars have described this time as marked by a “new empiricism”—a dominant epistemology that held that sheer volumes of data could circumvent the need for expert interpretation and that numbers offered a more impartial representation of social phenomenon than human judgment (Kitchin, 2014). Under this epistemological paradigm, meaning was understood to “transcend context or domain-specific knowledge and thus can be interpreted by anyone who can decode a statistic or data visualization” (Kitchin, 2014: 4). In other words, according to dominant rhetoric, as long as an analyst could interpret a statistical result, context wouldn’t matter. Rob Kitchin (2014: 5) has described articulations of this empiricism as a “discursive rhetorical device”—aimed at selling the value of big data analytics, and Rieder and Simon (2016) have historicized the uptake of this rhetorical data-ism in public policy domains.
As representatives within industries and medical organizations denounced the absence of context, they adopted a stance that rejected the epistemological paradigm shifts emerging in other parts of their organizations and beyond—a stance that challenged the hype around data. Opposing rhetoric selling big data, they consistently called into question the epistemic power of numbers and insisted that further explanatory information was necessary for responsible data interpretation. As representatives raised these concerns in public spaces, they motivated their arguments not only by pointing to the potential for misinterpretations of data but also by arguing that such misunderstandings could harm the advancement of medicine. Many manufacturers and medical organizations suggested that, wrapped in the language of “sunshine,” consumers would default to assigning a negative connotation to every data point documenting a payment and that this would slow medical innovation and the delivery of patient care. For example, Doug Peddicord, Executive Director of the Association of Clinical Research Organizations, suggested that failing to exclude research payments from reporting would discourage physicians from participating in clinical trials and have “deleterious effects on the research enterprise in the United States” (C-SPAN, 2012: 42:24). Similar concerns were raised in public comments submitted by the Association of Clinical Researchers and Educators (ACRE)—a nonprofit organization founded by Thomas Stossel, an oncologist and professor at the Harvard Medical School. At the time of their comment submission, the ACRE website highlighted a condemnation of the “anti-industry movement” in the medical profession and listed a long-term goal of: “revers[ing] policies instituted curtailing or minimizing interaction between industry and physicians, educators and researchers” (ACRE, 2010). Their comments stated: Finally, the most destructive aspect of Sunshine is its flawed premise that every payment presents a conflict of interest. The fact is that payment information can serve either good or bad purposes. The ‘good’ outcome ‘sunshine’ advocates claim is that ‘transparency’ will rein in medical care costs based on the absolutely unproven and arguably false premise that industry payments to physicians drive overprescribing of unnecessary and overly expensive brands. (ACRE Steering Committee, 2012: 3) How these data are to be taken in the context of analyzing scientific research is unknown and merely will serve the interests of fascinated individuals (certainly not patients or researchers). Without context, these data cannot be interpreted; intentionality, influence, alleged corruption and causality cannot be defined by the data alone. (ACRE Steering Committee, 2012: 4)
As misconstrued data were cast as harmful to the advancement of science, the presentation of numbers divorced from context was cast as an epistemic problem. Embedded in these statements was an assumption that the conclusions that can be drawn from data are not only variable, but also socio-politically mitigated—that interpreting data is not a neutral activity, but instead one that is shaped by pre-existing societal connotations regarding sunshine and transparency. In their discourse, de-contextualized data bolstered opportunities for consumers to base their interpretations of data's meaning on “unproven and arguably false premise[s]” (ACRE Steering Committee, 2012: 3). They implied that a lack of context established the conditions of possibility for interpretation to be an ideological activity rooted in unproven beliefs rather than a scientific enterprise rooted in proven facts. In opposing dominant discourse emerging elsewhere in the life science industry—discourse suggesting that number-crunching would naturally reveal new patterns and predict new outcomes—industrial actors cemented the impression that “objective” data work could be distinguished from biased data work. Decontextualized data were put forth as an epistemic problem in cases that they classified as the latter; context entered into discourse about data when there was concern that anti-industry ideologies might motivate interpretation or when data posed a risk to corporate healthcare reputations.
Given statutory requirements to consult with relevant stakeholders on how to contextualize the data, CMS was required to heed these concerns when devising regulations for PPSA reporting. On February 8, 2013, CMS published the final rule for the program's implementation to the Federal Register (US Department of Health and Human Services, 2013), summarizing reporting processes and outlining the penalties for failures to report. While the rule itself was set forth in the final seven pages of the document, the final rule was over 70 pages long—with over 60 pages serving as a preamble. The bulk of the preamble's text was devoted to responding to public comments.
Ultimately, the word “context” appeared 32 times in the final document, and, at least on the surface, CMS appeared in agreement with industry advocates that the absence of context was an epistemic concern for responsible data interpretation. In several sections, the text emphasized that CMS believed that providing context on the nature of payments was important to “help the public better understand the relationships between the industry and covered recipients” (US Department of Health and Human Services, 2013: 9474). Context could, for instance, help explain situations where a physician may not be receiving the full amount of the payment—such as cases where the university a physician works for is the actual end recipient or cases where the full cost of a clinical trial, including donated drugs, is reported against one principal investigator physician. Context, in the form of explanatory text, could also indicate when a payment does not so cleanly fall into standardized categories and could disclaim that not all payments represented a conflict of interest. While there was general agreement on the need to attend to context, there were also notable discrepancies over what it meant to put the data in context. In what follows, I detail how context was differentially enacted as actors sought to link it to the promises of transparency.
Tethering context to the promises of transparency
As context became the most consistent talking point in debates regarding reporting regulations, it was often cast as a prerequisite to transparency. In her opening testimony at the September 2012 roundtable, Elizabeth O’Farrell stated: “At [Eli] Lilly, we believe that physician payment transparency, when done accurately and with relevant context, is good for all stakeholders” (C-SPAN, 2012: 40:03). A letter submitted to Regulations.gov from Medtronic, Inc., in response to the proposed rule indicated: We believe having all manufacturers share information in a standard manner will better educate patients and the public on the role of industry-physician collaboration, improve the public trust, discourage inappropriate payments and transfers of value, and ultimately ensure that appropriate collaborations can continue to benefit patients. These shared goals require that such relationships are properly explained in their full context. (Schumacher, 2012)
Both of these companies emphasized their commitments to transparency by referring to payment disclosure programs they had voluntarily initiated in the late 2000s—a time after the PPSA was first introduced but before the finalization of reporting regulations. As I encountered companies citing their voluntary transparency programs as evidence of their commitments to transparency, Internet Archive captures of earlier versions of their websites became part of the ethnographic archive I engaged in this research. Reviewing captures from the early 2010s, I learned that this voluntary reporting could be found under sections of their websites celebrating commitments to “corporate responsibility” (Medtronic, Inc., 2013). On pages linking to the registries, manufacturers characterized industry-physician relationships as being essential to the development of new treatments (Eli Lilly and Company, 2011) and cited examples of medical innovations that emerged as a result of industry–physician relationships (Medtronic, Inc., 2013). In their comments to CMS, both Medtronic and Eli Lilly characterized the words displayed on these websites as providing meaningful “context” around the data that helped educate the public on the meaning of a transfer of value.
It is notable that within a year of both companies initiating their voluntary programs, Department of Justice investigations into cases of healthcare fraud resulted in legal settlements that mandated the companies to publicly report payments to doctors. In fact, between 2007 and 2012, more than a dozen pharmaceutical and medical device companies (including several of the largest medical manufacturers in the US) faced DOJ settlement charges that required that they begin reporting payments to physicians on their websites (Association for Medical Ethics, 2012) before the statutory deadline for reporting. Captures of many of these webpages are archived via the Internet Archive and similarly celebrate corporate responsibility, commitments to transparency, and how the companies adhere to the “highest standards of conduct” when it comes to industry-physician relationships (AstraZeneca, 2010). On the webpages, publishing payment data became an opportunity to characterize payments to physicians as indicative of their ethical business practices; many webpages noted how physicians were paid fair-market values for their time and expertise in the development of medical advancements. This “contextualizing” language on their websites recasts the negative connotations data consumers might assign to payment data in a positive and even promotional light. As companies touted this language, they took advantage of the mutability of transparency as a concept—working to dissociate transparency with the idea of disinfecting corruption and re-associating it with the idea of promoting public confidence in the medical industry.
Monika Zalnieriute (2021) describes “transparency-washing” as a practice through which companies prioritize the ideal of transparency to strengthen their brand while detracting attention from substantive ethical issues toward smaller-scale procedural issues. Indeed, several activists, political staff members, and agency representatives I interviewed in this research characterized industry commitments to transparency as “lip service” and identified their own ambivalence about the effectiveness of transparency programs. Niall Brennan, former Chief Data Officer at CMS and the individual at CMS primarily responsible for developing regulations for the PPSA, noted how views on transparency could be broken down along political lines, with some viewing transparency as a means of empowering consumers, while others raised concerns over the way these initiatives turned the responsibility for reforming healthcare inequities onto the people with the least agency to do so. His characterization echoed concerns raised in legal and policy circles that transparency initiatives embody liberal governmentality (Adams, 2018; Ben-shahar and Schneider, 2011; Birchall, 2015); corporations can evade government regulation as the responsibility for oversight—the responsibility to “disinfect” bad behavior—gets pushed onto a citizenry that is newly expected to review public information and make consumer-based decisions in alignment with their values and priorities.
Labeling the promotional language on their websites as “context” became a powerful transparency-washing tactic. In much of their advocacy around data context, industrial actors had criticized the possibility that the interpretation of data may be driven by false or misleading premises, positioning context as the remedy. Referring to the language on their websites as context helped industry actors frame that language as objective and disinterested, helping consumers draw “true” conclusions rather than ideologically motivated ones. It allowed them to uphold their arguments that data interpretation should be a scientific enterprise, while still exerting influence over the underlying premises about transparency that may frame how the public assigns meaning to it. In enacting context in this way, companies rendered it into a resource for promoting public trust in the medical industry.
Leading up to the publication of the PPSA final rule, companies pushed CMS to include this kind of promotional material on the federal government website where the data would be published. In certain ways, CMS conceded in the final rule: We plan to ensure that the public Web site accurately and completely describes the nature of relationships between physicians and teaching hospitals, and the industry, including an explanation of beneficial interactions … As proposed, the Web site will clearly state that disclosure of a payment or other transfer of value on the Web site does not indicate that the payment was legitimate nor does it necessarily indicate a conflict of interest or any wrongdoing. (US Department of Health and Human Services, 2013: 9503–9504) I think that was definitely something that we could kind of talk to, but not have to figure out the details of in a rule. You can say you will put appropriate context on the website. And what that actually looks like: what do we actually say? I don’t know. … It was one of those issues where we can agree with it. It's an easy thing to agree with. Of course, we want context. Of course, we are not saying that any of these payments are wrong or incorrect. It was all about transparency. …At the time, it was called the Sunshine Act; it was all about transparency. So that's important, right?
CMS agreed that context was a prerequisite to transparency. They also agreed that transparency was not necessarily about discerning corrupt from legitimate business practices; it was not necessarily about disinfecting bad behavior. Yet, for CMS, transparency was also not necessarily about promoting trust in the medical industry. CMS wanted to avoid coming across as passing value judgments on the data's meaning and significance. As they attempted to craft a neutral narrative around the promises of transparency, they deployed context as a means to nullify the preconceived appraisals they may be perceived as assigning to the data.
Enacting context in this way involved making some strategic decisions about how to present the program to the public. Upon finalizing the regulations, the PPSA landed on the Center for Program Integrity (CPI) at CMS. At the time, Shantanu Agrawal, a former emergency room doctor who later shifted careers toward health policy, was serving as the Deputy Administrator for CPI and became responsible for overseeing PPSA implementation. During Agrawal's time as administrator, CMS had elected to rebrand the program as “Open Payments” to shift away from framing data as a “disinfectant.” Agarwal, speaking as former deputy administrator for CMS, summarized in our 2023 interview: We wanted it to be about transparency. And we wanted it to be clear about what it was. We didn't want to continue with this underlying notion that it was intrinsically corrupt. And so we emphasize openness and transparency.
When it came to explanations of the data's meaning, representatives at CMS placed more focus on emphasizing what the data did not necessarily mean than detailing what the data meant. On September 30, 2014, CMS published the first round of data to a public website. In the weeks leading up to the data release, numerous physician organizations and manufacturers signed on to letters chiding CMS for failing to consult with them about the display of context around the data, along with failing to provide them an opportunity to review the content of the website (American Academy of Allergy, Asthma, and Immunology et al., 2014). Charles Ornstein (2014b), a ProPublica journalist who had been engaged in extensive reporting around the PPSA, summarized, “…the context that they’re looking for is statements from the government that relationships between drug companies and doctors are beneficial to the public.” This is not what appeared on the published website. Amidst pages describing how the reporting program works, guidelines for data use, glossaries, frequently asked questions, and an interactive data report generator, one page on the website was titled “Open Payments Data in Context” (US Centers for Medicare & Medicaid Services, 2014). In outlining “context,” the creators of the CMS website carefully avoided expressing stakes in the data's meaning. The interface stated that “Open Payments does not identify which financial relationships are beneficial or which may cause conflicts of interest.” It noted that “Open Payments means different things to different people and audiences” and encouraged patients to talk with their healthcare providers about financial relationships. The interface went on to state that “CMS has an impartial role in Open Payments,” noting that they were responsible only for collecting and publishing information. Context, in the form of information explaining what the data did
Tsoukas (1997) argues that one of the “tyrannies of light” in modern information societies is that a surplus of information, coupled with a temptation to treat information as objective, may come to obscure what it is meant to represent. As information—always already serving as a “surrogate” for the tangible world—comes to stand in for behaviors and actions, the opportunities to hide or misrepresent what is actually happening multiply. Deploying context as a means to promote industry-physician collaborations and secure public trust in the medical industry, manufacturers attempted to regulate interstitial spaces of meaning-making, furthering their own capacity to eclipse or misrepresent certain behaviors and actions while casting their renderings of the data as closer to truth than other renderings. On the other hand, deploying context as a means to evade perceptions of judgment and open interpretive possibilities, CMS attempted to deregulate these meaning-making spaces, in the process propagating possible meanings that could be assigned to data. Notably, in both enactments of context, the visibility of information can establish the conditions of possibility for concealment (Strathern, 2000). When deployed to explain away what is made visible through transparency initiatives or in refusal of assigning judgment to data, context can multiply opportunities for misrepresentation. This makes it critical to consider whose perspectives get to assign context and decide what that means—an issue that I will turn to next.
Fixing centers of meaning-making
As organizations advocated for prioritizing data context in developing PPSA regulations, many suggested that the power to frame a payment's meaning should be left to industries and physicians. For instance, the Coalition for Healthcare Communication, an organization that prioritizes “self-regulation” (Kamp, 2011) as they “act to prevent or reverse actions interfering with the free flow of healthcare information,” (Coalition for Healthcare Communication, 2011) stated in their comments to the proposed rule: First, the final rules must enable industry reporters and medical professionals to fully explain the context of these payments to the public, the press and public policy professionals, so that these entities will understand clearly the value of these relationships to the public health. Without this context, CHC fears the reports will do more harm than good. It would be a tragedy if these reports lead the public to misunderstand the complex interaction between scientific research and communication in both the creation and adoption of medical innovations, as well as in the delivery of excellent patient care. (Kamp, 2012: 2)
The CHC's letter went on to condemn the interpretations other stakeholders had brought to bear on early data reports. Both ProPublica and the Association for Medical Ethics (a community of doctors seeking to uncover physician financial conflicts of interest in healthcare) had been aggregating data certain manufacturers had been publishing on their websites as a result of DOJ settlements, along with data reported in state-level registries (Association for Medical Ethics, 2012; Ornstein et al., 2010). Engaging these data in new contexts, they generated preliminary analyses of the financial ties between medical manufacturers, physicians, and teaching hospitals. CHC's letter argued that their reports “suggest[ed] improper behavior by companies and doctors where none existed” (Kamp, 2012: 2). The letter went on to indicate that these misrepresentations could have been avoided by presenting data “in a context that gives them meaning and demonstrates the value of these relationships to medical professionals and patients.” Context, according to CHC, was the jurisdiction of data producers, not data interpreters, and given the nature of disclosure, this meant the power to contextualize should be left to the medical industry.
While all stakeholders generally agreed on the significance of context, patient and consumer advocacy organizations (which were considerably underrepresented in these spaces in comparison to manufacturers and medical organizations) also expressed skepticism over the way “context” might be appropriated and corrupted when only representing the perspectives of those reporting the data. Consider comments submitted to Regulations.gov from Marcia Hams, the Director of Prescription Access and Quality at Community Catalyst—a national nonprofit organization focused on affordable healthcare access. Hams advocated for enabling manufacturers to report transaction-level contextual information with the data, but she went on to warn: While this information could be helpful to patients, providers and the public, CMS should monitor entries and consider withdrawing this field if it is being used inappropriately by applicable manufacturers to editorialize on matters such as the perceived value of industry-provider relationships, particularly given that there would be no comparable opportunity for comments from other points of view. (Hams, 2012: 10)
While these calls explicitly advocated for certain perspectives to sit at the forefront of assigning meaning to data, stakeholders also engaged in more subversive advocacy around contextual perspective as they relayed calls to contextualize data through the design of certain information infrastructures. Before the drafting of the final rule, many commenters called for putting the data in context by standardizing definitions and clearly demarcating categories for payments. In testimony delivered at the September 12 roundtable, Eliza O’Farrell stated: I think that this is also a key for CMS, then, to make sure that you are providing a lot of clear guidance on what a bucket is, what's in research, what's in meals, what's in speaker programs, because if you want it to be able to be aggregated across the different manufacturers, it's really important that we’ve got good definitions bucket-by-bucket, not dictionary definitions but real-life definitions, so we all know that we’re submitting things the right way to you so you can aggregate. (C-SPAN, 2012)
For industry stakeholders, the stated goal of clarifying definitions was to promote consistency in reporting—to ensure that all manufacturers were categorizing payments in the same way. Many community groups submitting comments agreed, arguing that data would be more meaningful to the public when standardized. In suggesting that clear definitions could contextualize data, these actors, at least on the surface, appeared to be advocating for fixing the center of mean-making through certain information infrastructures. Since standards, categories, and definitions are often culturally perceived as neutral, a benefit of operationalizing context in this way was that it could exonerate particular parties—parties that may be accused of exploiting context to advance a particular agenda—from blame. In deploying context in this way, stakeholders effaced the human judgment always already animating the design of information infrastructures (Bowker and Star, 1999). While perhaps perceived as being neutral infrastructures for meaning-making, the data's definitions were in fact interlaced with diverse stakes and interests. Both industry actors and activists had been advocating—at times vehemently—over the scope of the data's definitions as the final rule was being promulgated. Simultaneously centering and erasing situated contextual perspectives by displacing them onto information infrastructures, stakeholders eclipsed these politics.
Despite these calls, in interviews with representatives at CMS, I learned that their team had a number of reservations about rigidly nailing down definitions in the statute. For one, they wanted to make sure that they did not encode definitions into the statute that senators would later claim were not written in the spirit of the original legislation. Further, they recognized that they didn’t have everything about program implementation figured out; they didn’t want to create a situation where they would need to revise the final rule, working through the entire comment and review process again, because they had encoded a definition too narrowly in the statute. Finally, they raised concerns that industry actors might take advantage of the mutability of data definitions to find loopholes around data reporting. One individual noted that “if you put a hard line, people are going to find a way around it.”
In response to these concerns, many of the program's architectural decisions were made with an eye toward describing the nature of the payments while enabling consumers to draw their own interpretations from the data. When I asked Agrawal about how he was thinking about “context” as he was building out the Open Payments program at CMS, he noted a number of decisions made in relation to the program's website: displaying character-delimited transaction-level contextual information reported by manufacturers; segmenting the data such that research payments would be displayed in a separate database from general payments; building out analytic capabilities so that users could make year-to-year and geographic comparisons; including language warning users against jumping to conclusions on what the data mean and don’t mean. In this enactment, context (understood to be implicitly designed into the website's information architecture) would not determine interpretation but could facilitate it. Indeed, in a June 2023 interview, a government official noted that determinations of which are “good” user categories and which are problematic depend on the perspective of the user of the data. The government official continued “…They're not good or bad. They're just user categories. It depends upon what you're looking for. And then you can ascribe the values to those. And really, that's the challenge.”
In the years following the first data release, data journalists and other writers presented reports from Open Payments in new contexts—bringing meaning to the data through multiply situated perspectives. For example, ProPublica articles reported that almost no women physicians were among the top-paid speakers or consultants in the medical industry (Ornstein, 2014a) and that over 14,000 doctors received payments on more than 100 days in 2014 (Jones, 2015). Some characterized these media reports as “witch hunt[s]” (Gur-Arie, 2014) overly focused on taking individuals down. In an article titled, “Without context, Sunshine Act's health care ‘transparency’ is useless,” Peter J. Pitts (2014), a former Food and Drug Administration associate commissioner and then president of the Center for Medicine in the Public Interest argued that, for some, the data's publication was intended to embarrass the healthcare industry and healthcare providers. He corrected: “The truth behind the numbers is that industry-physician collaboration is one of the main drivers of medical innovation today.” Policy & Medicine, a publication that claims to be “one of the leading resources for information on the Physician Payments Sunshine Act and transparency issues” (Sullivan, 2023) published a number of articles condemning the data's lack of context, describing the program as “essentially just a spreadsheet of payment transactions” (Sullivan, 2014). They went on to argue that, while some news outlets leveraged the data as an opportunity to persecute industry-physician collaborations, others “dug deeper,” interviewing doctors with the top payment totals and giving them an opportunity to provide context on their own payments. Suggesting that a lack of context invited data consumers to misassign meaning to the data, coupled with calls to resituate the “truth” of the data's meaning, commentators attempted to rein in meaning-making, recentering it with the healthcare industry and healthcare professionals.
As concerns over context were deployed in attempts to fix centers of meaning-making, context at times became a means of controlling (and at other times relinquishing control over) the frames of reference from which data would be assigned meaning. Stakeholders agreed that numbers could not and should not speak for themselves, but they differed in their articulations and justifications around who or what should speak for numbers—around where meaning should originate. Critical theorists have called attention to both the contradictions and consequences of attempts to fix centers of meaning-making, arguing that centered structures both enable and delimit free play in signification (Derrida, 1970). Context emerged as a multi-faceted concern as stakeholders positioned themselves along this paradox, tightening and loosening the centering structures governing meaning-making. Context was at times positioned as a means of determining interpretation and at other times a means of facilitating interpretation. As actors staked claims over not only what data meant, but also what it meant to make data meaningful, they rendered context into a manifold concern.
Conclusion
I was an undergraduate student when most of the debates around the implementation of Open Payments were taking place. I was not present for any of the PPSA public forums nor did I participate in the public comment process. I initiated this research project about 10 years after the final rule was put into effect, meaning that I had to rely almost entirely on recordings and documents of sites where advocacy around the program took place. Notably, almost all of the materials that I engaged with were publicly available to me as a result of the US Sunshine Laws that opened federal government documents to the public. These materials, of course, constituted a fragmented archive. I did not have access to private conversations amongst CMS and other data stakeholders; I did not witness day-to-day work of program implementation, and I could only observe public-facing comments submitted to CMS. Pursuing context around the way “context” was mobilized meant continuously considering further opportunities to supplement testimonies presented in the archive—through interviews, creative web sleuthing, and exploratory analysis of the Open Payments dataset. “Context” became meaningful as an ethnographic keyword, not because it was pre-coded in the archive's metadata, but through persistent exposure to and engagement with documents where it figured prominently. It became meaningful as I juxtaposed these settings to other experiences where I had heard stakeholders raise concerns about data being taken out of context. I considered the process of contextualizing my ethnographic data as active and emergent, but never fully saturated; this depicts aspects of my own “context culture.”
While it is clear that numbers and metrics cannot be interpreted without accompanying contextualization, “putting data in context” can take many shapes and forms, making it critical to attend to the “context cultures” motivating how data get situated. While discourse suggesting data to be self-evident can certainly be traced through life science industry headlines and marketing campaigns in the early twenty-first century, focusing exclusively on these examples as representative of dominant data epistemologies eclipses contrasting but equally dominating portrayals of data where context is deemed paramount. This article critiques earlier characterizations of dominant data cultures, showcasing how institutions championing the promises of big data also engaged in discourse condemning the treatment of data as self-evident. Recently, I’ve seen similar discourse around data “context” weaponized to impede data sharing in a number of communities that have hyped data-driven approaches elsewhere. Raising concerns over the inclusion of “mining” in the categories US industrial emitters are required to report to the public, companies have expressed concern that the public will take the data out of context (Cray, 2000). Providing testimony on their views on data management, the US International Association of Chiefs of Police cited the risks of sharing data with the public noting that, taken out of context, the data could be misleading (Task Force on 21st Century Policing, 2015). Many business groups oppose pay transparency initiatives in California, citing concerns that the data will be taken out of context (Gedye, 2022). In the case of Open Payments, not only was data context deemed a priority by some of the most powerful data stakeholders; it was also strategically operationalized by certain stakeholders to re-situate explanatory power over the data's significance and control data narratives. These cases demonstrate that it is not only important to consider
In bringing attention to distinct ways context gets animated, this article highlights what ethnographers might look for when examining how actors attempt to assign meaning to data. Inquiries into distinct context cultures should consider not only whether, but also how actors epistemically characterize the problem of de-contextualized data—paying attention to how actors grapple with explanations of data's mutability, their convictions around how human judgment factors into data interpretation, and their assumptions about what it means to present data in a grounded but impartial way. Such inquiries should attend to how actors advocate for where meaning should originate and the hold centering structures should have on interpretive possibilities. Such analysis can advance understanding of the manifold ways “context” can be differentially operationalized—foregrounding whose interests context is designed to serve and how this frames what sorts of information become “transparent” in efforts to hold institutions accountable.
Footnotes
Acknowledgments
Thanks are due to the Smith College students in the Spring 2023 cohort of the Data Ethnography and Advocacy Lab (DEAL), including Anika Arifin, Juniper Huang, Naomi Liftman, and Ziqi Zhen for data collection and archiving. Quinn White supported aggregation and analysis of Regulations.gov comments. In addition, Michael Fortun, Kim Fortun, Brandon Costelloe-Keuhn, and three anonymous reviewers offered generative feedback on drafts of this article.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
