Abstract
This article reports on a thematic content analysis of 486 newsroom posts published between 2016 and 2021 by five prominent digital platforms (Facebook, Tinder, YouTube, TikTok, and Twitter). We aimed to understand how these platforms frame and define the issues of harm and safety, and to identify the interventions they publicly report introducing to address these issues. We found that platforms respond to and draw upon external controversies and media panics to selectively construct matters of concern related to safety and harm. They then reactively propose solutions that serve as justification for further investment in and scaling up of automated, data-intensive surveillance and verification technologies. We examine four key themes in the data: locating harm with bad actors and discrete content objects (Theme 1), framing surveillance and policing as solutions to harm (Theme 2), policing “borderline” content through suppression strategies (Theme 3), and performing diversity and inclusion (Theme 4).
Introduction
Social media platform companies (“platforms” hereafter, for the sake of brevity) are under increasing public pressure to address harm and safety issues ranging from mis- and disinformation and electoral interference to hate speech, abuse, and non-consensual intimate images (Online Safety Bill, 2021; The Digital Services Act, 2020). Following “public shocks” (Ananny & Gillespie, 2017), media scrutiny (Dias et al., 2020), and intensifying regulatory pressure (Department for Digital, Culture, Media & Sport and Home Office, 2019), platforms have become more responsive to addressing the harms perpetrated through the social networks they support. Platforms have established trust and safety councils (Cartes, 2016), and committed to working together with other technology companies (Facebook, 2019) and nation states (Christchurch Call, 2019) to foster safer digital environments. At the heart of these commitments is the development and scaling up of automated tools to moderate and curate the enormous volume of online content that the larger platforms host (Gorwa et al., 2020).
In their responses, platforms have introduced a range of automated, quasi-automated, and manual content moderation tools to prevent, suppress, or police harm and offered users a range of tools to help preserve or report threats to their safety. Meanwhile, however, users have flagged their own concerns about platform governance, from demonetization and discriminatory advertising to suppression and unjustified content takedowns (Are, 2021; Stardust et al., 2022). Throughout this period, platform companies have regularly made public statements about what they are doing to address harm and foster safe and inclusive environments (see, for example, Tinder Newsroom, 2021). Such statements offer critical insights into how platforms frame harm and safety, justify their interventions and perform accountability, and within what constraints and limitations. This article reports on the findings of a study that took such public statements as an object of study, using them to investigate platform companies’ framings of and responses to the issues of “harm” and “safety,” across platforms, and over time.
Approach and Methods
This article is based on the analysis of 486 newsroom posts published between 2016 and 2021 by Facebook, 1 Twitter, YouTube, TikTok, and Tinder. These five platforms were selected because they have large userbases and host significant amounts of user-generated content and interactions. Each of them is dominant in a particular form of digital media, including social networking and public conversation (Facebook and Twitter), video (YouTube and TikTok), and dating apps (Tinder). Each has been at the center of controversies around the issues of online harm and safety and has publicly spoken about their approach to dealing with these issues.
By “newsroom,” we mean those sections of the platforms’ websites specifically devoted to media releases and public announcements, which, as outlined below, are referred to by some of the platforms as “newsrooms” and by others as company “blogs” (reflecting the earlier Web 2.0 traditions where they originated), but of course these digital “newsrooms” are predated by and serve similar functions to corporate public relations (PR) departments. In focusing on the newsroom posts, we aimed to identify and describe how platforms frame these problems and solutions, based on the understanding that their publicity materials would provide insights into approaches that “[position] some approaches as on the table for discussion and sweeping others out of view” (Gillespie et al., 2020). Newsroom posts are an entry point into an entangled arrangement of platforms’ interventions, technologies, partnerships, focus areas, assumptions, and normative choices (Barrett & Kreiss, 2019). They speak to a web of platform investments and infrastructures, from financial constraints, operating systems, app store policies, and public and regulatory pressures. Public-facing newsroom posts are, of course, carefully curated, written and published by company actors, who frame their actions in favorable ways to maintain their reputation and perform corporate social responsibility for stakeholders, media, and the public. While the posts are far from being a complete or accurate record of platforms’ governance practices, they are useful in illuminating the strategic choices platforms make about exactly which problems and interventions to highlight as part of their public positioning on such matters, and the discourses that order these choices. Accordingly, this article does not provide a comprehensive list of platforms’ practices and technologies nor an interrogation of their effectiveness in practice, but rather considers what values are represented through platform newsroom posts, and whose interests they serve.
Our original research questions were as follows:
How do the platforms frame the nature of online harm and safety in their newsroom posts?
How are the platforms’ interventions justified through discourses of harm and safety?
What are the consequences of the data logics and models used?
Data Collection
We collected all the newsroom posts that we identified as relating to harm or safety, published between 1 November 2016 and 1 November 2021 from each platforms’ newsroom or company’s blog. The outlets included the Twitter Blog (https://blog.twitter.com), TikTok Newsroom (https://newsroom.tiktok.com), the YouTube Official Blog (https://blog.youtube), Tinder Newsroom (https://www.tinderpressroom.com), and the Facebook (now Meta) Newsroom (https://about.fb.com/news/). 2 While several of the platforms published newsroom posts in a range of languages, our sample only includes posts that were published in English, simply because the team’s language abilities are limited to English. A spreadsheet containing links to the posts in our sample is available as supplementary material at our institutional data repository (http://researchdatafinder.qut.edu.au), with posts tagged by platform and numbered 1–486. In this article, we use this numbering system to refer to specific posts.
In collecting the initial sample, we took a broad and inclusive approach to the definition of harm and safety. We approached harm as a concept that extends beyond interpersonal violence or illegality to include structural and systemic harms, including harms perpetrated by platforms (Bartolo & Matamoros-Fernández, in press). Similarly, building on critical cultural studies and feminist literature, we understood safety not only as the absence of harm, danger, risk, or injury but also as the cultivation of social and cultural inclusion, access, and justice in addition to the presence of transparency and processes of accountability (Stardust et al., 2022). We manually extracted posts from the platforms’ various newsrooms and blogs, which discussed harm and safety issues, interventions, and the progress platforms had made toward meeting safety goals. We then collated the posts, manually extracting their metadata including publication date and URLs. After the initial collection, two members of the research team read the posts, identified those that were out of scope or had been duplicated (n = 115), and removed them from the sample. Of the 2,510 newsroom posts published by the five platforms between 1 November 2016 and 1 November 2021, 486 (19.4%) met the harm or safety criteria for the study and were included in our final sample. This included 270 posts by Facebook, 83 posts by Twitter, 65 posts by YouTube, 51 posts by TikTok, and 17 by Tinder. According to our definitions, one-fifth (20%) of all newsroom posts published by the major platforms concern issues of harm and safety.
Data Analysis and Preliminary Results
We conducted a thematic content analysis (Neuendorf, 2018; Wertley & Baker, 2022) of the newsroom posts to identify (a) salient harm and safety issues and (b) platform interventions. We used an inductive coding process grounded in our existing knowledge and literature review of platform governance issues (Saldaña, 2015) and involved an iterative process alternating independent coding with coder agreement meetings and the progressive elaboration of code definitions. As these codes were thematic rather than categorical, they were not mutually exclusive. Informed by Foucauldian discourse theory as applied to corporate PR communication (Edwards, 2018) and relevant critical literature on platform governance (cited throughout the analysis below), we then created four overarching themes from this lower-level coding exercise (discussed in detail below): (1) locating harm with bad actors and discrete content objects, (2) surveillance and policing as solutions, (3) “borderline” content and suppression strategies, and (4) performing diversity and inclusion.
The most prevalent harm or safety concern we identified across all platforms was mis/disinformation (n = 137), followed closely by inauthenticity (n = 129), abuse (n = 91), privacy and security (n = 80), health (n = 78), hate speech (n = 77), and illegal and illicit activity (n = 69) (Table 1). (In our coding scheme, “inauthenticity” tended to be associated with topics related to a lack of trustworthiness in user identity and interactions, whereas “misinformation” was more likely to be associated with content-related topics, such as conspiracy theories, fake news, and so on.) It is important to acknowledge that this emphasis is skewed by the influence of Facebook, which had had the highest proportion of in-scope total posts (29.3%). The focus on mis/disinformation was likely influenced by that sample period, which included the onset of the global COVID-19 pandemic. The prevalence of inauthenticity throughout the sample was, in part, impacted by political concerns about Russian interference during the United States 2016 and 2020 federal elections, and platforms’ subsequent attempts to verify users and restrict the influence of state-affiliated accounts. Privacy and security were strongly represented in Facebook’s posts, which makes sense given that during the sample period, Facebook responded to the 2018 Cambridge Analytica scandal in which unlawfully harvested data were allegedly used to influence electoral outcomes. Of the harm/safety concerns we found, the least prevalent topics were global issues (n = 16), digital literacy (n = 10), human and civil rights (n = 7), and state violence (n = 3).
Harm and Safety Themes in Newsroom Posts of Facebook, YouTube, TikTok, Twitter, and Tinder.
Each of the five platforms had a different focus that reflected its social media type, user base, and political investments. Table 2 indicates the harm and safety issues most frequently raised by each platform. As the largest platform, reporting 2.9 billion users, Facebook predominantly posted about mis/disinformation (which appeared in 29% of their posts), inauthenticity (24%), and privacy and security (23%). As the second largest platform, reporting 2 billion users, the harm and safety issues most frequently raised by YouTube were illegal and illicit activity (34%), inauthenticity (29%), and borderline content (28%). With its disproportionately young userbase, TikTok, which first began publishing posts in 2018, most frequently posted about abuse (31%), health (31%), and children’s well-being (24%). As a platform that has been scrutinized for facilitating coordinated inauthentic behavior and hosting harmful content by former US President Donald Trump, Twitter’s posts most frequently raised themes of mis/disinformation (39%), inauthenticity (39%), and abuse (39%). The largest online dating app that facilitates adult in-person encounters, Tinder, posted about the potential for interpersonal violence, with abuse (41%), social inclusion (29%), and inauthenticity (24%) being the most frequently cited issues.
Harm and Safety Themes by Platform.
Across the course of the sample period, there were notable increases in the frequency of posts on particular issues (Table 3). For example, in 2016, only one post was coded for mis/disinformation, increasing to 41 posts in 2021 (with a peak of 42 in 2020). In 2016, only two posts were coded for inauthenticity, increasing to 25 posts by 2021, with a peak of 37 in 2020. Over the 5 years, the number of posts coded for privacy and security per year rose from 0 to 22, social inclusion posts rose from 1 to 14, abuse from 1 to 27, hate speech from 1 to 26, illegal and illicit activity from 1 to 31, and health from 1 to 32. In addition, there were changes to the types of tools and interventions initiated by platforms over time. We observed more posts about content moderation, content curation, verification, user controls, law enforcement, policy updates, education, and consultation as strategies to respond to harm and safety issues. For example, Facebook’s posts about content moderation increased in number from 1 to 99 over the sample period, their posts on education increased from 2 to 105, and posts on consultation increased from 0 to 132.
Harm and Safety Themes Over Time.
Theme 1: Locating Harm With Bad Actors and Discrete Content Objects
The newsroom posts framed media objects (individual pieces of content) posted by “bad actors” (individual users) as the locus of harm and the primary focus of their harm prevention efforts. The posts often appeared to be responses to recent controversies around events where harm appeared to be caused by social media use, and that in turn led to “media panic” discourses (Drotner, 1999) about social media platforms in the mainstream media and public debate (for detailed discussion of this phenomenon with respect to children’s online safety, see Milosevic, 2018). In managing and seeking control of the fallout from these controversies, the newsroom posts tended to focus on their efforts to combat bad “media objects”—that is, content that was highly visible, easily detectable, and likely to be understood by most observers as egregious, such as instances of hate speech (15.8%), abuse (18.7%), and illegal or illicit content (14.2%). All platforms except Tinder framed hate speech as a harm and safety problem. Hate speech was most often defined by reference to anti-Semitism (e.g., holocaust denial material), and less frequently to misogyny and racism (Facebook row 113, TikTok row 301). Posts about abuse focused on interpersonal violence—bullying and harassment, non-consensual intimate imagery, sexual violence, incivility, doxing, cyberstalking, and physical violence. Tinder most often focused on abuse (41%), often describing “harmful language” in “opening lines” and other instant messages sent between users (Tinder row 326). Twitter’s posts about abuse (39%) described “incivility” and “unhealthy conversations,” such as how “distracting, irrelevant, and offensive replies can derail the discussions that people want to have” (Twitter row 386). Another further target for comment was illegal and illicit content. In total, 69 newsroom posts focused on content and behaviors that were unlawful, regulated, or unauthorized. The majority of these posts were about easy targets (issues where there is relatively stable consensus across the political spectrum), such as terrorism, child sexual abuse material (“CSAM”), and the sale, promotion, use and possession of weapons, tobacco, and alcohol and other drugs.
The platforms emphasized and reinforced the culpability of individual users by framing those who harm others through their networks as “bad actors” (Tinder row 327, YouTube 463). In doing so, they invoked the debunked criminal justice theory of “bad apples,” which views anti-social or criminal behavior as occurring among a few deviant individuals rather than as the result of structural or environmental factors (Tator et al., 2006). Drug use, for example, was repeatedly framed using language around “bad actors” and “bad content.” These posts focused on illicit and criminalized drugs and reproduced the “war on drugs” approach. Facebook’s Vice President of Global Policy Management reported in a 2018 post (Facebook row 240): I joined Facebook after a career as a prosecutor and saw firsthand the damage these drugs can inflict on communities and families. So let me start with the obvious: there is no place for this on our services. It’s bad for society, bad for people, and against our values.
While platforms responded to global public health issues, such as the COVID-19 pandemic, they were otherwise concerned with very selective, individual health issues that received substantial media and government attention: substance abuse (largely about recovery and detox); negative body image (overwhelmingly focused on eating disorders and body dissatisfaction); self-harm (including cutting, suicidal ideation, and suicide); and digital addiction (the negative impacts of internet consumption). In doing so, platforms focused on health through a lens of individual pathology, reinforcing stigma rather than addressing the structural determinants of health.
The more that platforms framed harm as inhering in discrete pieces of content, the more they could justify investing in scaling up their content moderation systems. Of the 486 newsroom posts, 40.1% (n = 195) discussed both human and automated content moderation to detect, suppress, and remove content or users and more than 21.2% (n = 103) described the steps the platforms were taking to curate content, which the companies claimed included content suppression or recommendation. Platforms framed their technological investment in “machines” as an efficient and reliable method to “take as much of the burden as possible away from the individual encountering abuse” (Twitter 352), “flag content for review at scale,” and “remove millions of violative videos before they are ever viewed” (YouTube row 468). Facebook emphasized how they “will continue investing in technology to keep illicit drug sales off our platforms” claiming to “block and filter hundreds of terms associated with drug sales,” making “it easy for people to flag bad content” and “proactively investigate profiles, Pages, Groups, hashtags and accounts associated with bad content we’ve already removed” (Facebook row 240).
Alongside content moderation, interventions included tools users could employ to customize their online experience and take responsibility for their own safety. Almost one third (29.6%, n = 144) of the posts described user controls that enable users to curate aspects of their experience, limit contact with other users, and report rule violations. However, 59% of Tinder’s posts described user tools, which included “blocking” or limiting contact with users (Tinder row 324), and even the ability for LGBTIQA+ users to suppress their profiles in countries where LGBTIQA+ activities, identities, and relationships are criminalized (Tinder row 335). The four other platforms described affordances that would enable users to customize what they could “see” in comments (YouTube row 463), feeds, and direct messages (Twitter row 398). For example, Twitter announced the platform’s “Safety Mode,” which allows users to block accounts and harmful language (Twitter row 341).
The harms platforms focused on tended to be those that are most amenable to scalable technical solutions (see Rieder & Skop, 2021). TikTok reported that their automated tools are “reserved for content categories where [their] technology has the highest degree of accuracy,” which included violations of the “policies on minor safety, adult nudity and sexual activities, violent and graphic content, and illegal activities and regulated goods.” However, platforms have very broad prohibition policies that require their algorithms to differentiate between educative, humorous, and unsolicited content and the apparent gender of human nipples (Tiidenberg & van der Nagel, 2020). The perils of this work are obvious in YouTube’s posts about their restricted mode, which attempts to differentiate between sex education (permitted) and detailed conversations about sex (prohibited). As YouTube reflected, “This is one of the more difficult topics to train our systems on, and context is key” (YouTube row 482). Despite contention over their policies and the documented risk of overcapture, platforms continue with proactive automated takedowns for such “bad content.” The scalability of automated content moderation systems is a strong motivating factor for technological investment and “the productivity of the process is central to rendering this the preferred solution” (Siapera & Viejo-Otero, 2021, p. 126).
The extent to which the newsroom posts focus on “bad actors” and “bad content” demonstrates the carceral discourses that order platform governance, and that the newsroom posts help reinforce (Hasinoff et al., 2020). By this, we mean they are guided by criminal legal assumptions that individuals—rather than structural conditions—produce crime, and that punitive punishment deters offending (Schoenebeck et al., 2020). These logics lead platforms to focus on symptoms—like pieces of content—rather than the systemic drivers of online harm. In the face of public scrutiny for facilitating serious interpersonal violence (Dias et al., 2020), Tinder announced a partnership with the non-profit background check platform “Garbo,” which gives users “public records and reports of violence or abuse, including arrests, convictions, restraining orders, harassment, and other violent crimes” of those whom they may want to connect with. Tinder framed the partnership as a way to “combat bad actors and empower users with tools to help keep them safer” (Tinder row 327). However, the partnership implies that while bad actors are to blame for online harms, the behaviors can best be prevented by vigilance on the behalf of individuals (Bivens & Hasinoff, 2017). As we discuss elsewhere (Stardust et al., 2022), this kind of carceral surveillance on dating apps systems could have serious unintended consequences for vulnerable and marginalized populations.
Theme 2: Surveillance and Policing as Solutions
The posts further framed users as “bad” actors where they were perceived to be behaving in deceptive, disingenuous, or “inauthentic” ways. More than a quarter of the posts described inauthenticity as a harm and safety concern (26.5%, n = 129). These posts went beyond individuals and discrete content, and described a range of issues including impersonation, spam, synthetic or manipulated content, and coordinated inauthentic behavior, at times overlapping with mis/disinformation and abuse. However, 39% of Twitter’s in-scope posts were about inauthenticity (n = 32), with their posts often highlighting malicious uses of automation and foreign and government-linked actors who sought to disrupt the integrity of elections, hinder public health discussions, amplify spam, impersonate others, and spread synthetic or manipulated content. YouTube’s primary inauthenticity concern surrounded children as they reported “content on YouTube that attempts to pass as family-friendly, but is clearly not” (YouTube row 474). As a dating app presumably concerned with catfishing and romance fraud, Tinder focused on how users represented their physical appearances and encouraged users to be their “authentic selves” (Tinder row 337).
Because these abusive and exploitative practices were framed as a problem of “inauthenticity,” the solutions were framed around the need for more authentication (or verification)—and hence, more data-intensive user surveillance. Twitter, who described verification-based interventions in 19% of their newsroom posts, focused on authenticating political actors and acting against “networks of spammy or automated accounts automatically” (Twitter row 408). While verification practices were not as commonly described by TikTok, Facebook, and YouTube, each of these platforms illustrated the steps they took to authenticate users’ identities (particularly users who garner large audiences [Facebook rows 31, 52] and seek to run political advertisements [Facebook rows 108, 213]) and to ensure age-appropriate access for children (TikTok row 282, YouTube row 446). Verification was a dominant safety intervention strategy for Tinder, with 18% of their newsroom posts outlining tools to ensure “every match is who they say they are” (Tinder row 333). With the help of automation, Tinder introduced various image and identification verification practices, such as its photo verification feature, which compares “a posed photo taken in real-time to profile photos” to “help verify a match’s authenticity and increase trust in member profiles” (Tinder row 333).
These verification interventions assume that digital spaces are safer when users create accounts tethered to their “real” identities. However, while verification interventions may help some users, they can seriously disadvantage and even harm others. Facebook acknowledged the potential harm resulting from verification practices even while enforcing the platforms’ heavily-contested (boyd, 2012; van der Nagel & van der Frith, 2015) “real name” policy. In one 2021 post, Facebook even explained why enabling pseudonymity can be important: “perhaps they’re a young member of the LGBTQIA+ community and they worry about having their identity attached to a pseudonymous account” (Facebook row 102). The incongruity between Facebook’s policy and their stated efforts to create a safe digital environment call into question whose “safety” the platform as a whole is most likely to ensure. While verification interventions may benefit certain users, victim-survivors of domestic and family violence who use pseudonyms for protection are at risk of exposing themselves to malicious actors under Facebook’s real-name policy. Verification requirements therefore fail to tackle the conditions that enable harm, and instead actively perpetuate it, meaning “our concerns about anonymity are overly-simplistic; system design can’t solve social problems without actual social change” (Matias, 2017).
Platforms further emphasized the value of monitoring of user behavior and content in the context of combatting abuse. In 2021, Tinder announced its newest suite of harassment interventions, “Are You Sure?” and “Does This Bother You?,” which automatically scan content sent between users on the app’s instant messaging service and are prompted to intercept messages that the company understands as harmful, thereby instilling an omnipresent mode of surveillance. Tinder (row 326) claims to collect and store reported user conversations as a means “to evolve and improve” their “AI” to detect “inappropriate language in messages.” However, the effectiveness of these automated surveillance tools depends on Tinder’s capacity to understand what their users experience as harmful and how this informs their features and functions. The practice raises important questions about what they include or exclude, what understandings and heuristics are used to justify their actions, and how they ascribe meaning to data.
Given the reliance of their business models on user data (Sadowski, 2019), it is more than convenient for technology companies that the ever-more intensive extraction and accumulation of user data can be justified in service of harm and safety interventions. Building on van Dijck’s (2014, p. 198) definition of datafication as “a means to access [. . .] and monitor people’s behavior,” we observe in our corpus an ever-intensifying “datafication of safety” on behalf of platforms, despite having been heavily scrutinized for data breaches in the recent past (Wong, 2019). Facebook’s announcement of new privacy standards followed a US$5 billion penalty to the Federal Trade Commission for violating a consent order following an investigation into the Cambridge Analytica scandal (Facebook rows 183, 211). In responding to ongoing user concerns about how they were collecting, storing, and sharing user log-in details and personal information with third parties (Facebook rows 252, 254), some of the newsroom posts deflected attention to external actors, warning of cyber threats, and the distribution of hacked material.
This accumulation of user data is notable given that platforms are increasingly establishing relationships with law enforcement. Each of the platforms except for Twitter described their involvement with law enforcement agencies, citing illegal or illicit content and behaviors or high-profile media events, such as the 2021 United States Capitol attack (Facebook row 134) and Christchurch terrorist attack (Facebook row 225) as motivators for such engagement. In addition to their partnership with the Garbo, Tinder partnered with the emergency response app Noonlight, which invites user to log details about their dates and alert law enforcement if they feel unsafe (Tinder row 333). A particularly clear example of the carceral logics we observed was demonstrated by Twitter, Facebook, and YouTube’s “strikes” policies, which provide that users who break the rules are punished based on the number of rule violations scored against their accounts—mirroring the “three strikes” mandatory sentencing policies known to disproportionately impact marginalized communities and result in net-widening in the criminal justice system.
Theme 3: “Borderline” Content and Suppression Strategies
Platform newsroom posts inevitably had to address a lack of trust and confidence on the part of users, media, researchers, governments, and the public. Posts coded for “trust and confidence” addressed a range of issues, including algorithmic overcapture, lack of transparency, discriminatory advertising, political campaigning, restricted content, recommender systems, strike systems, personalization, barriers to earning capacity, and data privacy. A significant trust issue was the overcapture and proactive platform suppression of what platforms describe as “borderline” and “sensitive” content. What constitutes such content is contentious because the terms refer to media that does not technically breach platform terms of service or community guidelines, but which platforms consider, through covert and subjective decision-making, to be otherwise undesirable. Borderline content is not necessarily removed, but its visibility and searchability is reduced (Facebook row 250). Across the newsroom posts, a variety of content was referred to as borderline, including misinformation, fake news, clickbait, sensationalism, explicit language (profanity), eating disorders, alcohol and drug use, content that trivializes suicide and self-harm, and sexually suggestive material.
The moderation of nudity and sexual expression provided an illustrative example of contentious overcapture. Platforms posted with pride about their proactive and automated removal of such content: in their first ever enforcement report, Facebook stated, “We took down 21 million pieces of adult nudity and sexual activity in Q1 2018—96% of which was found and flagged by our technology before it was reported” (Facebook row 250). While this information is presumably offered to demonstrate the efficacy of their technology, it also reflects the breadth of their problematic nudity/sexuality policy which has been widely critiqued for suppressing lawful, consensual content including sex education, health promotion, and harm reduction material (Are, 2021). However, proactive detection and suppression was not a universal approach: Facebook, YouTube, and TikTok were the only platforms within the scope of our study that posted about their removal of this content (Tinder does not allow photo exchange and Twitter already permits nudity). By contrast, Twitter took a more nuanced approach, posting about their prohibitions on non-consensual nudity and CSAM rather than issuing a blanket ban (Twitter row 344).
Biased automated tools can leave platforms open to liability: In 2022, Tumblr settled discrimination allegations with New York City’s Commission on Human Rights after they implemented an automated sexual content takedown system that disproportionately impacted LGBTIQA+ users (Robertson, 2022). In the posts, we analyzed, YouTube apologized for how their “restricted mode” feature was blocking LGBTQ content, describing how it was “originally designed as an optional feature for public institutions like libraries and schools to prevent the viewing of mature content on YouTube.” However, they reflected that “in looking more closely at the feature, we found that there was LGBTQ (and other) content that should have been included in Restricted Mode but was not, like kissing at weddings, personal accounts of difficult events, and speaking out against discrimination” (YouTube row 480). While responding to complaints about content suppression, platforms used their newsrooms to apologize to users being targeted by anti-queer ads and to justify advertising and fundraising policies that enabled the earning potential of some users while restricting that of others. YouTube responded to critiques that they were preventing advertisers from running campaigns targeting social movements, such as Black Lives Matter (“BLM”) (YouTube row 434). In 2020, Mashable reported that Tinder too was removing profiles that mentioned Black Lives Matter in their bios (Lovine, 2020). Although newsroom posts positioned these policies as reasonable and neutral, they tend to impact marginalized users disproportionately, especially those organizing mutual aid. Platforms have been accused of “black box gaslighting” because of the way they appear to deny the substance of criticism concerning their algorithms (Cotter, 2021). In 2019, eight queer YouTube creators filed a class action lawsuit against Google to “stop discriminatory and unlawful content-based regulation, restraint, monetization, false advertising, and anti-competitive distribution of LGBTQ+ speech and video content” (Cheves, 2019).
Newsroom posts responded to such concerns via several strategies: announcing updates to their policies and community standards, conducting algorithmic and workplace audits, releasing transparency reports, amending appeals processes, announcing partnerships, hosting events, conducting stakeholder consultation, and funding projects. Facebook, who began publishing enforcement numbers only recently in 2018, published various reports and internal guidelines seeking to offer greater visibility into their customization, content ranking, and distribution practices, clarifying how they make money and revising definitions of what they consider “borderline,” “sensational health content,” “unoriginal video content,” “problematic” content, and “low-quality content” to be suppressed (Facebook row 2). Twitter reported conducting studies to assess gender and racial bias in their algorithms and research into responsible machine learning (Twitter row 351). However, the posts reflect a practice we summarize as “act now, apologise later.” Updates were generally aimed at addressing gaps in existing policies and were overwhelmingly reactive—Facebook’s announcement that the company would prohibit discriminatory advertising in housing, employment, and credit lending was prompted as a result of litigation (and subsequent settlement) filed against them by the National Fair Housing Alliance (NFHA), American Civil Liberties Union (ACLU), the Communication Workers of America (CWA), and others (Facebook row 226).
Theme 4: Performing Diversity and Inclusion
In what might be viewed as reputational damage mitigation, platforms emphasized their policies on inclusion and diversity. Under the code “social inclusion,” platforms discussed LGBTQIA+ inclusivity, gender inclusivity, racial inclusivity, disability inclusivity, and body inclusivity; however, they took a surface approach to engagement and focused overwhelmingly on the politics of visibility over structural change. Posts on LGBTIQA+ inclusivity, for example, were predominantly about pride, celebration, and visibility. Facebook offered pride-themed product features, such as logos, avatars, filters, backgrounds, stickers, rainbow hashtag, chat themes, and stories (Facebook row 109). Their focus was on personalization, self-expression, and LGBTIQA+ individuals as a source of inspiration: “Mention your #rolemodel in your stories and share how they inspire you” (Facebook row 248). On YouTube, posts about LGBTQIA+ inclusivity were about boosting and amplifying voices and encouraging more creators via the hashtag #ProudtoCreate (YouTube row 644). Similarly, racial inclusion posts tended to focus on amplification, celebration, and awareness raising. Using the hashtag #RepresentLove, Tinder advocated for “Interracial Couple Emojis” through their petition (Tinder row 338). YouTube announced their celebration of Black History Month with YouTube Originals (YouTube row 438) and their amplification of Black voices through their #YouTubeBlackVoices Fund (YouTube row 440).
Such initiatives ought to be viewed critically, as “inclusivity” can reinforce social inequalities (Hoffmann, 2021) and fail to address political, social, cultural, and economic barriers. All of Facebook’s posts coded for LGBTQIA+ inclusion were limited to Pride Day or Coming Out Day (Facebook rows 109, 248, 150). Tinder posted about trans users when they had to address complaints: “our trans members have been very vocal about: the banning of our transgender members, especially transgender women” (Tinder row 334). The two posts coded for racial inclusivity at Facebook related either to designated days (Asian Pacific American Heritage Month) (Facebook row 113) or particular political events (the 2020 Black Lives Matter protests) (Facebook row 171). Celebrating inclusion only on specific days, without sustained initiatives throughout the year, works to slot these user bases into platforms’ existing business model and casts their commitment to those communities into doubt (Hoffmann, 2021). While a common method for platforms to address social issues was by making sizable one-off donations to charities and non-profits (Facebook announced a donation of US$15 million to organizations focused on racial justice and equity), it is unclear what platforms are doing to support the organizations’ everyday work in the long term. Presumably, these donations bring tax deductions and offsets to the companies, rather than being mutual aid donations offered in solidarity.
Amplification efforts have obvious benefits for the platforms themselves, as they encourage the creation of further content that attracts more viewers and advertising revenue. For example, Facebook committed to reaching 1 million members of both the Black and Latinx communities through their Elevate program, which provides “free training in the digital skills they need to succeed, from setting up an online presence to creating marketing materials and more” (Facebook row 171). YouTube similarly focused on the success of individuals on their platform: “Black creators and artists have and continue to play an important role in shaping the culture on YouTube, propelling our platform forward. We are invested in these creators, artists, and their stories” (YouTube row 434). It is important to note that “‘Black faces in high places’ is not an aberration but a key feature of a society structured by White supremacy” (Benjamin, 2019, p. 46). It is unsurprising that, in another post, YouTube lists one of their top priorities as “growing the creator economy” (YouTube 438).
Although platforms are eager to appear inclusive, their “support” is selective. For example, the posts about LGBTQIA+ inclusion featured a notable underrepresentation of pressing justice, access, and equity issues relating to criminalization, medicalization, poverty, violence, or hostile legal frameworks, especially those facing trans and intersex communities. When YouTube responded to critiques about their restricted mode, they emphasized their consultation among LGBTIQA+ employees, broadening their community guidelines, inviting users to submit instances where they think the platform “got it wrong,” and hosting creator roundtables and advisory sessions (YouTube row 480). No such activities were reported to address content produced by sex workers, despite the well documented content blocks and takedowns for those communities (Blunt et al., 2020). While sex work can be considered queer—it’s anti-normative, anti-nuclear, sexually deviant, it involves sex for money rather than love or procreation, and the work itself blur binaries of homosexuality and heterosexuality (McKay, 1999)—presumably, supporting sex workers is not considered “brand safe,” and sex workers are not deemed palatable entrepreneurs whose content ought to be amplified. Indeed, one YouTube post on “Protecting our community” announced how they strike the balance to protect users from inappropriate content while still attracting revenue: “We want to give creators confidence that their revenue won’t be harmed by bad actors while giving advertisers assurances that their ads are running alongside content that reflects their brand’s values” (YouTube row 473).
The urgency of political reforms can also be eclipsed by strategies that emphasize individual resilience over systemic change. YouTube and Facebook, for example, both highlighted their “It Gets Better” campaign (which features LGBTQIA+ influencers sharing stories about mental health) (YouTube row 480, Facebook row 150). While the campaign can be inspiring, it has been critiqued for its promise of a better future that may not actually eventuate, and its expectation for perseverance at a time of documented anti-trans and anti-queer backlash (Butler, 2021). Newsroom posts on disability inclusion took a very narrow approach to the issue of access, community, culture, and design. The posts on Facebook, Twitter, and YouTube were generally about introducing automated captions, speech recognition, text-to-speech software, and photo descriptions (Facebook row 156; Twitter row 349; YouTube row 485). TikTok went further by spotlighting Deaf and Hard of Hearing creators during Deaf Awareness Month and supporting #DeafTikTok (TikTok row 275) and introducing a photosensitivity warning for users with photosensitive epilepsy (TikTok row 297). The one post we coded for body inclusivity, entitled “supporting body inclusivity on TikTok,” introduced new treatment and support resources for users struggling with eating disorders when they searched for #edrecovery #proana or similar phrases (TikTok row 292). The lack of posts about supporting fat activists, plus-sized models, or body positive users is striking given that platforms have been critiqued for deleting images and content of body positive women (Witt et al., 2019) and TikTok itself has admitted suppressing content of disabled, queer, and fat creators (Botella, 2019).
Some platforms introduced initiatives to improve their workplaces, such as audits to identify racism, or diversity training for workers. Twitter, for example, announced training their staff in the historical context of hate speech and proactive hiring of more diverse staff. Facebook claimed, “We’ve already committed to have 50% of our workforce be from underrepresented communities by the end of 2023, and we’re working to double our number of Black and Latinx employees in the same timeframe. And over the next five years, we’re committing to have 30% more people of color, including 30% more Black people, in leadership positions” (Facebook row 171). While a more diverse workforce may make different decisions on content and policy, the politics of “add diversity and stir” does not necessarily change the organizational structures or profit incentives of corporations. Ahmed & Swan, (2006) have theorized about the problems of inclusion for people of color who are co-opted into white institutions, who are then required to do tiresome and uncompensated “diversity work” to exist in those spaces.
Conclusion
While no substitute for insider access, public newsroom posts offer important insights into how platforms frame, justify, and perform for the public and their interventions into matters of societal concern. Our study investigated which problems platforms chose to emphasize in their public statements, whose safety they appeared to prioritize, and the logics that underpinned the interventions they reported making. In detailing our findings, we covered four main areas: first, we identified a persistent focus on individual “bad actors” as the cause of harms; second, we identified a focus on data-led surveillance as the solution to these harms; third, we highlighted the rise of “borderline content” and content suppression strategies as an emerging practice; and fourth, we noted the limited and contradictory ways platforms’ public statements perform diversity and inclusion. While acknowledging the limitations of this study of public statements (as opposed to internal company policies and practices, and debates about these among various cohorts of company employees), we now draw out two further implications from our analysis.
First, the problem of online harm as the result of a select few “bad actors” (who vigilant users might protect themselves from) absolves platforms of responsibility and reproduces the myth that platforms are neutral conduits (Bivens & Hasinoff, 2017), rather than powerful media institutions whose architectures, cultures, and governance practices not only reflect but also help to shape society and its various forms of inequality (Benjamin, 2019). By neglecting the symbiotic relationship between technology and people, technology companies reproduce misconceptions about how platforms shape behavior and how to address online harms and make digital spaces safer for users. This diverts attention from platforms for facilitating, enabling, amplifying, cultivating, and reinforcing harmful content and behaviors. By focusing on the harms that are amenable to technological tools, platforms distract from the range of online harms that cannot be addressed by these methods alone. To focus on the removal of pro-ana websites while blocking fat-positive content creators, to focus on mental health support while insisting trans users use deadnames on their profiles, and to focus on drug users without putting forward policy statements to condemn drug criminalization, acts discursively to position responsibility with individuals rather than the legal, policy, social, cultural, and technological environments that continue to harm them.
Second, the emphasis on verification and user surveillance as the keys to improved safety hints at the underlying carceral logics that underpin dominant models of platform governance, not only on behalf of the platforms themselves but also on behalf of society and governments more broadly. Under those logics, safety is understood as being synonymous with security—that is, safety “from” (harms and threats) rather than safety “to” (act in one’s own interests, to flourish), and safety is secured through surveillance and punishment (Thakor, 2019), in the absence of trust. The “prospect of total surveillance,” Mark Andrejevic (2019, p. 9) writes, “offers to relieve us of the social burden of having to trust one another at a time when it is becoming harder than ever to do so. . .” A bonus of surveillance and verification led responses to safety concerns is that platforms can use these concerns to justify the scale-up of their automated tools and the increased scope of personal data collection. As we might expect on the basis of understanding the newsroom posts as public relations media, these efforts seem primarily to function as responses to high-profile media controversies or regulatory threats, focusing on convenient topics where there are reputational benefits. More important, the deep challenges of apparently conflicting human rights appear to be increasingly dealt with through defining some forms of expression as “borderline content” and suppressing its reach, a trend that reduces transparency, trust, and hence accountability for matters of concern around safety.
Supplemental Material
sj-docx-1-sms-10.1177_20563051221144315 – Supplemental material for Safety for Whom? Investigating How Platforms Frame and Perform Safety and Harm Interventions
Supplemental material, sj-docx-1-sms-10.1177_20563051221144315 for Safety for Whom? Investigating How Platforms Frame and Perform Safety and Harm Interventions by Rosalie Gillett, Zahra Stardust and Jean Burgess in Social Media + Society
Footnotes
Acknowledgements
The authors would like to thank Cadbury Cordeaux for their excellent and diligent research assistance on this project.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Jean Burgess has consulted with Meta in an advisory capacity. Rosalie Gillett has previously received research funding from Facebook for an unrelated project.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Australian Research Council Centre of Excellence for Automated Decision-Making and Society (Grant No. CE200100005) and QUT’s Digital Media Research Centre.
Supplemental material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
