Abstract
Social media offers an attractive site for Big Data research. Access to big social media data, however, is controlled by companies that privilege corporate, governmental, and private research firms. Additionally, Institutional Review Boards’ regulative practices and slow adaptation to emerging ethical dilemmas in online contexts creates challenges for Big Data researchers. We examine these challenges in the context of a feminist qualitative Big Data analysis of the hashtag event #WhyIStayed. We argue power, context, and subjugated knowledges must each be central considerations in conducting Big Data social media research. In doing so, this paper offers a feminist practice of holistic reflexivity in order to help social media researchers navigate and negotiate this terrain.
Keywords
Introduction
Recently, there has been increased interest among scholars in Big Data research, driven in part by the rise of social media, advancements in computing and technology, and new corporate practices and infrastructures for data production and circulation (Hand and Hillyard, 2014). Big social media data, however, are controlled by companies that privilege corporate, governmental, and private research firms, which poses challenges for researchers wishing to access these data. Moreover, institutional review boards have struggled to develop policies regarding context-specific ethical practices for online research. As a result, scholars have begun to consider the ethical questions of online Big Data research, and Big Data research of social media specifically (e.g., boyd and Crawford, 2012; Brock, 2015; Leurs, 2017; Zwitter, 2014). Increasing consideration is being given to the methodological implications for knowledge production and dissemination, the implications for those who share their experiences, thoughts, and opinions (“data points”) via social media and online networks, how and whether social media constitutes a “public space,” and what informed consent means for social media users.
Meanwhile, activists have been calling out practices by researchers and others within social media spaces. In December 2014, a group of activist women of color published a collective statement announcing an organized social media blackout, which was in response to the abuse, appropriation, and de-legitimation of their digital labor (Collected Authors, 2014). This statement titled, #ThisTweetCalledMyBack, a play on the classic feminist book This Bridge Called My Back (Moraga and Anzaldua, 1983), called out academics, journalists, non-profits, and “proper organizers,” among others, for ignoring, discounting, or disparaging the work of women of color, while reaping the benefits of their unpaid digital labor. As the statement’s signatories explain, There is a refusal to legitimize the words of women of color without the backing of academia, established media, and non-profit monikers. How do we then legitimize the lens with which marginalized women of color view their lives and the spaces where they are actually allowed to assert their agency? Currently the tools women of color use to engage a movement that has long viewed them as silent subjects, relegated to frying chicken and frybread for the real movers, are devalued at best—and threatened after mined for content at worst. All of this and more take place to the tune of stalking, plagiarism, and an outright refusal to look at the interpersonal violence that we face as a result. Still, no one can quell our concern about how it is that we can expect to be respected and kept safe a physical movement space if you won’t respect and keep us safe in a digital one (Collected Authors, 2014: para. 8)
Informed by feminist methodologies, we explore how power shapes the context of discovery as well as the context of inquiry (Collins, 1994; Harding, 1987) within Big Data social media research. In this paper, we shed light on the ways our feminist commitments surface methodological and institutional challenges in doing Big Data social media research. To do so, we engage feminist holistic reflexivity regarding our own experiences designing and conducting a study involving a qualitative Big Data analysis of #WhyIStayed. While feminist methodologies are diverse, we draw upon key concepts including power, reflexivity, and subjugated knowledges (Collins, 1994; Haraway, 1988; hooks, 1989, 2000) to outline the ways feminist ethics informed not only how we navigated power dynamics, but how feminist ethical considerations give rise to new sets of challenges, some of which would not have presented themselves had we employed another epistemological or methodological framework, for example, (post)-positivism. When we embarked on this project, few studies discussed methodological challenges in conducting feminist Big Data social media research and even fewer explicitly addressed how they experienced and navigated these challenges. We hope this paper serves as an important intervention in that regard. In this paper, we turn to feminist holistic reflexivity to navigate the challenges and constraints of doing Big Data social media research. As such, we contribute to a larger ongoing conversation both inside and outside the academy about what it means to do social media research that seeks to account for the agency, subjectivity, and contextual specificities of those individuals who produce “data points.”
Feminist ethics
Feminism is defined as “the movement to end sexism, sexual exploitation, and sexual oppression.” (hooks, 2000: 3). As hooks and other feminists of color argue, sexism is inextricably linked with racism, colonial histories, capitalist exploitation of labor, and thus, an intersectional approach is necessary to feminism as a movement and a practice. In the context of knowledge production, feminist scholars acknowledge that all research is value-laden and never morally or politically neutral (Collins, 1994; Haraway, 1988; Mohanty, 2013). For example, in her work related to black feminist epistemology, Collins (2000) argues knowledge is produced by and through personal experiences as they are shaped by societal ideologies. As such, knowledge is subjective and partial, rather than objective and impartial. Moreover, developing knowledge of marginalized or subordinated groups requires more work than that of privileged groups. Collins (1989) explains, “because subordinate groups have long had to use alternative ways to create an independent consciousness and to rearticulate it through specialists validated by the oppressed themselves … one cannot use the same techniques to study the knowledge of the dominated as one uses to study the knowledge of the powerful” (751). Thus, Collins epistemically privileges experience, and in particular the standpoint of Black women, as knowledge and in doing so challenges conventional conceptualizations that position knowledge as separate or distinct from the everyday, lived experiences. Furthermore, feminist research is not value-neutral or objective as feminist researchers seek to produce transformative knowledge in the interest of social justice, while avoiding, to the greatest extent possible, the reproduction of inequalities (Ahmed, 2017; Collins, 1991, 1994). The emancipatory potential for research is “a goal we can move toward through reflexive methodology in the construction of knowledge” (Hesse-Biber and Piatelli, 2012: 574).
Feminist reflexivity
Feminist reflexivity is a method or practice wherein researchers engage in an ongoing process of critical reflection on the development and outcomes of knowledge production and is central to enacting and enhancing feminist ethics (Hesse-Biber and Piatelli, 2012). Reflexivity becomes a means to interrogate one’s own positions but also to critically analyze the ways in which power is operating in the research process (Ramazanoglu and Holland, 2002). Feminist reflexivity often has been applied to qualitative research involving direct engagement and inquiry into participants’ lives, perspectives, and experiences. In this research, feminist reflexivity often focuses on the positionality of the researcher vis-à-vis the researched. For example, we might identify that we are white (heterosexual, middle-to-upper-middle class, cis-gender) women in the academy studying a hashtag event about domestic violence initiated by a black woman who had herself experienced domestic violence. While such identifications are important in recognizing the partial positions from which the research takes place, reflexivity also requires a critical interrogation of how power is operating in the research process. Feminist reflexivity typically involves considerations of power dynamics between the researcher and research communities. These considerations include a critical examination of how power dynamics shape the research context and knowledge production (Islam, 2000; Kahn, 2005). More importantly for the focus of this paper, we wish to extend the feminist concept and practice of reflexivity beyond considerations of the positionality of the researcher and researched towards a “holistic reflexivity” as it pertains to the positionality of the research itself vis-a-vis larger institutional contexts. Thus, we advocate for a feminist holistic reflexivity that interrogates how power also operates in and through institutions in ways that enable or constrain the research process.
Feminist holistic reflexivity
Feminist holistic reflexivity acknowledges power dynamics while also recognizing that power shapes the research process in ways that often cannot be fully anticipated or addressed by the researcher. The feminist practice of holistic reflexivity “exposes the exercise of power throughout the entire research process. It questions authority of knowledge and opens up the possibility for negotiating knowledge claims and introducing counterhegemonic narratives, as well as holding researchers accountable to those with whom they research” (Hesse-Biber and Piatelli, 2012: 559). Moreover, it also “requires attentiveness to how the structural, political, and cultural environments of the researcher and participants and the nature of the study affect the research process and product” (2012: 560). As such, it necessitates researchers’ attention toward power dynamics as they manifest both within and outside of the context of research.
The ways in which we engaged feminist holistic reflexivity were informed by the work of Collins (1994), who emphasized lived experience as the starting place for knowledge and the values of dialogue, care, and personal accountability. In her discussion of Black feminist epistemology, Collins centers dialogue as an important component to assessing and validating knowledge claims. Dialogue, in this context, refers to “talk between two subjects, not the speech of subject and object. It is a humanizing speech, one that challenges and resists domination” (hooks, 1989: 131). Collins asserts that new knowledge claims are rarely made in isolation, and instead, connectedness, rather than separation, is an essential component of knowledge validation. Collins also calls for an “ethic of caring” one that recognizes the appropriateness of emotion as well as fostering empathy in dialogue. Given the connectedness between people, this approach lastly emphasizes personal accountability on the part of the researcher (Collins, 1994). Researchers thus bear “moral responsibility for their politics and practices” and therefore should be attentive to issues of power and potential harms in the research process (Ramazanoglu and Holland, 2002: 14).
In practice, the concepts above informed how we enacted reflexivity throughout the research process, shaping the questions we asked ourselves and the ethical issues we sought to interrogate. We relied on a number of reflexive methods to do so, including but not limited to individual journaling, collective responses to reflection questions, and recorded group discussions during weekly research team meetings (see: Linabary et al., 2017). The sections below outline our navigation and negotiation of the ethical concerns that emerged in the context of our research.
Doing feminist Big Data social media research
We discovered that conducting Big Data social media research from a feminist perspective is fraught with ethical concerns. Access to Twitter content (i.e., tweets), for example, is controlled by companies that privilege corporate, governmental, or private research entities interested in extracting Big Data, often towards capitalist gains (e.g., the use of Big Data to create algorithms designed to market products, goods, and services to online users; Hoffman and Jonas, 2017; Parks, 2014). Additionally, within the academic context, university-based Institutional Review Boards’ (IRB) regulative practices and slow adaptation to emerging online ethical dilemmas present challenges for feminist Big Data researchers who are attuned to power dynamics and the potential for exploitation of vulnerable populations (Linabary and Corple, 2018).
Power, access, and control
Following the growing voices within Critical Data Studies, we recognize that “data is a form of power” (Iliadis and Russo, 2016: 1). Corporate control of Big Data not only restricts access to and analysis of data for researchers, but it has larger sociocultural effects, particularly for marginalized populations (Hoffman and Jonas, 2017; Noble and Tynes, 2016). Scholars have described this disparity in data access as the “Big Data divide” (Andrejevic, 2014). Unlike the “digital divide,” which refers to the lack of access to digital technologies such as computers and the internet, the “Big Data divide” refers to the split between those who are able to access Big Data, the “data rich” such as marketers and corporations, and those who lack access, or the “data poor,” such as activists or researchers with little funding (boyd and Crawford, 2012). The Big Data divide identifies the vulnerability of individuals who lack access and rights related to the information collected on their very experiences. This power imbalance enables the data rich to engage in “surveillance as social sorting”, which can serve as a “powerful means of creating and reinforcing long-term [or newly generated] social differences” (Lyon, 2002: i).
Most large corporations exercise a great degree of control over their content, maintaining no obligation to make it available or inexpensive. When content is made available, it is at cost and usually sold to those with similar corporate interests such as marketing and advertising firms. Moreover, many companies are likely to refuse access when the interests of those requesting content conflict with interests of the company and/or their partners (Parks, 2014). Given these barriers, many researchers have turned to “public” sources of content, such as social media. However, even access to ostensibly public data privileges high-paying corporate bodies. For example, very few researchers have access to the Twitter “firehose,” the stream of nearly all public tweets. Boyd and Crawford (2012) compare researchers’ access as that of a “garden hose” (roughly 10% of public tweets) or a “spritzer” (roughly 1% of public tweets) (669). Feminists and other critical researchers are doubly disadvantaged as their projects are “often underfunded, not well understood, and undervalued” (Cheek, 2008: 50), making the acquisition of expensive data sets even more challenging.
These issues are particularly salient for feminist researchers concerned with matters of power and voice. First, social media provides an important space for marginalized voices to be heard (Brown Givens, 2014). It can provide a platform to speak truth to power or mobilize movements (Jackson, 2016; Loza, 2014). Yet, if feminist researchers are unable to access social media content in the same ways that corporate entities are, then the social justice potential of social media research is significantly constrained. Corporate interests often conflict with social justice interests, potentially excluding research that examines the potential for social media to advance social justice. Corporate influence is present even in recent efforts to make social media data available for social scientists studying issues such as democracy and election processes. In the case of Facebook’s newly formed Social Science Research Council, the company shares power with its team of social science experts to decide which projects merit access to Facebook user data (Diep, 2018). This is an especially important consideration to address, particularly given the ways social movements emerge, grow, and thrive via social media; both the Arab Spring and Black Lives Matter movements are examples of this (e.g., Freelon et al., 2018; Howard and Hussain, 2013; Newsom and Lengel, 2012).
Second, without change in the current Big Data ethical paradigm, existing practices can further silence or harm marginalized groups (Hoffman and Jonas, 2017; Noble and Tynes, 2016). For example, “big” social media data is often perceived as “whole” data, despite how some populations are absent or less visible in online spaces (boyd and Crawford, 2012). As Kennedy and Moss (2015) assert, “The publics generated through data are contestable: they do not simply mirror publics “out there”, but rather are constructed in particular and partial ways” (4). Furthermore, this is exacerbated as much social media research highlights the actions and implications of social media “influencers” or “opinion leaders” (e.g., Cha et al., 2010; Xu et al., 2014). Although these studies provide insight into how information is spread online, these projects emphasize the contributions of those with power in online space. As online influencers are often those with influence offline, such as political elites, media outlets, and celebrities (Dubois and Gaffney, 2014; Xu et al., 2014), these approaches rarely attend to the voices of those with less social power online and off.
As a result, corporate control of big social media data can create perceptions of culture that are then rolled into larger “national geodemographic systems that in turn provide postcode-level analysis of people’s tastes and preferences” (Beer and Burrows, 2013: 59) that come to shape systems of surveillance and control. Turow (2012) highlights how advertising firms use individuals’ social media data to calculate their marketing value, and to characterize individuals as target or waste. However, a discourse of technological determinism spurred by corporate interest often obscures these negative sociocultural ramifications (Trottier, 2012). When potential negative outcomes are discussed, they are often dismissed due to the seemingly inevitable social good that results from increased collection and analysis of Big Data. Yet, when academic researchers cannot gain access to this data given corporate control and expense, the ability to critique or change these practices is constrained.
Feminist ethics for Big Data research: Navigating power, access and control
We embarked on this study in spring of 2015, several months after the #WhyIStayed hashtag event had peaked. The hashtag #WhyIStayed was created by author and domestic violence survivor, Beverly Gooden, in response to the victim-blaming of Janay Rice, who was brutally beaten in an elevator by her then-fiancé, former National Football League running back Ray Rice. The security tape of the abuse was released on TMZ, which precipitated discussions online and in mainstream media regarding why she (and other victims of domestic violence) stayed in the relationship (Janay and Ray married shortly after the abuse occurred). Many who were tweeting and following the trending hashtag #WhyIStayed and the corresponding hashtag #WhyILeft were victims/survivors themselves, sharing deeply personal experiences of domestic violence.
To access the tweets from September 2014 (when the hashtag emerged and was trending) that used #WhyIStayed or #WhyILeft, our research team began the project with few means by which to access the full content. As with many social media platforms, Twitter does not maintain its full historical archives in a publicly accessible manner. While anyone can capture Twitter content in real time using its Streaming API, this requires researchers to be able to anticipate a trending hashtag prior to its emergence in social media. There were some options available for accessing archived Twitter content, yet there were more limitations to these options. For the content we could access, we were unable to discern the algorithms used to capture the Twitter content or the parameters used by the company distributing the content. As such, we were concerned about the unknowns of what (and who) was included or excluded. Other Big Data researchers regularly use the “available” data via the Twitter APIs (Gaffney and Puschmann, 2014), simply noting the parameters and indicating the inherent restrictions of using these services.
Yet, such an approach did not resonate with us as feminist researchers who center subjugated knowledge in our research. As feminists, we hope to avoid reproducing power inequities by further silencing the experiences of domestic violence victims/survivors that are typically marginalized. Moreover, in our group discussions we agreed that such an approach felt lacking, particularly given the focus of our research: an examination of #WhyIStayed, a hashtag event that was intended to counter the victim-blaming directed at Janay Rice and domestic violence victims/survivors in general. By using a partial set of the data, we feared that we would inadvertently silence the experiences of those who tweeted, many of whom did so to challenge societal myths regarding domestic violence or shared their personal experiences of domestic violence. This was not simply a matter of our Big Data no longer being “big” if we used a partial data set. As our feminist ethics made salient the power differentials between researcher and researched, the centering of subjugated knowledge (Collins, 1991; hooks, 1989), and the goal of social justice, we questioned what it meant to have a partial data set whose parameters for collection were unknown. Through our group discussions and collective journaling, where we frequently struggled over and processed implications of our potential methodological decisions, we were unable to rectify this ethical dilemma in a way that resonated with our feminist ethics and commitments. Thus, we decided this was not for us a desirable or even possible approach to accessing the data.
Moreover, of the options available to us, none were affordable, as we do not have an external funding source for this project. After many deliberations during our group discussions, we decided to purchase (using a small intramural research support obtained through our university) access to the full Twitter firehose through a data management company called Gnip, which is owned by Twitter. Providing access to Twitter data, Gnip regularly contracts with large companies and corporations. A sales representative informed us that our request, which was quoted with a price of $1,000 USD, was at the bottom of company’s cost threshold. After several months of back and forth and delayed responses, we were surprised when we received a lengthy legal contract rather than a standard sales invoice. Wary of signing a legal contract without counsel, the first author consulted with legal representatives for our university and was advised not to sign the contract unless red-line changes (changes to the contract made by the university legal team) were approved by Gnip. Following this legal counsel, we sent Gnip the red-line changes recommended by the university’s legal department. Gnip’s sales representative responded that Gnip does not make red-line changes for contracts under $250,000 USD.
Because the university encouraged edits to the contract prior to signing, but our low-cost order did not qualify for changes, we were at an impasse. We had spent months locating a way in which we could access the full data set, then several additional months negotiating with Gnip and corresponding with the university. At this point, our group discussions were quite despondent as our team seriously considered abandoning the research. We had encountered so many barriers to access the full content that in our group discussions we were convinced the project was untenable, primarily because we could not adequately resolve the tension between our feminist ethics and the lack of access. Other Big Data researchers may have not even encountered these barriers, for example, if they were affiliated with a corporate entity such as a marketing firm with financial resources rather than a university, or if their research methodology was such that using a partial data set did not pose an ethical dilemma. Yet, as feminist researchers studying marginalized groups, we were confronted with these challenges to access as a result. In other words, our feminist ethics—specifically the attention toward power dynamics and interest in subjugated knowledges—produced the very conditions by which we were confronted with seemingly insurmountable barriers to accessing the Big Data we required to embark on the project.
With a serendipitous turn of events, the second author discovered an organization focused on distributing data and producing analytics with the objective to end domestic violence. After talking with the founder of this organization, they agreed to provide access to the entire #WhyIStayed Twitter content, which they had collected using their own resources. As a result of their generosity and support, we were able to move forward with the procedures and design of our study. Although navigating this challenge appears to have ended positively in terms of our ability to gain access to the full set of tweets we needed to embark on this project, it is also a happenstance ending. Without the unlikely discovery of a data company committed to eradicating domestic violence, we would have felt compelled to abandon the project as it was originally designed.
The most significant barrier to our project was cost. Large companies purchase access to online content for marketing reasons, enabling data management companies to set high prices. This raises important questions not only related to who can conduct social media research, but more importantly, who owns the experiences and narratives we as individuals share on social media and other online networking platforms. Social media ostensibly creates accessibility with regards to examining and understanding social phenomena, yet this requires technical know-how, monetary resources, and time. These factors present constraints as to what can be studied (and therefore what can be known), by whom, and for what purpose.
IRBs and the missing discourse of feminist ethics
Corporations are not the only institutions presenting challenges to Big Data social media research. University-based IRBs' regulative practices and slow adaptation to emerging online ethical dilemmas create challenges to Big Data social media research. Over the last 30 years in the United States, the effect of IRBs and ethical oversight has produced a context by which research ethics have shifted from being principally a moral discourse to a regulatory discourse (Bell, 2014; Miller and Boutlon, 2007). Discourses of research ethics have shifted from considerations of what researchers ought to do in terms of morals or rules of conduct to an “ideology and instrument of governmentality” that establishes policies, practices, and administrative infrastructures that emphasize compliance, control, and surveillance (Halse and Honey, 2007). The emphasis on compliance reflects increasing concerns about university liability and risk, particularly as research efforts extend beyond national borders (Bell, 2014). The reality then becomes the protection of the university as an institution rather than as a collection of individuals (Christians, 2000). Some have suggested expansions of the IRB on US college campuses have led to “mission creep,” or an overextension of IRBs’ power where an emphasis on compliance creates a focus on superficial technicalities to the detriment of larger ethical considerations (Gunsalus et al., 2006). Specifically, feminist scholars such as Bell (2014) have argued, “researchers and those who control their activities are becoming much more focused on regulating research and are perhaps in danger of being less concerned about dealing with issues like power, which underpin feminist research agendas” (77). As a result, discourses of research ethics articulated in IRB policy are often disconnected from the ethical considerations of actual researchers and do not fully consider the marginalized individuals whom the research impacts (Halse and Honey, 2007).
Researchers who do not adopt a positivist epistemology often struggle to make the ethical considerations of their research intelligible to review boards given most IRB policies and procedures are based on a biomedical model of research (Metcalf and Crawford, 2016). IRB evaluation of non-biomedical research often does not account for differing epistemological orientations; for example, requiring researchers to include their “hypothesis” in the description of the project. Researchers may ultimately avoid research endeavors out of concern it will be rejected by an IRB or that they may not be able to conduct the study in a way that conforms to their own moral and ethical practices. For example, Halse and Honey (2005) discuss their own delays and discomfort in seeking IRB approval on a funded project because of concerns about how they would be required to describe the study’s population. In this case, increasing pressures from funders and colleagues led to compromise and compliance that still left them “uneasy and uncomfortable” (2148). Through their experiences, researchers may learn to play the game by deploying “ethics-speak,” describing their projects in terms that are more acceptable to an institution designed to evaluate biomedical research (Halse and Honey, 2007). This encourages researchers to find ways to circumvent IRB or avoid it out of fear of rejection.
Under current US federal definitions, many forms of online research involving existing data sets may not even be considered human subjects research, requiring no IRB review or approval (Buchanan, 2011; Metcalf and Crawford, 2016). Twitter content, like many other forms of online data, is generally considered public and as such does not typically require IRB approval. However, the lines of what is considered public and private are blurred, particularly in social media, creating ethical concerns about privacy for which IRB policies may not adequately address (Linabary and Corple, 2018). For example, while Twitter content is publicly accessible, we question whether Twitter users recognize that what is public can be used by researchers in ways that may not have been intended by the individual who posted the content (Schmidt, 2014). Moreover, the relative permanency of online, in contrast to offline observations of public behavior, which are dependent upon a researcher’s interpretive lens, seem to differ in important ways. The latter does not require informed consent from participants; we argue the former should. While individuals may recognize their Twitter content as public, we question whether this alone constitutes informed consent for researchers, as researchers are not the typical intended audience for a social media user. Unfortunately, IRBs provide little ethical guidance for researchers on this issue. Thus, the onus falls on researchers themselves as decision-makers to consider the ethical implications of their projects and to subsequently self-regulate. While we are not advocating for unnecessary increased oversight by IRBs, we hope to raise ethical considerations regarding how to ensure the protection, safety, respect, and dignity of those whose lives we seek to study, given the assumptions currently operating regarding Big Data social media research.
Articulating a discourse of feminist ethics: Navigating IRBs while feminist
At the time we began this study, the IRB at our institution considered non-private social media posts and accounts as public domain. Thus, research using social media as a source of data is observational and as such does not require IRB approval. Given our feminist commitments, specifically sensitivity to issues of power in the production of knowledge (Collins 1989, 1994), and concerns regarding the potential vulnerability of domestic violence victims/survivors who tweeted using the hashtag, we wished to protect, to the extent possible, the identities of those whom we were studying. As such we sought guidance from IRB on how we might do so. During these exchanges with IRB, while negotiating the specificities of the risks and benefits of accessing and analyzing the tweets about domestic violence or authored by domestic abuse victims/survivors, we as a research team lamented the lack of guidance and insight regarding these specific dilemmas. There was little assistance or guidance offered by our IRB in the way of protecting the participants (i.e., Twitter users) or strategies on whether and how to obtain informed consent in the context of Big Data social media research. As such, we turned to the literature on feminist ethics for guidance. However, this was somewhat limited in that few publications, particularly at the time we embarked on this project, directly engaged with this specific methodological challenge in conducting feminist Big Data social media research. Ultimately, we decided to submit an application to IRB, even though this was not required, in hopes we would have at least some institutional oversight and ethical consideration of our project. The application was reviewed and approved, yet we are unaware of any impact or change in IRB’s approach to social media research as a result of our consultation and/or application.
Informed by feminist ethics and sensitive to the lived realities of domestic violence victims/survivors and power dynamics between the researchers and researched, several concerns emerged in our group discussions and collective journaling with the conceptualization of Twitter content as “public” and therefore not subject to ethical consideration as determined by IRBs. First, as we learned from the commentary of feminist scholars and activists, and in particular the women of color who authored #ThisTweetCalledMyBack, the uncritical assumption that Twitter content is perceived as “public” by users and therefore can be mined by scholars for research purposes can open up those who are marginalized to exploitation, co-option, surveillance, and violence. They write, following their social media boycott, “our inboxes now sit full with those people requesting to know where we have gone and what we are doing. Don’t we know they have dissertations to write?” (Collected Authors, 2014). This quote offers a challenge to the exploitation of the digital labor of women of color by those who may not share similar activist commitments or be invested in similar social justice projects—those who may be interested in the digital labor of women of color as a means for personal/professional advancement rather than for advancing the political cause of critiquing gender violence and racial hierarchy. The use of social media content as a form of data then raises questions regarding the intention of those who post content. While people have an understanding that when they use social media it is publicly accessible, whether they intend or wish for that content to be used for purposes beyond what they intended is indeed questionable.
Second, we recognize the sensitive nature of this particular hashtag. As such, we wished to be attentive to the power dynamics between those who responded to #WhyIStayed and ourselves as researchers. The topic of the tweets using the hashtag—tweets that were often posted by self-identified victims of domestic violence—raised additional layers of concern regarding the potential risks to those who tweeted. We had concerns that if an individual’s involvement with the hashtag was discovered by an abusive partner, it could place her or him at further risk of mental, physical, or emotional abuse. Additionally, in our weekly research meetings, we struggled with whether and how we could ensure ethical standards of “confidentiality” if we were to describe or quote tweets of participants given the ease in which online search engines would allow one to identify an author of a tweet. The option of not describing or quoting our tweets was also not a desirable solution given our feminist commitments to marginalized communities and perspectives, as well as the standards of publishing qualitative research that often demand rich description and direct quotes to support analysis and theoretical interpretations. Additionally, given the pervasive social stigma surrounding domestic violence, exposed individuals could be subject to other social or emotional harms (Fontes, 2004). As Markham and Buchanan (2012) state, “the greater the vulnerability of the community/author/participant, the greater the obligation of the researcher to protect the community/author/participant” (4). Therefore, in studying the hashtag we wished to anticipate and seek to minimize these risks to the extent possible.
Although Twitter content is considered by the IRB to be “public” and thus research utilizing Twitter does not require institutional review, our feminist commitments described above informed our decision to submit an IRB protocol application regardless of whether it was required by the institution. As part of our application, we wished to demonstrate we had approval from a key stakeholder to conduct the research, similar to the process a researcher might follow when conducting observations or interviews in offline contexts. We contacted the originator of the hashtag, Gooden, as a key “stakeholder,” to solicit her input and permission to proceed with the project. Moreover, informed by our feminist ethics and sensitive to the power dynamics voiced by the authors of #ThisTweetCalledMyBack discussed earlier in this paper, we indicated our intent to be respectful of the efforts that went into creating the hashtag and that we were attendant to the sensitivities of the context of the event (e.g., discussions of the lived experiences and perspectives of victims/survivors). Furthermore, we acknowledged the possibility that other researchers may have already contacted her, so we communicated our desire not to create an additional burden by soliciting her support. We explained our desire to seek her permission for our project and any further involvement was welcome should she choose to do so. Fortunately, Gooden responded affirmatively, indicating we could use her tweets and articles to further our research efforts. We included our initial contact letter and her response in our IRB application, in a way mirroring the efforts of other researchers who seek permission from key stakeholders to conduct research prior to initiating data collection.
In our IRB application, we also described the content within the data set and our reasons for seeking IRB approval despite the public nature of Twitter content and the fact that such approval was not deemed necessary. We were interested in studying social media specifically because it provides platforms for individuals historically silenced in academic research and in the wider society. In the case of our research related to domestic violence victims/survivors, we considered aggregating data as it would potentially alleviate the risk of their identification. However, we feared silencing the experiences of these particular users. This was of particular importance not only given our feminist commitments to subjugated knowledges, but also, given in the larger study, the object of inquiry centers on narratives of domestic violence, many of which were authored by domestic violence victims/survivors. Victims/survivors are often silenced by their abusers (through threats of violence or actual violence, gaslighting, and isolation), as well as by a culture that engages in and promotes victim-blaming, and a criminal justice system that is not well-equipped to address the complexities of domestic violence (McDermott and Garofalo, 2004; Mills, 1999). Our feminist ethics sensitized us to the very real possibility that by utilizing Big Data, our study might inadvertently produce another form of silencing victims/survivors, which is the power dynamic we had hoped to help address, not reproduce. While we wanted to avoid silencing the voices expressed in the tweets, we also were sensitive to the vulnerability of some social media users and the potential risks if we were to directly quote a tweet, for example.
To address concerns regarding confidentiality, we indicated in our IRB application that tweets would be de-identified and data will be reported in aggregate as much as possible. Moreover, any information regarding victims/survivors status would only be determined by the tweet itself and not based on further investigation on the part of the researchers. Furthermore, we conducted semi-structured interviews with individuals who participated in #WhyIStayed/#WhyILeft. This was in part to understand the meanings of the hashtag to those who participated as well as to explore some of our ethical concerns in conversation with participants. Based on our approach to feminist holistic reflexivity, we included questions in our interview guide that solicited input from interview participants regarding their concerns and suggestions related to representing tweets. From these interviews, we gained important insights that have informed how we have chosen to represent the tweets in any writing regarding our empirical analysis (which we discuss elsewhere). All of our interview participants had experienced domestic violence. Many did not use their real name or identifying information online or kept their social media profiles set to private. Several participants identified similar tensions regarding victims/survivors being able to give voice to their experiences and the desire to ensure confidentiality. Moreover, they acknowledged that such representation might expose individuals to further harm or risk, either by their abusers or by others if their identities as domestic violence victims/survivors are disclosed without their knowledge or consent, and under circumstances of which they do not have control. Given these considerations and conversations, and the difficulty of de-identifying public tweets, we opted to represent tweets within any writing produced from the study in summative and composite form, rather than directly quoting the tweets themselves.
None of these efforts or commitments were required by the IRB, nor were they offered by IRB as possible solutions to our ethical dilemmas when we solicited input. While efforts have been made by organizations like the Association of Internet Researchers to offer guidance to researchers and institutions, IRB procedures and regulations have not yet adapted to these emerging guidelines. Therefore, IRBs often have little understanding of how to guide researchers or to evaluate such projects (Linabary and Corple, 2018; Metcalf and Crawford, 2016). Rather, it was through feminist ethics and our practice of feminist holistic reflexivity that shaped our navigation of these dilemmas and the strategies we employed. We found it necessary in proceeding with a social media project on domestic violence to not simply adhere to regulatory norms.
Discussion
This paper draws attention to the importance of feminist ethics for Big Data social media research, as it is sensitive to issues of power, context, and subjugated knowledges, each of which we argue must be central considerations. We outlined several challenges to pursuing feminist research using big social media data—specifically, those posed by companies that “own” and control access to this data as well as IRBs’ regulatory discourses that may offer little guidance for researchers using social media data. We argue power, context, and subjugated knowledges must each be central considerations in conducting Big Data social media research. In doing so, this paper offers a feminist practice of holistic reflexivity in order to help social media researchers navigate and negotiate this terrain.
We have drawn on our practices of feminist holistic reflexivity to shed light on the institutional structures that presented dilemmas for our Big Data project involving social media content. Feminist reflexivity has a long tradition within feminist methodologies; however, feminist reflexivity is typically understood and enacted in qualitative research through the social context of personal interaction between researchers and participants (Collins, 1994; Harding, 1987). Technological advances and the growing field of Big Data research present the need for the development of a feminist holistic reflexivity, particularly as Big Data research brings to the fore the institutional power at play within the research process. As we have demonstrated in this paper, power dynamics manifest in distinct ways, between the researcher and the researched, as well as the researcher and the institution. Thus, we argue that researchers should not “take for granted” the institutional structures and norms at play in Big Data social media research but interrogate and engage the ways that corporate and institutional power shapes access to data, the (in)visibility and well-being of the researched, and how particular knowledges are produced.
As researchers committed to social justice, the ability to seek subjugated knowledge—particularly that from marginalized groups which are too frequently silenced in mainstream outlets—is critical to the production of knowledges that can potentially address oppression and injustice (Collins, 1994; hooks, 2000). Social media, for example, provides a powerful platform for users to share experiences, to expand networks, to mobilize communities around shared goals, and to challenge powerful individuals and institutions. Yet, if we as feminist researchers are unable to draw upon people’s lived experiences given the inaccessibility of big social media data, we miss an opportunity to develop knowledge, particularly as it relates to the ways in which social media operates as a space for social justice activism. The lived experiences of those historically marginalized in research, in institutional settings such as the criminal justice and welfare systems, as well as in advocacy efforts such as those by social workers and non-profits, offer important insights into the possibilities and limitations of social media activism.
Informed by feminist holistic reflexivity, we are also deeply concerned with the ways in which our lived experiences, our social realities, our labor as expressed through our posts, our “likes,” and our tweets are commodified through the process of algorithmic data mining which results in what we call the colonization of the self. The “colonization of the self ” refers to the ways in which social and online media corporations are able to access and excavate the digital representations of the self through our posts, “likes,” tweets and retweets, internet searches, Instagram photos, etc. This access is granted by users as part of the “uses and terms,” yet the choice to allow access is not made freely; to deny such access results in our inability to use these digital platforms. In other words, there is no option to reject the uses and terms and still have access to an application or platform. What we post online represents who we are, our thoughts, our experiences, our lives. Yet, it becomes “data” that corporations are then able to control and determine who and what for this data may be used. As this paper demonstrates, the corporate control of online data, and in particular social media data raises critical questions regarding knowledge production itself. If this data is accessible primarily to corporations and large marketing firms, which can afford the exorbitant costs associated with purchasing access to such “data” (i.e., our lived experiences), this impacts what types of questions will be asked and who will benefit from that research. As such, it is imperative for Big Data social media researchers to engage the principles of feminist ethics, including feminist holistic reflexivity, as articulated in this paper.
Furthermore, engaging in feminist holistic reflexivity means that, despite the “public” nature of Twitter data, researchers should not simply collate and analyze tweets, and disseminate this analysis without engaging with members from online communities. We recognize the limitations of regulatory discourses as applied to online spaces wherein informed consent is expected to be obtained from all participants in a study. Adherence to this regulatory norm is not always feasible nor possible when using big social media data, and as we noted above, may cause undue harm or risk when studying vulnerable populations such as victims/survivors of domestic violence. To address this challenge, our larger project includes semi-structured interviews with Twitter users who tweeted using the #WhyIStayed hashtag that allows for participants to engage with the research team in navigating this specific ethical dilemma. Interviews, beyond providing opportunities for producing “thick data” (Wang, 2013) through more nuanced understandings of participants perspectives and experiences within big social media data sets, represent one possible tool for engaging in dialogue with those affected by the research to understand potential risks and vulnerabilities and inform ethical decision-making (Linabary and Corple, 2018). Given how Big Data research problematizes traditional approaches to ensuring ethical research, we encourage researchers to further consider how the practice of feminist holistic reflexivity can inform and address the ethical challenges faced in online Big Data research, particularly that involving social media.
In engaging in reflexive praxis related to Big Data social media research, we encourage researchers to consider the following questions. With regards to knowledge production, Big Data social media researchers should ask: What gets researched and who does it represent? Who gets to conduct research and in whose interests? What knowledges are even possible to be produced if access and content are controlled by corporate entities? With regards to methodology, Big Data social media researchers should consider the following concerns: What are the consequences of removing data from the context intended by the user? What symbolic violence might we enact if as researchers we simply adhere to the regulatory norms of IRBs? Who is most vulnerable to that violence?
Conclusion
It is our goal that in outlining the feminist ethical dilemmas we faced and sharing our negotiations and navigations through the relatively uncharted terrain of feminist Big Data social media research, we offer future researchers a starting point, providing insight and guidance on what challenges to anticipate and possible ways to navigate those challenges. Specifically, informed by our feminist methodological commitments to disrupt power dynamics and recognize subjugated knowledges as a starting point for knowledge creation, we advocate for the use of feminist holistic reflexivity to interrogate power throughout the research process, to engage in questions regarding authority in the production of knowledge, and to consider dialogue with institutional structures and the research population. As we demonstrated, feminist ethics are especially needed when drawing on subjugated knowledge, lived experiences, and other information for research purposes, even when these are shared in publicly accessible social media spaces, as many users do not expect or anticipate their online lives to be used by researchers. While our research involving victims/survivors of domestic violence brings issues of vulnerability and risk, power dynamics, and social justice to the fore, we argue that feminist holistic reflexivity is an important practice that can aid Big Data social media researchers writ large.
Footnotes
Acknowledgments
The authors would like to thank the editors and anonymous reviewers for their insightful comments and constructive feedback. The authors would also like to thank Susan Scrupski and Big Mountain Data for providing access to Twitter content without which our research would not be possible. Thank you to Emily Fogle for her research assistance. We wish to acknowledge the labor of Beverly Gooden, author and originator of #WhyIStayed, and all those who tweeted their stories, shared their lived experiences with domestic violence. This labor created an important conversation about the complexities of domestic violence. A special thank you to those individuals who agreed to be interviewed and shared with us their stories in hopes that it would help other domestic violence victims/survivors.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
