Abstract
Using user-generated content from open-access platforms such as Reddit for research raises ethical questions and challenges. Research projects involving publicly available data can qualify for an exemption from human research ethics review. However, when the exemption is granted, some scholars move to the data collection phase without attending further to ethical considerations. This does not always result from negligence but can be driven by the lack of coherent guidelines or limitations of procedural ethics. Despite receiving an exemption from ethics review, researchers can still engage with ethical concerns throughout the project. This article argues that a “situated ethics approach” to researching publicly available online data, which pays attention to flexibility, reflexivity, and complexity of research ethics, should be applied to projects working with data from user-led platforms—Reddit or others. Using a reflexive process and drawing iteratively on learnings, this article describes and analyses a situated ethics framework applied to a case study of doctoral research about youth health discussions on Reddit. Through a focus on three key areas: digital context, users’ views, and project specificity, the framework inspired a set of ethical questions that can assist with applying situated ethics to other studies. This paper advocates that a “situated ethics approach” to researching publicly available online data can usefully advance debates and practice in research on user-led platforms with public data, such as Reddit.
Introduction
A key tension in the ongoing debate around internet research ethics is around the use of publicly available user-generated data. There is a problematic trend among some researchers who, by labeling the data as publicly available, waive their obligation to address ethical concerns. They either seek exemption from human ethics review and abandon ethical deliberations once it is granted or self-define publicly available data as validating the lack of further attention to ethics (Stommel & de Rijk, 2021). Avoidance of ethical responsibilities can result in potential harm and exploitation of the users (Ess, 2020). Not engaging with internet research ethics is not always a matter of choice or negligence but can be a lack of awareness about situated ethics and platform-specific guidelines.
When engaging with ethical considerations, internet research scholars often follow guidelines and practices from other internet research studies or committees’ procedures, which emphasize procedural ethics (Roberts, 2015). However, focusing on the procedural approach and aligning with ethics committees’ expectations can often overshadow the research’s processual and situated nature. Overall, there is a significant need for ongoing discussions of research ethics involving publicly available user-generated data, for example, from platforms such as Reddit.
In a systematic review of Reddit studies, Proferes et al. (2021) investigated ethical approaches concerning issues of informed consent, de-identifying data, and dataset sharing. They found that most papers (86%) about Reddit do not explicitly discuss ethical considerations, and if they do, it is often only to state an exemption from ethics review, sought formally or self-decided. Even if ethical practices occur, there is minimal discussion about ethics in the papers. Moreover, Reddit data, when considered “public,” is often not de-identified for publication. Proferes et al. (2021, p. 21) found that 30% of studies used direct quotations, and 10% mentioned usernames. Such practices pose ethical dilemmas as they can contribute to tracing back Reddit users’ posts and profiles. Proferes et al. (2021) highlight the inadequacy of simply defining Reddit as “public” and advocate instead for a focus on the ethical implications of individual research. Building upon this, I argue that even if data are publicly available, and even if studies are exempt from formal ethics review, it is necessary to consider an alternative and dynamic approach. Moving beyond procedural ethics can be done by assembling strategies to surface, explore and respond to ethical questions throughout a project. Thus, another way to engage with internet research ethics is to consider a situated ethics approach guided by the particular project and researchers’ reflexivity.
The research question I ask is how can situated ethics principles guide researchers working with Reddit data. This article responds to the need to inform nuanced discussion about the ethics of Reddit research. To assist researchers with ethical challenges, I propose a situated ethics framework for Reddit. I provide questions about the platform, users, and projects that can guide researchers’ ethical approach. The proposed framework is a theoretical addition to the call for situatedness in internet research ethics. The framework builds on the existing scholarship from many prominent internet research ethics scholars (e.g., Markham, 2018; Zimmer & Kinder-Kurlanda, 2017) and internet research guidelines (e.g., Franzke et al., 2020; Markham & Buchanan, 2012; The National Committee for Research Ethics in the Social Sciences and the Humanities, 2019) that advocate for situated ethics approach.
Therefore, this article presents the framework in practice—as it emerged from a specific research case—my PhD study of online discussions about youth health matters on Reddit. My research received an exemption from Human Research Ethics Committee as complying with the formal requirements of public and nonidentifiable data. However, requesting an exemption was not the end of engagement with the project’s ethical considerations. My overall approach to research ethics was guided in a dynamic way in relation to multiple considerations: the digital research ethics literature, national privacy and human research ethics laws, ethics committees’ policies, the existing scholarship on Reddit, Reddit users’ recommendations, and consultations with an advisory Wellbeing Health & Youth Commission.
In keeping with the deeply reflexive approach of the situated ethics framework, I present the analysis and discussion in the first person, referring to how I explore and attend to the ethical dilemmas in this research. First, I discuss the procedural ethics protocols and their criticism and describe situated ethics. Then, I outline the situated ethics framework for researching Reddit. I later formulate ethical questions relevant to future studies involving Reddit and provide a detailed description of the application of the framework in my study. My conclusion reflects on the usability of the framework for other scholars.
Procedural internet Research Ethics
Internet research ethics is dynamically evolving as new behaviors, phenomena, and platforms emerge (Buchanan, 2017). New technologies provide new challenges, researchers disagree on ethical dilemmas, and many ethics committees struggle with internet-based research. Different actors produce different ethical guidelines: for example, local, state, and international laws; internet research associations; university policies and ethics committees; platforms terms and conditions; cultural context; and researchers’ common sense (Sormanen & Lauk, 2016). Institutional ethics committees continue to play a significant role as gatekeepers of ethical conduct (Whelan, 2018). Research ethics committees’ core principles are to protect human subjects; monitor biases and conflicts of interest; ensure research compliance with regulations; and create institution-wide standards (Grady, 2019). In the United States, committees’ review of the study’s ethicality mandates current and future federal funding (Powell, 2002). This means ethical bodies also have political and financial consequences.
Some ethical requirements’ sources, like ethics committees or privacy/human research laws, incline more toward procedural ethics approach to internet research. They expect preplanned and consistent conditions for all stages of the research process, whereas circumstances on the internet change dynamically (Webb et al., 2017). Risk-oriented procedural ethics is common in legal requirements for human research. For example, the Australian National Statement on Ethical Conduct in Human Research defines research in the themes of risks and benefits, outlining that focusing on and resolving risks is essential for research to be “ethically acceptable” (National Health and Medical Research Council et al., 2018, p. 16). Procedural ethics has numerous limitations, including focusing on risk management rather than monitoring for and responding to unexpected ethical issues and the contextual forces that continuously shape specific projects.
Ethics guidelines and the one-size-fits-all approach pose various ethical and practical constraints (Steinmetz, 2012). “Procedural ethics” is more “rule-bound” than processual (Franzke et al., 2020, p. 6). Following procedural ethics, some researchers focus on “obtaining ethical clearance” (Roberts, 2015, p. 315) rather than actually engaging in ethical considerations. One of the shortcomings of procedural ethics is the aim of identifying all potential risks and ways to mitigate them before commencing research (Roberts, 2015). Procedural ethics reinforces the idea of the researcher’s ability to plan and govern ethics processes, ideally with standard procedures.
Markham’s (2018) critical account of dominant error-avoidance and concept-driven models of ethics in internet research has criticized blueprint ethical practices. Markham (2018) presents cases when seemingly protective decisions resulted in harm. First, she describes a study involving an eating disorder blogger who, after providing informed consent to study her blog, started describing more dangerous starvation practices than before the researcher’s presence. An error-avoidance model with routine informed consent has failed, even though the researcher had sought it to ensure the subject’s best interests. The second example describes how the institutional framing of vulnerability imposed such a definition on a blogger who published her experiences of losing a daughter to suicide under her real-life name. She did not want to be considered vulnerable or be anonymous in her advocacy. In her case, a concept-driven model, which applied pre-set definition of vulnerability and privacy, misaligned with the participant’s identification. Markham (2018) argues that looking at ethics as defined case-by-case prioritizes caring for the research’s impact, compared with applying ethical principles only to fit into the procedural requirements. Blueprint procedural ethics definitions can be challenged with the situated ethics approach. Situated ethics is not a “silver bullet” solution to address all ethical issues. Rather, it is nonlinear, open-ended and aims to balance the practicalities of conducting research with protecting internet users by making ethical decisions
Situated Ethics Principles
Ethics processes are fluid and full of uncertainties and ambiguities and require flexibility along the way (Beninger, 2016). A situated ethics approach engages with the advantages and limitations of research by observing its processes and addressing the complexities of the interrelations of people, platforms, and places (Collin et al., 2019). Situated ethics recognizes research as informed by user practices, platform cultures, and technological affordances (Corple & Linabary, 2020). Situated ethics also requires reflexivity on the researcher’s part and attentiveness to potential biases (Hine, 2015). Situated ethics is relational, as relationships provide contexts for ethical conduct, which Nissenbaum (2004) conceptualizes as contextual integrity. Moreover, situated ethics is non-linear, as researchers can face ethical issues as the process unravels (Roberts, 2015). Situated ethics also redefines the idea of avoiding “harm”. Harm in procedural ethics is defined as psychological, physical, social, economic, or legal distress, discomfort, or inconvenience (National Health and Medical Research Council et al., 2018). Through the lens of situated ethics, harm can differ in every community, platform, topic, and even individual (Hair & Clark, 2007). In situated ethics, the ethical judgment principle (Ess, 2020), with dialogue, process, and reflectivity of the researcher, allows for making situated judgments for each project.
My situated ethics framework is largely informed by internet researchers’ current interest in situated ethics. Anabo et al.’s (2019) review of current internet research guidelines such as the Association of Internet Researchers, British Sociological Association, British Psychological Society, and the National Committee for Research Ethics in the Social Sciences and the Humanities guidelines acknowledges that ethical decisions need to account for multiple scenarios specific to each project. Researchers may revert to, for example, adapting informed consent procedures (or waiving thereof and proposing different protective strategies) or thinking through a multifaceted approach to privacy, flexible definitions of harms and benefits, contextual expectations of privacy, and redefining vulnerability. Many internet researchers already apply situated ethics in practice, following “personal codes of ethics” (Vitak et al., 2016). Researchers can be capable of navigating ethical decision-making and finding the best solutions to their situated dilemmas, given the right tools.
More discussion on situated ethics is needed as internet research advances and diversifies. In the following section, I introduce the situated ethics framework developed to address Reddit complexities and possibilities. The framework arose inductively as I planned my ethics approach to my PhD study. In my project, I have encountered numerous ethically important moments (Guillemin & Gillam, 2004), which oriented me away from procedural ethics toward a situated ethics approach. Although formulating a “framework” may seem like a form reinforcing procedural ethics, my framework does not aim to be a strict and comprehensive model. Instead, the proposed framework, in line with the AoIR recommendations, follows the principles of “judgment calls” (Franzke et al., 2020, p. 6), which substantiate questions-oriented guidelines. Formulated questions can serve as “prompts” for researchers to be more reflexive and outline broader “domains” to reflect on without being a step-by-step instruction for resolving ethical dilemmas.
In the next two sections, I will first present a high-level overview of the framework and later describe how the framework works in practice.
A Situated Ethics Framework for Reddit
Guided by the situated ethics approach principles outlined above, I propose a situated ethics framework for researching Reddit (Figure 1). The framework consists of three dimensions: digital context, users’ views and project specificity. These dimensions emerged through careful consideration of the ethical aspects of my PhD study. Guided by internet ethical research guidelines and scholarship, and other studies that dealt with similar dilemmas, I have organized my ethical considerations into three broader “domains”. The framework also consists of three principles of flexibility, reflexivity, and complexity. These principles have been formulated and supported by the situated ethics literature. Dimensions of digital context, users’ views and project specificity have emerged surrounded by flexible, reflexive, and complex thinking characteristic for situated ethics. Moreover, resolving ethical dilemmas pertaining to each of these dimensions also requires following situated ethics principles. The relationship between principles and dimensions is co-constitutive and constantly co-present.

A situated ethics framework for Reddit.
The first dimension of the framework focuses on the digital context, recognizing that a uniform approach to internet research ethics is largely untenable. Being reflexive and attuned to specific digital contexts helps surface the impact of particular affordances and digital cultures of specific platforms on ethicality. Such an approach contrasts with applying guidelines written for social media in general or following ethical guidance from studies of other platforms.
The dimension of users’ views includes indirect peer guidance through the literature review about users’ attitudes and ethical approaches used in similar studies, as well as direct consultations with the platform’s users and targeted group. The principles of care and respect can be reinforced by including voices from the users, even if research is conducted without direct contact with exact data contributors.
The last dimension—project specificity—accounts for the specifics of data and targeted groups. The data can guide us significantly, as many internet research projects form original methodologies. Similarly, research with particular groups has attuned expectations, limitations, and biases, which deserve specific attention. The researcher’s role is to think through different aspects of the project that may influence the study’s ethicality.
Three major principles that guided me through my research project were complexity, reflexivity, and flexibility. Even at the early stage of internet research, Annette Markham (2006, p. 39–40) highlighted these as key to ethical research: Online or off, an ethical researcher is one who is prepared, reflexive, flexible, adaptive, and honest. Methods are not simply applied out of habit but derived through constant, critical reflection on the goals of research and the research questions, sensitively adapted to the specificities of the context.
Reflexivity is a key part of the qualitative research tradition. Being reflexive involves critical reflection on the researcher’s role in the research process, participants, and context; the researcher’s background influence on research; and acknowledging potential limitations thereof (Guillemin & Gillam, 2004). In my process, I have constantly reflected on my experience and positionality as a researcher, a Reddit user, and a former teenager who used social media for health. I weighed my opinions about this research with the benefit of users and the community. I frequently asked myself whether my decisions are no shortcuts out of convenience and whether things can be done effectively and ethically at the same time.
Similarly, flexibility is also an essential principle for qualitative researchers. Flexibility prioritizes research phenomenon over methodological scrutiny and endorses finding appropriate methods that align with the research questions and inquiry (Holloway & Todres, 2003). Flexibility does not have to involve complete relativism but can be balanced through the coherence and consistency of the methods. My study’s research method was constantly evolving due to technological constraints and new insights. Same with the ethics itself—the more project specific challenges I encountered, the more flexibility on the ethics processes was needed.
Qualitative research requires a principle of complexity. Complexity acknowledges the messy, interruptive, and fluid research process contrary to the scientific desire for simplification, hard evidence, and causality (McCoy, 2012). McCoy (2012) defines research projects as “encounters” with data and theory, moments where knowledge is produced through unexpected material-discursive collisions. The interdisciplinarity of my research and ethical dilemmas specific to the project resulted in a complexity that could not be ignored. Situated ethics allowed for going through each encounter with an open mind and without pre-ordered answers from procedural ethics.
I will now reflect on how the framework emerged further—from the dilemmas and decisions I made in the particular study. Following that, I will propose questions that can guide future researchers on the ethics of working with Reddit data.
Applying the Framework to Research Youth Health Discussions on Reddit
The framework emerged from my ethical concerns during my PhD research. My project investigates peer-led discussions about health in r/teenagers, r/AskTeenGirls, and r/AskTeenBoys subreddits on the online platform Reddit. Research questions ask about conversational practices on Reddit and how youth health matters are represented in these discussions. The project explores how conversations about youth health on Reddit contribute to youth digital health—understood as an assemblage of social, material, and affective resources and forces enacted and performed (Duff, 2014; Fox & Alldred, 2017). Subsequently, I define online discussions as relational dialogues—processes based on the interrelations of people, socio-material affordances, and digital cultures as contextual resources. Dialogues comprise of contextual, interactional and semiotic dimensions (Krippendorff, 2010; Linell, 2009). Hence, the research does not focus on individual users’ behavior but rather examines online discussions as collective socio-material practices. Using an unobtrusive digital ethnography (Hine, 2015) with extant data collection (Salmons, 2016a), the research collected user-generated content without direct contact with users. Data have been extracted using the data scraping software ParseHub. The analysis in the study followed grounded theory and thematic analysis. This interdisciplinary project needed a holistic approach to ethics. Three different dimensions—digital context, users’ views, and project specificity—guided my ethical deliberations and positionality as a Reddit researcher.
Digital Context
The dimension of the digital context reflects circumstances specific to the platform: its size and structure, availability and accessibility, user’s anonymity and identity, the third-party data policy and terms and conditions of use. A platform-specific lens into ethical decisions helps in keeping up with the rapidly evolving digital media landscape and nuances of the platform’s design and culture. Distinctive affordances of the platform surface questions that can influence ethical considerations about Reddit data (Table 1). Furthermore, I outline how these questions were addressed in my study.
Digital Context Critical Questions.
It is difficult to define whether Reddit falls under the category of a social media platform or is an online forum (Massanari, 2015). For example, Reddit is more anonymous and publicly accessible than social media but resembles an online forum structure, only on a larger scale. Ethical guidelines for both social media and online forums are not exhaustive or easily applicable to Reddit.
The latest platform-issued report about Reddit users lists 430 million active users (in October 2019), 430 + million posts and 2.5 + billion comments posted in 2022 (Murphy, 2019; Reddit, 2022). Reddit comprises of subforums called subreddits—communities established and managed by users gathered around a particular topic. Subreddits have a discussion tree structure of conversations known as threads. Posts and comments feature the content, username, date and voting scores. A single post contains a body text and (depending on the subreddit) a title, an image, or a link. Reddit does not sell data to external sources. Its business model is based on investors, advertising, premium memberships, and user engagement-based awards (Johnston, 2020). Reddit privacy and terms of service documents explicitly remind the users about the platform’s public and anonymous nature, the accountability for the content they post, and the third-party access to data. Reddit documents about privacy and conditions are very short, comprehensive, and easy to read during sign-up compared to other platforms.
On Reddit, most communities are available for public browsing, and only posting and commenting are limited to registered users. Content can also be accessed, for example, from a link on a search engine. The feature of private subreddits is rarely used. Many people only view (lurk on) Reddit without active participation. 98.1% of Reddit registered members do not post or comment on the content (u/truebirch, 2019). These numbers do not include nonregistered lurkers redirected, for example, from a search engine. Posts are intended to reach many unknown users and generate anonymous discussions. Most popular posts and comments are rewarded with “karma” (a tool for most active users), so Reddit users aim for many comments and votes. Reddit also prides itself on being an open-source platform, which means that all website coding and its ever-posted content was aggregated in a pushshift.io database, available for download (Baumgartner et al., 2020). Very recently, Reddit has been changing its API policy, which resulted in banning Pushshift access to API. However, following criticisms, the consultation between Pushshift and Reddit led to an agreement through which approved Reddit moderators can now request Pushshift access. While this paper is being published, further discussions are held to reinstate access for nonmoderators. Now, Reddit provides means to access its API for the researchers at request, supported by the university’s documentation. Reddit also has a very liberal third parties policy for data scraping and does not use blockers from data scraping software (u/powerlanguage, 2016). In this research, I examine subreddits that are public and large: r/Teenagers (2.9 million users), r/AskTeenGirls (44.9k users), and r/AskTeenBoys (49.7k users).
The anonymity of Reddit is its defining feature. Users often value Reddit for its anonymous nature; therefore, they refrain from providing identifiable details to maintain anonymity (Robards et al., 2021; Van der Nagel & Frith, 2015). Posts are published under pseudonyms, and Reddit’s content or user profile does not include names, ages, countries, or friend lists. Therefore, the traceability of a Reddit user’s offline identity is extremely limited and would require illegal hacking (e.g., sending a link to the user to click to obtain an IP address). Reddit actively promotes a culture of anonymity, both in its terms and conditions and during setting up a profile. While choosing a username, Reddit suggests random usernames. Studies indicate that users choose usernames unrelated to their real-life identities (van der Nagel, 2017). Redditors are proficient in navigating their anonymity using different strategies for different contexts (Triggs et al., 2021). Reddit is also strongly opposed to doxing—posting private information about users publicly—both in policies and speeches from the company’s spokespeople (Copland, 2021).
Following the situated ethics approach addressed the insufficiency of some existing general social media guidelines for researching Reddit specifically. The large scale and structure of Reddit communities and posts, and the platform’s terms and conditions and policies, reinforced my interpretation of Reddit as more public and anonymous than other platforms, which has guided my decisions about pursuing exemption from the human research ethics review. Moreover, my preliminary keyword search indicates that users of the specific subreddits I have chosen for analysis are not posting identifiable information. However, to ensure even greater anonymity of the posters and to mitigate the risks of tracing back verbatim quotes, I propose paraphrasing source content, presenting data through “vignettes,” describing the properties of image or video data (e.g., “a YouTube video of a vlogger describing their meditation routine”), and focusing on discussion threads, not individual users’ postings.
Users’ Views
Users’ views also directly informed my approach. This included considering studies on users’ perception of using their social media posts for research, ethical approaches in other studies on Reddit, the views of Reddit users, and consultations with an advisory youth commission at my research center. Users’ views helped to embed reflexivity by incorporating a more inclusive approach to recognize and address my biases and include a range of voices in my ethical decision-making process.
Accounting for the dimension of users’ views is a respectful strategy despite limited contact with individual contributors if pursuing an exemption from human ethics review and/or waiving informed consent. Incorporating users’ voices into projects dealing with Reddit data can be done by raising questions indirectly or directly (Table 2).
Users’ Views Critical Questions.
Users’ attitudes toward using social media entries for research purposes fall under three categories: skepticism, acceptance and ambivalence (Beninger, 2016). Users are skeptical due to inaccurate contextualization of the posts. Acceptance relates to the social benefit of the study and honesty in online discussions. Ambivalence derives from the lack of control over posted content, no tools to prevent data use for research and a belief that users’ opinions do not matter. Informed consent is either considered unnecessary due to personal responsibility for posts and difficulty obtaining it or a “common decency” out of respect for privacy (Beninger, 2016, p. 65). Studies indicate that most participants want to be anonymous; however, some want to be recognized for their contributions (Fiesler & Proferes, 2018).
However, investigating the ethical approaches used by Reddit researchers reveals that the use of Reddit data for research is less well captured in the scholarship. Proferes et al. (2021) noted that the vast majority of papers on Reddit that reported on the ethics approach have referred to the exemption, yet it is unclear whether it is formal or self-defined. With ethics committee-endorsed exemptions, reasons include: public availability of data (Derksen et al., 2017; Park & Conway, 2017, 2018a, 2018b; Park et al., 2018b) data classified as not involving human subjects (D’Agostino et al., 2017; Sowles et al., 2018) and nonidentifiable data (Foufi et al., 2019).
I have also asked Redditors about their attitudes toward using posts for research on r/AskReddit, r/TheoryOfReddit and r/Teenagers (a site of this research). I announced myself as a researcher, described the research project, outlined my ethical dilemmas, and asked for Redditors’ opinions. Most users were favorable toward using their data for research. These users acknowledged Reddit’s public nature and user-generated content’s benefits. Some users flagged that the research use is more protective than other forms of redistributing posts, such as reproducing it in news stories or memes with attached usernames. Some said that they value anonymity on Reddit and would like their usernames to be removed. They also differentiated between mining the posts without other information from the profile and recognized the impracticality of seeking consent from all users involved in one thread. Some suggested obtaining informed consent from selected users but did not specify how to make the selection.
Because this research investigates teenagers’ subreddits, I have consulted with an advisory Wellbeing Health & Youth Commission to receive input from young social media users. They highlighted that Reddit users differ greatly, from lurkers to active posters, and that researchers should engage with the perspectives of both posters and young people. The additional degree of anonymity afforded by removing usernames aligns with what they value in Reddit, namely, not to have their Reddit activity linked with offline life. They highlighted that some of the posts from teenagers’ subreddits are less mature and touch upon sensitive topics, which is why they deserve more privacy, especially from adults. Also, contacting all contributors from 50 postings with dozens to hundreds of comments each was considered a hardship validating waiving informed consent.
Project Specificity
Another dimension I used to situate my ethics approach focused on surfacing the project specificity by considering the study’s aims and methodology. This enabled researcher reflexivity by relating the ethical dilemmas to concerns attuned to my project, which is not prevalent in more general guidelines for social media research.
A more tailored ethical approach based on project specificity is often overlooked. Developing attention to the project’s study design influencing ethical dilemmas reinforces particular questions to consider (Table 3).
Project Specificity Critical Questions.
Digital research can be considered non-human text research and human subject research (Bassett & O’Riordan, 2002). If pursuing the human subject position, research must undergo a human research ethics committee review. My research focuses on the text, not the users, as I am interested in how conversations about youth health-related topics manifest and contribute to the youth digital health discourse. Subsequently, my study does not focus on identity and does not trace individual user activities or collect profile information. Therefore, data represents youth health discourses and focuses on its processes, topics, and interactions, not on who is performing them.
In my study, I did not seek informed consent due to several reasons. First, content on Reddit is already posted under non-name-related pseudonyms. Second, the data volume is large enough to make contact with all these users too time-consuming, and the response rate may be low. Third, this study investigates naturalistic practices and seeking consent may discourage Redditors from further participation in the community, which could account for the group harm. Due to all measures deployed to protect users, this study is highly unlikely to cause any harm, individual- or community-wise. I do not consider not seeking informed consent equivalent to a lack of respect or unethical conduct. Instead, I propose other measures that can play a significant function in mitigating potential risks (Morris, 2005). I consulted with the group members and de-identified data. Moreover, the subreddits I research are not restricted, and posters are aware of the potential of a wider audience beyond community members.
Finally, research about young people and health is typically framed as high risk, requiring more ethical protections. However, elements of existing ethical procedures around youth or health-related research might be more harmful than beneficial. For example, parental consent, typical for youth-related research, could breach young people’s privacy, as they turn to peer-led platforms because they want to keep their questions away from the adults. Framing health research as sensitive and young people as vulnerable may exacerbate stigma and taboo around some health experiences. In addition, teenagers’ subreddits members voluntarily share their health narratives on Reddit, knowing its public nature and broad audience. Many seek opinions from multiple peers. Some even post or comment to collect more Reddit awards. Generally, ethics committees consider health narratives as sensitive data. On Reddit, sharing health narratives may be a way of making young people and their health “visible” to the world.
The considerations about the research design, the topic and the group have guided me to focus on the text rather than the user and de-identify data during collection. Using the automated data scraping software ParseHub, I downloaded the discussion tree consisting of original posting and multi-level comments; and excluded the username, date and time, submission ID, voting scores, and metadata. Therefore, no identifiable information is attributed to the postings. In most projects, deidentification happens after collecting data. I have adjusted this practice to furtherly detach any information on the user from the raw data. The decision to not seek informed consent was permissible with considerations of practicality and the possibility of the study. Instead, I offered other strategies, such as de-identifying data from usernames and paraphrasing content to strengthen users’ safety.
Discussion
The situated ethics framework emerged in the process of designing a Reddit study on youth health-related discussions. The situated ethics approach invites particular researcher dispositions—reflexivity, complexity, and flexibility—throughout this study design and, therefore, the project. This research design was a challenging case for navigating the traditional ethics committee’s requirements. For instance, this study could have been heavily restricted by the committee’s obligations for digital, youth, and health studies. A few of the requisite ethical standards, for example, contacting all the users, might have been impractical enough to make the study impossible. Some could even, to an extent, have harmed the participants, for example, by breaching their confidentiality and anonymity with parental consent. These difficulties were thought-provoking and compelled me to introduce a situated ethics methodology. With a growing interest in researching Reddit, the proposed framework—and its application in this specific study—surfaces the complex organization of the platform, its users, data, and communities. Reddit researchers can approach their studies in a more situated way, even if receiving an exemption from ethics committees’ reviews. Engaging in flexible, reflexive, and complex research will advance the situated ethics qualities of Reddit research and contribute to research approaches more attuned to the project and platform specificities.
The situated ethics framework I propose builds on the existing scholarship around the situatedness and fluidity of internet research ethics encounters. One of the most prominent guidelines for internet research is the AoIR guidelines (Franzke et al., 2020), which attune to multiple concerns around subjects, data, platforms, and research design. AoIR guidelines argue for a “reflexive and dialogical” approach to ethics (Franzke et al., 2020, p. 23), which is now emphasized especially among social sciences researchers. Yet, some research domains, like computer science, can receive less ethical education or the suggested ethics approaches are more procedural. There are attempts to educate and promote ethical attention and societal impact among computational scientists (Ashurst et al., 2021; Fiesler et al., 2021; Pillai et al., 2021). Importantly, ethics education for the HCI field could benefit from incorporating a more situated lens, but I also understand how my framework could seem “loose” by researchers used to more rigid models. Domain-specific practices do not have to be fully abandoned but can be integrated into the situated framework precisely because of its question-oriented form.
My framework funnels the questions more attuned to Reddit rather than all internet research, like AoIR or NESH guidelines. For example, my framework may not sufficiently surface issues with Big Data projects. Moreover, my framework offers broader ethical dimensions than, for example, the Qualitative E-research Framework from Salmons (2016b). Moreover, similarly to guidelines like Webb et al. (2017) for Twitter, my framework arises from the course of this research project, which supports its validity in practice. In addition, it offers an alternative to the existing practices around research ethics in Reddit or publicly available user-generated data.
Conclusion
The ethical issues of research on Reddit are rarely engaged with in academic publications (Proferes et al., 2021). Reddit is another platform where researchers face ethical concerns in studying social media. Reddit can be perceived as a more public and non-identifiable platform—as it requires no registration and is pseudonymous. However, the public availability of Reddit data does not validate its use without ethical considerations. This article adds to the discussion on the importance of situated research ethics in internet research and, specifically, Reddit.
In traditional ethics review, researchers participate in “procedural ethics” to address ethical issues, which limits internet research by overlooking the benefits of situatedness and context-sensitivity (Roberts, 2015). At the same time, the discussion on situated ethics generally lacks formalization and proposes more “values”. Instead of rigid procedures or elusive positions, my framework offers a structured application of its principles while remaining open and flexible. My reflexive situated ethics framework does not aim to solve all issues with procedural ethics but can complement some inconsistencies for bodies such as ethics committees or regulators since members of ethics committees have no consensus on whether and how to review projects involving public data (Vitak et al., 2017).
If claiming an exception from ethics review, some researchers report on the exemption without further discussing or attending to ethical responsibilities (Stommel & de Rijk, 2021). One of the reasons for not engaging in research ethics can be the practicality of the study. I believe that many researchers are not ignoring ethical concerns maliciously but know that following strict ethical procedures may entail ceasing the study. My framework offers an alternative to researchers who fear starting the ethics process, as it may end up unfavorably.
Although prominent internet research guidelines advocate for situatedness (Anabo et al., 2019; Franzke et al., 2020; Markham, 2018), there are still many researchers that firmly believe in ethical golden standards, such as informed consent or prescribed vulnerability (Kozinets, 2019). Such rigid opinions can be one of the reasons why researchers choose procedural ethics or no ethics at all. My framework can be of “evidence” that a reflexive and situated approach to research ethics can be endorsed by an ethics committee.
This article presents how I contextualized the ethics of my study focused on youth health-related discussions on Reddit despite receiving an exemption from human research ethics review. Guided by platform context, user attitudes, and research design, I have assembled strategies to protect Reddit users. This article outlines the situated ethics framework developed through the practical application of situated ethics principles and the researcher’s positionality. The framework, built on questions rather than prescriptions, does not only work for my study but could be applied more widely for projects with user-generated data from Reddit. Some of the questions raised here can also be helpful in projects involving other platforms with publicly available user-generated data.
Internet research is increasingly complex, and procedural, risk-management or imposed ethical guidelines are insufficient. Dynamically evolving platforms, users’ behaviors, and particular project’s aims and design, require a customized, fluid and situated approach to ethics. The proposed situated ethics framework does not aim to neatly resolve the range of dilemmas raised. Instead, it offers guidance to surface and support working through them.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
