Abstract
This article explores key ethical and security challenges related to exploitation of open-source intelligence (OSINT) in research on online terrorist propaganda. In order to reach this objective, the most common approaches to OSINT-based projects are analysed through the lens of some of the most recognized ethical guidelines in science, which allowed several core dilemmas to be identified. First of all, this study discusses how personal data protection rules are applicable to investigations of potentially dangerous subjects, such as members and followers of Violent Extremist Organizations (VEOs). In addition, the author examines potential threats to the safety of researchers and the scientific infrastructure used in OSINT-based projects. He also discusses the risks of incidental findings and malevolent use of research results. Finally, drawing from existing legal regulations and good practices in other fields, as well as the author’s previous experience in OSINT-based analyses of online terrorist activities, this article explores basic means of tackling these dilemmas.
Introduction
One of the most significant challenges experienced by almost all academics engaged in research on online terrorist communication is the difficulty of collecting and properly analysing primary sources. This problem was even blamed for the ‘stagnation’ in this field (Sageman, 2014; Schuurman, 2014: 62). A myriad of works avoided this issue by examining secondary sources exclusively (Schuurman, 2020). However, the most insightful studies that significantly pushed the boundaries of knowledge on online terrorism and political violence were usually evidence-based (Fisher et al., 2019; Marone, 2019). Many were founded on extracting primary sources from dedicated platforms specialized in sharing extremist propaganda for academic purposes, such as Jihadology (Zelin, 2021). Others were more ambitious and adopted methodologies allowing detecting, extracting and analysing raw data independently (Macdonald et al., 2019; Zelin, 2015). Scholars who did so frequently exploited open-source intelligence (OSINT) techniques. They have been utilized, among other purposes, to map online infrastructure maintained by terrorist organizations, understand their activities on the dark web, examine their propaganda output, as well as to measure the scale of their operations in social media (Berger and Morgan, 2015; Fisher and Prucha, 2019; Lakomy, 2021a). In recent years, OSINT has also been extensively used to monitor chatter of violent extremists in cyberspace by NGOs and think tanks (Tech against terrorism: Trends in terrorist and violent extremist use of the internet, 2021).
Unlike the law enforcement and intelligence communities that ordinarily exploit OSINT in their activities, there are very few legal regulations on utilizing these methods by researchers. It means that scholars engaged in OSINT frequently move into a legal grey zone yet to be explored. Rare cases of detainment of scientists carrying out online terrorism research prove this to be the case (Curtis and Hodgson, 2008). Moreover, general ethical guidelines in social sciences have not been adequately adapted to the specificity of studying internet activities of violent extremist organizations (VEOs) with OSINT tools. Surprisingly, until recently, there was little to no attention from academia attached to addressing those challenges in detail. A scientific debate on this matter was symbolically launched just recently by the insightful article of Conway (2021: 380). She rightly concluded that there is a dire need for identifying all critical ethical dilemmas in this sub-field and the means to tackle them.
This article aims to fill this gap in the academic discourse. It explores key ethical and security challenges related to exploiting open-source intelligence in research on online terrorist propaganda. In order to reach this objective, the most common approaches to OSINT-based projects were analysed through the lens of some of the most recognized ethical guidelines in science (Ethics Self-Assessment Step by Step, 2018; Internet Research: Ethical Guidelines 3.0, 2019) as well as relevant papers on research ethics. It allowed several core dilemmas to be identified. First, this study discusses how personal data protection rules apply to investigations of potentially dangerous subjects, such as members of VEOs. In addition, it examines potential threats to the safety of researchers and the scientific infrastructure used in OSINT-based projects. It also discusses the risks of incidental findings and malevolent use of research results. Finally, drawing on existing legal regulations, good practices in other fields, as well as the author’s previous experience in OSINT-based analyses of online terrorist communication, this article explores basic means of tackling these dilemmas.
This article has been divided into five sections. The first briefly outlines the concept of open-source intelligence. It also examines some of the most popular OSINT tools utilized in online terrorist propaganda research. The second section explores the risk of encountering personal data in terrorist online communication channels and ethical means of addressing this problem. The third discusses the methods of dealing with incidental findings, such as illicit content or information about lone-wolf operations. The fourth examines potential security concerns related to researchers engaged in investigating online terrorist activities and the digital infrastructure they exploit. The final section discusses the potential malevolent use of research results.
Open-source intelligence in online terrorist communication research
Despite the growing significance of OSINT in social sciences in recent years, there have been surprisingly few academic attempts to fully explain this concept. Most of the existing and popular definitions have been coined by international organizations, government bodies, intelligence companies and law enforcement agencies, which is symbolic of who primarily utilizes these methods. The US Director of National Intelligence, for instance, characterizes it as ‘intelligence produced from publicly available information that is collected, exploited, and disseminated in a timely manner to an appropriate audience for the purpose of addressing a specific intelligence requirement’ (Williams and Blum, 2018: 1). From this perspective, OSINT is invaluable for a wide range of activities carried out by government authorities. For instance, according to Omand (2021: 290), digital intelligence gathering is used to ‘uncover terrorist networks and frustrate attacks’.
This study, however, follows a more scientific approach to OSINT and frames it as ‘a concept that addresses the search, collection, processing, analysis and use of information from open sources that can be legally accessed by any individual or organisation’ (Evangelista et al., 2021: 3). In this context, open-source intelligence, which has a non-scientific origin, constitutes a set of methods and techniques allowing the detection, extraction and analysis of raw data on the internet. Advanced tools and techniques ordinarily utilized by the law enforcement agencies and the intelligence community may provide precious results for scholarly debate, as emphasized by Tow and Yeo (2005: 1–2). They have been frequently employed by studies focused on social opinion and sentiment, cybercrime and organized crime, or cybersecurity and cyber-defence (Pastor-Galindo et al., 2020: 10282–10283). From the viewpoint of methodology, these techniques constitute a hybrid approach that draws on a number of internet research methods, such as non-participant online observation (Norskov and Rask, 2011), social network analysis (Daniel et al., 2008) or web content analysis (Herring, 2010). However, some differences exist between these ordinary internet research methods and OSINT. Open-source intelligence is based on a cross-platform and cross-technology approach to collecting and analysing online data, which was originally developed for the intelligence community. It is usually combined with the attention given to examining and processing individual data pieces, sometimes leading to surprising and valuable discoveries. Thus, open-source intelligence offers a convenient solution to the problem of accessibility to primacy sources in online terrorism research. According to Pastor-Galindo et al. (2020: 10284–10285), the core benefit of this approach is related to the enormous amount of available information on the web and high computing capacity enabling labour-intensive operations in terms of data collection and analysis to be carried out effortlessly. Moreover, the emergence of data mining techniques, big data and machine learning algorithms has facilitated and increased the efficiency of investigations. Finally, open-source intelligence is characterized by its broad scope.
According to most national and international-level documents, carrying out OSINT-based research of VEO-affiliated communication channels raises no legal concerns. For instance, recital 11 of the EU’s Directive on Combating Terrorism notes that merely ‘visiting websites or collecting materials for legitimate purposes, such as academic or research purposes, is not considered to be receiving training for terrorism’ (Directive 2017/541, 2017). One of the core reasons for its legality is related to the fact that OSINT focuses on open sources exclusively. Fleisher (2008: 853) defines them as ‘publicly available print and digital/electronic data from unclassified, non-secret, and “gray literature” sources’. Effectively, only unrestricted data that have been published, even unintentionally, by VEOs and their followers on the internet can be subject to research. As Jardines (2016: 5) rightly emphasizes, OSINT cannot involve hacking, deception or any other fraudulent act, even if it leads to collecting valuable information, allowing the frontiers of knowledge to be significantly pushed. This, in turn, means that acquiring access to and analysing restricted terrorist communication hotspots, such as closed message boards or private telegram channels (Clifford and Powell, 2019: 11; FAQ Channels), falls outside the scope of this method. In this context, the use of hacked or dumped data originating from terrorist domains in OSINT has not been sufficiently addressed. It seems to be ethically permissible, although it strictly depends on the character of dumped data and the purpose of its use. However, this does not apply to the hacked data related to ordinary internet users, the use of which could be considered a crime.
According to Herrera-Cubides et al. (2020: 5), available open sources and selection of detection techniques are highly dependent on research objectives. They are also determined by the technical features of the internet layer under consideration. On the surface web, OSINT has been primarily exploited either to map online information ecosystems maintained by violent extremist organizations (Lakomy, 2021a) or to measure the propaganda output of these groups (Milton, 2018: 2). In this context, the first step of most OSINT-based projects in this environment relies on so-called ‘Google hacking’, which is based on exploiting advanced options and commands in this search engine. 1 They are utilized to increase the efficiency of search queries and facilitate access to usually hardly accessible data (Bazzell, 2016: 39). In order to ensure accurate results, ‘Google hacking’ needs to be combined with carefully selected keywords related to terrorist agendas. For instance, the detection of internet addresses utilized by the Islamic State requires Salafi–Jihadist nomenclature and terms related to its most popular publication series, leaders, organizational structure or ideology, to be used (Lakomy, 2021a: 361–362). Alternatively, multiple ways to automate search queries exist. It is usually done with web crawlers that can be founded on, for instance, Selenium and Python scripts (Postels, 2020). These activities are usually concluded by the manual screening of search results (Herrera-Cubides et al., 2020: 5).
Subsequently, OSINT may be employed to understand the most important features of individual internet addresses affiliated with terrorist organizations. Available tools and techniques allow the adoption of three strategies for analysing domains on the surface web. The first focuses on investigating the purely technical features of those websites. For instance, this approach allows registrar information (who.is) or the true IP address of a given site to be extracted. The latter, in turn, enables other websites co-hosted on the same server to be detected through the reverse IP lookup technology (viewdns.info). It should be stressed that this tedious data gathering process can be automated and accelerated with dedicated web scrapers and reconnaissance applications. For instance, SpiderFoot allows viewers, among other things, to uncover external links located at a given domain, which helps to map the structure and interconnectedness of a terrorist information ecosystem. It also extracts email addresses utilized on the analysed website and assesses whether it is associated with malicious networks (Lakomy, 2021c).
The second approach to investigating surface web domains focuses on extracting and processing their content en masse. For this purpose, a wide range of Python scripts can be utilized. Among others, Metagoofil allows automatic detection, collection and analysis of metadata of text files present at a given URL. Another popular script – Recon NG – enables a vast array of activities to be carried out, including the identification of email addresses, hidden files or subdomains (Pastor-Galindo et al., 2020: 10293). On a side note, many of these data extraction applications are, in fact, dual-use tools. They are frequently exploited for OSINT and penetration testing (Troia, 2020: ch. 12).
The third group of tools allows individual files encountered at terrorist-affiliated internet addresses to be examined. It requires either stand-alone applications or simple add-ons to popular browsers. For instance, Exif Viewer enables metadata from image files to be extracted. They may contain GPS coordinates of where the picture was taken (Bazzell, 2016: 217, 261–280). This creates certain opportunities for counterterrorism (CT) but has little use in research. Effectively, the approaches mentioned above enable learning of the whereabouts of other interconnected terrorist domains and accounts, measuring the scale of propaganda shared by a given URL, or uncovering technical features of propaganda productions.
Aside from the surface web, OSINT has also been utilized in studies on terrorist activities in social networks, including Facebook (Ayad, 2020) and Twitter (Berger and Morgan, 2015). These platforms usually constitute a convenient environment for open-source intelligence analysis as they are abundant in potential research subjects. Social media have been massively exploited by followers and members of violent extremist organizations (Macdonald et al., 2019). From the viewpoint of OSINT, there are two groups of platforms available. The first, more privacy-focused, provides only basic integrated search tools for registered users exclusively. This approach was adopted, among others, by Facebook (Bazzell, 2016: 75). It also constitutes one of the potential reasons why studies of terrorist activities in this social network have been less popular in recent years. The second group enables a variety of content monitoring tools to all, even unsubscribed, users. Moreover, this group allows third-party content monitoring apps to be integrated. For instance, Twitter provides visitors with advanced search options, similar to ‘Google hacking’ (Schreiber, 2017). It also consists of an application programming interface (API) platform, providing easy access to public data on its users (About Twitter’s APIs). In this context, most of the OSINT-based scholarly works utilized manual and automated approaches to collect raw data. Sometimes this led to collecting massive datasets consisting of thousands or even millions of messages (Bodine-Baron et al., 2016; Ceron et al., 2019), although Parekh et al. (2018: 19) argue that the current data collection methods on Salafi-jihadist activity on social media ‘have exceptionally high rates of irrelevant account inclusions’.
Violent extremist organizations have also been present on the dark web (Weimann, 2016). This environment, hosted within overlay networks, provides users with a degree of anonymity. It cannot be accessed through ordinary browsers but requires dedicated applications (Ciancaglini et al., 2015). The privacy-oriented nature of networks making up the dark web (Freenet, ZeroNet, I2P, TOR) makes detecting and analysing hidden services a significant challenge. Most OSINT tools are obsolete in this environment (Koch, 2019), which may explain the scarcity of evidence-based research on terrorist activities below the Clearnet. Nevertheless, other fields, such as the broadly understood criminology and computer sciences, successfully explored some areas of the dark web. In order to do so, scholars have utilized both manual and automated methods (Owen and Savage, 2015; Topor, 2019). The first group of means is usually founded on exploiting somewhat crude search engines, whose ability to detect content is limited. The second group of methods is based on dedicated web crawlers. However, their efficiency is reduced again because they cannot overcome the anti-crawling measures employed by many hidden services (Telatnik, 2020).
Given what has been said, despite its vast potential, open-source intelligence cannot be perceived as a silver bullet of online terrorist propaganda research – it has its limitations. Academia has identified a number of dilemmas that have to be addressed prior to launching OSINT-based research projects. Among others, scholars face problems related to the enormous quantity of data encountered on the internet, which is inherently disorganized. Its proper classification and management are serious challenges. Furthermore, academics utilizing this method are susceptible to disinformation. Verifying information originating from OSINT is a difficult and time-consuming process. In many cases, it is not even possible. Last but not least, even though open-source intelligence is legal, it faces a variety of ethical dilemmas (Pastor-Galindo et al., 2020: 10285). This study focuses on exploring these latter challenges in detail.
Personal data protection of potentially dangerous subjects
As proved by Koops et al. (2013: 677), OSINT-based research, in certain situations, may be considered a threat to privacy protection. This problem is usually manifested in undertaking unlawful actions aiming to profile ‘natural persons’ with online identifiers, as addressed in article 22 in relation to recital 30 of the European Union’s General Data Protection Regulation (GDPR) (The Regulation 2016/679, 2016). However, most acts reconcile the right to personal data protection with the right to freedom of expression and information, including scientific activities. Effectively, while this issue poses a significant ethical challenge to the broadly understood internet research, a set of guidelines and good practices on approaching and processing personal data in online environments have been elaborated over the past years (Ducato, 2020; Internet Research: Ethical Guidelines 3.0, 2019: 10–11). Among others, article 89 of the GDPR lists safeguards and derogations relating to processing personal data for scientific purposes and mentions, for instance, pseudonymization (The Regulation 2016/679, 2016). Moreover, article 89 para 1 and article 5 para 1 mention the principle of data minimization.
Nevertheless, these solutions do not cover data protection issues experienced during OSINT investigations focused on terrorist activities on the internet. As previously mentioned, open-source intelligence techniques allow various types of data at terrorist-affiliated URLs to be extracted, including IP addresses, registrar information, usernames and email addresses. Due to operational security reasons, communication channels utilized by VEOs are usually subject to professional anonymization, as Nance and Sampson (2017: 61) accurately noted. Thus, it is extremely unlikely to encounter data, which could be perceived as ‘personal’. Still, there is always a risk that some media operatives responsible for online operations, or any of their followers, will fail to undertake necessary precautions resulting in their IP, email addresses or social media handles being extractable with OSINT means. These pieces of data, in turn, may reveal their true identity. Such a scenario effectively creates the problem of processing sensitive – i.e. revealing religious beliefs and political opinions – personal data of potentially dangerous subjects involved in terrorist propaganda on the internet.
Existing regulations on processing this type of data usually refer strictly to law enforcement agencies. For instance, in the European Union they have the right to process personal data and profile natural persons ‘with regard to whom there are serious grounds for believing that they have committed or are about to commit a criminal offence’ (Directive 2016/680, 2016). These regulations do not apply to academia, which means that all scholarly projects encountering activities of identifiable internet users are carried out without a proper legal basis. Thus, the question arises, what kind of measures should be undertaken to tackle this dilemma ethically? Three types of solutions could be identified based on the current widely adopted guidelines on processing personal data for scientific purposes.
First of all, research projects should not consist of any activities that could be interpreted as online profiling of natural persons. This approach is manifested in, for instance, avoiding the identification of owners of usernames or email addresses that terrorist-affiliated websites have published. Analysis of IP addresses of individual users or metadata (EXIF) from pictures they share may also be problematic in this regard, as it may reveal their whereabouts. This strategy allows the problem of processing personal data of terrorists to be largely avoided.
Secondly, all incidental findings related to personal data should be disregarded, pseudonymized or anonymized. Similar solutions have been widely adopted, among others, in health science (Quinn, 2017). Disregarding means that some pieces of personal data, such as, for instance, a username revealing an individual’s identity, should never be included in the project’s database. However, in some cases, this attitude could negatively influence research outcomes, especially if a person under consideration plays a vital role in online terrorist communication. This risk shows that alternative options, such as anonymization or pseudonymization, should be considered. The first is the process of removing ‘personal identifiers, both direct and indirect, that may lead to an individual being identified’ (Anonymisation and Pseudonymisation). GDPR defines the second as ‘the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information’. Anonymization seems to be a more appropriate option because pseudonymized data can still fall within the scope of personal data protection acts due to the existing risk of re-identification.
Thirdly, a principle of data minimization should be followed. Technical and organizational measures should be implemented to limit data processing to what is necessary to realize research objectives (The Regulation 2016/679, 2016, article 5, para 1, subpara e, article 89, para 1). Given that scholars have usually used OSINT to understand the online activities of VEOs, not their individual members, research activities should not go beyond these boundaries. However, this does not mean that the analysis of de-identified profiles of terrorists who are active, for instance, on social media, should not be considered, as it may lead to valuable conclusions on the strategy employed by their organization. This applies even if it was not part of the initial research objective but is justified by potential added value to the project. However, such a flexible approach to OSINT-based studies would require rigorous anonymization of all collected data. Moreover, all changes to the ongoing research projects in this regard would require additional permissions from relevant ethics commissions.
In this context, one more dilemma should be addressed: Why should personal data encountered at terrorist-affiliated domains be disregarded instead of reported to law enforcement agencies (LEA)? This issue is directly related to the paradox identified by Grossman and Gerrand (2021). They noted that ‘terrorism researchers can become (drawing on Emmanuel Levinas’s distinction between ethics and morality) caught between “ethical” responsibility for participants on the one hand, and “moral” responsibility for the greater good, on the other’ (p. 242). There is no doubt that reporting potential media operatives to LEA would – at least in some cases – contribute to the ‘greater good’ of the society. However, would that be ethical? The answer is equivocal. On the one hand, as previously mentioned, OSINT is susceptible to disinformation and e-identities are easily changeable. Thus, there is always a risk that encountered personal data could have been forged. Moreover, most guidelines on research ethics emphasize the necessity of preventing or mitigating harm to research participants (Taylor and Horgan, 2021). Effectively, providing such information to government authorities would create a risk of endangering innocent internet users that have nothing to do with terrorism. It could happen, for instance, due to online identity theft by a terrorist organization. Such cases occurred in the past. For instance, the Islamic State’s followers hacked random accounts on Twitter and used them for propaganda dissemination (Whittaker, 2019). On top of that, verifying this kind of data would certainly meet the prerequisites for profiling natural persons with ‘online identifiers’. Thus, adopting such a solution would be doubtful from a legal viewpoint. It would also have nothing to do with scientific activities, as interpreted by Weber (Sharpf, 2007). On the other hand, some accidental findings may lead to learning the true identities of members of terrorist organizations. Failure to provide this information to LEAs by scholars could be interpreted as acceptance of or indifference to such illicit activities. In this context, the proper reaction to this dilemma depends on individual circumstances related to what types of personal data were uncovered and how they are related to terrorist activity. Undoubtedly, it constitutes one of the most important ethical challenges that requires further research.
Incidental findings and the criminal liability of researchers
Another ethical challenge to OSINT-based research on online terrorist communication is a minimal yet existing risk of encountering unexpected or incidental findings. First of all, they may concern information about planned or ongoing terrorist attacks, as well as their targets. Extremists have published this kind of data on the internet in the past. On the one hand, lone-wolf operations have sometimes been preceded by social media messages discussing their motivations, allegiance and potential targets. Some terrorist attacks were even live-streamed, as the Christchurch events prove (Ibrahim, 2020). On the other hand, many VEOs have released messages encouraging internet users to carry out attacks against specific targets. For instance, the Islamic State’s Francophone magazine Dar al-Islam published ‘Wanted Dead’ infographics. They urged readers to assassinate members of the Muslim minority in France, who were identified as ‘traitors’ of the ummah (Lakomy, 2021b: 89).
Encountering these pieces of information during research creates the problem of the indirect responsibility of academics for the safety and security of targets designated by terrorists. While many international and government bodies are engaged in monitoring chatter of VEOs on the internet, there is always a risk that they will be late in detecting and processing this news. Thus, there is no certainty that proper counterterrorism measures will be introduced in a timely and efficient manner. In this context, most existing guidelines suggest that such incidental findings of significant importance – and information about terrorist attacks is undoubtedly so – should be reported to relevant authorities (Ethics in Social Science and Humanities, 2018: 14–15). This attitude can be compared to the established responsibility of scholars for reporting harm in other areas, including, for instance, child abuse. In other words, the only ethical way of solving this dilemma is manifested in the immediate transfer of the findings to the relevant law enforcement agency responsible for CT operations. However, there is one complicating factor from the viewpoint of research ethics. In most projects, participants are informed about reporting obligations of researchers. This, in turn, may influence their decision on whether they want to participate in the study. In OSINT-based projects, anonymous participants cannot be consulted and, effectively, are not able to provide their explicit consent (Internet Research: Ethical Guidelines 3.0, 2019: 10).
Open-source intelligence analysis carried out on the dark web creates even greater ethical dilemmas in this regard. Due to the unique features of this layer of internet communication, it has been massively utilized by the cyber-criminal underground. It is permeated by dark markets, drug or firearms vendors, leaked databases, as well as websites facilitating the carrying of illegal activities. It has also been known to contain child pornography (O’Brien, 2014; Shillito, 2019). Moreover, as previously mentioned, investigations on the dark web suffer from its concealed, non-transparent and heterogeneous nature. Searching for and verifying terrorist-affiliated content usually requires directly visiting websites under consideration, as they frequently lack meta descriptions. Both features combined mean that monitoring and analysing these environments for academic purposes must consider risks related to incidentally encountering illicit content. Even inadvertent access to domains containing such materials may be considered a crime.
Most ethical guidelines suggest that a proper response to this problem from the research team should be founded on four principles: respect for persons, beneficence, justice and fairness, as well as intellectual freedom and responsibility (Guideline for the Reporting of Incidental and Secondary Findings to Study Participants). In this context, the European Commission’s manual on research ethics clearly states that ‘as a rule, criminal activity witnessed or uncovered in the course of research must be reported to the responsible and appropriate authorities, even if this means overriding commitments to participants’ (Ethics in Social Science and Humanities, 2018: 14). Following this rule, the only reasonable and ethical way to react to these findings is to report them to the relevant authorities immediately. Any other reaction or negligence would result in criminal liability of the research team. Moreover, all traces of such findings should be erased from the project’s database without any delay.
In addition, it should be stressed that some layers of the dark web, such as Freenet or ZeroNet, exploit peer-to-peer (P2P) technology (Wang et al., 2020). It means that domains located in these networks do not have traditional Clearnet-like hosting servers. Instead, all users visiting these websites are involved in their hosting. It creates an even greater ethical risk for researchers as merely visiting terrorist sites is equivalent to maintaining them online. This means that the digital infrastructure used by scholars can potentially host VEO-affiliated communication channels. In other words, scientists would contribute to online terrorism while researching ways to curb it. It would constitute a clear violation of CT laws, resulting in researchers’ criminal liability. To date, this issue has not been sufficiently addressed by the academic community. The lack of clear solutions may be perceived as a factor discouraging scholars from studying these environments. In fact, thus far, Freenet and ZeroNet have been subjects of very few evidence-based studies.
In this context, two considerations can be made. Firstly, research focused on these P2P-based networks should be carried out in a way that allows the blockage of co-hosting of all visited locations automatically. It is an open question whether it is technically possible as the current solutions only allow hosting to be stopped manually (Frequently asked questions). Thus, the risk of violating the law still exists. Second, academics interested in studying these environments should consult their methodology with authorities responsible for fighting cyber-crime and terrorism prior to launching their projects. It is the only viable way of allowing these legal dilemmas to be addressed.
Identifying risks to the safety of researchers
At first glance, open-source intelligence should not cause any threats to researchers’ safety as projects are carried out online without any risk of physically encountering members of violent extremist organizations. However, a closer look enables two qualitatively different challenges in this regard to be detected.
On the one hand, as Sarda et al. (2019: 557) note, ‘whenever we navigate the Web, we leave a trace through our IP address, which can in turn be used to establish our identity.’ Aside from an IP address, the ordinary ‘digital footprint’ left by internet users may consist of information about the device used for browsing, its operating system or physical location. Leaving this kind of footprint in cyberspace creates operational security (OPSEC) risks for researchers engaged in open-source intelligence. Terrorist organizations subject to investigations can collect and analyse these traces with digital forensic tools, enabling computers used by scholars to be geolocated. In extreme cases, even their identity could be revealed. This problem may be especially evident when visiting websites lacking the encrypted HTTPS protocol (Crouch, 2018). Effectively, it creates a potential risk for the safety of researchers.
Fortunately, two factors significantly mitigate this risk. First, successful identification of academics engaged in OSINT-based projects usually requires proficiency in computer science and access to professional software or databases used by law enforcement. Thus, online profiling of researchers by terrorist organizations, which usually lack these capabilities (Conway, 2007a), is extremely unlikely. Second, so far, there have been no records of VEOs targeting the scientific community engaged in online terrorism research. Even the Islamic State, which was known for commenting on the academic discourse in its propaganda publications, did not carry out any attacks against scholars (Lakomy, 2021b: 70).
These considerations do not mean that scientists engaged in OSINT-based research cannot be threatened in future. Aside from purely physical risks, VEOs can use a variety of online means to harass academics, including, for instance, e-stalking. As Conway (2021: 369) notes, ‘both jihadist and right-wing extremists have been known to engage in networked forms of abuse, some of which also has the potential to spill over into “real world” settings.’ In effect, each study founded on OSINT should take necessary precautions to address these risks, especially in close cooperation with the academic institutions coordinating the project. As noted by AoIR, they should ‘develop policy detailing support procedures for researchers experiencing online threats or harassment related to their work’ (Internet Research: Ethical Guidelines 3.0, 2019: 11). Moreover, there are significant concerns related even to simply consuming terrorist propaganda. According to many guidelines, being exposed to terrorist content may seriously threaten scientists’ well-being. These risks also require introducing proper procedures from academic institutions.
On the other hand, some terrorist communication channels have been known for disseminating malicious software. There have been numerous cases of scholars representing this sub-field that encountered malware during their investigations. This problem refers, for instance, to independent file-sharing services that have been used to distribute fake terrorist apps (al-Rawi, 2018: 746). Moreover, web pages maintained by some terrorist organizations have been known for being hosted on servers alongside hundreds of malicious domains (Lakomy, 2021c). This tendency means that computers used by scholars in their research may be easily infected, which can have several negative consequences. To begin with, certain types of malware, such as worms and trojan horses, allow extracting documents stored on hard drives, exploiting active logins, as well as registering all keystrokes (Bhardwaj and Goundar, 2020). In effect, an infection may reveal researchers’ identity or location. Furthermore, in some cases, the same set of stolen data can be utilized for identity theft, resulting in a takeover of email addresses or social media profiles of scientists. Finally, malware infection disrupts data integrity. Projects’ databases may be easily altered, encrypted or simply deleted.
In order to tackle these challenges, OSINT-based research of VEOs’ communication channels should be founded on a set of cyber-security standards commonly adopted by scientists specializing in tracking criminal activity on the internet. Popular solutions include, among others, exploiting ‘burner computers’ (Herpig and Reinhold, 2018: 40). These devices should be used in one research project exclusively and consist of no personal data, active logins, or any other traces that allow identification of their owners. All computers utilized for open-source intelligence should have the necessary anti-virus and firewall software installed. Advanced cyber-security hardware, such as unified threat management (UTM) devices, is also helpful in preventing infections (Agham, 2016). Research activities must be carried out through a properly configured Linux-based virtual machine (VM), significantly reducing the probability of serious incidents. This is because VMs can be easily deleted and cloned in case of a cyber-security breach. Furthermore, all OSINT activities should be founded on technical and organizational solutions, ensuring the integrity of databases. All collected evidence should be subject to regular backups and preferably password protected. Last but not least, OSINT-based projects should make use of a variety of privacy-oriented apps, allowing a high level of operational security of investigations to be maintained. These means include, for instance, virtual private networks (VPNs), TOR Browser, and a set of anonymizing browser extensions, such as uBlock Origin, HTTPS Everywhere or Privacy Badger (Ramadhani, 2018). A combination of these pieces of software allows the mitigation of risks of leaving sensitive traces on the internet.
Data integrity and the malevolent use of research results by third-parties
Many recognized scientists have recently discussed the risk of misuse of research results on political violence in a way that inflicts harm on people or provides support for harmful policies (Kalyvas and Strauss, 2020). There are plenty of ethical guidelines mentioning this challenge. For instance, the European Commission’s Comprehensive strategy on how to minimize research misconduct and the potential misuse of research stresses concerns related to studies that ‘can be reasonably anticipated to provide knowledge, which could be misused for criminal, terrorist or unethical military purposes’ (A Comprehensive Strategy, 2010: 7).
This problem also applies to OSINT-based studies of online activities of VEOs, which may lead to uncovering manuals released by these groups. These productions aim to teach their followers a variety of skills necessary to carry out terrorist attacks, such as techniques of car-ramming or bomb-making. These instructions have been frequently conveyed by e-magazines, such as al-Qaeda’s Inspire (Conway et al., 2017). Many manuals have also addressed operational security problems experienced by members of violent extremist organizations. For instance, al-Qaeda-affiliated Kybernetiq magazine has provided its readers with tips and tricks on how to maintain privacy and anonymity on the internet (Barone, 2019).
Studying this kind of harmful content usually requires its extraction to the researcher’s database, which effectively creates ethical responsibility. In the case of data leaks resulting from the previously mentioned cyber-incidents, there is a risk of the potential misuse of these materials. In other words, academics would bear partial liability for enabling access of third parties to instructions designed to facilitate terrorist attacks. In order to mitigate this risk, following the EU’s guidelines (A Comprehensive Strategy, 2010: 7), the cyber-security standards discussed above should be in place, including especially the encryption of all collected evidence. Furthermore, access to the digital infrastructure used to store these instructions should be restricted to authorized personnel exclusively. Finally, all copies of downloaded manuals should be removed upon completing all scientific objectives.
Conclusions
There is no doubt that online terrorist propaganda research has not paid enough attention to ethics in recent years. Addressing all gaps in this regard poses a significant challenge for academia as it requires overviewing a variety of methodologies utilized in this sub-field. Thus, the necessity of scientific dialogue, noticed by Conway (2021: 380), is more than evident. This article constitutes an attempt to contribute to this debate. It should be treated merely as an introduction to complex issues related to OSINT-based research and not as a set of ultimate rules. Nevertheless, this study allows the conclusion that three general principles should be followed in order to address most of the ethical dilemmas that may arise when exploiting open-source intelligence to explore the online activities of VEOs.
To begin with, OSINT-based research projects focused on terrorist information ecosystems have to adopt the same set of highest technical standards that have been applied in cyber-crime research for years (Hussien, 2021). It means that significant attention should be attached to measures mitigating risks for digital infrastructure security and the integrity of collected data (Mariuta, 2014). Likewise, these studies should make use of anonymizing techniques, which enable the identity and whereabouts of researchers to be concealed.
Secondly, proper research data management policies should be introduced. Aside from the cyber-security standards mentioned above, organizational measures must be in place to prevent unauthorized access to databases. Moreover, an information-sharing mechanism with key stakeholders should be established. However, this must be carried out deliberately, considering both moral and ethical dimensions of research (Grossman and Gerrand, 2021), and preferably in close coordination with proper ethics bodies. This is because OSINT may encounter both data that need to be reported as soon as possible to the relevant authorities, as well as those that should be anonymized or disregarded to follow the primum non nocere rule (Taylor and Horgan, 2021).
Last but not least, even though open-source intelligence is, by definition, legal, researchers exploiting these means should undertake all necessary precautions to reduce the risk of violating the law. Reaching this objective may, however, encounter cross-jurisdictional problems, as counterterrorism laws or personal data protection acts adopted by different states are frequently incompatible. It means that scholarly activities considered legal in one state may prove to be a crime elsewhere. These problems have been in evidence for years, even in relations between the European Union and the United States (Conway, 2007b: 25–27). The potential conflict of laws is especially important in multi-stakeholder or transnational projects, which are realized in multiple jurisdictions simultaneously. Finding a common denominator that takes into account all relevant regulations seems to be one of the first steps to be taken before launching such projects. Scholars must pay particular attention to understanding existing legal regulations and designing methodologies that allow these challenges to be avoided lawfully.
Footnotes
Funding
The research activities used for this article were co-financed by funds granted under the Research Excellence Initiative of the University of Silesia in Katowice.
