Abstract
The rapid progress of Artificial intelligence in generative modeling is marred by widespread misuse. In response, researchers turn to use-based restrictions—contractual terms prohibiting certain uses—as a “solution” for abuse. While these restrictions can be beneficial to artificial intelligence governance in API-gated settings, their failings are especially significant in open-source models: not only do they lack any means of enforcement, but they also perpetuate the current proliferation of tokenistic efforts toward ethical artificial intelligence. This observation echoes growing literature that points to useless efforts in “AI ethics,” and underscores the need to shift from this paradigm. This article provides an overview of these drawbacks and argues that researchers should divert their efforts to studying deployable, effective, and theoretically grounded solutions like watermarking and model alignment from human feedback to effect tangible changes in the current climate of artificial intelligence.
Keywords
In recent years, a large body of research has been developed on generative modeling, leveraging the expressive power of deep neural networks to model complex, high-dimensional data distributions (Ruthotto and Haber, 2021). This allows generative models (GMs) like GPT-3 (Brown et al., 2020) and Stable Diffusion (Rombach et al., 2022) to create images, texts, etc. that appear realistic and plausible to humans. They have seen wide applications ranging from virtual legal assistants (Nguyen, 2023) to automated software development (Dong et al., 2023), and they have seen huge practical value in medical situations as well (Sandfort et al., 2019). However, abuse and misuse also abound, from bots spamming harmful posts (Kilcher, 2022) to artificial intelligence (AI) models leaking programmers’ code (Chen et al., 2021; Poritz, 2022), unfairly winning art competitions with AI (Roose, 2022) to generating “DeepFakes” of celebrity pornography (Wiggers, 2022). While new regulations such as the EU AI Act are catching up to technological advances, regulation still lags behind the rapid development of AI technology (Wirtz et al., 2020), enabling large-scale, socially harmful misuses by bad actors.
In response, these malpractices have prompted a considerable portion of the AI community to incorporate use-based restrictions (Rombach et al., 2022; Touvron et al., 2023a)—legal provisions that prohibit certain GM uses as part of either an open-source license (i.e., a use-restricted license) or terms of service. For instance, OpenAI’s usage policies forbid the generation of malware or adult content with ChatGPT, under which repeated violations can lead to suspension or termination of a user’s account. Although there are actionable means of enforcing such restrictions for API-gated models, these measures have been proven ineffective for open-source GMs by numerous instances of abuse. This is no surprise: they not only lack enforceability but also feed into the growing trend of formalistic actions in “responsible AI” today. We thus advocate for a fundamental shift toward safeguards that are practical, effective, and theoretically robust, serving as an initial step in fostering accountability within the open-source community. Finally, the article acknowledges that addressing GM misuse necessitates not just robust technical measures but also a concerted, cross-disciplinary approach that recognizes the limitations of current solutions and fosters collaboration between legal, ethical, and academic fields.
The introduction of use-restricted licenses
The response to this lack of external regulation is a call for self-regulation through use-based restrictions from the research community. Proponents see this as a responsible move that developers should take to minimize the harm caused by their work. For instance, Stable Diffusion, an open-source text-to-image GM, embraced the OpenRAIL-M License (Rombach and Esser, 2022), restricting use cases from disseminating misinformation to exploiting the vulnerabilities of legally protected categories. Ferrandis (2022), a contributing author of the license, argued that these restrictions “might act as a deterrent” for potential bad actors and gave developers “greater control over the use of [GMs],” a first step toward an informed and respectful AI culture. In practice, this means that users must expressly agree to these terms before gaining access, thereby creating a legally binding contract between the licensor (i.e., the developers) and the user. This measure requires minimal effort from researchers and has been adopted by an increasing proportion of new GM releases, as shown in Table 1.
An overview of recent generative models (GMs) and the specifications regarding their releases.
Rows are sorted by release date.
Inadequacies of use-restricted licenses
Current discussions do not adequately address the unique issues posed by open-source GMs separately from API-gated models, and hence neglect the infeasibility for developers of open-source GMs to take action in case of abuse. First, once access to a GM has been granted, the user would have full access to its weights and inference code, enabling large-scale GM deployment without alerting the licensor at all. Developers of API-gated models, in contrast, can easily identify large-scale abuse with statistics (Contractor et al., 2022). This increases the difficulty for researchers to identify abuse of open-source GMs in the first place. Secondly, current literature on AI has yet to witness a reliable test to determine if certain content is AI-generated, let alone to identify which model generated that content. Recent research points out that successful detection hinges on a number of “favorable conditions” that do not, in general, hold (Gragnaniello et al., 2021: 2). While detection for API-gated models is much more actionable, for example, by searching server logs, this remains a challenge for developers of open-source models and creates another hurdle for licensors to gather evidence for legal action. Lastly, even if abuse can be identified, the consequences are no more than empty threats. The OpenRAIL license, for instance, proposes ending access as a form of deterrence. Whereas developers of API-gated models can monitor and, if necessary, restrict access based on misuse, this mechanism seems impractical in the open-source context. How does one “end access” to a model that’s freely available and potentially replicated across countless servers and devices? These three factors together create practically insurmountable challenges for open-source GM researchers to tackle abuse through use-restricted licenses.
The adoption of use-restricted licenses, then, is no more than another nominal nod to “ethical AI,” a seemingly responsible choice that is more about convenience than commitment. For instance, Kilcher (2022), an independent researcher, managed to generate large volumes of racist, sexist, and generally toxic posts on 4chan, an online anonymous discussion forum, using an open-release GM—GPT-J. These posts caused much havoc online, with many even disseminating unfounded conspiracy theories about the AI-written posts. No statements or actions against Kilcher have come from EleutherAI, who released GPT-J. While this instance of abuse predates the introduction of use-restricted license by BLOOM (Scao et al., 2022), the apathy from researchers exemplifies the insensitivity to abuse common in the larger tech culture, which has been well documented (Wachter-Boettcher, 2017; Paul, 2023). Many works, in addition, identified the lack of ethical training in undergraduate CS-related majors (Oliver and McNeil, 2021; García-Holgado et al., 2021), where complex socio-ethical issues created by emerging technology that intersect with race, gender, and class are overlooked. The insensitivity to GM abuse, coupled with a lack of ethical training, suggests that AI research takes place in an ethically void environment (Munn, 2023). It is then no surprise that researchers opt for the readily accessible templates of use-restricted licenses, an expedient—albeit superficial—approach to ethical challenges, sidestepping deeper considerations of responsibility in AI development.
The pressing need for an alternative
“Licenses are only useful if they are enforced,” as Contractor et al. (2022) noted (p. 13). This enforcement gap in GMs underscores the urgency for a shift toward practical, effective, and theoretically robust safeguards in open-source AI research. One example is the growing body of research on LLM watermarking, a safety mechanism that embeds patterns in models or their outputs that are indiscernible to humans but statistically detectable (Kirchenbauer et al., 2023; Gu et al., 2022). For instance, Gu et al. (2022) proposed a multi-task learning framework that embeds “backdoors” in LLM weights, which can be detected even after the model has been finetuned. Such a solution can be immediately integrated into current machine learning pipelines before model releases, maximizing its real-world impact without necessitating extensive modifications. Supported by theoretical investigation and experimental verification, detections made with this approach can then serve effectively as evidence for further actions against bad actors (Grinbaum, 2022), addressing a crucial shortcoming of use-restricted licenses. Other alternatives such as model alignment from human feedback have also shown promise in aligning AI-generated content with human preferences (Ziegler et al., 2019). For instance, reinforcement from human feedback (RLHF) has already seen increasing adoption to encourage helpful, truthful, and harmless LLM outputs (Bai et al., 2022; Touvron et al., 2023b), and similar maximum likelihood-based approaches have been developed for text-to-image GMs as well (Lee et al., 2023). After pretraining, these methods can similarly be integrated as finetuning to the current pipeline, though the computational and operational costs may be significant. This approach does not address misuses after they happen but creates barriers to the generation of toxic and harmful texts in the first place. In light of the clear deficiencies of use-restricted licenses, researchers should transition to these proven, practical safeguards that align with the operational and distributional realities of open-source GMs.
While these specific solutions show more promise than use-restricted licenses, their practicality has limitations that must also be taken into consideration in deployment. For instance, Gu et al. (2022)’s approach is effective only when models operate as black boxes; once text is generated, tracing its origin becomes challenging, which complicates the process of enforcement against, e.g., spreading misinformation with GMs. Although RLHF has provably improved certain LLM metrics such as harmlessness and toxicity, the assessment of the outcome is often difficult and ambiguous (Bai et al., 2022) and, as a result, calls into question the net impact of such alignment (Ouyang et al., 2022). Further, given the already-released powerful models capable of large-scale disruptions (Widder et al., 2022), the challenge is not just in safeguarding future models but addressing the potential harm from existing ones. These nuances underscore that technical solutions alone are insufficient. As the landscape of AI evolves, it’s imperative that means of regulation remain proactive in adapting to the ever-evolving challenges posed by GMs.
Finally, an effective, well-rounded solution to GM abuse demands a collective, interdisciplinary effort from law, ethics, and academia. On the one hand, as researchers, especially those pioneering safety mechanisms like LLM watermarking, lay the foundational groundwork that implements safety mechanisms, it is equally crucial to focus on the meaningful integration of high-level principles to datasets, models, and products. Legal experts and ethicists play a pivotal role in identifying axiomatic principles that dictate the purpose or objective of technical solutions, ensuring that they not only function effectively but also align with societal values and norms (Hagendorff, 2020). On the other hand, we must recognize that such technical fixes adopt one particular, narrow understanding of “responsible AI use” and address only part of the problem (Munn, 2023). How would human bias influence a GM’s inductive bias in novel finetuning methods such as RLHF (Ziegler et al., 2019)? How might current GM uses perpetuate or worsen historical inequalities that intersect with race, gender, and class? These questions, unaddressed by technical solutions, are embedded in highly complex socio-technical systems and require constantly renewed perspectives to “see the new as it emerges” (Rességuier and Rodrigues, 2020: 1). As the intersection of GMs’ capabilities and societal implications becomes increasingly complex, a collaborative approach is crucial in aligning technical advancements harmoniously with ethical guidelines and legal frameworks to ensure the potential of AI is harnessed responsibly.
Footnotes
Acknowledgements
The authors would like extend their heartfelt thanks to Dr. Eric M. Bliman for his valuable insights and feedback, which have significantly enhanced the quality of this work.
Declaration of conflicting interests
The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
