Sage Journals: Discover world-class research

Abstract

The rapid progress of Artificial intelligence in generative modeling is marred by widespread misuse. In response, researchers turn to use-based restrictions—contractual terms prohibiting certain uses—as a “solution” for abuse. While these restrictions can be beneficial to artificial intelligence governance in API-gated settings, their failings are especially significant in open-source models: not only do they lack any means of enforcement, but they also perpetuate the current proliferation of tokenistic efforts toward ethical artificial intelligence. This observation echoes growing literature that points to useless efforts in “AI ethics,” and underscores the need to shift from this paradigm. This article provides an overview of these drawbacks and argues that researchers should divert their efforts to studying deployable, effective, and theoretically grounded solutions like watermarking and model alignment from human feedback to effect tangible changes in the current climate of artificial intelligence.

Keywords

Artificial intelligence machine learning deep learning responsible artificial intelligence deep generative modeling open-source licenses

In recent years, a large body of research has been developed on generative modeling, leveraging the expressive power of deep neural networks to model complex, high-dimensional data distributions (Ruthotto and Haber, 2021). This allows generative models (GMs) like GPT-3 (Brown et al., 2020) and Stable Diffusion (Rombach et al., 2022) to create images, texts, etc. that appear realistic and plausible to humans. They have seen wide applications ranging from virtual legal assistants (Nguyen, 2023) to automated software development (Dong et al., 2023), and they have seen huge practical value in medical situations as well (Sandfort et al., 2019). However, abuse and misuse also abound, from bots spamming harmful posts (Kilcher, 2022) to artificial intelligence (AI) models leaking programmers’ code (Chen et al., 2021; Poritz, 2022), unfairly winning art competitions with AI (Roose, 2022) to generating “DeepFakes” of celebrity pornography (Wiggers, 2022). While new regulations such as the EU AI Act are catching up to technological advances, regulation still lags behind the rapid development of AI technology (Wirtz et al., 2020), enabling large-scale, socially harmful misuses by bad actors.

In response, these malpractices have prompted a considerable portion of the AI community to incorporate use-based restrictions (Rombach et al., 2022; Touvron et al., 2023a)—legal provisions that prohibit certain GM uses as part of either an open-source license (i.e., a use-restricted license) or terms of service. For instance, OpenAI’s usage policies forbid the generation of malware or adult content with ChatGPT, under which repeated violations can lead to suspension or termination of a user’s account. Although there are actionable means of enforcing such restrictions for API-gated models, these measures have been proven ineffective for open-source GMs by numerous instances of abuse. This is no surprise: they not only lack enforceability but also feed into the growing trend of formalistic actions in “responsible AI” today. We thus advocate for a fundamental shift toward safeguards that are practical, effective, and theoretically robust, serving as an initial step in fostering accountability within the open-source community. Finally, the article acknowledges that addressing GM misuse necessitates not just robust technical measures but also a concerted, cross-disciplinary approach that recognizes the limitations of current solutions and fosters collaboration between legal, ethical, and academic fields.

The introduction of use-restricted licenses

The response to this lack of external regulation is a call for self-regulation through use-based restrictions from the research community. Proponents see this as a responsible move that developers should take to minimize the harm caused by their work. For instance, Stable Diffusion, an open-source text-to-image GM, embraced the OpenRAIL-M License (Rombach and Esser, 2022), restricting use cases from disseminating misinformation to exploiting the vulnerabilities of legally protected categories. Ferrandis (2022), a contributing author of the license, argued that these restrictions “might act as a deterrent” for potential bad actors and gave developers “greater control over the use of [GMs],” a first step toward an informed and respectful AI culture. In practice, this means that users must expressly agree to these terms before gaining access, thereby creating a legally binding contract between the licensor (i.e., the developers) and the user. This measure requires minimal effort from researchers and has been adopted by an increasing proportion of new GM releases, as shown in Table 1.

Table 1.
An overview of recent generative models (GMs) and the specifications regarding their releases.

Model name Modality Release date Availability License type Commercial Use? Restrictions?

GPT-2 (Radford et al., 2019) Text 2/2019 Open-Source MIT¹ Yes No

XLNet (Yang et al., 2019) Text 6/2019 Open-Source Apache 2.0 No No

BERT (Devlin et al., 2018) Text 10/2019 Open-Source Apache 2.0 Yes No

ELECTRA (Clark et al, 2020) Text 3/2020 Open-Source Apache 2.0 Yes No

DALL.E (Ramesh et al., 2021) Text-to-Image 1/2021 API-Gated N/A / Yes

GPT-J Text 8/2021 Open-Source Apache 2.0 Yes No

GPT-NeoX (Black et al., 2022) Text 2/2022 Open-Source Apache 2.0 Yes No

T5 (Raffel et al., 2020) Text 4/2020 Open-Source Apache 2.0 No No

OPT (Zhang et al., 2022) Text 5/2022 Open-Source Custom Yes No

BLOOM (Scao et al., 2022) Text 7/2022 Open-Source OpenRAIL Yes Yes

Stable Diffusion (Rombach et al., 2022) Text-to-Image 7/2022 Open-Source OpenRAIL-M Yes Yes

GPT-3 (Brown et al., 2020) Text 11/2022 API-Gated N/A / Yes

Bard Text 3/2023 API-Gated N/A / Yes

Midjourney v5 Text-to-Image 3/2023 API-Gated N/A / Yes

Falcon Text 4/2023 Open-Source Apache 2.0 / No

Raven Text 4/2023 Open-Source Apache 2.0 / No

Claude Text 7/2023 API-Gated N/A / Yes

Llama 2 (Touvron et al., 2023b) Text 7/2023 Open-Source² Custom Yes* Yes

Mistral 7B (Jiang et al., 2023) Text 9/2023 Open-Source Apache 2.0 / No

Rows are sorted by release date.

Inadequacies of use-restricted licenses

Current discussions do not adequately address the unique issues posed by open-source GMs separately from API-gated models, and hence neglect the infeasibility for developers of open-source GMs to take action in case of abuse. First, once access to a GM has been granted, the user would have full access to its weights and inference code, enabling large-scale GM deployment without alerting the licensor at all. Developers of API-gated models, in contrast, can easily identify large-scale abuse with statistics (Contractor et al., 2022). This increases the difficulty for researchers to identify abuse of open-source GMs in the first place. Secondly, current literature on AI has yet to witness a reliable test to determine if certain content is AI-generated, let alone to identify which model generated that content. Recent research points out that successful detection hinges on a number of “favorable conditions” that do not, in general, hold (Gragnaniello et al., 2021: 2). While detection for API-gated models is much more actionable, for example, by searching server logs, this remains a challenge for developers of open-source models and creates another hurdle for licensors to gather evidence for legal action. Lastly, even if abuse can be identified, the consequences are no more than empty threats. The OpenRAIL license, for instance, proposes ending access as a form of deterrence. Whereas developers of API-gated models can monitor and, if necessary, restrict access based on misuse, this mechanism seems impractical in the open-source context. How does one “end access” to a model that’s freely available and potentially replicated across countless servers and devices? These three factors together create practically insurmountable challenges for open-source GM researchers to tackle abuse through use-restricted licenses.

The adoption of use-restricted licenses, then, is no more than another nominal nod to “ethical AI,” a seemingly responsible choice that is more about convenience than commitment. For instance, Kilcher (2022), an independent researcher, managed to generate large volumes of racist, sexist, and generally toxic posts on 4chan, an online anonymous discussion forum, using an open-release GM—GPT-J. These posts caused much havoc online, with many even disseminating unfounded conspiracy theories about the AI-written posts. No statements or actions against Kilcher have come from EleutherAI, who released GPT-J. While this instance of abuse predates the introduction of use-restricted license by BLOOM (Scao et al., 2022), the apathy from researchers exemplifies the insensitivity to abuse common in the larger tech culture, which has been well documented (Wachter-Boettcher, 2017; Paul, 2023). Many works, in addition, identified the lack of ethical training in undergraduate CS-related majors (Oliver and McNeil, 2021; García-Holgado et al., 2021), where complex socio-ethical issues created by emerging technology that intersect with race, gender, and class are overlooked. The insensitivity to GM abuse, coupled with a lack of ethical training, suggests that AI research takes place in an ethically void environment (Munn, 2023). It is then no surprise that researchers opt for the readily accessible templates of use-restricted licenses, an expedient—albeit superficial—approach to ethical challenges, sidestepping deeper considerations of responsibility in AI development.

The pressing need for an alternative

“Licenses are only useful if they are enforced,” as Contractor et al. (2022) noted (p. 13). This enforcement gap in GMs underscores the urgency for a shift toward practical, effective, and theoretically robust safeguards in open-source AI research. One example is the growing body of research on LLM watermarking, a safety mechanism that embeds patterns in models or their outputs that are indiscernible to humans but statistically detectable (Kirchenbauer et al., 2023; Gu et al., 2022). For instance, Gu et al. (2022) proposed a multi-task learning framework that embeds “backdoors” in LLM weights, which can be detected even after the model has been finetuned. Such a solution can be immediately integrated into current machine learning pipelines before model releases, maximizing its real-world impact without necessitating extensive modifications. Supported by theoretical investigation and experimental verification, detections made with this approach can then serve effectively as evidence for further actions against bad actors (Grinbaum, 2022), addressing a crucial shortcoming of use-restricted licenses. Other alternatives such as model alignment from human feedback have also shown promise in aligning AI-generated content with human preferences (Ziegler et al., 2019). For instance, reinforcement from human feedback (RLHF) has already seen increasing adoption to encourage helpful, truthful, and harmless LLM outputs (Bai et al., 2022; Touvron et al., 2023b), and similar maximum likelihood-based approaches have been developed for text-to-image GMs as well (Lee et al., 2023). After pretraining, these methods can similarly be integrated as finetuning to the current pipeline, though the computational and operational costs may be significant. This approach does not address misuses after they happen but creates barriers to the generation of toxic and harmful texts in the first place. In light of the clear deficiencies of use-restricted licenses, researchers should transition to these proven, practical safeguards that align with the operational and distributional realities of open-source GMs.

While these specific solutions show more promise than use-restricted licenses, their practicality has limitations that must also be taken into consideration in deployment. For instance, Gu et al. (2022)’s approach is effective only when models operate as black boxes; once text is generated, tracing its origin becomes challenging, which complicates the process of enforcement against, e.g., spreading misinformation with GMs. Although RLHF has provably improved certain LLM metrics such as harmlessness and toxicity, the assessment of the outcome is often difficult and ambiguous (Bai et al., 2022) and, as a result, calls into question the net impact of such alignment (Ouyang et al., 2022). Further, given the already-released powerful models capable of large-scale disruptions (Widder et al., 2022), the challenge is not just in safeguarding future models but addressing the potential harm from existing ones. These nuances underscore that technical solutions alone are insufficient. As the landscape of AI evolves, it’s imperative that means of regulation remain proactive in adapting to the ever-evolving challenges posed by GMs.

Finally, an effective, well-rounded solution to GM abuse demands a collective, interdisciplinary effort from law, ethics, and academia. On the one hand, as researchers, especially those pioneering safety mechanisms like LLM watermarking, lay the foundational groundwork that implements safety mechanisms, it is equally crucial to focus on the meaningful integration of high-level principles to datasets, models, and products. Legal experts and ethicists play a pivotal role in identifying axiomatic principles that dictate the purpose or objective of technical solutions, ensuring that they not only function effectively but also align with societal values and norms (Hagendorff, 2020). On the other hand, we must recognize that such technical fixes adopt one particular, narrow understanding of “responsible AI use” and address only part of the problem (Munn, 2023). How would human bias influence a GM’s inductive bias in novel finetuning methods such as RLHF (Ziegler et al., 2019)? How might current GM uses perpetuate or worsen historical inequalities that intersect with race, gender, and class? These questions, unaddressed by technical solutions, are embedded in highly complex socio-technical systems and require constantly renewed perspectives to “see the new as it emerges” (Rességuier and Rodrigues, 2020: 1). As the intersection of GMs’ capabilities and societal implications becomes increasingly complex, a collaborative approach is crucial in aligning technical advancements harmoniously with ethical guidelines and legal frameworks to ensure the potential of AI is harnessed responsibly.

Model name	Modality	Release date	Availability	License type	Commercial Use?	Restrictions?
GPT-2 (Radford et al., 2019)	Text	2/2019	Open-Source	MIT¹	Yes	No
XLNet (Yang et al., 2019)	Text	6/2019	Open-Source	Apache 2.0	No	No
BERT (Devlin et al., 2018)	Text	10/2019	Open-Source	Apache 2.0	Yes	No
ELECTRA (Clark et al, 2020)	Text	3/2020	Open-Source	Apache 2.0	Yes	No
DALL.E (Ramesh et al., 2021)	Text-to-Image	1/2021	API-Gated	N/A	/	Yes
GPT-J	Text	8/2021	Open-Source	Apache 2.0	Yes	No
GPT-NeoX (Black et al., 2022)	Text	2/2022	Open-Source	Apache 2.0	Yes	No
T5 (Raffel et al., 2020)	Text	4/2020	Open-Source	Apache 2.0	No	No
OPT (Zhang et al., 2022)	Text	5/2022	Open-Source	Custom	Yes	No
BLOOM (Scao et al., 2022)	Text	7/2022	Open-Source	OpenRAIL	Yes	Yes
Stable Diffusion (Rombach et al., 2022)	Text-to-Image	7/2022	Open-Source	OpenRAIL-M	Yes	Yes
GPT-3 (Brown et al., 2020)	Text	11/2022	API-Gated	N/A	/	Yes
Bard	Text	3/2023	API-Gated	N/A	/	Yes
Midjourney v5	Text-to-Image	3/2023	API-Gated	N/A	/	Yes
Falcon	Text	4/2023	Open-Source	Apache 2.0	/	No
Raven	Text	4/2023	Open-Source	Apache 2.0	/	No
Claude	Text	7/2023	API-Gated	N/A	/	Yes
Llama 2 (Touvron et al., 2023b)	Text	7/2023	Open-Source²	Custom	Yes*	Yes
Mistral 7B (Jiang et al., 2023)	Text	9/2023	Open-Source	Apache 2.0	/	No

Footnotes

Acknowledgements

The authors would like extend their heartfelt thanks to Dr. Eric M. Bliman for his valuable insights and feedback, which have significantly enhanced the quality of this work.

Declaration of conflicting interests

The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Jonathan Cui

David A Araujo

Notes

References

Bai

Jones

Ndousse

, et al. (2022) Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.

Black

Biderman

Hallahan

, et al. (2022) GPT-NeoX-20B: An Open-Source Autoregressive Language Model. arXiv preprint arXiv:2204.06745.

Brown

Mann

Ryder

, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems 33: 1877–1901.

Chen

Tworek

Jun

, et al. (2021) Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.

Clark

Luong

, et al. (2020) ELECTRA: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.

Contractor

McDuff

Haines

, et al. (2022) Behavioral use licensing for responsible AI. In: 2022 ACM Conference on Fairness, Accountability, and Transparency. pp. 778–788.

Devlin

Chang

Lee

, et al. (2018) BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Dong

Jiang

Jin

, et al. (2023) Self-collaboration code generation via ChatGPT. arXiv preprint arXiv:2304.07590.

Ferrandis

(2022) OpenRAIL: Towards open and responsible AI licensing frameworks. https://huggingface.co/blog/open_rail

10.

García-Holgado

García-Peñalvo

Therón

, et al. (2021) Development of a spoc of computer ethics for students of computer science degree. In: 2021 XI International Conference on Virtual Campus (JICV). IEEE, pp. 1–3.

11.

Gragnaniello

Cozzolino

Marra

, et al. (2021) Are GAN generated images easy to detect? A critical analysis of the state-of-the-art. In: 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp. 1–6.

12.

Grinbaum

Adomaitis

(2022) The ethical need for watermarks in machine-generated language. arXiv preprint arXiv:2209.03118.

13.

Huang

Zheng

, et al. (2022) Watermarking pre-trained language models with backdooring. arXiv preprint arXiv:2210.07543.

14.

Hagendorff

(2020) The ethics of AI ethics: An evaluation of guidelines. Minds and Machines 30(1): 99–120.

15.

Jiang

Sablayrolles

Mensch

, et al. (2023) Mistral 7B. arXiv preprint arXiv:2310.06825.

16.

Kilcher

(2022) GPT-4chan: This is the worst AI ever. https://www.youtube.com/watch?v=efPrtcLdcdM.

17.

Kirchenbauer

Geiping

Wen

, et al. (2023) A watermark for large language models. arXiv preprint arXiv:2301.10226.

18.

Lee

Liu

Ryu

, et al. (2023) Aligning text-to-image models using human feedback. arXiv preprint arXiv:2302.12192.

19.

Munn

(2023) The uselessness of AI ethics. AI and Ethics 3(3): 869–877.

20.

Nguyen

(2023) A brief report on LawGPT 1.0: A virtual legal assistant based on GPT-3. arXiv preprint arXiv:2302.05729.

21.

Oliver

McNeil

(2021) Undergraduate data science degrees emphasize computer science and statistics but fall short in ethics training and domain-specific context. PeerJ Computer Science 7: e441.

22.

Ouyang

Jiang

, et al. (2022) Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35: 27730–27744.

23.

Paul

(2023) Black workers accused Tesla of racism for years. now california is stepping in. https://www.theguardian.com/technology/2022/feb/18/tesla-california-racial-harassment-discrimination-lawsuit

24.

Poritz

(2022) Microsoft, OpenAI, GitHub face copyright suit over coding tool. https://news.bloomberglaw.com/privacy-and-data-security/microsoft-openai-github-face-copyright-suit-over-coding-tool

25.

Radford

Child

, et al. (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8): 9.

26.

Raffel

Shazeer

Roberts

, et al. (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21(1): 5485–5551.

27.

Ramesh

Pavlov

Goh

, et al. (2021) Zero-shot text-to-image generation. In: International Conference on Machine Learning. PMLR, pp. 8821–8831.

28.

Rességuier

Rodrigues

(2020) AI ethics should not remain toothless! A call to bring back the teeth of ethics. Big Data & Society 7(2): 2053951720942541.

29.

Rombach

Blattmann

Lorenz

, et al. (2022) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10684–10695.

30.

Rombach

Esser

(2022) CreativeML Open RAIL-M. https://huggingface.co/spaces/CompVis/stable-diffusion-license

31.

Roose

(2022) An A.I.-generated picture won an art prize. artists aren’t happy. https://www.nytimes.com/2022/09/02/technology/ai-artificial-intelligence-artists.html

32.

Ruthotto

Haber

(2021) An introduction to deep generative modeling. GAMM-Mitteilungen 44(2): e202100008.

33.

Sandfort

Yan

Pickhardt

, et al. (2019) Data augmentation using generative adversarial networks (cycleGAN) to improve generalizability in CT segmentation tasks. Scientific Reports 9(1): 1–9.

34.

Scao

Fan

Akiki

, et al. (2022) BLOOM: A 176B-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.

35.

Touvron

Lavril

Izacard

, et al. (2023a) LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.

36.

Touvron

Martin

Stone

, et al. (2023b) Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.

37.

Wachter-Boettcher

(2017) Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech. New York: WW Norton & Company.

38.

Widder

Nafus

Dabbish

, et al. (2022) Limits and possibilities for “Ethical AI” in open source: A study of deepfakes. In: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. pp. 2035–2046.

39.

Wiggers

(2022) Deepfakes for all: Uncensored AI art model prompts ethics questions. https://techcrunch.com/2022/08/24/deepfakes-for-all-uncensored-ai-art-model-prompts-ethics-questions

40.

Wirtz

Weyerer

Sturm

(2020) The dark sides of artificial intelligence: An integrated AI governance framework for public administration. International Journal of Public Administration 43(9): 818–829. doi:https://doi.org/10.1080/01900692.2020.1749851

41.

Yang

Dai

Yang

, et al. (2019) XLNet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems 32: 5753–5763.

42.

Zhang

Roller

Goyal

, et al. (2022) OPT: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.

43.

Ziegler

Stiennon

& Wu

, et al. (2019) Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593.

Rethinking use-restricted open-source licenses for regulating abuse of generative models

Abstract

Keywords

The introduction of use-restricted licenses

Inadequacies of use-restricted licenses

The pressing need for an alternative

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iDs

Notes

References