Abstract
This study adopts a multiple-case study design to address ‘Does copyright law protect automated news, and if so, how’ in three jurisdictions: the United States, the European Union and China. Through doctrinal legal analysis of the copyright laws and document analysis of policy reports, corporate responses and other empirical evidence, this study has found that the three copyright regimes differ substantively with regard to both formal texts and informal enforcement of copyright claims to artificial intelligence (AI)-generated news. In the United States, there has been a policy silence. In the European Union (EU), eager regulators have rushed to enact premature laws and failed policy patchwork. In China, the state is instrumentalising both laws and journalism to further its own interests. These findings suggest that current regulatory frameworks in all cases have led to a weakening of the institution of copyright, which, in turn, has contributed to the deinstitutionalisation of journalism and the institutionalisation of algorithms.
Keywords
Introduction
With generative artificial intelligence (AI) models such as ChatGPT, DALL-E and LaMDA making headlines around the world, a new round of debates about how technology is poised to disrupt many aspects of media and communication has emerged. Newsrooms are no strangers to AI. The use of the automated generation of journalistic content through software and algorithms, sometimes dubbed automated journalism (Carlson, 2015), has been present in news organisations for decades now (Linden, 2017). Newsrooms worldwide have embraced automation in the hope that technology would help increase speed and scale in reporting, provide additional value to readers by expanding coverage, and free journalists to do more creative and investigative work (Diakopoulos, 2019). Various levels of adoption of automated journalism have been observed in different regions, with the United States, Europe and China taking the lead (Dörr, 2016).
While automated journalism is becoming increasingly pervasive, one puzzle remains: who is the author of this automated news, and who should get paid for it? Automation raises a particular challenge for copyright, a type of intellectual property (IP) that governs authorship and ownership of literary and artistic works, including news. News copyright has always been a complex issue as it involves a balancing act between providing enough protection so that publishers and journalists have the incentives and resources to create quality journalism and safeguarding the public’s right to be informed (Siebert, 1932; Slauter, 2019). As automated content generation touches deeply upon human areas of creativity and expression (Latzer et al., 2016), which are essential in determining copyrightability (Klein et al., 2015), automated journalism adds another layer of intricacy to the issue of news copyright (Díaz-Noci, 2020; Kuai et al., 2022; Montal and Reich, 2017; Weeks, 2014).
This study on the topic of copyright protection of automated news is situated at the intersection of media, technology and law. It addresses the following three research gaps:
First, while technological development has long fascinated journalism scholars (Steensen, 2011), there is a relative absence of another important perspective in the discussions: law and policy (Helberger et al., 2022; Pickard, 2020). As legal systems and regulatory frameworks have structuring forces in shaping all aspects of our lives (Bannerman, 2024), and policy instruments and government interventions are inextricably bound up with the future of journalism (Pickard, 2020), this normative perspective deserves more of our scholarly attention.
Second, while there is a growing literature on how AI intersects with IP (Abbott, 2022; Guadamuz, 2017; Yanisky-Ravid, 2017), few have focused on non-fiction works, such as news (Trapova and Mezei, 2022). As Denicola (1980) pointed out, copyright law has historically ‘always dealt more comfortably with the novelist, painter, or composer, than with the historian, reporter, or compiler’ (p. 96). As the legal status of AI-generated works remains unclear in many jurisdictions, it is high time to engage in the debate and make the case for news.
Third, international perspectives and comparative approaches are underrepresented in both communication and media studies, and law and policy studies (Curran and Park, 2000; Gritsenko et al., 2022). This could be problematic given that the often taken-for-granted democratic assumptions embedded in much of the social sciences do not always apply to different contexts. As copyright policymaking and AI governance become increasingly internationalised affairs and the forces shaping the digital media landscape transcend borders (Bannerman, 2024), it highlights the need for research on the plurality of regulatory regimes and communicative phenomena across the globe.
To address these research gaps, this study adopts a multiple-case study design to investigate the broad question ‘Does copyright law protect automated news, and if so, how’ in three jurisdictions, namely, the United States, European Union (EU) and China. By looking comparatively at the copyright regimes, which consist of regulations, norms, discourse and technology (Katzenbach, 2018), and their respective treatments towards journalism and automation through doctrinal legal analysis and document analysis, this article aims to (a) explore potential mechanisms for regulating automated news, and by extension, protecting journalism and governing AI; (b) reflect on the role of copyright in shaping the digital media landscape; (c) uncover how technology and policymaking affect larger understandings of what journalism is, how it should be and why it matters. In so doing, this study offers insights into different opportunities for news protection, adds to the global debate on algorithms and AI governance (Latzer and Just, 2020) and contributes to the conversation of how AI can help journalism fulfil its democratic aims (Helberger et al., 2022).
The mutual shaping of journalism, algorithms and copyright
Algorithms and automation have been part of journalism for decades now (Anderson, 2013). Many journalistic tasks and functions are being increasingly automated and powered by algorithms, from news gathering to production to distribution (Diakopoulos, 2019). This study engages with ‘automated journalism’, which refers to ‘algorithmic processes that convert data into narrative news texts with limited to no human intervention beyond the initial programming’ (Carlson, 2015: 417). The automated journalism outputs, or ‘automated news’, may fall under the category of ‘AI-assisted outputs’ or ‘computer-generated works’, terms often used in wider academic, industry and policy debates.
To date, a handful of studies have examined the legal challenges brought up by automated journalism. For example, Weeks (2014) discusses how automated journalism interacts with the First Amendment, Section 230 and copyright in the US context. By studying attribution, Montal and Reich (2017) investigate the practical issues of authorship, disclosure policies and legal views on automated journalism. Lewis et al. (2019) examine the potential legal hazards when algorithms produce libellous news content. By dissecting Natural Language Processing (NLP) – the core technology used in automated journalism and the production process, Trapova and Mezei (2022) argue copyright law in the EU cannot be extended to automated news. These studies all point to copyright, one area of law in which automated journalism poses particular challenges.
Worldwide, copyright law has the dual function of not only protecting authors and their works to encourage creation but also ensuring the public’s access to information and knowledge (Ananny and Kreiss, 2011). Copyright protects creators’ moral rights through authorship and their economic interests through property rights (Ananny and Kreiss, 2011; Klein et al., 2015). Both news and computer programmes are currently protected under copyright as literary works by international conventions. Some jurisdictions also offer protection for database, and text and data mining (TDM), which are essential for AI development, through copyright regulations. Automated content production, including automated journalism, touches deeply on some key elements in copyright, such as creativity and originality (Guadamuz, 2017), authorship and ownership (Abbott, 2022) and the copyrightability of these works (Yanisky-Ravid, 2017).
The discussion of news copyright has long presented a particularly complicated jurisdictional problem, with a subject matter – ‘news’ – that is notoriously hard to define and considerations for the rights of the public (Siebert, 1932). Legislators and regulators also need to deal with the potential danger of copyright abuse, which can result in either state-controlled propaganda machines or media conglomerates monopolising information flow in different contexts (Picard, 2015; Slauter, 2019). The conversation of protecting news, with copyright being considered as a potential mechanism, has gained increased prominence in recent years due to rising platform power as tech companies have begun to act as gatekeepers to news and to capture a huge share of the advertising revenues (Meese and Bannerman, 2022; Nielsen and Ganter, 2022). Historically, legal frameworks and copyright regimes have been tested and revised in response to the arrival of the telegraph, radio and the Internet (Bently, 2012; Slauter, 2019; Tworek, 2015) and have consequently played a shaping role in the development of journalism.
Theoretical framework and research questions
To investigate the complex interplay of journalism, algorithms and copyright, I draw on sociological institutionalism and regard each of them as an institution that is a complex social structure that constitutes both formal and informal networks of rules, practices, roles and norms that create orders to sustain their values, legitimacy and endurance (DiMaggio and Powell, 1983; Just and Latzer, 2022; Katzenbach, 2012; Napoli, 2014; Reese, 2022). In addition, I draw on historical institutionalism to add a critical historical dimension that informs the structural and historical stakes at play (Bannerman and Haggart, 2015). In the following section, I summarise the key elements of the three institutions (journalism, algorithms and copyright) through this theoretical lens before introducing my three research questions.
First, the well-established idea of the press as the fourth estate has reinforced the notion of journalism as an institution. However, in the digital era, the institution of journalism no longer appears to be homogeneous or stable, succumbing to both internal and external forces, such as imperatives to innovate, economic pressure, organisational uncertainty, political interests and government intervention (Ananny and Kreiss, 2011; Örnebring and Karlsson, 2022; Picard, 2014). The institution of journalism increasingly needs to take into account various actors within and beyond newsrooms, such as journalists, editors, managers, technologists, businesspeople, audiences, platform companies and even non-human actors like networks and algorithms (Anderson, 2013; Carlson, 2015). The institutional crisis of journalism is exacerbated by the ‘shiny new things’ syndrome of news organisations prematurely embracing technological innovation without articulating its values and establishing shared beliefs (Helberger et al., 2022; Karlsson et al., 2023; Pickard, 2020).
Second, the institutionality of algorithms, manifested in their functionalities and effects, is largely comparable to those of institutions in general and media institutions in particular (Napoli, 2014: 343). As more than a series of computational instructions programmed to fulfil a certain task or a ‘black box’ (Gillespie, 2014), algorithms as an institution can control supply and demand in the media (Napoli, 2014), generate and distribute economic wealth (Latzer et al., 2016) and contribute to the co-construction of social reality and social orders (Latzer and Just, 2020). Algorithms’ regulatory potential is famously captured by the saying ‘code is law’ coined by Lessig (2000). But algorithms as an institution are also both an instrument and a result of regulations, shapable by various actors, practices, values and orders in its institutional environment (Katzenbach, 2012; Latzer and Just, 2020). The institution of algorithms interacts particularly with the institution of journalism in examples such as search algorithms adopting journalistic logic of relevance (Van Couvering, 2007), or journalism adopting algorithmic logic in pursuit of measures (Carlson, 2018), data (Coddington, 2015) and open source (Lewis and Usher, 2013).
Third, copyright as an institution promotes the idea of protection by engaging in the balancing act to the mutual advantages of producers and users (of news and of technology) and balancing rights and obligations. From the sociological perspective, copyright as an institution is a norm, value or practice of how and what creative works are perceived and regulated by delimiting what subject matters to protect and what rights creators have (Ananny and Kreiss, 2011; Bannerman and Haggart, 2015). A historical institutionalist would also emphasise the historical roots of how copyright came into being and modified both as a norm and a more formal institution, such as domestic copyright laws, and international conventions, treaties and governing bodies like the World Intellectual Property Organization (WIPO). An example of the interaction between the institutions of copyright and algorithms is illustrated by ‘algorithmic copyright enforcement’, the algorithmic detection and issuing of ‘notice and takedown’ (Perel and Elkin-Koren, 2016).
By mobilising their institutional, material and ideational resources, institutional actors can change the institutional arrangements to align with their values and promote their orders (Bannerman and Haggart, 2015). While some actors with privileged access to resources can gain legitimacy and authority for their ‘institutionalisation’, the weakening of norms and lack of shared understanding can contribute to a ‘deinstitutionalisation’ of others or sometimes conceived as a part of the process of ‘reinstitutionalisation’ (Picard, 2014). The institutions compete and collaborate in a ‘bargaining game’ to further their interests in a continuous process of differentiation and isomorphism, with an open outcome that sometimes comes with unintended or unexpected consequences (DiMaggio and Powell, 1983; Rhodes, 2007).
Granting copyright protection for automated news would mean the institution of copyright supports the institution of journalism and protects news as a type of literary work, as well as supporting the institution of algorithms and protecting AI-generated works. In reality, things are far more complicated, as the institutions of journalism, algorithms and copyright mutually engage in a ‘game’ of copyright protection of automated news that needs to cater to an array of actors with potentially conflicting values and goals that are not necessarily aligned. Therefore, to better understand the tensions within and among the institutions and the potential implications of the structural arrangements, it is important to tease out the key elements at play. Thus, the first two research questions ask:
RQ1: Which actors and practices are (not) considered in the copyright regimes in relation to copyright protection of automated news?
RQ2: What values and institutional orders are (not) promoted in the copyright regimes in relation to copyright protection of automated news?
In addition, from the historical institutionalist perspective, the role of history is also accounted for through ‘path dependency’, where past decisions tend to constrain future institutional arrangements (Bannerman and Haggart, 2015). To account for change, institutionalists also introduce the concept of critical junctures as important points when a moment of disruption occurs where institutional arrangements are uncertain and change is seriously considered (Capoccia, 2015). In this study, I view proliferating automated content generation, rapidly developing technologies entering uncharted territories in laws and regulations and the current institutional crisis of journalism as critical junctures. As the institution of algorithms is increasingly established as a player in the ‘game’, to better understand the shifting dynamics, it is important to identify the mechanisms supporting or constraining the institution of algorithms and how it interacts with other institutions in these moments of change. Hence, the third and last research question asks:
RQ3: In relation to copyright protection of automated news, how are current institutional arrangements influenced by and influence the institution of algorithms, and to what effect?
Finally, as these arrangements and consequences are situated in a particular political, social, economic and cultural setting, operating under different legal and media systems, and historical traditions of which they are a part, each context is analysed separately before being joined for comparison.
Method and data
This study adopts a cross-national multiple-case study design (Yin, 2018) in combination with legal doctrinal analysis of the laws (Tiller and Cross, 2006) and document analysis of policy reports, legislative proposals, corporate press releases and other empirical evidence (Bowen, 2009). The cases of the United States, EU and China are selected because of their relatively advanced development of automated journalism, and their role as global leaders in developing and deploying AI and automation technologies. In addition, their distinct legal traditions, media systems and different economic, political and cultural environments constitute a ‘most different systems’ research design (Anckar, 2008), which facilitates the analysis of contributing factors to the outcome of the institutional arrangements and sheds light on the similarities, differences, and particularities found in each case. Although the EU is not a nation-state, it has a harmonised regulatory and legal framework towards copyright (and AI), governed by international treaties and EU-level directives and regulations, illustrating ‘a European approach’. Therefore, the federal features of the EU as a political organisation make it comparable to national entities such as the United States and China.
As I approach the copyright regimes as a hybrid of laws, policies and discourses (Katzenbach, 2018), I rely on multiple sources of evidence, which I triangulate on three levels (see Supplemental Material Appendix for a list of documents): (a) primary data, as in copyright laws, their accompanying implementation guidelines, court cases and administrative records of copyright registration related to journalism or automation; (b) contextual materials, as in policy reports issued by the copyright offices, legislative proposals and other policy documents concerning AI and journalism; (c) discursive materials, as in consultation opinions during the legislative process, industry debate and corporate responses. The contextual and discursive materials are considered secondary data that serve to support or challenge information derived from the primary data, with the entire corpus constructed to optimise the validity of the comparison. In addition, documents governing the international framework, such as the Berne Convention, Rome Convention and TRIPS Agreement, were examined to establish the international context.
In terms of data collection, the primary data were retrieved directly from the supervising body of IP/copyright in each jurisdiction through their public records. I further conducted keyword searches in the databases in their respective legislative and administrative bodies, using search terms such as ‘automat*’, ‘algorithm*’, ‘artificial intelligence, ‘journalis*’, ‘newspaper*’, ‘press’, ‘publisher’, ‘news’ (search queries were adjusted to each portal), then manually collected and assessed documents for relevance to be added to the corpus. Additional documents were identified by reviewing relevant academic literature. I further collected discursive materials by monitoring professional forums as well as the official websites of key technology providers. For data analysis, following my theoretical framework, I conducted legal doctrinal analysis (Tiller and Cross, 2006) of the primary data to examine the copyright laws’ treatment of journalism and algorithms and document analysis of the secondary data in three stages: cursory examination, thorough examination, followed by interpretation (Bowen, 2009). Coding was carried out separately for each case before I joined them for comparison and further analysis. All documents were examined in their original languages, English in the cases of the United States and EU, and Chinese in China, both of which I am fluent in.
Findings and analysis
The first research question (RQ1) examines the actors and practices in relation to automated news considered in the copyright regimes. So far, relevant legal theories are untested in the context of automated journalism in most jurisdictions, except for two legal precedents in China. Hence, the following two sections aim to tease out two layers: first, the copyright regimes’ treatment of journalism as in protecting news and newsworkers, and second, its treatment of algorithms as in protecting automated content generation and other relevant algorithmic practices, and then discuss the implications for automated news.
‘News’ is protected, but not everyone creating news
Internationally, copyright for news was formalised on a multilateral basis in the Berne Convention, 1 which was signed in 1886 and is still in force today. As member states, the United States, EU and China offer copyright protection for news in their respective copyright law. China was the last among the three to clarify its stance on this issue in its newly revised copyright law. In the Third Amendment to the Chinese Copyright Law 2 that came into effect in June 2021, for the first time, China articulated its copyright protection of news content. The newly revised law says ‘purely factual information’ cannot be copyrighted (rather than the previous wording of ‘news on current affairs’), but representations of those facts, news commentary and other news-adjacent content can, provided they satisfy the originality requirement. This revision is an active response to curb rising news aggregators’ free-riding on content produced by news organisations (Kuai et al., 2023).
China is not alone in dealing with technological challenges brought to news copyright in the digital era, in particular, the ongoing changes in forms of creation, ways of dissemination of works, copyright transaction models and rising platform power. In the United States, while acknowledging the lack of bargaining power of news organisations, the US Copyright Office reached the conclusion that ‘the challenges for press publishers do not appear to be copyright-specific’. 3 The non-interventionist style is underpinned by the ‘negative policy’ approach as exemplified by the First Amendment to the US Constitution (Freedman, 2010), which offers constitutional protections intended to promote free speech. Conversely, the EU takes a proactive approach to news copyright. The European Commission, in recent years, has increasingly asserted itself as a media legislator and actively shaped the digital media landscape with the Digital Services Act, the Digital Markets Act, the EU AI Act, and the proposed European Media Freedom Act. In relation to news copyright, most notably, as part of its Directive on Copyright in the Digital Single Market introduced in 2019, 4 the EU granted press publishers a new, exclusive right to authorise the reproduction and communication to the public of content they publish by commercial online services, with exceptions for hyperlinking and private uses.
However, the Directive shows a preference for larger news organisations and news agencies, with Recital (55) emphasising the need to recognise ‘[T]he organisational and financial contribution of publishers’. Indeed, the concept of the ‘press publisher’ neglects individual journalists or non-institutional creators of news. The Directive also exhibits an obsession with the licencing mechanism (Quintais, 2020), which neglects small newsrooms and individual journalists who have little negotiating power in making deals with platform companies. Such favouring for employers and organisational actors has also been found in China and the United States. In China, the latest copyright law was revised to label journalists’ work as ‘special-work-for-hire’, placing their works in the same category as engineering project design, drawings of product design, maps and computer software. In the United States, 5 in most cases, any article written by an employee of a newspaper or magazine as part of their employment would be considered a work-made-for-hire, with the publisher having the legal status of author and copyright owner.
Automated news: to protect or not to protect?
In regard to automated content generation, the US copyright regime faces a particular conundrum as it has an explicit ‘Human Authorship Requirement’, 6 and the copyright law only protects ‘the fruits of intellectual labour’ that ‘are founded in the creative powers of the mind’. When a selfie taken by a monkey raised a series of copyright disputes, 7 the court affirmed that animals cannot legally hold the copyright, which aligns with the Compendium of the US Copyright Offices Practice, 8 with the case later written in as an example. The US Copyright Office has consistently denied copyright registration for works of which machines are identified as creators. 9 In the EU, in line with the Berne Convention, human authorship emerges prominently from originality in the current EU copyright framework. The Court of Justice of the European Union (CJEU) has created a practice of assessing originality, in which it has repeatedly stated that a work must be the ‘author’s own intellectual creation’, 10 reflecting its anthropocentric focus. The EU Patent Office has also denied non-humans claiming inventorship, thereby ownership of the IP. 11
Considering the potential necessary human input for automated journalism that could meet the threshold for originality, if protected, the assignation of authorship could lie between the programmer that develops the AI and the journalist or ‘data entrant’ that makes the necessary arrangements for the creation of such work, akin to the UK approach 12 to computer-generated works. So far, no copyright regulation differentiates the types of algorithms involved in news automation practices. In the case of unstructured data and large language models (LLMs) used for automated journalism, the output could also be treated as a derivative work of the materials the AI programme was exposed to during training. The potential wider applications of generative AI in newsrooms may further complicate the issue, for even the risk-based approach in the EU AI Act may not cover its ‘dynamic context and scale of use’ (Helberger and Diakopoulos, 2023). However, automated journalism service providers themselves are eager to make the differentiation. In addressing the issues of generative AI in journalism, United Robots (n.d.) clarifies that their news-writing bots run on a different model based on structured data where ‘factual correctness is basically guaranteed’.
Turning to China, the country provides a particularly interesting case as it established a legal precedent when a court granted copyright protection for automated journalism output. In a case regarding an article generated by Tencent’s news-writing bot Dreamwriter, 13 the Shenzhen Nanshan court found that the arrangement and selection of the data input, trigger conditions setting, as well as the template and corpus style selection of the Dreamwriter development team, were intellectual activities that directly related to the specific expression of the article. As the plaintiff is both the developer and the user of the AI software, the court confirmed the development team’s effort and creation and decided the legal entity, in this case, the employer of the development team, Tencent, as the copyright owner. In another copyright dispute also involving automated news, a court denied copyright protection for the AI-generated content, emphasising a human author’s involvement to qualify for copyrightability. 14 However, the court decided the investment in automated journalism still deserved some protection and granted compensation to the software user, since the software developer was already rewarded with payment for the use of the software. Both cases indicate that while upholding the anthropocentric view, the Chinese copyright regime favours technological innovation and is keen on rewarding and encouraging such investments and investors.
In sum, the findings show that news is protected under copyright laws in the United States, EU and China, and possible avenues for automated news protection have been identified. In addition, although copyright laws in all cases uphold the anthropocentric view, copyright enforcement differs in regard to AI-generated content. However, in all cases, the current regulatory frameworks fail to take into consideration the wide range of news content producers, and the plurality of actors and different types of algorithms (with different inputs, throughputs and outputs) possibly involved in automated news production.
Copyright’s balancing game: competing values in protecting automated journalism
In investigating the second research question (RQ2) of the values and orders promoted in relation to copyright protection of automated news, three major sets of potentially conflicting themes emerged in all copyright regimes.
First, in protecting human creative work as a practice, all copyright laws uphold the anthropocentric value of insisting on originality and creativity. This could potentially put automated news in a legal quagmire since the fact-based nature of news and the journalistic pursuit of factualness may come at odds with being ‘original’ or ‘creative’. As the idea/expression dichotomy affirmed in the TRIPS Agreement 15 mandates, data and facts as such are not protected by copyright law. How much exposition of facts and data qualifies as an ‘expression of facts’ rather than ‘facts’ themselves? How much human input fulfils the requirement for human authorship? How much creativity meets the originality threshold? Such originality requirement has been enforced differently in the United States 16 and EU (see Note 10), with China only formalising such a requirement in its latest amendment to copyright law. In regards to automated news, it would need to be evaluated case-by-case under the current regulatory frameworks.
Second, all copyright regimes are found to favour private ordering and economic rights and, concurrently, neglect moral rights and much of human agency. In the EU, the Copyright Directive indicates an incentive-based utilitarian theory and a focus on monetary gain, with ‘investment’ and ‘financial contribution’ cited as some main motivations for the legislation. The press publishers’ right (see Note 4) could undermine some authors’ intention to share with platforms, with small newsrooms and individual journalists having even less control of their content, and negatively impact their ability to reach audiences and develop online, as well as hurting individual users’ access to information and freedom of expression (Quintais, 2020). In the United States, the copyright system follows the common law tradition of allowing legal persons to be considered authors and favours producers of films or employers of journalists as the copyright owner. In China, copyright law, from its birth, is a hybrid of civil law and common law traditions and principles, but both cases discussed above cited monetary investment as the key consideration for compensation. In the case of automated journalism, as the AI programmes are already protected by copyright law as literary works, the AI developer may not care who owns automated news as the output. But news organisations that are the programme users do. Developing such AI programmes in-house may also not be economically viable for many newsrooms, especially small and local ones. This economic-centric value also ties to the profit motive operating within and in opposition to journalism as a public good. In addition, downplaying moral rights is at odds with the journalistic norm of transparency. An AI bot may not care if they are attributed as the author, but a news article without a byline or proper attribution may erode trust in journalism.
Third, all copyright regimes exhibit favouring technological innovation. In China, the newly revised law has taken an inclusive approach and states that ‘works’ also include ‘other intellectual achievements conforming to the characteristics of the works’. Such an all-purpose miscellaneous provision is designed to cope with unforeseen developments, such as the emergence of a new type of work, including AI-generated works. Such revision is in line with China’s stated goal to become a global AI superpower (Chinese State Council, 2017), for which setting up the legal infrastructure to encourage such innovation is crucial. In the EU, the Copyright Directive provides copyright exceptions to TDM, which could be beneficial for the development of AI. In the United States, its Digital Millennium Copyright Act was the first to offer exemption from direct and indirect liability of Internet service providers and other intermediaries. While copyright, according to the US Constitution, 17 is essential ‘[T]o promote the progress of science and useful arts’, it is interesting to note that the original discussion of whether news is copyrightable, dating back to the 1820s, also related to a debate of whether news qualifies as ‘science’ (Tworek, 2015: 204–205). The power imbalance between news publishers and tech companies has been further exacerbated by Section 230 of the Communications Act of 1934. All these have put copyright’s dual role in protecting journalism and promoting technological innovation to a tougher test.
Institution of algorithms: generating assets, manipulating rules and trying to be norm-setter
In analysing the third research question (RQ3) of how the institutional arrangements influence and are being influenced by the institution of algorithms, and to what effect, the study has found that in all cases, the current regulatory setups have limited governing effects, and the institution of algorithms is strengthening over time across the board.
In the EU, Google claimed, in compliance with Article 15 of the Copyright Directive, that it had signed licencing agreements with over 2600 publications in Europe by mid-October of 2023 (Connal, 2023) to compensate publishers by offering payments for longer previews of their contents. Unfortunately, detailed information about platform companies’ agreements with press publishers is typically not publicly available. In addition, the harmonisation of legal protection for press publishers remains challenging, exemplified by the cases of Germany and Spain (Colangelo and Torti, 2019). In China, news aggregators, led by tech companies such as ByteDance, which is also the parent company of TikTok, used the policy vacuum to grow tremendously. However, they have now been regulated and formalised partnerships with news organisations in sharing advertising revenues. They have also been instrumentalised by the state to promote party propaganda (Kuai et al., 2023). In addition, it is interesting to note that Tencent has an impressive winning record at the Shenzhen Nanshan Court, which ruled in favour of the tech company also in the automated news copyright case. Chinese netizens have referred to Tencent as ‘Nanshan Pizza Hut’. The name is given as Pizza Hut’s Chinese name literally means ‘undefeated man’, which also insinuates the close relationship between the company and the state (Fu, 2021).
In the United States, amid media policy silence (Freedman, 2010), capricious tech companies are calling the shots. For example, Google (2023c) formerly advised against ‘automatically generated content’ but later changed its mind in April 2022 to only object to those that ‘intended to manipulate search rankings’. Furthermore, in February 2023, Google (2023b) acknowledged that ‘automation can create helpful content’ and its search ranking would reward ‘high-quality content’, ‘however it is produced’. This document was released the same week Google launched its own AI chatbot, Bard.
The tech companies that aim to ‘advance AI for everyone’ are also promoting notions such as ‘transparency’, a journalistic value that is sometimes disregarded by news organisations themselves. Tech news outlet CNET was caught quietly using AI to write articles that were later found to contain errors (Sato and Vincent, 2023). In this regard, it behaved similarly to content farms that create ‘news’ just to game search algorithms to monetise traffic. On the contrary, norms of disclosure are forming. Google (2023a) recommends ‘AI or automation disclosures are useful for content where someone might think “How was this created?”’. United Robots (n.d.) also recommended transparency measures that all AI-written articles have a byline that ‘makes it unequivocally clear that it was written by a robot’. Such disclosure practice has also been recommended by OpenAI, a formerly non-profit but currently a ‘capped-profit’ company backed by Microsoft. Following the launch of ChatGPT, OpenAI published its research on how to mitigate threats of AI-generated content and proposed possible solutions such as ‘[G]overnments impose restrictions on data collection’ and ‘[P]latforms and AI providers coordinate to identify AI content’ (Goldstein et al., 2023). In addition, the growing prevalence of algorithms is also fuelled by the intense competition among technology developers. On 14 March 2023, Google launched an API for its LLM PaLM (Huffman and Woodward, 2023). Only a few hours later, OpenAI (2023) released GPT-4, this time, disclosing nothing about its training set, citing ‘the competitive landscape and the safety implications’ (p. 2) as reasons not to practice transparency.
In sum, the current copyright regimes have facilitated algorithms being asset-generating and distributing devices but only in the hands of the ones with privileged access to institutional, material and ideational resources (Bannerman and Haggart, 2015), with the viability of journalism remaining in question. In addition, while humans and people seem prominent in the discourses, all copyright regimes perceive them as passive recipients or aggregated commodities without much consideration for their agency. Taken together, equipped with technological infrastructure, monetary resources, lobbying power, intellectual powerhouses and gaining legitimacy and authority by claiming generally accepted values and norms, the institution of algorithms is turbocharged in becoming an increasingly prominent player in the ‘game’.
Concluding discussion
The comparative analysis of copyright regimes in relation to automated journalism has shown how different institutional arrangements in different contexts have resulted in the weakening institution of copyright, which has contributed to the deinstitutionalisation of journalism and the institutionalisation of algorithms. In the United States, without recognising the copyrightability of non-human entities’ work and without a coherent conceptual structure to follow, the US copyright regime remains a battleground for profit-seeking media entities to compete in monetising automated journalism. At the same time, policy silence (Freedman, 2010) could exacerbate the existing power imbalance between big and small news organisations and between news organisations and technology providers (Bannerman, 2024). In the EU, the anthropocentric view on authorship prevails, and only more sophisticated automated journalism with more human creative choices would trigger a valid copyright claim. The Copyright Directive, as a flawed piece of legislation, could potentially hurt small newsrooms, individual journalists and users. But implications of the EU AI Act, among other regulations, await further observation (Helberger and Diakopoulos, 2023). In China, by separating authorship and ownership, the recently revised Chinese copyright law has found a way for copyright protection for automated journalism to encourage AI innovation, but it favours investors and resource-rich tech companies over journalism. In addition, the state is increasingly assertive in aligning both journalism and technologies with its own goals in a bid to consolidate the ruling government’s control and further its national interest (Fu, 2021).
The study of the challenge brought by automated journalism to news protection has shown that technology is deeply implicated in defining the subject matter of copyright, in this case, news. The findings have demonstrated that algorithms’ challenge to news copyright is just the latest manifestation of the long-lasting debate on protecting news (Picard, 2015; Tworek, 2015). Under a weakening institution of copyright, algorithms as an institution serve as a catalyst for the deinstitutionalisation of journalism by widening the power imbalance between journalism on one hand, and the state and/or tech companies on the other hand, as the laws prioritise and reinforce the position of the state, investors and the tech industry, at the detriment of journalism, as key actors in the institutional arrangements. The institutionalisation of algorithms has been accelerated by some newsrooms’ prematurely eager embrace of innovation without much consideration of the legal preconditions, an underarticulation of journalistic norms and a lack of shared understanding of the roles and functions of journalism. Algorithms as an institution are strengthening their wealth-generating and distributing capacities and constructing their legitimacy into social reality and orders by attempting to set norms and championing generally accepted values such as transparency (Latzer et al., 2016; Latzer and Just, 2020). In all cases, journalism’s long-term autonomy is under threat. It further illustrates that the current journalism crisis is not just a technological problem but a business problem and, even more so, a policy problem (Picard, 2014; Pickard, 2020).
This study makes four contributions. First, by dissecting the institutional orders regulating journalistic innovation, I brought law and policy to the fore and showed copyright’s structuring power and how legal preconditions could impact the development of journalism and AI. Copyright policymaking not only affects dynamics within newsrooms but also has implications for the distribution of power in the whole media system (Bannerman, 2024). It is also a battleground on which different normative ideals of AI are imagined and can inform, create or constrain the conditions for technological innovation. The findings suggest more caution be exerted on the rhetoric of copyright, and that it should not be regarded as an incontestable God-given right (Klein et al., 2015). What copyright law says and how copyright operates is often predominately decided by large corporates, the state, or whoever has the upper hand among the institutions. The analysis points to the need to broaden regulatory imagination, and more inclusiveness and prudence during policymaking and legislation. Second, the context-bound and comparative approach in an international setting highlights the contextual matters for inter-institutional negotiations (Rhodes, 2007) and appreciates the plurality of communicative phenomena across the globe. Such context-aware analysis points to the need for all actors to act together if we wish all members of society were to share the benefits of AI and no one to be left behind. It also highlights the need for all actors to be more articulate and explicit about what journalism can and should be in order to create a clear shared understanding of journalism, cutting across institutional borders (Karlsson et al., 2023), for the liberal democratic role of journalism cannot be taken for granted. Third, the interdisciplinarity of the study contributes to the broader debate on algorithms and AI governance. An institutionalist view of algorithms requires consideration of the legal, social, political, economic and cultural foundations of the values and norms that direct the conditions for constructing algorithms as new institutions (Katzenbach, 2012; Latzer and Just, 2020). To take into account the legitimacy and functionality of algorithms as an institution requires a radical and comprehensive approach to regulating technologies. Establishing effective governance may need reimagining the digital media landscape, as well as the economic and legal operating systems. This is important because all power should be held accountable – whether it be political, economic, governmental or algorithmic. Finally, the study has confirmed the centrality of humans. Putting humans in the centre not only allows human creativity to unleash its potential but also facilitates establishing the chain of accountability, as there should be no rights without responsibilities. The scientific community could rethink what better research to focus on for AI to benefit humanity. Journalists could rethink what human-centric content to produce to better serve the community and inspire trust in audiences. The results also imply that civil society should reflect on its values before embedding them into AI, take more responsibility and actively participate in policy debate, hold technology developers accountable, consider paying for journalism, cultivate one’s media literacy and not take democracy for granted.
Finally, this study has a number of limitations. First, the number of cases could be expanded to include some other interesting cases, such as the United Kingdom or Australia, to account for more diversified contexts. Second, I did not examine the instances of cross-border information flow, which can be interesting considering the territorial characteristic of copyright, especially with the potential of AI-powered translation removing language barriers. Finally, policymaking and legislation are dynamic processes. My assessment can only be tentative. Further research can also address the interplay between copyright and other regulations, such as tax codes, competition law and data protection law, to gain a more fine-grained picture.
Supplemental Material
sj-xlsx-1-nms-10.1177_14614448241251798 – Supplemental material for Unravelling Copyright Dilemma of AI-Generated News and Its Implications for the Institution of Journalism: The Cases of US, EU, and China
Supplemental material, sj-xlsx-1-nms-10.1177_14614448241251798 for Unravelling Copyright Dilemma of AI-Generated News and Its Implications for the Institution of Journalism: The Cases of US, EU, and China by Joanne Kuai in New Media & Society
Footnotes
Acknowledgements
I would like to thank Michael Karlsson, Henrik Örnebring, Elizabeth Van Couvering, Rodrigo Zamith, and Edson C Tandoc Jr., who provided helpful comments at different drafts at various stages. I am thankful to the anonymous reviewers for their valuable feedback that helped to improve the article. Additionally, I am grateful for the support from colleagues at Karlstad University, Train Network, ICA Journalism Studies Division, and Communication Law and Policy Division. I’d also like to extend my gratitude to the special issue editors and the New Media & Society Editorial Team for their support throughout the publication process.
Funding
This research is supported by the Anne Marie och Gustav Anders Stiftelse för mediaforskning.
Supplemental material
Supplemental material for this article is available online.
Notes
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
