Sage Journals: Discover world-class research

Abstract

There is a need for a conceptual framework and transdisciplinary construction to help understand and instruct automated news production for integrating AI into news routines. Using a conceptual method, this article attempts to explicate the journalistic background of automated news production and proposes a methodology of artificial language philosophy and a therapeutic approach. We argue that the methodology for automated news production should be able to demystify the technological process and clarify the relationship between AI and human journalists. The article introduces linguistics and narratology to decipher the process of automated news production and finally summarizes the typical narratives for automated news production. This article’s ultimate contribution is offering a transdisciplinary perspective to dissect automated news production for further studies.

Keywords

AI news automation news production methodology applied linguistics news narrative

Introduction

Artificial Intelligence (AI) is defined by Bringsjord and Govindarajulu (2018) as a system that can think/act rationally or can perform like humans. AI and associated technologies are being applied to news production (Hansen et al., 2017) and impact news routines, at least in some influential news agencies. For instance, the Associated Press looks for ways to deploy AI in daily workings and cooperates with startup companies to infuse AI innovation into news production. The Chinese government is pushing its state news agency Xinhua to integrate AI into its news production to build a high-tech newsroom, based on information technology and featuring human-machine collaboration.

Automated news production is a complex amalgamation of data availability, data analyzing techniques, data-driven critiques, and intelligent automation. Some scholars tend to view the amalgamation from a methodology of over-simplification and antagonism (e.g., Tatalovic, 2018). Technological determinism also complexes the situation by stressing the autonomy of technological change and the technological shaping of society (Galily, 2018).

While AI and the associated technologies are being introduced to news routines, the traditional working methods of journalism tend to be intervened by digital methodology (e.g., Coddington, 2015), and the intervention is inclined to result in the confinement of functionalism and simplism of news and societal practices. Therefore, Lewis and Westlund (2015a) develop a sociotechnical emphasis on interconnections among actors, non-human actants, audiences, and activities to find the balance of AI and humanity in news production. They contend that the relevant research should not overly emphasize human centrality and increase the tension between humans and machines.

Against the backdrop of the coexistence between intelligent machines and journalists in news organizations, automated news production methodology still needs further clarification (Lokot & Diakopoulos, 2016). Normally, there are two paths to follow to understand the methods of adopting AI in news production.

One path is to deploy automation for automation’s sake (Young et al., 2018). It represents the mindset of making meaning via social structures, processes and practices while upholding the myth of interactivity (Domingo, 2008) in data, algorithmic, social, and cultural exchanges. For example, the automated news generation system developed by Leppänen et al. (2017) only reflects the data transmission and journalistic actors/actants interactions in news routines and does not fully consider the societal influences of the automation system.

The second path is for the realization of the anticipated functions of the automated news production systems. Following this path, the deontological obligation of news production is subjective to the algorithmic and computational rules, and news authority tends to decrease under some circumstances (Lewis & Westlund, 2015b). To counter this effect, Napoli (2014) underlined the importance of theoretical and industrial institutionalization of the related norms and regulations.

To find an approach to methodologically build an effective path to discern and instruct the automated news production, and to avoid the overemphasis on the myth of automated interactivity in journalism practices, this study proposes a new methodology for automated news production, discusses the application of linguistics in this domain, and summarizes the featured narratives for the news produced by AI and associated technologies, based on the analysis of the latent turns from precision journalism.

Background: The Potential Turn From Precision Journalism

News automation inherits from the precision journalism tradition (Anderson, 2018; Meyer, 1991). Precision news production centers on reporting through the manner of social science and statistics, and gradually forms different methods and genres of news reporting to tell stories with data, such as computer-assisted reporting, data journalism, and computational journalism.

Notwithstanding that the professional values of journalism are still upheld (cf. Splendore, 2016) in news production, automated news production intervenes in the traditional news routines and is apt to subordinate to data logic. We argue that the dual directions of data transmission and the methodological conversion epitomize the potential turn of automated news production from precision journalism.

The Dual Directions of Data Transmission

The nomenclatures of data-related journalism are interchangeable in a practical sense (Splendore, 2016).To clarify the relevant concepts of different journalism genres, Anderson (2018) proposes the opposite data flows in the process of news production to demarcate the boundary between data journalism and computational journalism.

The direct purpose at the dawn of computational journalism is to “supplement the accountability function of journalism” (Flew et al., 2012) by integrating algorithms, data, and social investigation methods. Computational journalism structures semantic information and extracts syntactical items, such as acting subjects, actions, and objects, from real life. Based on computer-assisted reporting and scientific tools in journalism, it turns stories into databases to manipulate relations and correlations in the world of narration and semantic web (Anderson, 2018). From this perspective, the data constantly flow from realities to computational and scientific tools, and finally to spreadsheets.

On the contrary, data journalism usually holds the belief that news can be represented by structured information that can indicate the attributes of reality. In many cases, data journalism adapts the qualitative sources to the numerical narrative manner (Splendore, 2016). Thus, the data flow is often from the data characteristics of news events to the attributes of social reality.

The difference between computational news production and data news production should not be taken as a contradiction but be treated as a continuity coupled with changes and innovations. News production is inherently mixed with data analysis and visualization techniques (e.g., Splendore, 2016). Journalism is under the strain of bridging the rifts when co-evolving with technologies: professional expertise versus networked information, transparency versus opacity, active versus passive audience, big data versus targeted sampling (Coddington, 2015; Splendore, 2016). The dual directions of data transmission are not categorical but aim to offer a perspective of understanding the tendency of AI and associated technologies applied in the news production of data journalism and computational journalism.

The dialectical negation of opposing data flows manifests the directional duality of data in automated news production. Automated news production integrates various methods of data gathering, processing, and presentation both computational journalism and data journalism. We put forward this duality as a perspective for effectively making sense of complex data and quantitatively comprehending the complex social reality in automated news production.

Methodological Conversion

In precision journalism, technological tools are designed to cater to human journalists for the making of “sense,” and the tools serve to extend and supplement rather than supplant human journalistic literacy (Flew et al., 2012). The “sense” is usually contextual. For example, local context is significant for concretizing news values (Thomas, 2016). The “context” is usually predicated upon the journalists’ cognitive elaboration, through processes of creation, comprehension, mental modeling, and situated awareness (e.g., Dervin et al., 2003). The methodology revolving around sense-making is widely adopted in data-related news production and is established on the interplay of manual and computational modes of operation.

As AI and associated technologies are being integrated into news routines among many news organizations, the strain from technological and algorithmic institutionalization (Napoli, 2014) is so intense that the confidence of being able to control every step of news production is increasingly baseless since the automation devices for news production are increasingly intelligent and yearning for entire automation and authority.

By automating the institutional routines and downplaying the constructive role of interference from outside of the automation systems, automated news production performs a more radical attitude against sense-making, accompanied by the possibility of self-alienation from actors and resources (such as human resources, over-fitted algorithms, and exclusive data), to a certain extent due to the complex and subtle contextual factors. These factors include public data availability, disorderly software scalability, and many forms of engagements that are progressively interconnected (Flew et al., 2012). It seems that this sense-making methodology (Dervin et al., 2003) is not very adept at explaining the following paradoxes.

The first paradox is that AI enables journalists to generate more news in shortened time while making it difficult for them to constructively participate and promptly understand the specific logic of every decision made by machine learning programs. For instance, automated news production is rebuked for lacking “slow” qualities (Thomas, 2016).

The second one is that AI empowers journalism with more and wider niche audiences while discouraging journalism from regarding news audiences as living beings and social subjectivity. For instance, news audiences are usually imbued with quantitative attributes and numerical manners (Koopman, 2019) in automated news production.

Generally, human journalists do not have to participate in automated news production, if they are not eager to do it. Compared to data journalism and computational journalism which can still be called human journalism or natural intelligence journalism due to human’s indispensability to add humanity and values to news production, news automation by AI is prominent as it can proceed without human intervention. Intelligent machines are pragmatically taken for granted as a way of existence that shows the possibility of being independent of human intervention. AI also shows a competitive edge over human resources in many aspects and has the potential to dispense with human journalists’ complementary and supplementary actions (cf. Flew et al., 2012). Thus, it is vital to underscore the intersubjectivity of news actors (including audiences).

The intersubjectivity draws forth the query into whether subjective intelligence can be bestowed on intelligent machines. The methodology of automated news production needs to admit the intersubjectivity of all news actants (Lewis & Westlund, 2015a). The ideal relationship between human journalists and robot journalists should be “the mode of existence which articulates the ‘self’ with the identity of the other” (Freitas & Benetti, 2017). First, the ideal relationship should recognize machine subjectivity and delimit the subjectivity. Second, it should encourage subjective diversity and uphold humanity. Third, it should attribute not only mentality to others but also the capability of swapping positions with others.

The new relations emerge and represent a methodological conversion of automated news production from the sense-making methodology. The new methodology ought to felicitously deal with the symbols, discourses and languages in both journalism and computational fields, as automated news production stands astride the social and computer-supported systems for dealing with experiences, descriptions, and mimesis of the world (Tomozei & Floria, 2010).

New Methodology for Automated News Production

Artificial Language Philosophy

Automated news production needs a new methodology to normalize its working process and to smoothly integrate intelligent automation systems into news routines. The problems appear to be how to know the very heart of intelligent news machines, and how to design and instruct the production procedures. For the answer, we propose to introduce the theses of artificial language philosophy and genuine phenomenology.

In tandem with the conventionalist view of language in concepts and reality, Lutz (2012) argues that the methodology of artificial language philosophy, that is, developing languages for specific purposes, can clarify and justify the applications of artificial language philosophy. Regarding the methodological usability, the methodology of artificial language philosophy is designed to yield more convenient explanations than the sense-making methodology for automated news production, on account of (1) its connections with the technologies of Natural Language Process, Named Entity Recognition, Text-To-Speech, Automatic Speech Recognition, and Speech-To-Sign language. and (2) depending on its concentration on semantics and pragmatics, such as meaning postulates and linguistic intuitions. In this domain, sense-making methodology and artificial language methodology, even though probably neither exhaustive nor mutually exclusive, diverge and complement in different contexts and narratives.

Artificial language philosophy exemplifies the dogma that “philosophical problems are best solved or dissolved through the conventional prescription of a new language, not by the analysis of actual language use” (Lutz, 2012), which tactfully circumvents the perplexing real contexts in which sense-making is always intractable, and can focus on trans-empirical and logic. For instance (Clark et al., 2010, p.65):

Every boy loves some girl who admires him

\forall x (boy (x) \to \exists y (girl (y) \land admire (y, x) \land love (x, y)))

Taking Natural Language Generation as an example (please refer to Figure 1) to explain the application of artificial language methodology, grammars can be inspected as generating sets of strings, and morphological analysis as the “relation between natural language strings (the surface forms of words) and their internal structure (say, as sequences of morphemes)” (Clark et al., 2010), and sentences as derivation trees.

Figure 1.

Lutz (2012) demonstrates that derivations are based on meaning postulates and internal structures (also known as logic), both of which are invoked on the premise of conventions. As a result, the authenticity of derivations is provided by language convention and empirical research. Without language convention, empirical results can only prove meaning postulates to be useless and inapplicable, rather than false. If automated news production adopts this methodology, it can be a system of linguistic choice and conceptual formation, and it will be more applicable to the fuzzy logic and diversity of social contexts, and it will not be restricted by prejudices and paradigms of the propositional statement by Software developers or news templates writers.

Therapeutic Approach

Regarding the construction of meaning postulates and internal structures in automated news writing, cognitive fractures exist between artificial inner logics and worldly contexts. Thus, the abstraction and purification of the news production process from reality might lead to the inhumanity of journalism in the future. To avoid this, an artificial language methodology should assimilate Husserl’s theory, since this is also the question raised by phenomenologists as the “burning question,” namely, “meaning or meaninglessness of human existence” (Husserl, 1970, p. 6).

Derived from quantitative journalism and precision journalism, automated news production is committed to a more rational social order (Meyer, 1991), and is striving for legitimacy by possessing investigative methods and technological attributes (e.g., Young et al., 2018). Once moving forward to a more “scientific” stance, automated news production just transfers legitimacy from social contexts to the scientific domain. This is also where the crisis lurks. As Płotka (2010) asserts, “the success of sciences is accompanied by the naiveté of human attitudes.” To attack this, genuine phenomenology propounds the therapeutic approach and insists that science is therapeutic if it is conducive to philosophically “consider human life as a concrete and individual subjective being” (Płotka, 2010).

Therapeutic science requires the attitude of non-solipsism, the focus on communicative relations between subjects, and the consideration of life as being in statu nascendi (Płotka, 2010). To fulfill these conditions, genuine phenomenology examines the “communal inquiry of time,” since the objective investigation is invariably a “stream of temporalization” (including retention, primal presentation, and protention). In this stream, self-cognition and human freedom are being pursued through the implications of therapeutic science (Płotka, 2010) which claims to be both objective and subjectivity-concerning. In other words, only when science enables humans to see themselves as acting and questioning persons does it becomes therapeutic.

On the premise that automated news production inherits the scientific attributes of precision journalism and by interpreting phenomenological accounts about therapeutic science as a methodology, automated news production should obey two principles. Firstly, automated news production should keep questioning objectively and introspectively as subjectivity and for the subjectivity because questioning the questioned establishes itself (Płotka, 2010). Nonetheless in the contemporary reality of news production, phenomenologically objective investigation normally concedes to objective description, which is not therapeutic. Secondly, automated news production should reduce the presupposed recognitions and the so-called scientific positions from both algorithm and organization aspects. Because the person “lives in the natural attitude” is often enslaved by presuppositions. The phenomenological method of reducing the presumption is to open “human being for the world and the community permanently” (Płotka, 2010).

One of the first discussions about the therapeutic approach of journalism can date back to the 1980s when news automation was not so intelligent. Joslyn-Scherer (1980) underlines that being therapeutic in journalism is a journalistic specialty for a particular special-interest group. Therapeutic journalism, according to Joslyn-Scherer (1980), aims to promote the self-esteem of journalists and audiences, enhance community treatment and affiliation, and increase the educational function of journalism, depending on the interactions and relations between news organizations, professional journalists, news consumers, and community representatives.

We contend that the practices of automated news production is directed by the tradition of sense-making methodology, the recent adoption of artificial language philosophy, and the therapeutic approach. In the following part, we will discuss how to define, discern and apprehend the application of linguistics to automated news production and the classification of AI news narratives.

Application of Linguistics to Automated News Production

In the 1950s, Natural Language Processing first touched on statistics and/or Machine Learning. The difficulties for computers to process natural languages have been the ambiguity, polysemy, and inflection of languages ever since (Manning & Schutze, 2000; Pustejovsky & Stubbs, 2013). A solution is to annotate corpora for analyzing and training the algorithms. For example, the Brown corpus was generated in the 1960s and 1970s at Brown University to train algorithms.

At that time, the linguistics of natural language processing was mainly for descriptive generation rather than creative generation. Moreover, the relevant studies often emphasized more on automatic annotation of corpora. Pustejovsky and Stubbs (2013) attribute these inclinations to the unavailability of adequate data. The data insufficiency induces the introspective modeling of cognitive functions because this model is suitable for building and explaining rules of composing linguistic utterance and discourse under the framework of formal methodology of linguistics.

The formalist models of cognitive linguistics are naturally associated with algorithms because algorithms also depend on rules to perform operations over inputs (Taylor, 2003). However, with the development of big data and deep learning technologies, rules do not seem to be able to be defined beforehand to exhaust all the possibilities in the language processing, partly due to the heavy inflection of certain languages and the limitedness of inventory units (semantic, phonological, and symbolic reflections) for construing knowledge language. This leads to the focus shift of Generative Grammar (Taylor, 2003) from input management (defining rules; e.g., Fodor, 1983) to output control (determining constraints), or at least to the combination of input and output control.

To a certain degree, this shift represents the rising of autonomous linguistics that is subsumed under (in most cases even substitutable with) mainstream generative linguistics, and reflects some features of the Chomskyan enterprise (Taylor, 2003).

The first feature is syntactocentrism. The syntax is comprehended as a computational mechanism that generates and constructs grammatical sentences and lexical materials.

The second feature is modularity. The syntax is often encapsulated as an independent and computational module from phonology and semantics, and especially cannot be explained in semantic terms and by cognitive capacities.

Extensive exposure to data is the third feature. Although data do not have to be massive to act as a quantified corpus for linguistic analysis and language process (Pustejovsky & Stubbs, 2013), the data exposure has to be extended to cover the diverse meanings and the idiosyncratic behaviors.

Based on the assumption that meaning is independent of non-syntax perception, autonomous linguistics often exists as being antagonistic to cognitive linguistics. Autonomous linguistics does not overly emphasize the importance of construals, perspectives, foregrounding, metaphors, and frames (Lee, 2001). The antagonism could lead to the debate on “autonomy.” Comparatively, the “autonomy” for autonomous linguistics (Mathieu, 2006) implies (a) the self-containing of syntax and grammar; (b) the separation between language competence (or language knowledge) and language performance (or language use). For example, language knowledge is not directly derived from and informed by language use.

Apart from the meaning assigned by autonomous linguistics, “autonomy” also means self-structural independence (Geeraerts & Cuyckens, 2010) in the generative grammar, such as the linguistic structures (e.g., syntax and morphology). However, the self-structural independence may not justify the external independence, for example, the independence of the news ecosystem.

To resolve this concern, the debate converges on the one-to-one mapping of four relations: syntax and semantic, form and interpretation, position and function, structure and meaning. From the perspective of cognitive grammar, the mapping is a conventionalized association among semantic structures (Geeraerts & Cuyckens, 2010). Whereas, from the standpoint of autonomous grammar, the mapping is across two linguistic levels of representation (i.e., syntax and semantic), upholds syntactical independence, and falls into the defeasibility of interpreting semantics (e.g., Rizzi, 1990).

If the one-to-one mapping is established, it is plausible to apply data to connect contextual and linguistic factors and finally to automatically generate news, relying on well-annotated corpora for description and generation. The application of linguistics to automated news generation is realized through the pragmatical functions of both cognitive linguistics and autonomous linguistics. Therefore, the key issue turns to how to apply these functions to automate news generation in tandem with the progression of linguistics.

Computational linguistics makes a foray into this field and is divided into fundamentals, methods, and applications. The fundamentals contain computationally understanding of phonology, morphology, lexicography, syntax, semantics, discourse, pragmatics and dialog, grammar, complexity, etc., in natural languages. The methods comprise maximum entropy models, text segmentation, part-of-speech tagging, parsing, word-sense disambiguation, anaphora resolution, natural language processing/understanding, speech recognition, text-to-speech synthesis, finite-state technology, lexical knowledge acquisition, corpus linguistics, linguistic annotation, etc. The applications consist of natural language generation, machine translation, information retrieval, information extraction, text summarization, discourse processing, question answering, etc.

These methods and applications act as tools for the one-to-one mapping that becomes the logical bedrock of news text generation. Particularly, computational semantics and computational psycholinguistics render the formal analysis of meanings, and computational models of cognitive mechanisms and representations (Clark et al., 2010). As an example, Padó et al. (2009) illustrate a SynSem-Integration structure which consists of the syntax model, the semantic model, and the interpolation (e.g., inserting words into a text or a conversation). The syntax model parses and ranks the probability of inputs. The semantic model ranks the plausibility of the argument structure of the verb. The two rankings are then interpolated into a general ranking to predict a humanly preferred structure (please see Figure 2).

Figure 2.

Architecture of SynSem-Integration model. Reprinted from “A Probabilistic Model of Semantic Plausibility in Sentence Processing,” by Padó et al. (2009). Copyright 2009 by Cognitive Science Society, Inc. Reprinted with permission.

Even though the sociotechnical view is introduced, computational linguistics cannot adequately embody the public attribute of journalism in automated news production. If computational linguistics is comprehended and utilized as the applied linguistics to automated news generation, it should adopt the criticism from critical applied linguistics to consider the specificity of journalistic contents.

In terms of the conventional disciplinary boundaries and the classical linguistic perspectives, critical applied linguistics is a transgressive approach of applied linguistics by linking language issues to general social issues (Berns & Matsuda, 2006) that include fake news, information manipulation, strategic silence, low-quality news, journalists’ job security, etc. in the journalistic context. Generally, it is devoted to the analysis and critique of “dominion (the contingent and contextual effects of power), disparity (inequality and the need for access), the difference (engaging with diversity), and desire (understanding how identity and agency are related).” (Pennycook, 2006) From the perspective of critical applied linguistics, automated news production is a type of social interaction that participates in the construction of meaning, knowledge and ideology.

Combining the standpoints of computational linguistics and critical applied linguistics, the production of automated news is an automation process of critically perceiving a mode of organizing knowledge, ideas, or experience that is rooted in language and its concrete contexts (e.g., culture, history, or institutions), by dint of computational semantics, computational pragmatics, narratology, rhetorics, etc. to constitute the subjects as multiple, conflictual, and reflective participants in social activities, In the process, it is a challenge to draw the boundary between the subjects (independent persons) and the objects (the dominated). If AI is empowered to the extent of exceeding human control and taking charge of knowledge production at the speed of outpacing the institutionalization of theories, policies, and regulations, quality journalism is likely to suffer more uncertainties.

Classification of Automated News Narratives

Narratological Perspective

There is a lack of clear distinctions between news narrative, storytelling, narrative structure, story structure, and plot in journalistic practices. Upon the coming of AI, it is more requisite to contour the boundary of automated news narrative in the aspects of theory backgrounds and linguistic expressions, for example, AI news values, AI narrative, AI narration, and AI creativity. On many occasions, the related concepts are used interchangeably by neglecting differences in practices, that is, narrative and storytelling, news literature and narrative news. Stemmed from digital and multimedia storytelling, the news storytelling using AI, which we call AI storytelling, also adheres to the classical news structures, that is, inverted pyramid, hourglass, diamond, etc. AI storytelling requires evident querying on the effectiveness of attracting wider audiences’ recognition. For the answer, we attentively introduce narratology to automated news narrative studies.

Narrative news refers to the “non-fictional mediated information that follows the characteristics of stories in terms of structure, characters, and plot.” (Emde et al., 2016) Emde et al. (2016) advocate narrative news to enhance news comprehension, particularly for adolescents who are relatively deficient in issue knowledge, through eliciting stronger affective and cognitive involvement. This returns to how to tell news stories and more precisely how to organize the narrative structure of news. Hence, news narrative studies need to adopt narrative theories, that is, narratology that is dedicated to the logic, principles, and practices of narrative representation (Hühn et al., 2014).

Narratology is the ensemble of theories about narratives, narrative texts, spectacles, events, and cultural artifacts for telling stories (Bal, 2009). A narrative text means “a text in which an agent or subject conveys to an addressee (‘tells’ the reader) a story in a particular medium, such as language, imagery, sound, buildings, or a combination thereof.” (Bal, 2009) Narratology is an intellectual tool for interpretive description (Bal, 2009) and mediated/mediatized translation (Driessens et al., 2017) Conceptually, narratology should not be simplified as a theoretical machinery “into which one inserts a text at one end and expects an adequate description to roll out at the other” (Bal, 2009).

News is a genre of interpretive description of facts, in which the so-called objectivity is actually “a form of subjectivity in disguise” (Bal, 2009), although continuously emphasizing news values of being objective and balanced. This kind of fact description resonates with the descriptive orientation of narratology. The description is a specialty of focalization delimitated by narratology, and it is a textual (or semiotic) fragment in which features are ascribed to objects. Various labels are attached to the focalization of narratology, such as narrative perspective, narrative situation, narrative viewpoint, and narrative manner (Bal, 2009).

Apart from understanding narrative as a form of description, narratologists also define narrative according to the interrelation of events, narrators, and narratees. For example, Gerald Prince gives the following definition. “Narrative: The recounting […] of one or more real or fictitious EVENTS communicated by one, two, or several (more or less overt) NARRATORS to one, two or several (more or less overt) NARRATEES.” (Fludernik, 2009)

About the news narratives for automated news production, the narrative factors (i.e., events, narrators, and narratees) need to adapt to the turns from precision journalism, and the linguistic development for news automation.

In an automated news narrative, news events are the multifarious sets of pure variants to mark attributes. For instance, the 4-tuple event representation {s, v, o, m} defined by Martin et al. (2018), “where v is a verb, s is the subject of the verb, o is the object of the verb, and m is the modifier.” The sets structurally represent the news events and are added with newsworthiness such as proximity, significance, conflict, timeliness, the unusual, prominence, and visual/aural emphasis.

Automated news narratives and classical news narratives have the same conceptual bedrock in narratology, such as Text or Narrative Text (Bal, 2009), Narrator (Fludernik, 2009), Narratee (Fludernik, 2009), Narration or Narrative Act, Narrative Constitution (Hühn et al., 2014), Layers (Levels or Tiers) of Narrative (Bal, 2009; Hühn et al., 2014). To interdisciplinary understand the core concepts and prepare for the classification of automated news narratives, we summarize the meanings of Text and Narrative Constitution in narratology.

Text in narratology refers to the composition of signs in any medium or semiotic system. The narrative text highlights the finite nature and structuredness of narratives, rather than the linguistic nature or linguistic style (Bal, 2009). The signs of texts include linguistic units, video shoots/sequences, audio clips, imagery, painted blots, buildings, etc. The finite assembly of signs can generate a tremendous amount of variations of meanings and functions (Bal, 2009).

Narrative Constitution in narratology means the composition (specifically the structural model) of narratives that have emerged in the traditions of formalism and structuralism. It stands for the multi-level structure of the narrative and underlines the transformation from the natural order of the narrated happenings to the artificial arrangement of the narrative (Hühn et al., 2014).

Classical news narrative and automated news narrative share core narratological nomenclatures. These shared concepts might lead to confusion in academia between the narratives in which AI is integrated merely as a tool and the narratives in which AI plays as a narrator alone or partially.

Latar (2018) treats AI as a tool for news narratives and construes automated/AI news narratives as the narratives using AI and associated technologies for news production. He lists the representative storytellings, including social media storytelling, chatbot storytelling, gamify storytelling, content-sharing storytelling, VR storytelling, drone storytelling, storytelling by telepresence robot, and storytelling by software. This classification of automated news narratives is usually hard to instruct a complete framework of the automated narratives, because of the constantly emerging technologies applied in journalism.

Despite lacking systematic exposition and narratological consideration, Diakopoulos (2019) attempts to dwell more on the narratives in which AI acts as a narrator. Generally, he integrates news narratives with the subjectivity of AI and hybridizes algorithms, automation, and human journalists. He also touches on localized narrative and personal narrative.

In our view, if it is possible to explain automated news from the aspect of narrative forms that differ from other kinds of discourses such as prose and conversations, the relation and interaction between AI, narrator and narratee should evolve with (1) the nurture of news values, for example, the discourse and meaning construction of newsworthiness (Bednarek & Caple, 2017); (2) characterization of automated news narrative, for example, multimodality, scalability, timeliness; (3) technological development of AI consciousness, for example, balancing between autonomy, creativity, and controllability; (4) philosophical examination of AI alterity (Freitas & Benetti, 2017; Tomozei & Floria, 2010), for example, balancing between humanity and AI subjectivity, since in this sense it is the AI news narrator who observes, evaluates the human experience of the world.

Narrative Genres

In this study, the narrative genres of automated news are deliberately differentiated from narrative artifacts (e.g., narrative-centric, narrative-parallel, narrative-additive) and narrative devices (e.g., metaphors, similes, personification, imagery, hyperbole, alliteration), from both of which various categories of narratives can be derived, such as parallel narrative and literary narrative. Regarding the applicability of news narratives as a consequence of balancing realism literature and fictitious literature in the industrialized press (Underwood, 2008), narrative genres are investigated based on nonfictional logic (Lovato, 2018) and technological availability, while aptly taking in dramatic structures.

The news generated by AI is organized following specific procedures that are defined by self-learning and/or human supervision. From the angle of natural language processing/generation, these procedures are not explicitly demarcated from the algorithms that are capable of writing prose or poetries. To distinguish news generation from fictional contents generation, the intentionality and norms of news making (Underwood, 2013) are accepted as the criteria for the judgment, notwithstanding the vague standards (Underwood, 2013) for epistemologically apprehending fact, truth, and accuracy of stories.

The more discussed genres of automated narratives in academia are those which are easy to be organized by converting structured or semi-structured data (Hovy et al., 2013) to narrative constitutions, including structured/semi-structured narrative, interactive narrative, and immersive narrative (e.g., immersive journalism). Nevertheless, the ideal state of narrative genres should also encompass community narrative and restorative narrative to ratify the therapeutic approach.

The structured (or semi-structured) narrative dissects itself into parts to establish its functions and is first applied to automated news production owing to its compatibility with different news ontologies in the technological aspect (Zarri, 2009), for example, Narrative Knowledge Representation Language, atomizing news (Jones & Jones, 2019), semantic units of news, and structured journalism (Caswell, 2019). It tends to be formulaic (sometimes too inflexible to require machine learning, e.g., Santos, 2016) and less creative compared to what news consumers anticipate (Melin et al., 2018; cf. Graefe et al., 2018). To increase the creativity of the structured narrative, several methods have been theoretically proposed from different perspectives and disciplines, such as linguistic creativity from a cognitive perspective (Zawada, 2006), the flexibility or inflexibility of news (Jones & Jones, 2019), organizing news in the event-driven form (Caswell & Dörr, 2018).

The structurization of news also emanates from the interactive narrative (storytelling through interaction between news users and algorithms) and the immersive narrative (storytelling by giving news users experiences of being in the news environment). These two genres of automated narratives structurize and process not only news content information but also interactive and environment information. The intelligence of the interactive and immersive narratives is manifested through recording and adapting narrative constitutions according to the procedural data (generated through the information exchange process), and integrating them into extant structured or semi-structured content data.

To be specific, the interactive narrative is common in the storytelling using social bots (Lokot & Diakopoulos, 2016) and the transmedia storytelling (Gambarato & Alzamora, 2018), and it is closely related to participatory journalism (Saridou et al., 2018). The interactive narrative accentuates the interaction between different news actors (Landert, 2014), and treats journalistic circulation (Carlson, 2020) as part of narrative texts with the news event content.

Being immersive means not only consuming news by 360° Videos and metaverse technologies, etc., but also the real link construction between news users, news items, and general contexts (Cayla-Irigoyen & Aïmeur, 2010). The immersive narrative is attuned to the experiential stories (Pavlik, 2019) that news users immersively experience (e.g., crafting immersive narrative via the subjective camera, and camera’s location and soundtrack collection at the viewer’s eye level) or imitatively live in the perceivable manners of sensationalization, visualization, and simulation, including holographic projection, VR/AR/XR, etc.

However, news structurization for automated news narratives only reflects artificial language methodology in the respective of developing language for automaton. To introduce and put into practice the therapeutic approach, community narrative and restorative narrative should also be brought to the forefront, particularly for feature stories. The community and restorative narratives can add creativity to automated news narratives, since formulaic template models can be enriched by assimilating human-interest stories (Piazza & Haarman, 2011) and cognitive perspectives (Zawada, 2006). These two narratives are contributive to offsetting the over-emphasis on timeliness and scalability of automated news production, and they can endow journalism with more humane factors, social responsibilities, and constructive elements (Aitamurto & Varma, 2018), for example, constructive journalism (From & Nørgaard Kristensen, 2018) and conciliatory journalism (Hautakangas & Ahva, 2018), notwithstanding that timeliness and scalability make it easier to serve for more communities in terms of human resources and sustainable local news business models.

Drawing on AI and associated technologies, news production has more devices to remain connected to local news audiences and engage directly with communities. To some extent, the community narrative is the product of strengthening the connection between news outlets and their audiences (McCollough et al., 2017), through sharing narrative templates within a community setting to pragmatically demonstrate textual politics and reflect power (Stapleton & Wilson, 2017).

The restorative narrative also underscores social and spacial aspects of news production. The restorative narrative (Dahmen, 2016) tells authentic stories that bring communities together, inspire hope and reveal healing. It is strength-based with hard truths that show progression without giving false hope. It keeps sustained inquiries that present universal truths and human connection.

The restorative narrative normally offers the meaningful progress of news events and the longer-term effects on individuals and communities. It gives the prominence of recovery, restoration and resilience to automated news narratives, to balance the relationship between the immediacy of reporting breaking news, the capability of accelerating news production routines, and the sustainability of news reporting, particularly in the wake of traumatic events and systemic dysfunction (Dahmen, 2019).

Conclusions

Regarding the application of AI and associated technologies in newsrooms, the normative news routines tend to be intervened by intelligent devices and new methods of news production. This not only involves a socio-technical view but also requires conceptual updates.

Automated news production is a trans-domain concerning different domain knowledge, that is, linguistics, narratology, computation, statistics, and last but not least, communication and journalism. Although the studies on the computational generation of news or texts started at least 1970s (Glahn, 1970), it is still in great need of a theoretical framework and systemic construction from the journalistic perspective while considering related disciplines. Thus, we elaborate on the linguistic applications and the narrative categories, based on the potential turn of automated news production from precision journalism. This turn is marked by the dual directions of data transmission and the update of news-making methodology in the workflow of news production

Compared with the methodology of artificial language philosophy, the traditional sense-making methodology appears to be less effective to direct automated news production. To coordinate the relationship between human journalists, news audiences, and AI, the therapeutic approach is proposed as a method for a more rational social order and a more self-reflective subjectivity of human perception in automated news production.

For better understanding and instructing the application of linguistics to automated news production, we introduce autonomous linguistics, computational linguistics, and critical applied linguistics into the study field. Moreover, different automated news narratives are presented from the perspective of narratology, including structured/semi-structured narrative, interactive narrative, and immersive narrative. To be therapeutic for AI news consumers, automated news production also needs to employ community narrative and restorative narrative.

Automated news production emerges from the precision journalism tradition and is examined in the studies of computational journalism, data journalism, automated journalism, AI journalism (Zhang & Pérez Tornero, 2023), etc. This article laid out a transdisciplinary framework that attentively integrates journalism, applied linguistics, and narratology. More importantly, this study offers a theoretical lens through which to dissect automated news production as a cross-domain knowledge that needs a logically coherent rationale to build.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Wei Zhang

Qiusheng Tian

References

Aitamurto

Varma

(2018). The constructive role of Journalism. Journalism Practice, 12(6), 695–713. https://doi.org/10.1080/17512786.2018.1473041

Anderson

C. W.

(2018). Apostles of certainty: Data journalism and the politics of doubt. Oxford Scholarship Online.

Bal

(2009). Narratology: Introduction to the theory of narrative. University of Toronto Press.

Bednarek

Caple

(2017). The discourse of news values how news organizations create newsworthiness. Oxford University Press.

Berns

Matsuda

(2006). Applied linguistics: Overview and history. In Brown

(Ed.), Encyclopedia of language & linguistics (2nd ed., pp. 394–405). Elsevier Science.

Bringsjord

Govindarajulu

N. S.

(2018). Artificial intelligence. In Zalta

E. N.

(Ed.), The Stanford encyclopedia of philosophy. (fall 2018 ed., pp. 1–55). Stanford University Library of Congress Catalog Data. Retrieved May 5, 2020, from https://plato.stanford.edu/archives/fall2018/entries/artificial-intelligence.

Carlson

(2020). Journalistic epistemology and digital news circulation: Infrastructure, circulation practices, and epistemic contests. New Media & Society, 22(2), 230–246. https://doi.org/10.1177/1461444819856921

Caswell

(2019). Structured Journalism and the semantic units of News. Digital Journalism, 7(8), 1134–1156. https://doi.org/10.1080/21670811.2019.1651665

Caswell

Dörr

(2018). Automated Journalism 2.0: Event-driven narratives. Journalism Practice, 12(4), 477–496. https://doi.org/10.1080/17512786.2017.1320773

10.

Cayla-Irigoyen

Aïmeur

(2010). Deep Distributed News: Ontologies to the Rescue of Journalism [Conference session]. Canadian Conference on Artificial Intelligence, 2010: Advances in Artificial Intelligence (pp. 344–347). https://link.springer.com/chapter/10.1007/978-3-642-13059-5_43

11.

Clark

Fox

Lappin

(2010). The handbook of computational linguistics and natural language processing. John Wiley & Sons.

12.

Coddington

(2015). Clarifying Journalism’s Quantitative Turn. Digital Journalism, 3(3), 331–348. https://doi.org/10.1080/21670811.2014.976400

13.

Dahmen

N. S.

(2016). Images of resilience: The Case for Visual Restorative Narrative. Visual Communication Quarterly, 23(2), 93–107. https://doi.org/10.1080/15551393.2016.1190620

14.

Dahmen

N. S.

(2019). Restorative narrative as contextual journalistic reporting. Newspaper Research Journal, 40(2), 211–221. https://doi.org/10.1177/0739532919849471

15.

Dervin

Foreman-Wernet

Lauterbach

(2003). Sense-making methodology reader: Selected writings of Brenda Dervin. Hampton Press.

16.

Diakopoulos

(2019). Automating the news: How algorithms are rewriting the media. Harvard University Press.

17.

Domingo

(2008). Interactivity in the daily routines of online newsrooms: Dealing with an uncomfortable myth. Journal of Computer-Mediated Communication, 13(3), 680–704. https://doi.org/10.1111/j.1083-6101.2008.00415.x

18.

Driessens

Bolin

Hepp

Hjarvard

(Eds.) (2017). Dynamics of mediatization: Institutional change and everyday transformations in a digital age. Palgrave Macmillan.

19.

Emde

Klimmt

Schluetz

D. M.

(2016). Does storytelling help adolescents to process the news? Journalism Studies, 17(5), 608–627. https://doi.org/10.1080/1461670X.2015.1006900

20.

Flew

Spurgeon

Daniel

Swift

(2012). The promise of computational journalism. Journalism Practice, 6(2), 157–171. https://doi.org/10.1080/17512786.2011.616655

21.

Fludernik

(2009). An introduction to narratology. Routledge.

22.

Fodor

J. A.

(1983). The modularity of mind. MIT Press.

23.

Freitas

Benetti

(2017). Alterity, otherness and journalism: From phenomenology to narration of modes of existence. Brazilian journalism research, 13(2), 2710–2727. https://doi.org/10.25200/bjr.v13n2.2017.989

24.

From

Nørgaard Kristensen

(2018). Rethinking constructive journalism by means of service journalism. Journalism Practice, 12(6), 714–729. https://doi.org/10.1080/17512786.2018.1470475

25.

Galily

(2018). Artificial intelligence and sports journalism: Is it a sweeping change? Technology in Society, 54, 47–51. https://doi.org/10.1016/j.techsoc.2018.03.001

26.

Gambarato

R. R.

Alzamora

G. C.

(Eds.) (2018). Exploring transmedia journalism in the digital age. IGI Global.

27.

Geeraerts

Cuyckens

(Eds) (2010). The Oxford handbook of cognitive linguistics. Oxford University Press.

28.

Glahn

H. R.

(1970). Computer-produced worded forecasts. Bulletin of the American Meteorological Society, 51(12), 1126–1131.

29.

Graefe

Haim

Haarmann

Brosius

H. B.

(2018). Readers’ perception of computer-generated news: Credibility, expertise, and readability. Journalism, 19(5), 595–610. https://doi.org/10.1177/1464884916641269

30.

Hansen

Roca-Sales

Keegan

J. M.

King

(2017). Artificial intelligence: Practice and implications for journalism (pp. 1–21). Tow Center for Digital Journalism Publications. https://doi.org/10.7916/D8X92PRD

31.

Hautakangas

Ahva

(2018). Introducing a new form of socially responsible journalism: Experiences from the Conciliatory Journalism Project. Journalism Practice, 12(6), 730–746. https://doi.org/10.1080/17512786.2018.1470473

32.

Hovy

Navigli

Ponzetto

S. P.

(2013). Collaboratively built semi-structured content and artificial intelligence: The story so far. Artificial Intelligence, 194, 2–27. https://doi.org/10.1016/j.artint.2012.10.002

33.

Hühn

Pier

Schmid

Schönert

(Eds.) (2014). Handbook of narratology. De Gruyter.

34.

Husserl

(1970). The crisis of European sciences and transcendental phenomenology. An introduction to phenomenological philosophy. Translated by Carr

Northwestern University Press.

35.

Jones

(2019). Atomising the news: The (In)Flexibility of structured journalism. Digital Journalism, 7(8), 1157–1179. https://doi.org/10.1080/21670811.2019.1609372

36.

Joslyn-Scherer

(1980). Communication in the human services: A guide to therapeutic journalism. Sage Publications.

37.

Koopman

(2019). How we became our data: A genealogy of the informational person. The University of Chicago Press.

38.

Landert

(2014). Personalisation in mass media communication. John Benjamins.

39.

Latar

N. L.

(2018). Robot journalism: Can human journalism survive?World Scientific Publishing.

40.

Lee

(2001). Cognitive linguistics: An introduction. Oxford University Press.

41.

Leppänen

Munezero

Granroth-Wilding

Toivonen

(2017). Data-driven news generation for automated journalism [Conference session]. Proceedings of the 10th International Conference on Natural Language Generation (pp. 188–197). https://doi.org/10.18653/v1/W17-3528

42.

Lewis

S. C.

Westlund

(2015a). Actors, actants, audiences, and activities in cross-media news work: A matrix and a research agenda. Digital Journalism, 3(1), 19–37. https://doi.org/10.1080/21670811.2014.927986

43.

Lewis

S. C.

Westlund

(2015b). Big Data and Journalism. Digital Journalism, 3(3), 447–466. https://doi.org/10.1080/21670811.2014.976418

44.

Lokot

Diakopoulos

(2016). News Bots: Automating news and information dissemination on Twitter. Digital Journalism, 4(6), 1081822. https://doi.org/10.1080/21670811.2015.1081822

45.

Lovato

(2018). The transmedia script for nonfictional narratives. In Gambarato

Alzamora

(Eds.), Exploring transmedia journalism in the digital age (pp. 235–252). IGI Global. https://doi.org/10.4018/978-1-5225-3781-6.ch014

46.

Lutz

(2012). Artificial language philosophy of science. European Journal for Philosophy of Science, 2(2), 181–203. https://doi.org/10.1007/s13194-011-0042-6

47.

Manning

C. D.

Schutze

(2000). Foundations of statistical natural language processing. MIT Press.

48.

Martin

Ammanabrolu

Wang

Hancock

Singh

Harrison

Riedl

(2018). Event representations for automated story generation with deep neural nets. The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 32(1), 868–875. https://doi.org/10.1609/aaai.v32i1.11430

49.

Mathieu

(2006). Autonomy. In Brown

(Ed.), Encyclopedia of language and linguistics (2nd ed., pp. 624–626). Elsevier Science.

50.

McCollough

Crowell

J. K.

Napoli

P. M.

(2017). Portrait of the online local news audience. Digital Journalism, 5(1), 100–118. https://doi.org/10.1080/21670811.2016.1152160

51.

Melin

Back

Sodergard

Munezero

M. D.

Leppanen

L. J.

Toivonen

(2018). No landslide for the human journalist - an empirical study of computer-generated election news in Finland. IEEE Access, 6, 43356–43367. https://doi.org/10.1109/access.2018.2861987

52.

Meyer

(1991). The new precision journalism. Indiana University Press.

53.

Napoli

P. M.

(2014). Automated media: An institutional theory perspective on algorithmic media production and consumption. Communication Theory: CT: A Journal of the International Communication Association, 24(3), 340–360. https://doi.org/10.1111/comt.12039

54.

Padó

Crocker

M. W.

Keller

(2009). A probabilistic model of semantic plausibility in sentence processing. Cognitive Science, 33(5), 794–838. https://doi.org/10.1111/j.1551-6709.2009.01033.x

55.

Pavlik

J. V.

(2019). Journalism in the age of virtual reality: How experiential media are transforming news. Columbia University Press.

56.

Pennycook

(2006). Critical Applied Linguistics. In Brown

(Ed.), Encyclopedia of language and linguistics (2nd ed., pp. 283–290). Elsevier Science.

57.

Piazza

Haarman

(2011). Toward a definition and classification of human interest narratives in television war reporting. Journal of Pragmatics, 43(6), 1540–1549. https://doi.org/10.1016/j.pragma.2010.12.005

58.

Pustejovsky

Stubbs

(2013). Natural language annotation for machine learning. O’Reilly Media.

59.

Płotka

(2010). Therapeutic potential of transcendental inquiry in the Husserlian philosophy. Santalka, 18(3), 81–91. https://doi.org/10.3846/coactivity.2010.29

60.

Rizzi

(1990). Relativized minimality. MIT Press.

61.

Santos

M. C. D.

(2016). Automated narratives and journalistic text generation: The lead organization structure translated into code. Brazilian journalism research, 12(1), 150–175. https://doi.org/10.25200/bjr.v12n1.2016.921

62.

Saridou

Panagiotidis

Tsipas

Veglis

(2018). Semantic tools for participatory journalism. Journal of Media Critiques, 4(14), 281–294. https://doi.org/10.17349/jmc118221

63.

Splendore

(2016). Quantitatively oriented forms of journalism and their epistemology. Sociology Compass, 10(5), 343–352. https://doi.org/10.1111/soc4.12366

64.

Stapleton

Wilson

(2017). Telling the story: Meaning making in a community narrative. Journal of Pragmatics, 108, 60–80. https://doi.org/10.1016/j.pragma.2016.11.003

65.

Tatalovic

(2018). AI writing bots are about to revolutionise science journalism: We must shape how this is done. Journal of Communication Science, 17(01), E–7. https://doi.org/10.22323/2.17010501

66.

Taylor

J. R.

(2003). Linguistic categorization. Oxford University Press.

67.

Thomas

H. M.

(2016). Lessening the construction of otherness. Journalism Practice, 10(4), 476–491. https://doi.org/10.1080/17512786.2015.1120164

68.

Tomozei

C. I.

Floria

(2010). Questions regarding Alterity in social collaborative networks. Broad Research in Artificial Intelligence and Neuroscience, 1(1), 70–75.

69.

Underwood

(2008). Journalism and the novel. Cambridge University Press.

70.

Underwood

(2013). The undeclared war between journalism and fiction. Palgrave Macmillan.

71.

Young

M. L.

Hermida

Fulda

(2018). What makes for great data journalism? A content analysis of data journalism awards finalists 2012–2015. Journalism Practice, 12(1), 115–135. https://doi.org/10.1080/17512786.2016.1270171

72.

Zarri

G. P.

(2009). Representation and management of narrative information. Springer-Verlag London Limited.

73.

Zawada

(2006). Linguistic creativity from a cognitive perspective. Southern African Linguistics and Applied Language Studies, 24(2), 235–254. Published online. https://doi.org/10.2989/16073610609486419.

74.

Zhang

Pérez Tornero

J. M.

(2023). Introduction to AI journalism: Framework and ontology of the trans-domain field for integrating AI into journalism. Journal of Applied Journalism & Media Studies, 12(3), 333–353. https://doi.org/10.1386/ajms_00063_1

Dissecting Automated News Production From a Transdisciplinary Perspective: Methodology,Linguistic Application,and Narrative Genres

Abstract

Keywords

Introduction

Background: The Potential Turn From Precision Journalism

The Dual Directions of Data Transmission

Methodological Conversion

New Methodology for Automated News Production

Artificial Language Philosophy

Therapeutic Approach

Application of Linguistics to Automated News Production

Classification of Automated News Narratives

Narratological Perspective

Narrative Genres

Conclusions

Footnotes

Declaration of Conflicting Interests

Funding

ORCID iDs

References