Effects of Visual Framing in Multimodal Media Environments: A Systematic Review of Studies Between 1979 and 2023

Abstract

Arabic

Spanish

French

Chinese

Framing is considered an important theoretical perspective for analyzing the influence of media information. Despite the explicit inclusion of visual elements within established frame definitions, visuals have long been marginalized in framing research. Based on a systematic review of 552 articles addressing visual framing, this article closely investigates 72 empirical studies on visual/multimodal framing effects. Drawing on framing as a multidimensional process, we discuss how existing studies have examined the sensory, affective, cognitive, and behavioral dimensions of impact. We explain the rising importance of visual communication and multimodality and conclude with implications for mass communication and visual journalism.

Keywords

visual framing multimodal framing framing effects media effects systematic review

In the ever-evolving landscape of media communication, framing research has become a critical lens through which the shaping and interpretation of mediated information can be understood (see D’Angelo et al., 2019). Framing, conceptualized here as a multidimensional model of media effects, provides a framework for exploring how media influence perception, cognition, and behavior. The process involves several stages that unfold during exposure to and reception of mediated messages. Recognizing a transformative shift in both the representation of media information and framing research traditions, this systematic literature review focuses on the effects of visual framing in multimodal media environments. Incorporating multimodality, which reflects changes in the way information is presented and consumed, becomes essential to a comprehensive understanding and analysis of framing processes. Having conceptualized framing as a multidimensional model of media effects, our investigation begins by elucidating the significance of multimodality in framing processes, setting the stage for the examination of modality-specific properties of images and visual frames. This foundational step seems critical for addressing the evolving nature of media communication and the challenges posed by the multimodal media landscape.

Motivated by the growing body of research on visual and multimodal framing, our systematic review aims to consolidate and synthesize the scattered state of knowledge in media and communication studies. We approach this research with two central questions:

Research Question 1 (RQ1): What is the current state of visual framing research?

Research Question 2 (RQ2): To what extent have the sensory, affective, cognitive, and behavioral dimensions of the multidimensional framing process been explored in studies of visual framing effects?

In the following sections, we describe our methodological approach and coding scheme and present overarching findings related to visual framing as a field of research. We then systematically categorize and analyze studies of visual framing effects, dissecting findings based on the sensory, affective, cognitive, and behavioral dimensions of multimodal framing. We conclude with a discussion of challenges for future research in mediated communication. Our systematic exploration not only synthesizes existing knowledge but also lays the groundwork for advancing the discourse on visual framing effects in the dynamic realm of multimodal media environments.

Framing as a Multidimensional Process of Media Effects

A frame is defined as “a central organizing idea or story line that provides meaning to an unfolding strip of events” (Gamson & Modigliani, 1987, p. 143), which affords interpretation by establishing connections between interpretable units of visual, textual, and auditory information. According to Entman (1993), “to frame is to select some aspects of a perceived reality and make them more salient in a communicating text” (p. 52). The underlying idea is that in media contexts, certain aspects of reality are selectively emphasized, providing the recipients with interpretative patterns that both support and influence information processing. Frames therefore not only define a problem and identify its cause, but also make moral judgments and suggest possible solutions (Entman, 1993). By framing issues and events in a particular way, the media can influence public opinion in favor of or against certain political, economic, and social trends (de Vreese, 2003).

As a model of media effects, framing operates by encouraging active selection and meaning-making. It encompasses the fundamental principle of simplifying complexity in the process of perception and processing, which is present in nearly all forms of communication (Geise & Baden, 2015). Given this premise, framing can be understood as a multidimensional process consisting of various intertwined subprocesses, including (1) sensory, (2) affective, (3) cognitive, and (4) behavioral dimensions of influence.

Before discussing these dimensions in more detail, it is important to explain why we have chosen to focus on sensory, affective, cognitive, and behavioral dimensions in the study of visual and multimodal framing effects. As outlined below, these dimensions represent key aspects of human response to visual framing in multimodal media settings and provide a nuanced understanding of framing effects. The first, the sensory dimension, addresses how visual stimuli engage sensory modalities, providing insight into immediate, physiological responses and laying the foundation for subsequent cognitive and affective processes. The second, the affective dimension, is concerned with the emotional responses elicited by visual framing. It is crucial because emotions play a central role in shaping attitudes, preferences, and decision-making. The third, the cognitive dimension, examines how visual frames affect cognitive functions, helping to unravel the mental mechanisms at play that influence the understanding, knowledge construction, and information processing guided by frames. And finally, the behavioral dimension, addresses observable actions and reactions that result from visual framing, thus providing a tangible link between framing processes and real-world outcomes that inform practical applications. Taken together, these dimensions provide a comprehensive view of how individuals engage with and respond to visual and multimodal framing. Further and systematic exploration of these dimensions will provide a nuanced assessment of the impact of visual framing on various facets of human response and behavior. It will therefore also enrich our understanding of the complex processes involved and inform practical applications in the development of communication strategies and the design of media messages.

As illustrated in Figure 1, this process begins with (a) the sensory perception of a media frame, which captures the recipient’s attention and directs focus to the multimodal media information, simultaneously initiating the processing phase. This step involves directing visual attention, stimulating associated physiological responses, and selectively perceiving the communication stimuli offered (Geise & Baden, 2015). At the same time, it is accompanied by interrelated cognitive evaluation stages (Seo & Dillard, 2019b), including the spontaneous assessment of the personal relevance of the information provided, which subsequently influences the continuation or interruption of visual attention (Dahmen, 2012; Smith et al., 2021). In addition, an evaluation of the emotional tone, an initial risk assessment, an evaluation of the controllability of the underlying information, and an attribution of responsibility may follow (Kühne & Schemer, 2015). According to appraisal theories (e.g., cognitive appraisal theory), (b) emotions arise from these evaluations (Lazarus, 1991; Scherer, 2005). For example, if the media information is deemed relevant, a frame that emphasizes the perpetrators of a problem may evoke anger, while a frame that focuses on the victims may evoke sadness (Kühne, 2013). As information processing continues, the emotions activated at this stage can then influence subsequent (c) “higher-level” cognitions, which are understood as (elaborated) mental propositions or units of knowledge about an object or topic (Ajzen & Fishbein, 1980), such as the formation of judgments, opinions, or behavioral intentions. These behavioral intentions can then lead to (d) behaviors or changes in behavior, which to some extent are aftereffects of framing.

Figure 1.

Framing as a Multi-Dimensional Process.

It is important to note that the assumption of chronologically sequential framing subprocesses provides only a partial explanation for the complexity of human information processing. These subprocesses not only overlap but also interact (e.g., visual attention and relevance judgments). Nevertheless, distinguishing between them provides analytical value by making us aware of the specific subprocesses and subdimensions that are interwoven in the framing process. In addition, it suggests avenues for empirical research on different framing effects from this process-oriented perspective by providing concepts for methodological operationalization.

Framing in Multimodal Media Environments

Despite its integrative potential, framing has long been studied primarily in textual contexts (D’Angelo et al., 2019; Geise & Baden, 2015). The initial focus on linguistic frames is not self-explanatory, as many of the established conceptions of framing explicitly include visuals as framing devices. For example, in his early theoretical reflection, Gitlin (1980, p. 7) defined media frames as “persistent patterns of perception, interpretation, and presentation, selection, emphasis, and exclusion by which symbol handlers routinely organize discourse, whether verbal or visual.” Gamson and Modigliani (1987) have also explicitly referred to visual images—together with metaphors, exemplars, catchphrases, and depictions—as five symbolic devices in a frame package (see also Borah, 2011; Tankard, 2001; de Vreese, 2005).

Because contemporary media environments are rife with multimodality (Wessler et al., 2016), images were put “back into the frame” (Geise & Baden, 2015), and research has devoted more interest to the visual and multimodal aspects of framing. In many media contexts, visual and textual messages appear together, and recipients receive and process them contemporaneously. Such offerings are multimodal because they are—according to the definition of multimodality—characterized by a “communicative interaction of meaning encoded in different modalities (e.g., sound, image, text)” (Geise & Baden, 2015, p. 4; Kress & Van Leeuwen, 2021). The concept of multimodality not only addresses the multifaceted interplay of information embedded in different modalities, but also acknowledges the fact that each modality offers mode-specific communicative potentials, functions, and limitations (Kress & Van Leeuwen, 2021).

To further elucidate the concept of visual framing in multimodal media contexts and explore its implications, we will use news articles as a representative example of how textual and visual elements work together to construct multimodal media representations. Within a multimodal news article, visual frames (typically conveyed by a press photograph, occasionally by a picture spread or short videos) coalesce with textual frames (typically including the headline, subhead, caption, and article text) to create a distinctive multimodal media unit (e.g., Rodarte et al., 2023; Xu & Löffelholz, 2021). According to framing theory, both the visual and the textual frames in news articles have the capacity to emphasize specific interpretations and evaluations of news events, shaping how recipients perceive and process the information and influencing their understanding, categorization, and evaluation of the reported events.

News images and article text serve as archetypal tools in multimodal framing, allowing news creators to shape their audience’s view of the world (Caple, 2017). Multimodal framing emerges as a particularly appropriate theoretical framework as it allows for the simultaneous consideration of the different elements of visual, textual, and auditory communication (Geise, 2017; Geise & Baden, 2015; Geise & Coleman, 2015). This is important because each modality brings its own specific features to the communication process (Kress & Van Leeuwen, 2021). These attributes need to be recognized in their own right, but also for their interplay with each other (Powell et al., 2015), potentially creating additional layers of meaning beyond the scope of a single modality. This includes the possibility that visual and textual frames, as autonomous modes of communication, may not only complement and reinforce each other’s messages but also present conflicting interpretations. For example, in a news story that uses both images and text, “the text may convey one set of meanings and the images another” (Kress & Van Leeuwen, 2021, p. 41). Nevertheless, from an information-processing perspective, both the visual and textual “information packages” provide elements that readers must synthesize to decipher the central meaning or framework(s) of a news piece (Geise & Baden, 2015). This being said, visual framing effects then occur when elements within an image, the image itself, or a collection of images endorse a certain meaning or interpretation of the depicted content and are adopted by certain recipients (Geise, 2017). From the perspective of perception and emotion psychology as well as visual communication and media effect research, a number of mechanisms come into play that suggest a particular impact potential of visual frames (e.g., Coleman, 2010; Fahmy et al., 2014; Geise & Baden, 2015; Messaris & Abraham, 2001). These mechanisms, resulting from the specific “logic” of the visual mode, are summarized below.

Modality-Specific Properties of Images and Visual Frames

Although images have been neglected in framing research in the past, they form part of the routine of journalism practice and mass communication in which photojournalists and editors select news images that highlight certain points of view, convey a certain interpretation of the news, or imply a moral judgment (Coleman, 2010; Entman, 1993). As Messaris and Abraham (2001) have pointed out, journalists employ various visual framing strategies, such as “choosing one view over another when examining a photograph, cropping and altering the resulting image in different ways, and/or choosing to display one image out of many taken at the same time and place” (p. 217).

Various journalistic framing strategies lead to different representations of visual frames in news media. Following this idea, Rodriguez and Dimitrova (2011) proposed an often-cited framework for understanding the four levels of visual framing. The denotative level refers to the different subjects and objects portrayed in the visuals. The stylistic-semiotic level delves into the conventions involved in visual representation, such as the positioning of the camera shot. The connotative level attends to not only the depiction but also the implicit symbols and cues of an interpretative nature. The ideological level uncovers the intentionality and the latent discourse through visual framing and reflects the ideas and power relationship behind the visual portrayals of perceived realities. Due to modality-specific properties, the processing and reception of visuals take place differently compared with other modalities of communication, particularly texts.

First and foremost, visual communication operates on a different associative logic than the linear, argumentative reasoning of textual communication (Messaris, 2003). While text reading follows a “sequential processing system,” the perception of visual information relies on a “parallel system” that utilizes heuristic information processing (Paivio, 1991). Images, particularly photographs, serve as integral communicative units that present a time-bound, condensed, concrete, and vivid representation of the subject matter. This condensed format efficiently encapsulates a wealth of detailed information in a coherent, easily understood framework that closely resembles the intended representation. As a result, images are perceived holistically, almost automatically, and with less conscious cognitive effort (Rodriguez & Dimitrova, 2011). Visuals are thus mostly quickly processed, easily accessed, and readily understood (Coleman, 2010). Kress and van Leeuwen (2021) attribute this to the fact that visual communication has a distinct semiotic quality in that it encodes and decodes information as tangible representations rather than transforming it into abstract signs. Accordingly, the perception of visual images depends on the recognition of similarity relations between the image and its subject, without the need for prior familiarity with specific representational conventions (Messaris & Abraham, 2001). This indexical quality imbues images with a sense of testimony, enhancing their credibility and establishing their “potential value as evidence” (Messaris, 1998, p. 130). In addition, images command a higher level of attention and salience, resulting in rapid and often prolonged mental anchoring even with peripheral exposure (Messaris, 2003; Paivio, 1991). Images, therefore, are not merely particularly intrusive in low-involvement situations, such as quickly scanning news stories; the heuristic processing of visuals was also shown to bias systematic information processing (Chaiken & Maheswaran, 1994; Tversky & Kahneman, 1974).

Given these modality-specific characteristics, visuals, especially photographs and videos, are more prone to elicit framing effects. Using their associative and heuristic logic, images quickly establish a visual frame that is particularly salient and rarely questioned by viewers. Because visual images carry a sense of authenticity similar to real-life experiences, they can lead viewers to implicitly believe that they are closer to the truth than other modes of communication (Messaris & Abraham, 2001). Visual frames, therefore, have the ability to convey meanings that might be more resistant to the audience if conveyed through words alone (Messaris & Abraham, 2001, p. 215). In these settings, visually suggested patterns of interpretation serve to simplify message comprehension by reducing complexity, inadvertently influencing how recipients interpret and evaluate the information. As a result, visual communication is often considered more “effective” than text-only communication (Kress & van Leeuwen, 2006).

In addition, visuals usually activate higher levels of attention and generate stronger psychological arousal, which is often referred to as the “picture superiority effect” (Childers & Houston, 1984). From this perspective, pictorial information was found to have a superior influence, even when the information is incidental and not mentioned in textual content at all (Gibson & Zillmann, 2000). Research has accordingly shown that visual frames can effectively shape their interpretation of textual information by triggering specific cognitive patterns (Gibson & Zillmann, 2000), potentially exerting a superior influence on message perception and interpretation (Geise, 2017; Geise & Baden, 2015; Powell et al., 2015). Therefore, in a typical multimodal news article, a strong image paired with text is expected to capture visual attention, evoke emotions, and shape citizens’ perceptions of political issues and individuals (Grabe & Bucy, 2009; Graber, 1996; Powell et al., 2018).

Visual Framing in Multimodal Media Environments: A Research Overview

The scientific discourse on framing has led to an immense body of research over the past decades. Due to the increasing importance of images in multimodal media environments, in the past two decades, research has been devoted more intensively to the analysis of visual frames. Instead of regarding framing as a “fractured paradigm,” the state of research seems quite dispersed, which is reflected, on the one hand, in a great diversity of phenomena studied as visual frames, and, on the contrary, in a variety of disciplinary approaches. In this circumstance, a systematic literature review helps to get a renewed overview of the existing research field. Based on the theoretical framework described above, this review can provide further insight into how existing studies have addressed sensory, affective, cognitive, and behavioral dimensions when examining visual framing effects in multimodal media environments.

RQ1: What is the state of visual framing research?

RQ2: To what extent have the (a) sensory, (b) affective, (c) cognitive, and (d) behavioral dimensions of the multidimensional framing process been examined in studies of visual framing effects?

In answering these research questions, our systematic review contributes to the advancement of multimodal framing research by synthesizing and critically analyzing a substantial body of existing studies. By comprehensively examining the available literature, this review provides a holistic understanding of the current state of knowledge in the field and helps to identify specific knowledge gaps in the existing communication literature. In addition, the review provides methodological insights by evaluating common methodological practices, challenges, as well as less frequently used but insightful approaches, thus guiding future researchers in designing future studies on multimodal framing.

Method

Literature Search

Adhering to the established systematic review protocol (Moher et al., 2009), two reputable scientific databases, namely EBSCO (Communication and Mass Media Complete) and Web of Science (category: Communication), were chosen to conduct the literature search. Google Scholar was used to verify and supplement the search results with additional articles. The scope of the literature search is communication scholarship in general. There was no pre-defined time frame, as this study intended to collect all relevant publications that were available in the databases. In line with previous studies (Bock, 2020; Brantner et al., 2012) that evaluated visual framing research published prior to 2017, this review employed three specific search terms: “visual framing,” “visual frame,” and “visual framing effect.” The scope of the publications reviewed included journal articles, review articles, and book reviews.

Table 1 shows the results of the literature search on September 1, 2023. The first content research addressing visual frames in news media could be dated back to 1979, while the earliest study dealing with visual framing effects was published in 1980. After removing duplicate records and cross-checking with Google Scholar search results, 552 publications were identified for the systematic review. It should be noted that this systematic review takes a broader understanding of visual framing. That is, publications were included for review as long as they examined messages from legacy media (i.e., newspapers, magazines, and television), online news sites, social media, movies, advertisements, political campaign materials, books, or spatially immersive media (e.g., virtual reality).

Table 1.

Literature Search Results for Systematic Review.

Key words	Databases
Key words	EBSCO(Field: communication and mass media complete)	Web of science(Field: communication)
Visual framing	180	410
Visual frame	124	411
Visual framing effect	17	59
After removing duplications and cross-checking with Google Scholar results: N=552

Coding Scheme

The codebook consists of formal and content variables. Formal variables include year of publication, authors’ names, country (where the first author affiliated), title of publication, source, and types of publication (including empirical study, theoretical or methodological essay, and book review).

Content variables include fields of communication research (including six subfields such as communicator research, content research, media effects or audience research); research methods (including observation, interviews, surveys, discourse analysis, content analysis, eye-tracking, computational methods, mixed), and analysis units (including text, photograph, cartoon or artistic illustration, data visualization, multimodal unit such as textual-visual news, audiovisual news). Drawing on the abovementioned theoretical differentiation, four variables were specifically developed for understanding how existing studies have empirically examined the multidimensional process in visual/multimodal framing effects. We dichotomously (1/0) coded the variables: sensory, affective, cognitive, and/or behavioral dimensions of framing. Moreover, based on the levels of visual framing (Rodriguez & Dimitrova, 2011), studies were coded dichotomously (1/0) by which level(s) they address the visual analysis, including denotative, stylistic, connotative, and/or ideological levels.

Intercoder Reliability Test

Intercoder reliability was assessed using a random sample of 10% (N= 50) from collected articles. Two independent coders reached very satisfactory results with all variables ranging from Krippendorff’s α=0.94 to α=0.99. On average, the overall reliability result was α=0.97.

Results

Visual Framing as a Research Field

Between 1979 and 2023, a total number of 552 articles referred to visual framing as indicated by the applied keywords. Of these, 465 empirical studies (84%) account for the majority of publications, followed by 69 theoretical or methodological essays (13%) and 18 book reviews (3%).

From 1979 to 2005, the number of publications was less than ten articles per year, implying a low level of attention to visual framing research. This finding is consistent with Matthes’ (2009) review of framing literature during the period 1990-2005, which revealed only 5% of studies included visuals in their analysis. After 2005, the number of publications rapidly increased and has multiplied year after year since 2015. Brantner et al. (2012) presented the first paper with a qualitative review of 29 visual framing studies. They also concluded that around 2010 the theory-building of visual framing was still under establishment. The increasing popularity of visual framing research was also found by a previous review article (Bock, 2020).

Looking at the fields of research, the vast majority of studies analyzed visual frames as part of media content (n=259, 47%). Media effects were examined in 72 articles (13%), followed by 45 studies about the medium (8%) such as digital platforms, journalistic drones, and virtual reality. There were 79 studies applying a communicator perspective (14%); but the least investigated field was audience research (n=33, 6%). In addition, 15 publications (3%) combined more than two research fields.

Visual framing scholars focus on a number of recurring themes, which is consistent with existing review articles (Bock, 2020; Brantner et al., 2012). The most popular topics were war, conflict and crisis (n=82, 15%), photojournalism practices and advertising (n=81, 15%), presidential election and political campaign (n=67, 12%), identity and stereotype (n=63, 11%), social movement and protest (n=40, 7%), and health-related issues (n=40, 7%).

Regarding research methods, more studies followed a quantitative approach (n=217, 47%) than a qualitative one (n=205, 44%). The most frequently used methods were critical discourse analysis (n=177, 38%) and content analysis (n=137, 30%). Survey (or survey-based experiment) was less used (n=66, 14%). The least used methods were: interview (n=17, 4%), observation (n=8, 2%), eye-tracking (n=10, 2%), computer-assisted analysis (n=4), and focus group (n=3).

Interestingly, the review shows that the vast majority of studies treated multimodal media offerings—such as textual-visual news, audiovisual reports, or news video clips—as the unit (or stimulus) of analysis (n=258, 56%). Visual framing research, therefore, rarely considers visuals in isolation, but seriously acknowledges that visuals are usually embedded in textual or linguistic contexts.

Quite a number of studies were primarily dedicated to the visual level and then mostly used photographs as units of analysis (n=132, 28%). Cartoons or artistic illustrations (n=23, 5%) and data visualization (n=7, 2%) were analyzed rather rarely. In 15 visual framing studies (3%), the text was the primary unit, for example, the capture of news images or videos, or headlines in the newspaper layout. Ten studies (2%) referred to more than two units of analysis, and 20 articles (4%) used other units such as performance, exhibition, and virtual environment.

Studies on Visual Framing Effects

Drawing on framing as a multidimensional process, this review further identified 72 empirical studies that were specifically dedicated to visual/multimodal framing effects. Figure 2 illustrates the number of studies that examine the sensory, affective, cognitive, and behavioral dimensions of the framing process. The areas where the ellipses overlap represent studies that have examined more than one dimension. Most studies measured how visual/multimodal news influenced cognition (n=65, 90%). A considerable number of studies have also evaluated the affective effects of visual/multimodal frames (n=28, 39%). Only 14 studies (19%) examined the sensory dimension in visual/multimodal framing effects. Only 4 studies (6%) measured the behaviors triggered by visual/multimodal frames. The sub-sections below provide a closer overview of these empirical studies (see Appendix for an overview of studies categorized by dimensions).

Figure 2.

Venn Diagram of Multi-Dimensions in Studies on Visual Framing Effects.

Sensory Dimensions of Visual Framing in Multimodal Media Environments

Since the vast majority of framing research focuses on post-receptive observations of cognitive framing effects, this review found only 14 articles examining the immediate sensory perception as part of visual/multimodal frame processing. Only four of them (Bucher & Schumacher, 2006; Geiger & Reeves, 1993; Smith et al., 2021; Vázquez et al., 2021) measured sensory perception alone, while other studies combined the examination of sensory aspects with affective (Gómez-Carmona et al., 2021) or cognitive dimensions (Dahmen, 2015; Segesten et al., 2022).

Eye-tracking was used as an efficient tool to capture sensory perception. However, researchers also explored other methods. Early studies on TV news recorded participants’ visual attention using reaction times to a secondary task as the measure (Geiger & Reeves, 1993). Researchers from neuroscience applied brain-imaging detection to capture the visual scene encoding in long-term memory for TV commercials (Rossiter et al., 2001).

Eye-tracking studies reported mixed-results regarding dwell time spent on textual and visual modality, but most scholars observed that recipients spent more time reading textual captions or stories than viewing images (Bucher & Schumacher, 2006; Lee & Ho, 2018; Smith et al., 2021; Vázquez et al., 2021). However, compared with textual messages, visual frames generate more and quicker attention in the reception process. In multimodal messages, “eye-catching” images are regularly the first elements that attract sight and draw it to the accompanying texts (Bucher & Schumacher, 2006; Vázquez et al., 2021), often creating a first impression of the communicated content (Rodgers & Thorson, 2000). Visual frames, therefore, reduce selectivity and potentially cause a stronger physiological activation during the first stages of sensory reception.

Most scholars share the finding that recipients perceive multiple modalities, such as visual images and texts in news, together and complementary to each other—and generate meaning from this nested process of multimodal communication (Geise & Baden, 2015). Bucher and Schumacher (2006), for example, demonstrated that recipients read headlines for a more detailed description of the news picture and actively included this information in their subsequent interpretation of the textual information. This supports the idea that visual and textual information should not be studied in isolation, but in their multimodal interaction.

While news frames are predicted to cause a change in the allocation of visual attention and thus control sensory perception, different types of frames are expected to drive visual attention differently (Smith et al., 2021). In an early eye-tracking study, Dahmen (2012) showed that divergent types of visual frames lead to different sensory responses to identical news texts. Recipients viewed emotionalizing photographs longer compared with the neutral ones. The altered intensity of visual perception also influenced the intensity with which recipients further engaged with the multimodal content at subsequent processing stages (Dahmen, 2012).

The effects on sensory perception, however, are also dependent on prior attitudes of the recipients. Gómez-Carmona et al. (2021) examined the moderating role of issue concern. Participants with high concern for the environment viewed related media frames faster, more focused, and more often. Using mobile eye-tracking technology, Ohme et al. (2022) compared visual attention devoted to news screened on a mobile versus a desktop computer. They found that visual attention did not differ significantly between devices, but memorization of political information was lower, when news posts were seen on a smartphone.

2. Affective Dimensions of Visual Framing in Multimodal Media Environments

Affective dimensions of visual framing have been empirically analyzed but in a limited number of studies (n=28). Analogous to the fact that framing research predominantly focuses on mental processes and models of memory-based judgment formation, none of these studies investigated emotional responses to frames alone, but mostly together with cognitive effects.

Methodologically, affective dimensions of visual framing are mostly examined with the help of standardized, post-receptive survey items (n=27; e.g., Coleman & Banning, 2006; Fung, 2019; Henderson et al., 2022; Iyer et al., 2014; Krause & Bucy, 2018; Parrott et al., 2019). Only a few authors use qualitative approaches, such as focus groups (e.g., Midberry, 2020). No study in this review applied physiological or process-related behavioral measures (such as facial expression) to examine emotional framing effects.

When examining the affective dimensions of framing effects, popular themes include war, conflict, and crisis (n=7), health messages (n=7), migrants or refugees (n=3), environmental issues (n=3), and advertising (n=3). Across these topics, the examined studies show that (a) visual/multimodal frames often lead to emotional effects in viewers, which (b) in turn, affect the subsequent processing of information, such as cognitive framing effects (e.g., Powell et al., 2015; Seo & Dillard, 2019a).

Comparing the affective impact of monomodal versus multimodal news items, Pfau et al. (2006) found that photographs plus captions elicited greater emotional responses (such as puzzlement, anger, and sadness) and reduced political support, than those conveyed via text alone. Later studies (e.g., Iyer et al., 2014; Powell et al., 2015) also showed that particularly visual frames can trigger specific emotions such as pity, fear, or anger that further shape the evaluation of political issues.

Throughout, research reveals that different frame types lead to divergent emotional responses. For example, Brantner et al. (2011) examined the effects of visual frames on the evaluation of news texts about the Gaza conflict. They found that divergent visual frames (victims vs. politicians) led to divergent affective evaluations of the identical news text. Similarly, Parrott et al. (2019) examined affective effects induced by news photos with either a human-interest frame (“immigrants as everyday people”) or a political frame (showcasing politicians). While the political frame increased negative emotions and led to negative attitudes, the human-interest frame fostered positive emotions and attitudes. Midberry et al. (2022) investigated how frames influence discrete empathy, behavioral intentions, and efficacy in reaction to visual solutions journalism. They found that textual frames yielded lower levels of sadness, anger, and disgust than frames with problem-centered images. In contrast, it was image pairs showing both problems and solutions that elicited the highest emotional response, including hope, compassion, and empathy.

Examining the emotional dimensions of framing also reveals that emotions develop along the previous attitudes of the viewers. Krause and Bucy (2018) analyzed how people affectively evaluate, interpret, and make sense of differentially framed images of fracking. Findings demonstrated that valence responses were heightened when individuals processed visual frames that aligned with their existing level of issue agreement. Fracking supporters were more positive toward economic benefit frames while fracking opponents expressed significantly more negative thoughts in response to environmental risk framing (Krause and Bucy, 2018).

3. Cognitive Dimensions of Visual Framing in Multimodal Media Environments

Because theory suggests that particularly visual frames influence information processing and shape cognition (Geise & Baden, 2015), cognitive effects are addressed in almost all identified studies (90%, n=65). The recurring contexts are political campaigns or elections (n=13), advertising (n=11), war and conflict (n=8), health issues (n=9), environmental issues (n=6), and social movements and protests (n=5).

From a methodological perspective, scholars primarily used standardized survey items to measure cognitive effects (n=58), mostly subsequent to the reception of frames (e.g., Dan & Arendt, 2021; Hameleers et al., 2020; Jeon et al., 2019). Here again, qualitative approaches are rarely applied, although they can deepen the understanding of framing processes. For instance, Weber et al. (2022) used a combination of an experimental study and retrospective follow-up interviews to gain deeper insights into the decision-making process of recipients when confronted with virtual reality images versus traditional text formats. Bode (2021) used a discourse analysis of viewer comments on social media and regarded them as indicators of cognitions related to viewers’ understanding of deep fakes.

Researchers have paid attention to a wide range of cognitions that are influenced by the visual/multimodal presentation of messages. Many studies conclude that the visual frame sets the direction and interpretive horizon in which the accompanying linguistic information is evaluated. In an early study, Zillmann et al. (1999) demonstrated that visual frames shape the evaluation and interpretation according to the frames set. The finding has been replicated in follow-up studies (e.g., Gibson & Zillmann, 2000). Arpan et al. (2006) also found that conflict-oriented protest images produced more negative evaluations than peaceful, positively framed images and, in turn, shaped viewers’ further evaluations of protests. The framing effects of visual representation on the evaluation of political actors were shown by Coleman and Banning (2006). Reception of visually framed TV news influenced the perception of the two covered candidates in a corresponding direction.

While framing theory predicts that visual frames exhibit lower cognitive reflection, higher recall performance, and superior memorability (Messaris & Abraham, 2001; Rodriguez & Dimitrova, 2011), there is limited research on visual framing effects in the context of memorization and information retrieval (Rossiter et al., 2001), learning (Ohme et al., 2022), or decision-making (Weber et al., 2022).

To explore frame-dependent effects on recipients’ cognition, researchers mainly manipulated the visual depiction in news images (n=54, 83%; e.g., Bolsen et al., 2019; Lee & Ho, 2018; Sontag, 2018), while effects of stylistic formats were less studied (n=13; e.g., Schindler et al., 2017; von Sikorski, 2018). Nonetheless, a few studies show that stylistic frames can also influence cognitions: visual cues to ideology embedded in political images, for example, were found to affect citizens’ evaluations of political actors (Dan & Arendt, 2021; von Sikorski, 2022).

Only a handful of studies examined the symbolic/connotative aspect of visual content, such as semiotic icons in health messages (Lazard et al., 2017), visual grammar in cartoons (Abdel-Raheem, 2017), or subtle ideological cues in political communication (Dan & Arendt, 2021). Lazard et al. (2017) compared the effects of a symbolic, indexical, and iconic image on the perceived severity and effectiveness of different diseases. Findings suggest that symbolic icons may be most effective for diseases that are not easily visualized otherwise (such as addiction), but iconic and indexical icons, which mostly represent an object’s likenesses, seemed more effective for health effects attributable to specific symptoms.

While most studies demonstrated framing effects triggered by the visual frame, some studies argued that the accompanying text predisposes the meaning of the visual content (e.g., Domke et al., 2002; Pfau et al., 2006). More recent studies, however, imply that interactions between visual and textual frames are complex and need to be differentiated across effect dimensions (e.g., Hameleers et al., 2020). In one of the few “explicit” multimodal framing studies, Powell et al. (2015) compared the effects of a monomodal visual, a monomodal textual, and a multimodal frame on sub-Saharan conflict. In the monomodal conditions, they found stronger effects of visual frames on recipients’ salience perception and attention allocation, while the text frame more strongly shaped cognitive elaboration of the topic. In the multimodal setting, the text frame increased approval of military intervention independent of the media frame, while the visual frame had a stronger effect on recipients’ willingness to engage personally, independent of news texts (Powell et al., 2015). Boomgaarden et al. (2016) support the idea that the textual frame has a stronger influence on the cognitive elaboration of the competence of political candidates than the visual one; for the more affective evaluation dimension of personal integrity, visual and textual frames showed a mutually reinforcing effect. Thus, Boomgaarden et al. (2016) conclude that future studies should more closely investigate which impact dimensions and aspects of political judgment formation are activated more by visual frames and which by textual frames.

4. Behavioral Dimensions of Visual Framing in Multimodal Media Environments

Although framing theory suggests that visual frames are particularly strong drivers of frame-related actions (Entman, 1993; Grabe & Bucy, 2009), behavior as an outcome of framing has only scarcely been studied. In the systematic review sample, only four articles measured the behavioral outcomes triggered by visual/multimodal frames (e.g., Meng, 2019; Powell et al., 2021; Ryan, 2012; Weber et al., 2022).

Using eye-tracking to investigate how visual signaling elements in software documentation influenced task accuracy and completion, Meng (2019) showed that participants working with visual tutorials executed assigned tasks more correctly and successfully. To study the selection and avoidance of political news, Powell et al. (2021) simulated a realistic social media environment where participants could select news items about politics, sports, or entertainment. When people selected political posts, they measured issue agreement and discrete emotions. Once selected, multimodal political news aroused stronger emotions and led to higher issue agreement than text messages, regardless of an article’s attitude congruence. Nevertheless, when compared with news presented solely in text, multimodal articles tended to encourage selective exposure and news avoidance rather than mitigating these tendencies.

Intending to investigate the effect of immersive VR imagery on voting behavior, Weber et al. (2022) simulated a voting decision in an experimental survey. Results show that participants exposed to the VR format cast more YES votes regardless of the pro or con arguments of the content presented; the virtual presentation therefore increased people’s decision-making and related actions substantially.

One understandable reason that actions—understood as individual observable responses in a given situation—are examined in so few studies is that they are very difficult to capture, especially when they do not immediately follow reception. Aiming to resolve this dilemma, a number of studies used action dispositions and behavioral intention as indicators of a person’s attitude and readiness to perform a given behavior later (Warshaw & Davis, 1985). For instance, Powell et al. (2015) claimed that it “remains unclear” how graphic visuals affect “public opinion and behavior” (p. 997), but then measured the intentions to discuss, to donate, to sign a petition, and to protest as proxies of frame-related actions. The measurement of behavioral intentions as framing outcomes involves a variety of different concepts, including the tendency of information seeking (Midberry et al., 2022), willingness to participate or act politically (Geise et al., 2021; Olesen, 2018; Shih & Lin, 2017; von Sikorski & Knoll, 2019), likelihood of further social media engagement (Midberry et al., 2022; Mourão & Brown, 2022), behavioral attainment to certain medical treatments (Fung, 2019; Kang & Lin, 2015; Sontag, 2018), or purchase intentions (Chang, 2012).

Challenges for Research and Mediated Communication

Contemporary media communication inherently involves multiple modes (Wessler et al., 2016). While many studies have focused on analyzing multimodal framing from the perspectives of framing theory and visual framing, our systematic literature review reveals a recent shift toward considering multimodal layers as units of analysis. Scholars have demonstrated the powerful influence of framing effects in both visual representations and verbal descriptions across a range of topics. They have emphasized that textual and visual frames have distinct effects, but it is primarily the interplay of multimodal elements that shape recipients’ perceptions and interpretations. However, the multimodal relationship between image and text has primarily been explored by testing different combinations of frames (e.g., multimodal vs. monomodal) or abstract categories (e.g., congruence vs. incongruence). This is a critical area for future research to delve into and further examine the dynamic interactions between textual and visual elements within the multimodal framing process.

While visual communication scholars have already discussed other ways in which modes interact, including illustration, complementation, argumentation, extension, and reinforcement, future research should more closely examine the resulting types of potential interactions between elements in multimodal messages. In addition, as our overview illuminates, visual framing researchers so far have shown relatively little interest in the effects of connotative and ideological visual cues. Yet, due to the unique characteristics of visual images, symbolic representations, latent stereotypes, and ideological intentions seem more likely to induce framing effects—and their influence seems particularly crucial in times of increasing polarization and propaganda, war, and crises.

Our systematic review also provides exciting methodological insights that can be harnessed for further framing research. For example, we found that although cognitive framing dimensions are investigated in almost all studies, qualitative methods that could more deeply account for framing effects at the level of interpretation and evaluation performance are rarely used. Our overview also shows that the affective dimensions of visual framing have rarely been empirically analyzed—and if so, then with the help of standardized, post-receptive survey items. This offers an exciting starting point for future research, especially if studies focus on affective framing effects using physiological or computer-based observation methods. Finally, our review reveals that behavior as an outcome of framing has only scarcely been studied. This research gap also offers potential for further research, especially as framing theory suggests that visual frames are particularly strong drivers of frame-related actions (Entman, 1993; Grabe & Bucy, 2009).

In examining the effects of multimodal arrangements, research has long emphasized that individuals are not passive recipients, but actively select and interpret information from different modalities in messages, experience emotions, make judgments, and process information based on their existing knowledge and attitudes. This has implications for visual journalism and mass communication, as it may be a challenge for (photo)journalists, news editors, and other visual and multimodal content creators to consider the perspective of recipients. Furthermore, how people create, interpret, and respond to visual content in the media is also influenced by socio-political and cultural differences. Our systematic review found that the majority of visual framing studies have been conducted in Anglo-American or European-centric contexts. Further research is imperative to mitigate knowledge biases. This is not only important from a scientific perspective but also from a practical one. Visual journalists and framing scholars need to understand how cross-national or intercultural similarities in visual/multimodal framing effects are driven by shared values or mass communication and media production practices. Conversely, they should also recognize how contextual factors can lead to divergent interpretations of a multimodal media frame.

As our research review shows, the concept of visual and multimodal framing has gained importance not only in journalism studies, but also in areas of mass communication such as political communication, health communication, public relations, and advertising. Adopting a multimodal perspective is beneficial for both media researchers and practitioners, as it helps to identify the diverse preferences and needs of news audiences, harness the strengths of different communication tools, and contribute to the creation of more effective and engaging content in multimodal media environments.

On the one hand, media professionals need to consider how to effectively structure media content to capture viewers’ attention, convey emotional content, and facilitate recipients’ comprehension and interpretation. On the contrary, showing that visual content is of immense importance throughout the entire reception process, research findings highlight a significant responsibility for media practitioners in creating, selecting, checking, combining, and contextualizing these images. However, every representational process involves a “translation or conversion of some kind; a process of inscription, transcription, and/or fabrication whereby the phenomenon or concept is captured, transformed, or even re-created” (Pauwels, 2006, pp. 4–5), and visual communication is also thus embedded in multimodal news framing. Visual producers must be aware of the potential impact—across sensory, emotional, cognitive, and behavioral dimensions. This adds complexity and challenge to the process of multimodal media communication, but it also underscores the tremendous influence of images in media. Their use goes way beyond illustrative purposes. As part of multimodal media framing, images are prominent, they are relevant, and they are fundamental, but they should be used with thoughtful consideration.

Footnotes

Appendix

Overview of Empirical Studies on Visual/ Multimodal Framing Effects.

Multi-dimensions in visual framing effects	N	Empirical studies
Sensory	4	Bucher and Schumacher (2006), Geiger and Reeves (1993), Vázquez et al. (2021), Smith et al. (2021)
Sensory & affective	1	Gómez-Carmona et al. (2021)
Sensory & cognitive	6	Barnes (2019), Dahmen (2015), Lee and Ho (2018), Ohme et al. (2022), Rossiter et al. (2001), Segesten et al. (2022)
Sensory & behavioral	1	Meng (2019)
Sensory & affective & cognitive	2	Dahmen (2015), Geise et al. (2021)
Affective	0
Affective & cognitive	23	Brantner et al. (2011), Coleman and Banning (2006), Domke et al. (2002), Fung (2019), Henderson et al. (2022), Hung (2001), Kang and Lin (2015), Krause and Bucy (2018), Iyer et al. (2014), Lazard et al. (2017), Midberry et al. (2022), Midberry (2020), Niu et al. (2020), Parrott et al. (2019), Pfau et al. (2006), Powell et al. (2015; 2018), Powell, Boomgaarden, et al. (2019), Seo et al. (2013), Seo & Dillard (2019a, 2019b), Shih and Lin (2017), Sontag (2018)
Affective & behavioral	1	Ryan (2012)
Affective & cognitive & behavioral	1	Powell et al. (2021)
Cognitive	32	Abdel-Raheem (2017), Abraham and Appiah (2006), Adams et al. (1980), An and Stern (2011), Arpan et al. (2006), Bekkers and Moody (2014), Bingaman et al. (2021), Bode (2021), Bolsen et al. (2019), Boomgaarden et al. (2016), Chang (2012), Dan and Arendt (2021), Gibson and Zillmann (2000), Jeon et al. (2019), Hameleers et al. (2020), Hart (2018), Brown and Mourão (2021), Mourão and Brown (2022), Olesen (2018), Powell, van der Meer, et al. (2019), Schindler et al. (2017), Shriver-Rice et al. (2022), Thomson and Greenwood (2017), von Sikorski (2018, 2022), von Sikorski & Knoll (2019), von Sikorski & Ludwig (2018), Vultee et al. (2022), Wanta (1988), Yu et al. (2024), Zhang (2020), Zillmann et al. (1999)
Cognitive & behavioral	1	Weber et al. (2022)

Data Availability

Supplemental materials (codebook, codesheet, list of publications) are available from the corresponding author, upon reasonable request.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Stephanie Geise

Yi Xu

Author Biographies

Stephanie Geise is Professor of Communication and Media Studies at the Centre for Media, Communication and Information Research (ZeMKI), University of Bremen. Asking how multimodal media messages influence political thinking and action, she investigates media effects of visual and multimodal communication. She was chair of the Visual Communication section of the German Society for Journalism and Communication Studies (DGPuK).

Yi Xu is a Postdoc Researcher at the Institute of Communication Science (IfKW), Friedrich Schiller University Jena, Germany. She received her doctoral degree from the Technische Universität Ilmenau, where her dissertation focused on a cross-national comparison of multimodal news frames in media coverage of national images on social media. Her research interests include multimodal framing, journalism studies, and crisis communication.

References

*Abdel-Raheem

(2017). Can cartoons influence Americans’ attitudes toward bailouts? Visual Communication Quarterly, 24(3), 179–191. https://doi.org/10.1080/15551393.2016.1230472

*Abraham

Appiah

(2006). Framing news stories: The role of visual imagery in priming racial stereotypes. The Howard Journal of Communications, 17(3), 183–203. https://doi.org/10.1080/10646170600829584

*Adams

R. C.

Copeland

G. A.

Fish

M. J.

Hughes

(1980). The effect of framing on selection of photographs of men and women. Journalism Quarterly, 57(3), 463–467. https://doi.org/10.1177/107769908005700313

Ajzen

Fishbein

(1980). Understanding attitudes and predicting social behavior. Prentice-Hall.

*An

Stern

(2011). Mitigating the effects of advergames on children. Journal of Advertising, 40(1), 43–56. https://doi.org/10.2753/JOA0091-3367400103

*Arpan

L. M.

Baker

Lee

Jung

Lorusso

Smith

(2006). News coverage of social protests and the effects of photographs and prior attitudes. Mass Communication & Society, 9(1), 1–20. https://doi.org/10.1207/s15327825mcs0901_1

*Barnes

S. R.

(2019). Studies in the efficacy of motion graphics: The relation between expository motion graphics and the presence of naïve realism. Visual Communication, 18(1), 135–158. https://doi.org/10.1177/1470357217739223

*Bekkers

Moody

(2014). Accountability and the framing power of visual technologies: How do visualized reconstructions of incidents influence public and political accountability discussions? The Information Society, 30(2), 144–158. https://doi.org/10.1080/01972243.2013.873749

*Bingaman

Brewer

P. R.

Paintsil

Wilson

D. C.

(2021). “Siri, show me scary images of AI”: Effects of text-based frames and visuals on support for artificial intelligence. Science Communication, 43(3), 388–401. https://doi.org/10.1177/1075547021998069

10.

Bock

M. A.

(2020). Theorising visual framing: Contingency, materiality and ideology. Visual Studies, 35(1), 1–12. https://doi.org/10.1080/1472586X.2020.1715244

11.

*Bode

(2021). Deepfaking Keanu: YouTube deepfakes, platform visual effects, and the complexity of reception. Convergence, 27(4), 919–934. https://doi.org/10.1177/13548565211030454

12.

*Bolsen

Palm

Kingsland

J. T.

(2019). Counteracting climate science politicization with effective frames and imagery. Science Communication, 41(2), 147–171. https://doi.org/10.1177/1075547019834565

13.

*Boomgaarden

Boukes

Iorgoveanu

(2016). Image versus text: How newspaper reports affect evaluations of political candidates. International Journal of Communications, 10, 2529–2555.

14.

Borah

(2011). Conceptual issues in framing theory: A systematic examination of a decade’s literature. Journal of Communication, 61(2), 246–263. https://doi.org/10.1111/j.1460-2466.2011.01539.x

15.

Brantner

Geise

Lobinger

(2012). Fractured paradigm? Theories, concepts and methodology of visual framing research: A systematic review [Paper presentation]. The 62nd International Communication Association (ICA) Annual Conference, Phoenix, AZ, United States.

16.

*Brantner

Lobinger

Wetzstein

(2011). Effects of visual framing on emotional responses and evaluations of news stories about the Gaza conflict 2009. Journalism & Mass Communication Quarterly, 88(3), 523–540. https://doi.org/10.1177/107769901108800304

17.

*Brown

D. K.

Mourão

R. R.

(2021). Protest coverage matters: How media framing and visual communication affects support for Black civil rights protests. Mass Communication and Society, 24(4), 576–596. https://doi.org/10.1080/15205436.2021.1884724

18.

*Bucher

H.-J.

Schumacher

(2006). The relevance of attention for selecting news content. An eye-tracking study on attention patterns in the reception of print and online media. Communications, 31(3), 347–368. https://doi.org/10.1515/COMMUN.2006.022

19.

Caple

(2017). Anyone can take a photo, but: Is there space for the professional photographer in the twenty-first century newsroom? In Franklin

(Ed.), Future of Journalismin an age of digital media and economic uncertainty (pp. 368–378). Routledge.

20.

Chaiken

Maheswaran

(1994). Heuristic processing can bias systematic processing: Effects of source credibility, argument ambiguity, and task importance on attitude judgment. Journal of Personality and Social Psychology, 66(3), 460–473. https://doi.org/10.1037//0022-3514.66.3.460

21.

*Chang

C.-T.

(2012). Missing ingredients in cause-related advertising: The right formula of execution style and cause framing. International Journal of Advertising, 31(2), 231–256. https://doi.org/10.2501/IJA-31-2-231-256

22.

Childers

T. L.

Houston

M. J.

(1984). Conditions for a picture-superiority effect on consumer memory. Journal of Consumer Research, 11(2), 643–654. https://doi.org/10.1086/209001

23.

Coleman

(2010). Framing the pictures in our heads. In D’Angelo

Kuypers

J. A.

(Eds.), Doing news framing analysis: Empirical and theoretical perspectives (pp. 233–261). Routledge.

24.

*Coleman

Banning

(2006). Network TV news’ affective framing of the presidential candidates: Evidence for a second-level agenda-setting effect through visual framing. Journalism & Mass Communication Quarterly, 83(2), 313–328. https://doi.org/10.1177/107769900608300206

25.

*Dahmen

N. S.

(2012). Photographic framing in the stem cell debate: Integrating eye-tracking data for a new dimension of media effects research. American Behavioral Scientist, 56(2), 189–203. https://doi.org/10.1177/0002764211419489

26.

*Dahmen

N. S.

(2015). Watchdog, voyeur, or censure? An eye-tracking research study of graphic photographs in the news media. Journalism Practice, 9(3), 418–432. https://doi.org/10.1080/17512786.2014.950883

27.

*Dan

Arendt

(2021). Visual cues to the hidden agenda: Investigating the effects of ideology-related visual subtle backdrop cues in political communication. The International Journal of Press/Politics, 26(1), 22–45. https://doi.org/10.1177/1940161220936593

28.

D’Angelo

Lule

Neuman

W. R.

Rodriguez

Dimitrova

D. V.

Carragee

K. M.

(2019). Beyond framing: A forum for framing researchers. Journalism & Mass Communication Quarterly, 96(1), 12–30. https://doi.org/10.1177/1077699018825004

29.

de Vreese

C. H

. (2003). Framing Europe: Television news and European integration. Aksant.

30.

de Vreese

C. H

. (2005). News framing: Theory and typology. Information Design Journal, 13(1), 51–62. https://doi.org/10.1075/idjdd.13.1.06vre

31.

*Domke

Perlmutter

Spratt

(2002). The primes of our times? An examination of the ‘power’ of visual images. Journalism, 3(2), 131–159. https://doi.org/10.1177/146488490200300211

32.

Entman

R. M.

(1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43(4), 51–58. https://doi.org/10.1111/j.1460-2466.1993.tb01304.x

33.

Fahmy

Bock

Wanta

(2014). Visual communication theory and research: A mass communication perspective. Springer.

34.

*Fung

T. K. F.

(2019). The role of counterfactual thinking in narrative persuasion: Its impact on patients’ adherence to treatment regimen. Health Communication, 34(12), 1482–1493. https://doi.org/10.1080/10410236.2018.1500432

35.

Gamson

W. A.

Modigliani

(1987). The changing culture of affirmative action. In Braungart

R. G.

Braungart

M. M.

(Eds.), Research in political sociology (pp. 137–177). JAI Press.

36.

*Geiger

Reeves

(1993). The effects of scene changes and semantic relatedness on attention to television. Communication Research, 20(2), 155–175. https://doi.org/10.1177/009365093020002001

37.

Geise

(2017). Visual framing. In Rössler

(Ed.), The international encyclopedia of media effects (pp. 1–12). John Wiley & Sons.

38.

Geise

Baden

(2015). Putting the image back into the frame: Modeling the linkage between visual communication and frame-processing theory. Communication Theory, 25(1), 46–69. https://doi.org/10.1111/comt.12048

39.

Geise

Coleman

(2015). Ethical challenges of framing in persuasive communication, in words and pictures. In Duffy

Thorson

(Eds.), Persuasion ethics today (pp. 199–221). Routledge.

40.

*Geise

Panke

Heck

(2021). Still images—Moving people? How media images of protest issues and movements influence participatory intentions. The International Journal of Press/Politics, 26(1), 92–118. https://doi.org/10.1177/1940161220968534

41.

*Gibson

Zillmann

(2000). Reading between the photographs: The influence of incidental pictorial information on issue perception. Journalism & Mass Communication Quarterly, 77(2), 355–366. https://doi.org/10.1177/107769900007700209

42.

Gitlin

(1980). The whole world is watching: Mass media in the making and unmaking of the New Left. University of California Press.

43.

*Gómez-Carmona

Muñoz-Leiva

Liébana-Cabanillas

Nieto-Ruiz

Martínez-Fiestas

Campoy

(2021). The effect of consumer concern for the environment, self-regulatory focus and message framing on green advertising effectiveness: An eye tracking study. Environmental Communication, 15(6), 813–841.

44.

Grabe

M. E.

Bucy

E. P.

(2009). Image bite politics: News and the visual framing of elections. Oxford University Press.

45.

Graber

D. A.

(1996). Say it with pictures. The Annals of the American Academy of Political and Social Science, 546(1), 85–96. https://doi.org/10.1177/0002716296546001008

46.

*Hameleers

Powell

T. E.

Van Der Meer

T. G. L. A.

Bos

(2020). A picture paints a thousand lies? The effects and mechanisms of multimodal disinformation and rebuttals disseminated via social media. Political Communication, 37(2), 281–301. https://doi.org/10.1080/10584609.2019.1674979

47.

*Hart

(2018). ‘Riots engulfed the city’: An experimental study investigating the legitimating effects of fire metaphors in discourses of disorder. Discourse & Society, 29(3), 279–298. https://doi.org/10.1177/0957926517734663

48.

*Henderson

Tukachinsky

Forster, R.

Kalili

Guillory

(2022). Visual framing effects of news coverage of police use of deadly force on intergroup relationships. Howard Journal of Communications, 33(1), 24–44. https://doi.org/10.1080/10646175.2021.1927261

49.

*Hung

(2001). Framing meaning perceptions with music: The case of teaser ads. Journal of Advertising, 30(3), 39–49. https://doi.org/10.1080/00913367.2001.10673644

50.

*Iyer

Webster

Hornsey

M. J.

Vanman

E. J.

(2014). Understanding the power of the picture: The effect of image content on emotional and political responses to terrorism. Journal of Applied Social Psychology, 44(7), 511–521. https://doi.org/10.1111/jasp.12243

51.

*Jeon

Y. A.

Koh

H. E.

Ahn

Coleman

(2019). Stigma activation through dis-identification: Cognitive bias triggered by mass media photos of people with obesity. Journal of Applied Communication Research, 47(5), 485–504. https://doi.org/10.1080/00909882.2019.1682181

52.

*Kang

Lin

C. A.

(2015). Effects of message framing and visual-fear appeals on smoker responses to antismoking ads. Journal of Health Communication, 20(6), 647–655. https://doi.org/10.1080/10810730.2015.1012242

53.

*Krause

Bucy

E. P.

(2018). Interpreting images of fracking: How visual frames and standing attitudes shape perceptions of environmental risk and economic benefit. Environmental Communication, 12(3), 322–343. https://doi.org/10.1080/17524032.2017.1412996

54.

Kress

Van Leeuwen

(2021). Reading images: The grammar of visual design. Routledge.

55.

Kühne

(2013). Emotionale framing-effekte auf Einstellungen: Ein integratives modell. Medien & Kommunikationswissenschaft, 61(1), 5–20. https://doi.org/10.5771/1615-634x-2013-1-5

56.

Kühne

Schemer

(2015). The emotional effects of news frames on information processing and opinion formation. Communication Research, 42(3), 387–407. https://doi.org/10.1177/0093650213514599

57.

Lazard

A. J.

Schmidt

Byron

M. J.

Peters

Boynton

M. H.

Brewer

N. T.

(2017). Icons for health effects of cigarette smoke: A test of semiotic type. Journal of Behavioral Medicine, 40(4), 641–650. https://doi.org/10.1007/s10865-017-9833-3

58.

Lazarus

R. S.

(1991). Progress on a cognitive-motivational-relation theory of emotion. The American Psychologist, 46(8), 819–834. https://doi.org/10.1037//0003-066x.46.8.819

59.

*Lee

E. W. J.

S. S.

(2018). Are photographs worth more than a thousand words? Examining the effects of photographic–textual and textual-only frames on public attitude toward nuclear energy and nanotechnology. Journalism & Mass Communication Quarterly, 95(4), 948–970. https://doi.org/10.1177/1077699017741090

60.

Matthes

(2009). What’s in a frame? A content analysis of media framing studies in the world’s leading communication journals, 1990–2005. Journalism & Mass Communication Quarterly, 86(2), 349–367. https://doi.org/10.1177/107769900908600206

61.

*Meng

(2019). Effects of visual signaling in screenshots: An eye tracking study. Technical Communication, 66(4), 396–411.

62.

Messaris

(1998). Visual aspects of media literacy. Journal of Communication, 48(1), 70–80. https://doi.org/10.1111/j.1460-2466.1998.tb02738.x

63.

Messaris

(2003). Visual communication: Theory and research. A review essay. Journal of Communication, 53(3), 551–556. https://doi.org/10.1111/j.1460-2466.2003.tb02608.x

64.

Messaris

Abraham

(2001). The role of images in framing news stories. In Reese

S. D.

Gandy

O. H.

Grant

A. E.

(Eds.), Framing public life: Perspectives on media and our understanding of the social world (pp. 215–226). Lawrence Erlbaum.

65.

*Midberry

(2020). Compassionate horror or compassion fatigue? Responses to human-cost-of-war photographs. International Journal of Communications, 14, 4406–4427.

66.

*Midberry

Brown

D. K.

Potter

R. F.

Comfort

R. N.

(2022). The influence of visual frame combinations in solutions journalism stories. Journalism & Mass Communication Quarterly. https://doi.org/10.1177/10776990221109235

67.

Moher

Liberati

Tetzlaff

Altman

D. G., &

PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal Medicine, 151(4), 264–269. https://doi.org/10.7326/0003-4819-151-4-200908180-00135

68.

*Mourão

R. R.

Brown

D. K.

(2022). Black Lives Matter coverage: How protest news frames and attitudinal change affect social media engagement. Digital Journalism, 10(4), 626–646. https://doi.org/10.1080/21670811.2021.1931900

69.

*Niu

Jeong

D. C.

Brickman

Nam

Liu

Stapleton

J. L.

(2020). A picture worth a thousand texts? Investigating the influences of visual appeals in a text message-based health intervention. Journal of Health Communication, 25(3), 204–213. https://doi.org/10.1080/10810730.2020.1731631

70.

*Ohme

Maslowska

Mothes

(2022). Mobile news learning—Investigating political knowledge gains in a social media newsfeed with mobile eye tracking. Political Communication, 39(3), 339–357. https://doi.org/10.1080/10584609.2021.2000082

71.

*Olesen

(2018). More than words?: A test of the effect of emotionally charged photographs. Nordicom Review, 39(2), 35–48. https://doi.org/10.2478/nor-2018-0010

72.

Paivio

(1991). Images in mind: The evolution of a theory. Harvester Wheatsheaf.

73.

*Parrott

Hoewe

Fan

Huffman

(2019). Portrayals of immigrants and refugees in US news media: Visual framing and its effect on emotions and attitudes. Journal of Broadcasting & Electronic Media, 63(4), 677–697. https://doi.org/10.1080/08838151.2019.1681860

74.

Pauwels

(2006). A theoretical framework for assessing visual representational practices in knowledge building and science communication. In Pauwels

(Ed.), Visual cultures of science: Rethinking representational practices in knowledge building and science communication (pp. 1–25). Dartmouth College Press.

75.

*Pfau

Haigh

Fifrick

Holl

Tedesco

Cope

Nunnally

Schiess

Preston

Roszkowski

Martin

(2006). The effects of print news photographs of the casualties of war. Journalism & Mass Communication Quarterly, 83(1), 150–168. https://doi.org/10.1177/107769900608300110

76.

*Powell

T. E.

Boomgaarden

H. G.

De Swert

De Vreese

C. H.

(2015). A clearer picture: The contribution of visuals and text to framing effects. Journal of Communication, 65(6), 997–1017. https://doi.org/10.1111/jcom.12184

77.

*Powell

T. E.

Boomgaarden

H. G.

De Swert

de Vreese

C. H.

(2018). Video killed the news article? Comparing multimodal framing effects in news videos and articles. Journal of Broadcasting & Electronic Media, 62(4), 578–596. https://doi.org/10.1080/08838151.2018.1483935

78.

*Powell

T. E.

Boomgaarden

H. G.

De Swert

de Vreese

C. H.

(2019). Framing fast and slow: A dual processing account of multimodal framing effects. Media Psychology, 22(4), 572–600. https://doi.org/10.1080/15213269.2018.1476891

79.

*Powell

T. E.

Hameleers

van der Meer

T. G. L. A

. (2021). Selection in a snapshot? The contribution of visuals to the selection and avoidance of political news in information-rich media settings. The International Journal of Press/Politics, 26(1), 46–68. https://doi.org/10.1177/1940161220966730

80.

*Powell

T. E.

van der Meer

T. G. L. A.

Peralta

C. B.

(2019). Picture power? The contribution of visuals and text to partisan selective exposure. Media and Communication, 7(3), 12–31. https://doi.org/10.17645/mac.v7i3.1991

81.

Rodarte

A. K.

Hyunsik Kim

Lukito

(2023). Representing “the people”: What can social media images reveal about populist propaganda in Brazil? Social Media + Society, 9(2), 1–15. https://doi.org/10.1177/20563051231177962

82.

Rodgers

Thorson

(2000). The interactive advertising model: How users perceive and process online ads. Journal of Interactive Advertising, 1(1), 41–60. https://doi.org/10.1080/15252019.2000.10722043

83.

Rodriguez

Dimitrova

D. V.

(2011). The levels of visual framing. Journal of Visual Literacy, 30(1), 48–65. https://doi.org/10.1080/23796529.2011.11674684

84.

*Rossiter

J. R.

Silberstein

R. B.

Harris

P. G.

Nield

(2001). Brain-imaging detection of visual scene encoding in long-term memory for TV commercials. Journal of Advertising Research, 41(2), 13–21. https://doi.org/10.2501/JAR-41-2-13-21

85.

*Ryan

T. J.

(2012). What makes us click? Demonstrating incentives for angry discourse with digital-age field experiments. The Journal of Politics, 74(4), 1138–1152. https://doi.org/10.1017/S0022381612000540

86.

Scherer

(2005). What are emotions? And how can they be measured? SSI, 4(4), 695–729.

87.

*Schindler

Krämer

Müller

(2017). Looking left or looking right? Effects of newspaper layout style on the perception of political news. European Journal of Communication, 32(4), 348–366. https://doi.org/10.1177/0267323117718463

88.

*Segesten

A. D.

Bossetta

Holmberg

Niehorster

(2022). The cueing power of comments on social media: How disagreement in Facebook comments affects user engagement with news. Information, Communication & Society, 25(8), 1115–1134. https://doi.org/10.1080/1369118X.2020.1850836

89.

*Seo

Dillard

(2019a). The persuasive effects of two stylistic elements: Framing and imagery. Communication Research, 46(7), 891–907. https://doi.org/10.1177/0093650215626979

90.

*Seo

Dillard

J. P.

(2019b). A process analysis of message style and persuasion: The effects of gain-loss framing and emotion-inducing imagery. Visual Communication Quarterly, 26(3), 131–145. https://doi.org/10.1080/15551393.2019.1638785

91.

*Seo

Dillard

J. P.

Shen

(2013). The effects of message framing and visual image on persuasion. Communication Quarterly, 61(5), 564–583. https://doi.org/10.1080/01463373.2013.822403

92.

*Shih

T.-J.

Lin

C.-Y.

(2017). Developing communication strategies for mitigating actions against global warming: Linking framing and a dual processing model. Environmental Communication, 11(6), 840–858. https://doi.org/10.1080/17524032.2016.1154886

93.

*Shriver-Rice

Fernandes

Johns

L. N.

Riopelle

Vaughan

(2022). Young adults’ reactions and engagement with short-form videos on sea level rise. Environmental Communication, 16(1), 63–78. https://doi.org/10.1080/17524032.2021.1963800

94.

*Smith

L. R.

Clavio

Lang

(2021). Does visual framing drive eye gaze behavior? The effects of visual framing of athletes in an increasingly visual social media world. Media Psychology, 24(4), 562–579. https://doi.org/10.1080/15213269.2020.1765810

95.

*Sontag

J. M.

(2018). Visual framing effects on emotion and mental health message effectiveness. Journal of Communication in Healthcare, 11(1), 30–47. https://doi.org/10.1080/17538068.2018.1435017

96.

Tankard

J. W.

(2001). The empirical approach to the study of media framing. In Reese

S. D.

Gandy

O. H.

Grant

A. E.

(Eds.), Framing public life: Perspectives on media and our understanding of the social world (pp. 95–106). Lawrence Erlbaum.

97.

*Thomson

T. J.

Greenwood

(2017). Beyond Framing: Influences of subject–photographer interactions on visual journalism. Journalism Practice, 11(5), 625–644. https://doi.org/10.1080/17512786.2016.1152908

98.

Tversky

Kahneman

(1974). Judgment under uncertainty: Heuristics and biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124

99.

*Vázquez

M. S.

El Khatib

Hamrick

Attardo

(2021). On the order of processing of humorous tweets with visual and verbal elements. Internet Pragmatics, 4(1), 150–175. https://doi.org/10.1075/ip.00060.sim

100.

*von Sikorski

. (2018). The effects of darkness and lightness cues in the visual depiction of political actors involved in scandals: An experimental study. Communication Research Reports, 35(2), 162–171. https://doi.org/10.1080/08824096.2018.1425984

101.

*von Sikorski

. (2022). Visual polarisation: Examining the interplay of visual cues and media trust on the evaluation of political candidates. Journalism, 23(9), 1900–1918. https://doi.org/10.1177/1464884920987680

102.

*von Sikorski

Knoll

. (2019). Framing political scandals: Exploring the multimodal effects of isolation cues in scandal news coverage on candidate evaluations and voting intentions. International Journal of Communications, 13, 206–228.

103.

*von Sikorski

Ludwig

. (2018). The effects of visual isolation on the perception of scandalized politicians. Communications, 43(2), 235–257. https://doi.org/10.1515/commun-2017-0054

104.

*Vultee

Burgess

G. S.

Frazier

Mesmer

(2022). Here’s what to know about clickbait: Effects of image, headline and editing on audience attitudes. Journalism Practice, 16(1), 1–18. https://doi.org/10.1080/17512786.2020.1793379

105.

*Wanta

(1988). The effects of dominant photographs: An agenda-setting experiment. Journalism Quarterly, 65(1), 107–111. https://doi.org/10.1177/107769908806500114

106.

Warshaw

P. R.

Davis

F. D.

(1985). The accuracy of behavioral intention versus behavioral expectation for predicting behavioral goals. The Journal of Psychology, 119(6), 599–602. https://doi.org/10.1080/00223980.1985.9915469

107.

*Weber

Dingerkus

Fabrikant

S. I.

Zampa

West

Yildirim

(2022). Virtual reality as a tool for political decision-making? An empirical study on the power of immersive images on voting behavior. Frontiers in Communication, 7, 1–17. https://doi.org/10.3389/fcomm.2022.842186

108.

Wessler

Wozniak

Hofer

Lück

(2016). Global multimodal news frames on climate change: A comparison of five democracies around the world. The International Journal of Press/Politics, 21(4), 423–445. https://doi.org/10.1177/1940161216661848

109.

Löffelholz

(2021). Multimodal framing of Germany’s national image: Comparing news on Twitter (USA) and Weibo (China). Journalism Studies, 22(16), 2256–2278. https://doi.org/10.1080/1461670X.2021.1994445

110.

*Yu

Ponomarenko

Liska

L. I.

(2024). How to allocate white space in ad design? The impact of product layouts on perceived entitativity and advertising performance. Journal of Advertising, 53(2), 215–229. https://doi.org/10.1080/00913367.2023.2175280

111.

*Zhang

(2020). Effects of freedom restoration, language variety, and issue type on psychological reactance. Health Communication, 35(11), 1316–1327. https://doi.org/10.1080/10410236.2019.1631565

112.

*Zillmann

Gibson

Sargent

S. L.

(1999). Effects of photographs in news-magazine reports on issue perception. Media Psychology, 1(3), 207–228. https://doi.org/10.1207/s1532785xmep0103_2

113.

References marked with an asterisk indicate studies included in the systematic literature review.