Abstract
This article examines how video journalism produced by the elite press is using forensic techniques and aesthetics as part of the effort to reinvent journalistic authority in a fragmented media and political sphere. I first discuss some earlier moments in which news coverage of events adopted a media-forensic epistemology and style, and then turn to the formation of the New York Times Visual Investigations team, a group at the leading-edge of this type of journalism today. I provide an analysis of one of the team’s investigative reports, a 40-minute account of the January 6 Capitol riot assembled from vernacular video, surveillance footage, police bodycam video, and other non-news source materials. In both its formal aspects and its subject matter, the piece represents an important example for understanding an emerging form of forensic journalism. While the January 6 Capitol riot was not the first time news coverage of a violent event adopted a forensic style and epistemology, the forensic-media coverage of the riot represents a unique conjuncture. A new convergence of media-technological developments and journalist practices shaped how the storming of the Capitol was experienced, investigated, and covered as a media event.
Keywords
News organizations and professional journalists are struggling to respond in effective ways to the fragmentation of the public sphere and the rise of anti-media populism in the social media age (Carlson et al., 2021; Gutsche and Hess, 2020; Panievsky, 2022; Rasmussen, 2013; Schlesinger 2020). Algorithmically driven social media platforms have diminished the agenda setting capacity of the elite press, undermining any illusion of consensus formation among fragmented and diffuse publics. The “collapse of the old news order” (Waisbord, 2018: 1868), along with orchestrated efforts to spread misinformation and amplify political division, have scattered public attention and created conditions of epistemic chaos. The brand of populism that saw Trump’s rise in the US developed within a fragmented media ecology, reflecting a very different form of mediatized politics than that defined by 20th-century commercial television (Hallin, 2019).
One response to contemporary conditions of mediatized and fragmented politics, and the related decline of trust in journalism, has been the adoption of media-forensic techniques as a means of assembling accounts of events and establishing their epistemic authority. This type of media-forensic journalism was notably on view in news coverage of the January 6 US Capitol riot. On January 12, 2021, the Wall Street Journal published a piece titled “Video Analysis: How a Pro-Trump Mob Overran Capitol Police.” The 6-minute report analyzed videos taken at the scene, offering close-up perspectives of the action from within the crowd. The analysis used freeze-framing and highlighting techniques to draw attention to key details, and satellite image maps to trace the path of the rioters from the Trump rally to the Capitol. The credits indicated that the piece was produced with the help of Storyful, a social media news agency founded in Ireland in 2009 and acquired by News Corp. in 2013. Other major news outlets followed with their own media-forensic analyses of riot footage, including a series of reports from The New York Times that similarly used highlighting, satellite image maps, and videos scraped from social media to reconstruct the event.
These types of video reports, now being produced by major media outlets, have the look, sound, and feel of open-source investigations, a term of choice to designate the use of eyewitness video and other publicly available sources to document atrocities and state crimes. A set of developments enabled the rise of this newly mediatized form of human rights fact-finding, including social media, smartphones, and publicly available satellite imagery (Koettle et al., 2020). Open-source investigators have appropriated forensic techniques and epistemology, and in many cases have pioneered new methods at the intersection of legal, documentary, and artistic production. These techniques of audiovisual analysis resemble similar techniques in the forensic investigation of crime, but the investigative methods typically employed by the state are turned on the state and other powerful actors to document and expose their wrongdoing. The architect, scholar, and activist Eyal Weizman (2017) refers to these practices as “counterforensics” to invoke their oppositional orientation. Independent organizations conducting these investigations include WITNESS, Bellingcat, and the Forensic Architecture collective.
Open-source investigations have caught the attention of journalism and media scholars, where much of the discussion has focused on their effectiveness as a means of challenging disinformation campaigns and holding states and other powerful actors accountable for human rights violations (Dubberly et al., 2020; Freeman, 2020). These emerging media practices have disrupted the once-settled practices of human rights fact-finding, producing a knowledge controversy “in which much is possible and much is at stake” (McPherson et al., 2020: 69). Smith and Watson (2022: 3) see the Forensic Architecture group as carrying forward the “camera as weapon” tactics of the Latin American Third Cinema movement, arguing that when media recordings are edited together with testimonials, maps, and other materials, the result can serve as an effective form of political and legal pressure. Open-source investigation has become a genre of conflict reporting, according to Sandra Ristovska (2021, 2022), with human rights advocacy adopting journalistic conventions and becoming what she calls a “proxy profession.” Müller and Wiik (2023) argue that the use of open-source methods has opened new possibilities for investigative journalism, integrating novel actors and technologies into journalism practice and pushing its professional boundaries.
Indeed, if human rights activists have adopted some of the institutional norms of journalism to gain influence and efficacy, the reverse is also true: the elite press has adopted the analytic and discursive techniques of open-source investigation as part of its own battle for epistemic authority. It is significant that a proliferation of audiovisual news reports have the look, sound, and feel of open-source investigations. The similarities give this type of video journalism the impression of innovative, resourceful, and independent reporting. But while there is overlap, forensic journalism produced and supported by the institutional press should not be conflated with the work of independent organizations like Bellingcat or Forensic Architecture (nor should the work of these groups be conflated). For one, the source materials that press organizations can access to assemble their accounts are not always limited to “open sources,” but instead often include sources that require more legal or newsroom resources to acquire. These reports also benefit from the skills and paid professional labor of newsrooms, with institutional logics shaping how newsrooms manage and incorporate open-source materials (Ristovska, 2021: 644). There are both opportunities and constraints in the way news media are adopting open-source techniques to produce this new form of forensic journalism.
To examine the opportunities and constraints of forensic journalism, I focus on the work of the New York Times Visual Investigations team, a group at the leading-edge of this type of reporting today. The combination of skills and professional backgrounds that this team brings to the newsroom is historically unique, and the distinctive type of audiovisual journalism they are producing stands out among the variety of ways that journalists and news organizations are claiming epistemic authority to legitimize the news in the digital era (Carlson, 2017). To consider what is unique about this emerging form of video journalism, I offer an extended analysis of one of the longest pieces the team has produced to date: a 40-minute account of the January 6 Capitol riot, titled Day of Rage: How Trump Supporters Took the U.S. Capitol (Khavin et al., 2021).
The Capitol riot unfolded in a massively mediatized environment, generating an enormous volume of video by the participants themselves, the Capitol building’s security cameras, police body cameras, as well as reporters, independent filmmakers and videographers, stringers, and bystanders. Hundreds of criminal cases associated with January 6, from misdemeanors to seditious conspiracy, collectively relied on thousands of hours of video evidence, overwhelming prosecutors and defense lawyers (Lynch, 2021). A significant portion of the January 6 Committee hearings were dedicated to screening video evidence of the day’s violence. The volume of video from such a wide range of sources and vantage points, over an extended period time, presented both unique opportunities to reconstruct the event, as well as major challenges to doing so. When asked in an interview to comment on how much video the Visual Investigations team used to produce Day of Rage, senior story producer Malachy Browne explained:
thousands and thousands and videos, and many hundreds if not thousands of hours of footage . . . . I think we’ve got terabytes of Parler data, and we took a subset of that and hosted it internally and worked with our data engineers to automatically transcribe it but also extract the coordinates . . . over 500 individual pieces of content were used in the making of it, but many, many multiples of that in analyzing it. (Quoted in Wright and Cohen, 2022)
In assembling more source material and providing more context, Day of Rage diverges from other video news segments, like the Wall Street Journal piece, to an account that looks more like a documentary feature. The film has helped elevate the Times team to the journalism pantheon, earning a Peabody Award, a duPont-Columbia Award, and an Edward R. Murrow Award. It also received an EMMY Nomination, a nomination for a Critics Choice Award, and was short-listed for an Oscar nomination. But more important than the recognition it received, for the purposes of my argument here, is what it suggests about the way journalism is adopting media-forensic techniques and aesthetics as a way of confronting the epistemic chaos of an intensively mediatized and fragmented political landscape.
Forensics and journalism
The January 6 Capitol riot was not the first time that news coverage of a violent event adopted a forensic style and epistemology. Forensics and journalism have long shared certain epistemic tendencies, including an orientation to event reconstruction that relies on media recordings as evidence, and a mode of evidence presentation that purports to stay tightly wedded to the facts, jettisoning general knowledge claims in favor of very particular ones. To understand what is unique about the forms of media-forensic journalism taking shape today, it helps to consider some earlier moments when journalism made use of media recordings acquired from non-news sources and adopted a forensic epistemology to make sense of them.
The close analysis of a vernacular film as evidence, at the intersection of news production and criminal investigation, extends back to the Zapruder film of the Kennedy assassination. Notably, it was not federal investigators but journalists who first conducted a close analysis of the film to reconstruct the shooting (Morris, 2013). Another major moment when vernacular video, recorded serendipitously, became the center of combined legal and public scrutiny was of course the Rodney King beating. George Holliday’s videotape of the beating is a well-known example of “eyewitness video” produced in the era of analog camcorders. While the video appeared to provide incontestable proof of police brutality, its courtroom analysis reframed the brutal beating through the “professional vision” of policing (Goodwin, 1994). Yet thousands remained unconvinced of this interpretation and took to the streets.
Surveillance footage is another type of source material that has lent itself to the adoption of media-forensic techniques in news coverage of events. An early example in which television news broadcasted surveillance footage of a crime occurred in 1974, when Patricia Hearst was famously recorded on security cameras participating in a bank robbery with her kidnappers (Doyle, 2003). It was a high-profile event made more memorable and visually accessible to a mass audience thanks to being recorded by the bank’s security cameras. The surveillance footage was leveraged across sensational television news coverage and in her notorious criminal trial. A still shot of Hearst that the FBI rendered from the footage remains an iconic piece of photojournalism.
The products of media forensics began to make more frequent appearance in the news as the field of video forensics began to make more effective use of recorded surveillance video as evidence. The field of video forensics has its origins in analog surveillance systems, but its growth and demand expanded after the transition to digital formats and storage. Since the late 1990s, the press has made increasing use of stills taken from surveillance footage, as an unauthored, nonhuman form of photojournalism. A still shot taken from airport security cameras showing 2 of the 9/11 hijackers, produced by a company called Salient Stills, circulated in the press after the attacks. This type of photograph is produced using a technique called frame averaging, which involves combining multiple video frames to make details like faces and license plates more visible. Repurposing surveillance video as photojournalism requires applying the specialized techniques of video forensics.
In the US, the regular appearance of media forensics in the news has occurred in tandem with major structural changes in media systems, as well as the increasingly entangled role of surveillance and social media in the investigation of violent events. An important moment of intersection between media forensics and audiovisual journalism came in April 2013, after two bombs exploded near the finish line of the 117th Boston Marathon, killing three people and injuring over 200. The ensuing investigation and manhunt can only be described as a forensic-media spectacle, with the effort to identify and capture the bombers appearing to play out live on news and social media. The forensic mediation of the investigation migrated back-and-forth from the FBI’s Computer Analysis Response Team to the police-media spectacle that played out for the viewing audience.
Audiences observing this news coverage also experienced the rise of forensic-themed content in American popular culture across the spectrum of fictional crime drama and reality-based, true-crime genres, including shows like America’s Most Wanted (1988–2011) and Forensic Files (1996–2011), and the wildly popular crime drama CSI (2000–2015). The conceit of this pervasive content invited viewers to the scene of the crime to participate in the investigation and evaluate the evidence. So, it was perhaps unsurprising when a group of Reddit users felt compelled to participate in the Boston bombing investigation by combing images on social media, leading to the disastrous misidentification of two suspects (Madrigal, 2013). The internet proved less of a public sphere for democratic participation than a platform to participate in detective fantasies and “digilantism” (Nhan et al, 2017).
The Boston bombings occurred at a conjuncture of social and media-technological developments, at a particular stage in the spread of video surveillance, smartphone capacities, social media platforms, and digital forensics. These developments together created the conditions in which the violence was experienced, investigated, and covered as a media event. Journalism scholars have analyzed the role of eyewitness images and forms of citizen journalism that occurred during and after the bombing (Allan, 2014; Mortensen, 2015). Others have studied how “digilantes” used social media to conduct their crowdsourced investigation, for the purposes of elucidating ways that these publics might better help the police conduct criminal investigations (Nhan et al, 2017). Less well understood is the treatment of the investigation as a media event, the unique moment of mediatization it represented, and the political implications of constantly haling citizens as detectives and spies (Reeves, 2017). How the relationship between journalism and forensics continues to change is a question I will explore now, turning to more recent developments.
The Visual Investigations team
Coming 4 years after the Boston Marathon bombing investigation and its staging as a forensic-media event, the formation of the New York Times Visual Investigations team in April 2017 marked a transitional moment in the emerging relationship between journalism and forensics. Ristovska (2021: 632) describes the group as “the first newsroom team dedicated solely to open-source investigation . . . presented to the public in an online video format.” The Visual Investigations website explicitly describes the work of the team as “a new form of explanatory and accountability journalism that combines traditional reporting with more advanced digital forensics” (Visual Investigations, n.d.). Digital forensics (also referred to as video forensics, forensic video analysis, media forensics, or digital multimedia forensics) encompasses a set of technical practices for analyzing and synchronizing media recordings, establishing timelines, and identifying people and material objects represented in video and other recordings. While it could be argued that “all forensics are media forensics” because the forensics sciences rely on cameras and visualizing devices (Keenan, 2022), it is also the case that media forensics have their own set of specialized techniques, applied to media recordings as objects of forensic analysis.
The Times launched its Visual Investigations feature in 2017 after making hires from organizations specializing in social media reporting and open-source investigations. Among the first was Malachy Browne, who joined the Times from Reported.ly, the social media reporting arm of First Look Media. Browne was a founding member of the Irish-based start-up Storyful in 2011. In addition to news reporting and editing experience, Browne has a background in computer programming and software development, skills particularly suited to media forensics. Another hire was Barbara Marcolini, who joined the Times from Storyful, where she verified open source content and developed discovery tools for Latin America. Marcolini earned a master’s degree in Entrepreneurial Journalism from the City University of New York, a program founded in the face of rising employment precarity for journalists. Marcolini joined the Times in 2017 and then moved to freelancing in early 2021.
Other members of the team have backgrounds and training with human-rights advocacy, suggesting significant cross-over career trajectories (Ristovska, 2021, 2022). Christoph Koettl joined the Times from Amnesty International where he served as a Senior Analyst. Haley Willis had worked as a Digital Verification Corps Project Manager at the Human Rights Center at Berkeley Law School. Christiaan Triebert worked for Bellingcat and earned a master’s degree from King’s College London, where he wrote his thesis on “On the Promise and Peril of Open Source Evidence.” In addition to drawing some of its members from open-source and human rights organizations, the Times team has also collaborated on investigations with Bellingcat and Forensic Architecture.
Most of the work the Visual Investigations team has produced since its inception is different than the kind of forensic-media spectacle on view following the Boston Marathon bombing. It is also very different than the failed open-source investigative efforts of the Reddit “digilantes” (Nhan et al., 2017). Many of their video reports have focused on debunking the denialism of state actors about covert state-sponsored violence. Targets of their investigations include the Syrian regime’s denial of its use of chemical attacks in the civil war, the Saudi’s denial of the regime’s role in the disappearance of journalist Jamal Khashoggi, Israel’s claim that the killing of a Palestinian medic in Gaza was an accident, the Nigerian military’s claim of self-defense in the killing of protestors, and more. When the voices of state representatives are included in these reports, their statements are not used to provide balance or achieve objectivity norms but to expose their deception. But while their reports diverge from conventional crime coverage or the professional vision of policing, their alignment with the news values of the New York Times means that their work does not always sit squarely in the category of counterforensics, where the orientation is more decisively oppositional (Weizman, 2015: 232).
Media scholars have begun to analyze the work of the Visual Investigations team. Ristovska (2021, 2022) conducted interviews with the team in her study of open-source investigation as an emerging genre of conflict reporting, focusing on how the workings of the Times newsroom impact the use of eyewitness video. She found that rather than always democratizing journalistic practice or elevating marginalized voices, the use of amateur video in news production can exploit the often highly precarious work of witnessing. Bjerknes (2022: 968) analyzed 14 reports produced by the Visual Investigations team in 2020, arguing that the team’s work exhibits an “investigative way of seeing events.”
My analysis contributes to these studies by focusing on one of the team’s longest and most extensive reports produced to date, Day of Rage: How Trump Supporters Took the US Capitol (Khavin et al., 2021). Like Ristovska, I examine the use of eyewitness video to assemble this report, but with less attention to whether the reporting gives voice to marginalized perspectives than to understanding the combined aesthetic and epistemic effects of journalism’s use of vernacular video. Like Bjerknes (2022), I analyze content produced by the VI team, with a similar interest in the way this form of professional vision combines the logics of forensic and journalistic investigation. However, Day of Rage departs from the shorter features that Bjerknes examined, both in its subject matter and in the scale of the investigation the team conducted.
Method of analysis
The method of media analysis that I use to analyze Day of Rage is informed by Eyal Weizman’s (2017) discussion of forensics as an aesthetic practice (see also Fuller and Weizman, 2021). Weizman (2017: 94) explains that aesthetics “traverses three sites of forensic operation, the field, the lab/studio and the forum,” and that aesthetics operates in different ways in each of these sites. “Material aesthetics” occurs at the level of the field, or the scene of the crime, where “material objects – bones, ruins or landscapes – function as sensors and register changes in their environment” (Weizman, 2017: 94). In the lab/studio, investigative aesthetics involves slowing down time and intensifying “sensibility to space, matter, and image” (Weizman, 2017: 94). And in the forum, aesthetic practice occurs in the “modes of narration and the articulation of truth claims” (Weizman, 2017: 94); the forum “is the place where the investigation is presented/contested, namely the courts, tribunals and sometimes even the public domain via the media” (Weizman, 2015: 232).
The aesthetic practices of forensics traverse not just the field, the lab/studio, and the forum, but also the forensic report, where the results of forensic analysis are formalized, or given a form. These reports, which take a variety of forms, embody and display forensic aesthetics. The forensic report also offers cues for understanding the aesthetics of the field it examines, and the aesthetic practices involved in producing it. My analysis of the forensic report Day of Rage is informed by actor-network theory, tracing the actors, both human and nonhuman, that work together and in tension to bring the account to fruition, and treating this text as a “full-blown mediator” rather than a transparent or “silent intermediary” (Latour, 2007: 81). I focus on two types of video source materials that are repurposed as both evidence and content: vernacular or participatory video and surveillance video. I also examine the media-forensics techniques used to analyze and assemble the source materials, including freezeframing and highlighting, reconstructing the event timeline, and vertical mediation (Parks, 2018). These sources and methods provide the basis for reconstructing the event from a fragmented array of media recordings, embedding forensic epistemology and aesthetics into the documentary’s mode of address and narrative structure. My aim in examining the formal dimensions of this piece of video journalism is to understand how media forensic techniques and aesthetics are being used in video news production as a way of claiming epistemic authority under conditions of epistemic chaos.
Day of Rage
Day of Rage: How Trump Supporters Took the US Capitol (Khavin et al., 2021) was published on the New York Times website on June 30, 2021, 6 months after the Capitol riot, and posted on YouTube the following day. The film opens with footage of people traveling to DC by bus, by pickup truck, and by plane. A bus load of people recites the pledge of allegiance, a plane full chants “stop the steal.” A sequence of images displays handmade signs and printed hats with variations on the same slogan. An establishing shot shows the Capitol building at night, in cell phone video taken through a windshield. The video clip is narrated by a man’s voice from inside the car, saying “We’re in Washington DC. Capitol building dead in front of us.” Malachy Browne’s voiceover then explains that, as part of a 6-month investigation, “the New York Times has collected and forensically analyzed thousands of videos, most filmed by the rioters themselves.” His voiceover proceeds to frame the report as a factual counterpoint to the campaign of denialism that followed the riot from Trump and the Republicans, some of whom claimed that it was a “false flag” event led by the anti-fascist group Antifa, with others downplaying the events by claiming that there was no violence against the police or threat to lawmakers. A Republican congressman is shown in a Zoom video claiming that the storming of the Capitol looked like just “a normal tourist visit.” His comments are followed by video of the rioters violently fighting with police, with Browne’s voiceover stating: “A tourist visit this was not, and the proof is in the footage.”
The documentary provides an account of how the riot unfolded, using video and audio from a wide range of sources, accompanied by voiceover narration. It includes video segments that depict assaults by rioters on the police, who are shown to be massively outnumbered and ill-prepared. The account tracks members of two far-right nationalist groups, the Proud Boys and the Oath Keepers, and periodically points to Trump’s culpability. Viewers are kept close to the action, using segments of video that move viewers around to different areas of the building, showing both sequential and simultaneous activity at multiple locations. The report tracks the multiple breaches, the evacuation of lawmakers, and the multiple scenes of violence. We witness the shooting death of Ashley Babbitt at an interior doorway, and the violent scene at the Inauguration entrance, where a concentrated battle played out for hours, officers were dragged into the crowd, and one rioter was trampled to death. The story ends when, 4 hours after the riot began, more police forces finally arrive at the Capitol and swiftly sweep people out of the building.
The most obvious forensic technique that appears in Day of Rage is the use of highlighting to draw attention to specific details in images. Highlighting and freeze-framing are used throughout the film to make specific individuals more visible in the video frames, and across different video source materials. These techniques also direct viewer attention to specific objects that individuals are holding or wearing, like radios, weapons, identifying patches, and body armor. Freeze-framing briefly stops time, and highlighting aims to intensify the viewers’ sensibility to the objects, people, and actions that might otherwise go unnoticed. The highlighted evidence shows that there were specific groups present at the Capitol on January 6, and that those groups were organized and prepared for violence. Using freezeframing and highlighting to draw attention to details allows the video segments to function as both demonstrative evidence and compelling pieces of narrative structure.
The field or scene of the crime in this case was the Capitol building and its surroundings, supporting Weizman’s (2017: 52) architectural standpoint that buildings are “among the best sensors of societal and political change.” The Capitol building is the architectural and symbolic center of everything that took place, and for this reason, it is central to the forensic analysis presented in the report, including the film’s aesthetic display of forensic techniques. Much of the violence seen and heard in the video source material is the battle between the rioters and the police, but once the rioters reach and then breach the Capitol, the violence shifts to the damage being inflicted on the building. Videos show rioters bust through doors and smash windows. In one shot, a rioter punches a window with his bare fists, inches away from a Capitol police officer. The Capitol building, along with the bodies of the people present, bears the physical traces of this violent event.
To provide viewers with a scale-level, spatial sense of the field, Day of Rage displays architectural diagrams of the Capitol building, some of them animated to show movement and dimensionality. These architectural diagrams, combined with video segments, give viewers a sense of the layout of the building and the way the rioters moved through it. Building diagrams locate the action and map its progression, pinpointing specific locations by editing the diagrams together with footage taken at those locations, at times superimposing diagrams on the video. The animated 3D diagrams look like graphics generated from satellite imagery, providing the account with verticality, a form of “vertical mediation” (Parks, 2018) that lends epistemic authority to the report.
The report provides audiovisual access to the building and its surroundings through the judicious use of videos from mobile, hand-held cameras situated within and among the crowd. Some of this video appears to come from the cellphones of people participating in the riot. This kind of vernacular, participatory video posted online made the possibility of the media-forensic reconstruction of the event possible in a way that professional videography and other sources could not, in the variety as well as the immediacy and intimacy of perspectives captured. In an interview on the podcast On Assignment (Wright and Cohen, 2022), Malachy Browne responded to the interviewer’s comment that watching the film, “you almost feel like you were there”:
The power of the footage that was captured that day is that you’re, as you say, you’re in the bus with them, you’re almost alongside, you’re carried along on the journey and right through the day, through the halls of Congress. You’re hearing and you’re feeling what people believe and why they’re there and what their motivations are, like right throughout the piece. (Quoted in Wright and Cohen, 2022)
Some of the video segments used in the report are clearly the recordings from participants’ cellphones. In one example, we see cellphone video taken by someone who is present inside the Capitol Building with Proud Boy leader Joe Biggs, the two of them enjoying the moment together. With a camera pointed at Biggs, we hear the voice of the person holding the cellphone ask him, “Biggs, what do you gotta say?” Looking straight at the camera, he lowers his mask, revealing a big smile on his face and as he replies, “This is awesome!” The cellphone shot offers a sense of being inserted into an intimate moment between friends.
And yet, if watching Day of Rage gives the impression that much of the video source material was filmed by the rioters themselves, it is often ambiguous precisely who was behind the camera. The impression that the video came from rioters is created in part by footage that shows people holding up their cellphones to record their surroundings. For example, one video clip shows a man recording a video of another man declaring his intention to storm the Capitol. Visible in the frame is the videographer’s cellphone, mounted on a stabilizer. It is evident that the man who is visible doing the recording in this shot is a participant when he hollers a yell of support in response to the other man’s declaration. Yet, it is never clear who is recording this exchange between the two men. In another example, we see one of the Proud Boys, Dominic Pezzola, smashing through a window with a police shield. Moments later, we see a brief shot of Pezzola inside the building holding up his cell phone to film himself and his surroundings. However, we never see the video that Pezzola recorded, nor do we see the person holding the camera that recorded Pezzola’s actions. The report conveys the participatory mediation of the riot partly through a more conventional style of videography.
Day of Rage displays reflexive moments throughout the report, where the active documentation of the event by participants is visible in the footage itself. Yet the ambiguity of the some of the video sources, edited together with participatory cellphone video, also conveys a sense of transparent immediacy. The mediating role of the cameras disappears in passive viewing, as does the production process, to give viewers an impression of witnessing for themselves.
The participant-generated videos are the source materials that give the film the look, sound, and feel of an open-source investigation. But while some of the sources used in Day of Rage were accessible to anyone on social media, other source materials required the resources of the New York Times newsroom to acquire. In one of his podcast interviews about the film, Browne noted that his team worked with sources in Congress who were involved in the impeachment trial to obtain the security camera footage inside the Capitol (Shields, 2022). They also relied on the organization’s legal leverage: “our lawyers also helped us out. They joined a motion that went to court to unseal the evidence that was attached to indictments,” much of it body-cam video from officers engaged in direct physical contact with the rioters (Browne, quoted in Shields, 2022). Independent open-source investigators typically do not have access to an in-house legal team of the kind available at The New York Times. Security camera footage from inside the building and police body-cam video unsealed by court order do not count as open-source materials, but they do contribute to the immediacy and investigative aesthetics of the finished film.
Security camera footage that the team obtained from Congressional insiders is especially important to the structure of the narrative and to the type of visual, temporal, and spatial perspective the film provides. Once the narrative arc reaches the point where the building is breached by the rioters, surveillance footage is used to offer direct visual access to activity happening inside the Capitol, synchronized with hand-held camera videos and audio recordings. Surveillance video shots provide unique visual access to the field, elevating the perspective to a view above the bodies of participants.
The Capitol’s security cameras function something like the eyes of the building, giving optics to its rooms and passageways. Detached from anyone’s hands, the building’s security cameras, and the building itself, are nonhuman “material witnesses” to the mob violence, in the sense that Susan Schuppli (2020) has theorized in her materialist conception of witnessing. In criminal law, the term “material witness” refers to a person: “a witness whose [testimony] is likely to be sufficiently important to influence the outcome of a trial.” 1 For Schuppli (2020: 3), material witness is a concept for making sense of the way nonhuman entities “archive their complex interactions with the world,” undergoing transformations “that can be forensically decoded and reassembled back into a history.” Her concept of material witness is intended to challenge the distinction between testimony and material evidence, and to allow for “the expressive agency of matter” (Schuppli, 2020: 11). Day of Rage uses video from the building’s security cameras as material witnesses in this sense, offering a nonhuman form of relevant testimony to what took place.
The use of surveillance footage also gives viewers a sense of liveness, the impression of observing the scene in real time. Cinema studies scholar Thomas Levin (2002) argues that the central formal quality and rhetorical claim of surveillance video is one of temporal indexicality, the claim or assumption of a direct connection between the recording and the timing and temporal unfolding of a moment in the past. The reason why surveillance video seems to have a direct link to past moments of time is because it appears to offer detached, real-time, and unedited access to the action, a sense of rewinding reality and hitting the play button. As Levin (2002: 592) puts it, “the temporal indexicality of real time surveillance has become an important new idiom of cinema’s ‘reality effect’.” Although Levin focuses on realist fictional film, his argument holds true of news and documentary production, but in this case the truth claims are, in fact, indexical.
Syncing and sequencing security video together with other video and audio recordings contributes to the spatial, temporal, and auditory indexicality of the report. In one example, a surveillance camera shot shows a hallway where members of Nancy Pelosi’s staff rush into a conference room to lock themselves in. The film cuts to hand-held video of rioters moving towards Pelosi’s chambers, chanting her name. Browne’s voiceover indicates that it is 12 minutes later. A cellphone video shows rioters moving into a hallway. We then see the same security camera angle that captured Pelosi’s staff rushing into a room, the hallway now filling with Trump supporters. One of them, highlighted in a circle, pounds on the door of the room where Pelosi’s staffers are hiding, trying to break in. The silent surveillance video is synced with an audio recording from one of the staffer’s cell phones, making audible the thumping sound of the man’s arm as it hits the door multiple times. We hear one of the staffers whisper that the rioters are “trying to find her.”
This seamlessly synchronized and sequenced compilation of video and audio recordings was no small editing feat. One of the main tasks of forensic video analysis is to place video recordings on a timeline that corresponds to a precise measurement and sequence of past time. Audio is often a key factor for synchronizing video taken from different angles at the same time. Determining the precise timing in the past when a video fragment was recorded is important in cases involving a single video, but this kind of temporal work is especially critical, and more complex, when multiple video segments need to be synchronized and sequenced. The precise timing of recordings can be a pivotal source of dispute, so establishing their temporal indexicality often requires multiple forms of verification. In the production of forensic video journalism, it requires careful editing work.
Much more could be said about the use of forensic techniques and aesthetics in Day of Rage to reconstruct the scene of the crime unfolding in and around the Capitol building. Less can be gleaned from the film itself about the lab/studio, or newsroom, where the production of the forensic report took place. The reports of the Visual Investigations team typically include brief commentary about the source materials and techniques used to assemble the accounts. These references to sources and type of analysis are significant because they stand in for expert witnesses and their embodied epistemic authority. The disembodied voice of Malachy Browne often serves this role.
Members of the team have also made a point of publicly discussing their work in materials available on the New York Times website, on YouTube, and in press interviews. This “metajournalistic discourse” (Carlson, 2017) includes explanations about the way the team chooses which incidents to cover, and how they go about conducting their investigations. This reflexive commentary suggests an effort to build trust and credibility with audiences. In the podcast interviews I have already cited, Malachy Browne provided insights into the combined investigative and production processes involved in producing Day of Rage. While he directed the film and his voiceover narrates it, he made clear that it was a collective effort across the Times newsroom: there were “10 people directly involved in making the film – producers, animators, editors” as well as “a supporting cast of many more” (quoted in Shields, 2022). He also described some of the backend work involved in reconstructing the event timeline: “We had this asset spreadsheet but then we also had a sort of timeline spreadsheet where we were moment-by-moment plotting what was going on” (quoted in Shields, 2022). From these interviews we learn that the team also used livestreams to establish the timeline of the riot and the way it unfolded in space: “We had a few dozen livestreams, and those livestreams went for 4 hours, 5 hours, one of them was 8 hours long. And they were really helpful because they allowed us to follow the crowd around the Capitol” (Browne, quoted in Shields, 2022). Livestreams shared online by people present at the Capitol allowed team members to perform the role of journalistic witnessing at a distance, also enabling them to reconstruct the temporality of the event and triangulate recordings.
Some of the key moments of dramatic action in Day of Rage were segments taken from a 39-minute cellphone video that bore the visible watermark “Jayden X,” the pseudonym of a young man named John Sullivan. Sullivan gave his continuous-shot video the title, Shooting and Storming of the US Capitol in Washington DC, and posted it on YouTube. “John Sullivan, via YouTube” appears in the closing credits of Day of Rage, but otherwise segments of his watermarked video are edited together with other footage in a way that renders his role as videographer more or less invisible. One of the reasons why Sullivan’s video was such a valuable media asset is because he recorded the shooting death of Ashley Babbitt as she tried to vault herself through a busted-out window into a hallway where lawmakers were being evacuated. In Day of Rage, the scene is set with Browne’s voiceover warning viewers that a graphic scene of violence is about to occur. We then hear Sullivan’s disembodied voice trying frantically to warn the people present that an officer has his gun drawn just inside the building. The camera is pointed at the gun in the hands of outstretched arms as it fires, the shot quickly panning to the right just in time to display Babbitt falling backwards and onto the floor.
The use of segments of Sullivan’s video without obvious attribution bears out Sandra Ristovska’s (2022: 644) observation that “less privileged actors have more access and . . . take great risks to produce eyewitness images’ but rarely get much credit. If we never learn who Jayden X is in Day of Rage, elsewhere he received more attention. Art critic J. Hoberman (2021) reviewed Sullivan’s video in Art Forum, comparing it to the Zapruder film and calling it ‘cinema as forensic evidence.” Whether Sullivan sees himself as artist, journalist, or detective, the New York Times owes a debt to his riot videography.
Conclusion
I have argued that the Visual Investigation team’s coverage of the January 6 Capitol riot is revealing of the way an emerging form of forensic journalism is responding to the epistemic chaos of the contemporary mediatized and fragmented political sphere. Day of Rage is both a product of these mediatized conditions and an effort to confront their political fallout. When asked in an interview what his team wanted to accomplish in compiling footage of the Capitol riot, director Malachy Browne replied: “What we hoped was that this was going to be an indisputable account of what happened that people can refer back to. And you can see it and hear it for yourself. All the conspiracies around it are disproved by the documentary,” adding that they wanted to produce an account that would have “continuing relevance” (quoted in Shields, 2022). While it is likely that Day of Rage will have continuing relevance, it is unlikely that any account of the event will ever be indisputable. Thanks to their collective expertise and the institutional resources they can draw from to produce their work, the Visual Investigations team can bring to the forum a convincing network of converging evidence. The combination of vernacular video, surveillance footage, architectural diagrams, and forensic techniques all lend this form its compelling audiovisual aesthetics and epistemic authority. But no matter how well presented the material and media evidence, forensic journalism must compete for credibility and attention under the same conditions of epistemic chaos and political fragmentation that led the rioters on their rampage into the Capitol building.
This is not to dismiss the work of the Visual Investigations team, or the technologies and practices they are developing. Their work is receiving well deserved recognition for challenging disinformation and appropriating forensic techniques to develop a new form of investigative journalism. Understanding the changing forms of news in these times is a matter of critical relevance to the future of democratic societies. Today, as Lisa Parks (2018) argues, it is necessary to rethink what is meant by “media coverage” to account for the way a wider range of media technologies are used to produce knowledge about the world. For Parks (2018: 7), media coverage still refers to “the practice of documenting and producing interpretive accounts of world events,” but it now encompasses other forms of mediatization, including the vertical mediation of satellites, drones, and security cameras for reporting, mapping, and monitoring. Rethinking media coverage also means understanding how journalists make use of an ever-expanding array of source materials to create interpretive accounts of events, as well as the backend technologies and practices required to do so. “Technology not only makes journalistic mediation possible,” writes Carlson (2017: 152), “it structures what this mediation can look like and how journalists can imagine their work.” The Visual Investigations team is reimagining media coverage and compelling media scholars to do so as well.
Footnotes
Acknowledgements
I wish to thank Dan Hallin, Jason Hill, Olga Kuchinskaya, Brian Martin, Chandra Mukerji, Sandra Ristovska, Christo Sims, and the anonymous reviewers for their helpful comments on this article.
Correction (February 2024):
Article updated online to correct the last sentence of first paragraph “10th-century” to “20th-century” in the online version.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
