Abstract
Video-based social science research is thriving. Across disciplines and topic areas, researchers use twenty-first century video data to gain novel insights into how social processes and events unfold on the ground. In recent years, “video data analysis” (VDA) has emerged as a methodological framework to facilitate this type of video-based research. The special issue “The Present and Future of Video-based Social Science Research: Innovations in Video Data Analysis” presents methodological innovations that speak to some of the most pressing debates around VDA. Contributions showcase the range of disciplines and research fields VDA is used in, from social interactions and collective behavior to neighborhoods, policing, and public health. This introductory article outlines two areas of growth in VDA methodology that the articles of this special issue speak to: taking advantage of scale and detail in VDA, and situating VDA in the canon of research methods.
Keywords
Introduction
As the availability of videos and images continues to proliferate in societies across the world, the relevance of video data for social science research increases in kind. Ready-made videos (i.e., videos filmed for non-research purposes using mobile phones, CCTV, body worn cameras, dashboard cameras, drones, or other camera technology) can serve as a tremendous resource for social science researchers. In addition, custom-made videos that have been filmed specifically for research purposes have never been easier to collect in an unobtrusive way. Beyond this availability and ease of access, video data have further strengths (LeBaron et al. 2018; Lindegaard and Bernasco 2018; Nassauer and Legewie 2019, 2021, 2022):
they allow unique insights into social dynamics and processes; they enable greater transparency in research; they contain highly detailed information; they offer the ability to re-watch recorded situations and events; and they enable teams of researchers to analyze the same recording of a situation, process, or event.
In light of these affordances, researchers in the social sciences increasingly use video data to study social life. Video-based research is highly diverse in disciplines and topic areas, goals, methodology, analytical focus, and many more aspects. But there are also clear common threads and much potential for cross-pollination, which creates both the need and the opportunity for a common methodological reference point and shared vocabulary to facilitate communication of methods and findings. The multi-disciplinary framework, video data analysis (VDA; Nassauer and Legewie 2019, 2021, 2022), aims to contribute to this effort.
1
Under its umbrella VDA seeks to collect studies characterized by prominent use of video data and a focus on the content of recordings capturing social processes and outcomes as they happen
The special issue “The Present and Future of Video-based Social Science Research: Innovations in Video Data Analysis” reflects the diverse disciplines and topic areas engaged in VDA and explores cutting-edge methodological debates and innovations around VDA. Each article, in its respective discipline(s) and topic area(s), opens innovative perspectives on the use of twenty-first century video data and beyond. This article contextualizes the contributions to the special issue by discussing two areas of growth in VDA methodology to which the contributions speak: taking advantage of scale and detail in VDA and situating VDA in the canon of research methods.
Video Data Analysis in a Nutshell
VDA is a collection of tools, ideas, and guidelines rather than a narrowly defined approach. Its goal is to connect rather than divide video-based researchers who aim to study social phenomena at the level of situational dynamics. Hence, in practice VDA is used in a great variety of research scenarios: as a standalone method or together with other approaches; in small-N case study designs and large-N projects; and in exploratory, inferential, and experimental research.
In the past decade, the use of approaches that fall under the VDA umbrella have grown exponentially across the social sciences (for an ever-expanding collection, visit www.videodataanalysis.com/research-articles-2/). Application areas include crime and deviant behavior (Bramsen 2017; Lindegaard, Vries, and Bernasco 2018; McCluskey et al. 2019; Nassauer 2018, 2019; Sytsma and Piza 2018), gender (Edmonds and Pino 2023; Mazières, Menezes, and Roth 2021), ethnic and racial relations (Dietrich and Sands 2023; Fairbairn et al. 2013; Malik, Hopp, and Weber 2022; Timmermans and Tavory 2020), health (Asan and Montague 2014; Hoeben et al. 2021), education and learning (Aspelin and Eklöf 2023; Böheim et al. 2020; Bohn et al. 2022; Gentrups et al. 2020), and political science (Antas and Kozień 2018; Brierley, Kramon, and Ofosu 2020; Chang and Peisakhin 2019), among many others.
Despite the many recent substantive applications and methodological reflections, there are a number of areas for further methodological growth. The contributions in this special issue all use video or image data as their primary data, but take the analysis to very different and new places, and thereby address different methodological challenges and discussions, in particular around how researchers can take advantage of scale and detail in VDA and how we can situate VDA in the canon of research methods. In the following sections, we briefly introduce these two growth areas and connect them to the articles in this special issue.
Taking Advantage of Scale and Detail in VDA
Because they are rich in detail, video data require a lot of time and resources to analyze, which creates challenges for VDA researchers (Gentrup et al. 2020; Golann, Mirakhur, and Espenshade 2019; Hoeben et al. 2021). The issue is most obvious when handling large-scale video corpora, but it can also arise when focusing on a few cases for in-depth analysis (e.g., when observing complex scenarios, such as events that stretch over several hours and have thousands of participants). This high demand in time and resources often limits which types of VDA studies are feasible, as researchers have to settle for smaller samples when doing quantitative analyses or less detail when doing in-depth case analyses. A crucial growth area for VDA is hence to make analysis of video data more scalable while still using their detailed information, both for large-N studies and in-depth case analyses. Fortunately, the field of computer vision—the automated processing of image or video data through the use of software algorithms—has made great strides in recent years, and already offers tools to expedite some types of analysis in VDA (for introductions, see Webb Williams, Casas, and Wilkerson 2020; Torres and Cantú 2022; Nassauer and Legewie 2022:211–36).
Several articles in this special issue tackle crucial aspects of scalability and detail in VDA. The article by Hwang, Dahir, Sarukkai, and Wright describes a cost-efficient way to produce the training data that most computer vision applications need (either to train models from the ground up, or to fine-tune existing models to a specific use case). The usual approach would be to provide a relatively high level of training to a small group of individuals and have them label a subset of videos or images. Instead, the authors’ approach utilizes simple tasks and pairwise comparisons. The individuals labeling data in this approach are recruited through crowd-sourcing platforms and require minimal training, which reduces financial and time investment necessary to produce training data. The authors apply their approach to image data (specifically, the detection of trash levels in US cities, based on Google Street View images), but the same techniques can be employed when labeling video data in a great range of research scenarios. The approach thereby helps applying computer vision in VDA research, which scholars can use to facilitate tasks such as identification of relevant cases or videos and coding of video data.
Goldstein, Legewie, and Shiffer-Sebba also tackle scalability and detail, focusing on analysis of video data. The authors develop “3D Social Research” (3DSR), a new computer vision tool that allows scalable and precise measurement of physical distance, movement in space, and movement rate of individuals during social interactions. These measurements allow crucial new insights in social interaction research, specifically regarding aspects of kinesics and proxemics (i.e., how people use physical and personal space and how they move their bodies). 3DSR can be used as the focal point of analysis or as a complementary perspective on situational dynamics, and it can be employed in both large-N studies and studies focusing on in-depth case analyses. The approach can therefore contribute to scaling up analysis in VDA studies interested in kinesics and proxemics across all disciplines and topic areas.
The contribution to this special issue by Bernasco and colleagues demonstrates how computer vision can be employed to solve the scalability problem in a concrete example: adherence to social distancing directives during the COVID-19 pandemic. The authors describe the full research process of using computer vision in a social science context, from producing a training data set to building and training an algorithm for automated coding of videos, testing the algorithm's reliability, and applying the algorithm to a corpus of CCTV video stills. The article illustrates that computer vision can already make significant contributions to social science research today, and holds even greater promise in the near future.
Situating VDA in the Canon of Research Methods: Triangulation, Concepts, and Measurements
Another challenge and growth area with VDA surrounds situating it in relation to other data types and as a means to scientific insight. What are potential weak points of VDA compared to other types of data, and how can researchers most fruitfully triangulate data types? How can VDA concepts and measurements be employed to speak to important current and future theoretical debates in the social sciences?
One aspect of situating VDA vis-à-vis other data types involves exploring what information videos provide reliably compared to other types of data, what insights we can best glean from video data, and what systematic limitations researchers should consider. Tackling this area, McCluskey and Uchida's contribution to this special issue investigates how videos from body-worn cameras compare to in-person systematic social observation when collecting data on police-citizen encounters. They find significant gaps in audio and video information of the filmed situations in body-worn camera footage, but also identify substantial capacity of such data for assessing important aspects of situational analysis, including sequential developments, causal ordering, and the duration of events. The article helps situate body-worn camera data within the methodological toolbox for analyzing police-citizen encounters while also contributing important insights to broader debates around the validity of video-based measurements.
Another aspect of situating VDA is finding fruitful ways of data triangulation. For instance, understanding the data-generating process for ready-made videos (i.e., how such video data are generated) can be crucial in drawing valid conclusions from this data. The combination of in-person observation and ready-made video data used in McCluskey and Uchida's article offers one way to probe data-generating processes. 2
Regarding situating VDA as a means of scientific insight, a crucial step is to reflect on concepts used in VDA and the validation of measurements derived from video data. Validation involves exploring issues of validity of data and measures, and relating directly observed data points to some unobserved but important phenomenon. In VDA, this means being able to pinpoint exactly what something observable directly in video data tells us about a more abstract concept of interest. As but one example, let us assume that a team of researchers is interested in studying racial bias, and prior research suggests that conscious or subconscious reservations toward ethnic and racial minorities will manifest in non-verbal cues during social interactions (e.g., Choi, Poertner, and Sambanis 2019; Goff, Steele, and Davies 2008; Zhang, Gereke, and Baldassarri 2022). An important step in this research would be to connect patterns observable in videos of social interactions to established concepts and research perspectives on the topic, such as attitudinal measurement instruments that capture issues around racial bias (e.g., experiences of racism, McNeilly et al. 1996; notions of white privilege, Pinterits, Poteat, and Spanierman 2009) or the notion of “white space” (Anderson 2015). Eventually, one goal for VDA can be to build a library of transparent and validated coding schemes that capture phenomena of frequent interest and can be employed out of the box or used as a reference point when analyzing video data. 3 3DSR, as presented by Goldstein, Legewie, and Shiffer-Sebba in this special issue, can provide the basis for capturing aspects of kinesics and proxemics in video data and relate them to established measures, or compare them across national and cultural contexts. Thus, 3DSR can be an important tool for validation studies.
While validation in a narrow sense is mostly an issue for large-N studies focused on statistical inference, current and prospective VDA users who employ in-depth case analysis can profit from more validated video data measurements, too. One example of a validated measure that is often used in in-depth VDA case analyses is Ekman's notion of facial expressions as indicators of human emotion (Ekman 2007). 4 Hence, even though in-depth case study research uses different techniques to ensure validity of drawing insights from data, scholars in this field can still profit from advances in validation efforts in VDA.
The Future of VDA Research
We find ourselves in a new era, with more and more aspects of social life being captured on video, from the mundane to the extraordinary. In the coming years, this trend will only intensify. The aim of this special issue is to showcase some of the latest methodological innovations and debates in the thriving field of video data analysis. The articles in the issue, written by some of today's most prominent methodologists in video data analysis, introduce cutting-edge solutions to core challenges of VDA research. They also provide readers a glimpse into the fascinating and dynamic world of video-based social science research.
Exciting debates and innovations in VDA methodology are not limited to the two areas of growth covered in this special issue. For instance, with recordings increasing exponentially, one open question is how we can manage video recordings, given that files are usually quite large compared to most other types of data (e.g., tabular data, text data, or even audio files). 5 For instance, Golann and colleagues (2019), who collected a total of 11,470 h of video data for the New Jersey Families Study, 6 are in the process of creating a data repository for their large-scale video project. Questions of data management and accessibility will become more salient in the VDA space over the coming years.
Another growth area relates to sampling in VDA. As video corpora increase in size and large-N VDA applications become more common, researchers need to reflect on issues of sampling in the context of video-based social science studies. How can we assess and improve representativeness in inference-based VDA studies? So far, sampling issues have been largely overlooked in methodological discussions of VDA.
We see a further growth area in mixed-methods VDA, which becomes increasingly relevant to connect empirical insights. Currently, scholars in the field might study the same topic from a qualitative in-depth and quantitative VDA angle; one recent example are the studies on police use of force as in-depth case analyses (Nassauer 2023) and large-N quantitative analysis (Piza and Sytsma 2022). VDA would profit from engaging more with the mixed methods literature and reflecting on how to combine approaches in systematic and meaningful ways.
Lastly, VDA scholars can work to expand techniques for presenting research findings. How can we harness the potential of video data to increase transparency and informativeness when presenting findings? What formats for non-linear, interactive communication of results do video data enable? In reflecting on such questions, VDA research could contribute to enhancing the classic journal article format of scientific publication, not only for VDA studies, but also beyond (for some reflections on this topic, see Nassauer and Legewie 2022:185–204).
Many more important issues and challenges exist, and will continue to emerge as VDA further matures. This brief reflection on further growth areas serves to stress a point demonstrated by the articles in this special issue: methodological research in VDA and adjacent methods will not run out of issues, challenges, and exciting new possibilities to use the tremendous and unique source of insight that is twenty-first century video data in social science research.
Footnotes
Acknowledgements
The authors are grateful to Christopher Winship and Felix Elwert for their encouragement and guidance in editing this special issue. Lisa Charron was an endless source of support and an invaluable partner in the editorial process. Further, they owe a debt of gratitude to all anonymous reviewers who provided their keen insights and constructive comments to improve each contribution to this issue. Last but not least, they would like to thank the authors who submitted their innovative work to the special issue and engaged in thinking about the special issue's theme, “The Present and Future of Video-based Social Science Research: Innovations in Video Data Analysis”.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
