Abstract

Innovations in Digital Research Methods, edited by Peter Halfpenny, emeritus professor of sociology at the University of Manchester, and Rob Proctor, professor of social information in the department of computer science at the University of Warwick, brings together the contributions of 27 British and Australian academicians from the fields of sociology, psychology, computer science, statistics, and spatial analysis. The textbook offers instructors and both graduate and undergraduate students a well-documented survey of how social researchers can best use new e-data sources and emerging digital technologies in today’s hyper-publishing climate. This is a welcome addition to any social research methods class or researcher unfamiliar with the challenges of locating, collecting, and analyzing “Big Data” given the paucity of such topics in popular U.S. social research methods textbooks (see Babbie 2016; Schutt 2019). I should say at the outset that many of the logistical and financial barriers discussed in the text from the various contributors are specific to government or nonprofit agencies in the United Kingdom and Europe. Having said that, readers will find a abundance of information (albeit a bit technical for undergraduates perhaps) that easily mirrors the types of problems North American quantitative and qualitative researchers have become more concerned about in recent years (Burns and Wark 2020; Mills 2018). With that said, I am confident the text is an important contribution to any level of discussions on the use of digital tools.
Divided into 13 standalone chapters, the textbook illustrates the e-Research landscape by closely examining topics such as how to access various e-data sources, how to adapt survey research methodologies immersed in big data, data management techniques, the use of statistical software packages, the comingling of quantitative and qualitative e-data sources, spatial analysis opportunities, ethical concerns, and finally, a philosophical challenge to sociologists immersed in the hyperreality of digital lives (Baudrillard [1981] 1994). Additionally, the contributors have provided student resources to supplement each chapter on the publisher’s website, including extensive bibliographies and detailed online resources that clearly demonstrate the book’s worth to researchers, instructors, and students.
Students and instructors may find the introductory chapter less illuminating than the others because editors Halfpenny and Proctor devote a great deal of space discussing the UK’s e-Infrastructure and e-Science projects. Additionally, undergraduate students may find most of the chapters quite dense and uneasy to read. As a reviewer, I found the overall text challenging because many of the chapters were structured around the trials and tribulations of obtaining data from UK sources. That said, I suggest readers and instructors in the United States simply envision the U.S. Census Bureau’s often maddening data download sites in place of the UK-based examples in the book to keep your attention focused. Lest I demonstrate too much bias in my assessment of the textbook’s Euro-specific examples, each chapter’s discussion of an agency or organization is footnoted (on the page cited) with the applicable website address, which can prove useful to cross-national comparisons. A strong advantage to this text is the types of research projects and digital tools discussed, and the variety of government sources mentioned had me imagining how I might adapt similar designs going forward.
Chapter two, “The Changing Social Science Landscape,” by Kingsley Purdam and Mark Elliot, offers insights into problems scientists face collecting data from various information silos, commonly referred to as the mining of “big data.” They offer an insightful observation that as researchers, we have until recently expressed “the notion of data as something we have,” but currently and for the foreseeable future, we must recognize data as something we are “immersed and embedded in” (p. 26; italics in the original). Thus, we must seriously consider spending more research development time contemplating the types of data sources, collection methods, and analytical tools used. The authors offer a typology of different data forms that prove useful to researchers developing a collection process and highlight how the uninitiated can develop a transparent schema. Purdam and Proctor conclude with a discussion of validity and reliability arguments on the use of big data, including recognition that the boundaries between subjects and their inferred data representations often appear opaque. They caution that a rush to publish and subsequent economic challenges can tempt some to rely on social media without carefully weighing theoretical frameworks. Overall, their chapter is a solid narrative of the opportunities and challenges of using e-data sources.
Chapter three, “Exploiting New Sources of Data,” by Mark Elliot and Kingsley Purdam, provide comprehensive case studies that examine elections, civil unrest, migration, homelessness, surveillance issues, social behaviors, and genetics linked to health and well-being. The example describes the various ethical concerns and tasks researchers struggle with when collecting representative data from various mediums. Interestingly, the authors discuss the crossover of eye-tracking techniques generally used in the cognitive sciences that are in use in social research today. In this case study, subjects’ responses to consumer product images that included references to carbon footprints alongside other product information were tracked and allowed inferring subjects’ awareness of competing social concerns. The chapter provides clear examples of the growing opportunities for interdisciplinary research and the use of new analytics in combination with new theoretical orientations.
Chapter four, “Survey Methods: Challenges and Opportunities,” by Joe Murphy, discusses the very real difficulties associated with survey research today. Specifically, Murphy examines the anecdotal evidence of poor survey research responses amid populations inundated with daily e-marketing schemes. In chapter five, “Advances in Data Management for Social Survey Research,” Paul S. Lambert discusses the rapid growth of new sources of large-scale societal- and quality group-level data, the statistical software packages most used for analyses, and the development of data management control and storage techniques. The chapter specifically focuses on providing readers an understanding of the mundane intricacies of collecting and coding data from various sources. A highlight of the chapter is the author’s listing of DAMES (Data Management through e-Social Science), a UK product launched to develop new online resources through a series of compiled case studies and applications in data management. In addition, there are complete listings and descriptions of Methodbox, a project dedicated to health-related data; NeiSS (National e-Infrastructure for Social Simulation), which coordinates and stores data for simulation models on a large scale; PolicyGrid, a data management project that established information retrieval systems of social surveys; and CESSDA (Council of European Social Science Data Archives), an umbrella organization that houses national data for distribution.
In chapter six, “Modelling and Simulation,” Mark Birkin and Nick Malleson present case studies that characterize four types of model-based approaches to social science research: spatial interaction model (mathematical), statistical model, demographic model, and crime model. Although the authors acknowledge that these computational models are not new to the social sciences, they argue that a transformation over the last several decades in terms of use and computing power has resulted in profound benefits to social science in general. However, researchers must be aware that collection and ownership of data going forward will become highly prickly issues. In chapter seven, “Contemporary Development in Statistical Software for Social Scientists,” Lambert and colleagues provide a short overview of the major statistical packages used in the UK (primarily SPSS) and a description of Stat-JR, a recent software program (free to UK academics only) that acts as a general user interface in conjunction with a researcher’s other statistical packages. I suggest this is the least necessary chapter for readers or students given the availability and documentation for U.S. software packages in use today, including comprehensive YouTube videos on, for example, the open-source R package.
Chapter eight, “Text Mining and Social Media: When Quantitative Meets Qualitative and Software Meets People,” by Lawrence Ampofo and colleagues, is similar in structure to previous chapters: brief but specific to software packages and case study examples. The authors describe text mining and the use of text as data in various analytical frameworks. An overview of text analyses in the social sciences is offered, including sections on conducting qualitative research with UK-specific computer-assisted software programs. Readers will find some jewels of interest in the section on sentiment analysis, which offers first-time semantic researchers’ insights into the complexities and benefits of textual analysis. However, they rightly caution that the software packages and consequent algorithms cannot substitute for trained human coders who can identify specific contexts for identified words.
Chapter nine, “Digital Records and the Digital Replay System,” by Paul Crabtree and colleagues, reminds readers of the many data points captured each time any one of us begins our daily routines. As many researchers are aware, the authors describe how our use of electronic devices and other media, whether at work, home, or play, including camera surveillance in postindustrial countries, has produced complex layers of digital footprints. They point out that the pervasiveness of today’s computer technologies and the internet offers researchers an ocean of electrons (logged every step of the way) for theorizing, harvesting, and analyzing. However, researchers should not take lightly that gaining access to someone’s digital footprint will bolster reliably and validity claims. Readers will find the authors offer up crowdsourcing data examples that demonstrate how researchers can increase public participation.
Chapter 10, “Social Network Analysis,” by Robert Auckland and Jonathan J. H. Zhu, is a brief but comprehensive survey of social network analysis. Most importantly, the authors provide descriptions and website links to the various tools for the collection of online network data. Each tool is specific to a researcher’s interest in the direction (i.e., direct or undirected) of particular social ties. They next offer readers a substantial list of websites that provide tools that bolster analysis and visualization of investigated networks that have already been identified and downloaded. There are also noteworthy discussions on construct validity and sampling. In chapter 11, “Visualizing Spatial and Social Media,” Michael Batty and colleagues offer important insights into geographic information systems (GIS) and graphical user interfaces (GUS). The authors offer a brief introduction into the use of two-dimensional and three-dimensional map data and desktop systems and review the various geospatial websites available, including the U.S. Census Bureau’s population density maps and MapTube, which allows visualization of street-level comparisons with neighborhood- and county-level data.
Chapter 12, “Ethical Praxis in Digital Social Research,” by R. J. Anderson and Marina Jirotka, makes a strong case that the use of big data sources and emerging technologies does not preclude serious ethical considerations. They argue for formalized training in research ethics specific to the collection of e-data, regardless of the perceived distance between the researcher and the source data collected. However, they warn that traditional conceptions of ethics as bivalent are less helpful in today’s environment and need to become more adaptable to the rapidly evolving social conditions brought about by digital interactions, retrievals, and storage techniques. Although the authors do not condemn the use of the various data footprint collection modalities in use today, they offer several exemplary in-depth case studies surrounding privacy concerns and emergent social issues as social scientists’ investigations become more nuanced. For example, one case study describes the T3 Project conducted by researchers from Harvard University and the University of California Los Angeles that collected Facebook profiles of the complete intake of Harvard’s students in 2006; updated profiles were recollected in 2007, 2008, 2009. Although the Institutional Review Boards at both universities cleared the project based on a guarantee of subjects’ anonymity (and an assumption that profiles would lack any predictable pattern that could identify subjects), a researcher from the University of Wisconsin later demonstrated individual profiles, geographic location, and organizational setting could be inferred by simply studying the project’s codebook (Zimmer 2010). The authors’ examination and discussion of unsuspected lapses to protect subjects’ privacy and confidentiality is well worth sharing with students and colleagues.
Mike Savage, in Chapter 13, “Sociology and the Digital Challenge,” summarizes in illustrative ways the use of digital sources in no way is a “straightforward project” for sociology (p. 298). He argues that too often, researchers view the tasks emerging from the digital age as in an “epochal frame where it is deemed to have intrinsic powers to remake social science” instead of using their sociological imaginations (p. 303). He asserts that the problem for researchers is to unmask the mystery surrounding big data and locate the design, collection, and analysis of data within its historical context so that meaning attributions become more transparent to other researchers. Specifically, he discusses the structural influences that inform researchers’ perspective, such as the narrative, accounting, and glance that impose a sense of objectivity and validity based on a researcher’s knowledge assertions. Overall, this chapter would add value to any instructor’s efforts to demystify big data and help students to better comprehend the necessity for reflexive sociology (Bourdieu and Wacquant 1992).
As previously mentioned, the collection of articles provides substantive insights into digital sources and techniques for both quantitative and qualitative researchers at the undergraduate and graduate levels. The overall layout of the text allows instructors to select chapters according to their syllabus design and in no way requires a linear approach or the reading from cover to cover of the text’s materials This textbook, either in whole or select chapters, would be a great addition to research methods and data analysis courses. The editors can be applauded for their ability to recruit so many knowledgeable contributors. Likewise, the contributors can be commended for providing sufficiently detailed descriptions and explanations (including footnotes and web addresses) that further our understanding of the identification, collection, and use of big data and e-sources.
