Big data and explanation: Reflections on the uses of big data in media and communication research

Abstract

In the article, we argue that the advent of data mining techniques and big data in media and communication studies present problems that involve fundamental methodological questions, requiring us to revisit existing ways in which the link between theory, operationalization and data are explained and justified. We note that the discourse of instrumental optimization that surrounds big data clouds epistemic debates about their appropriate integration in scholarly explanations, and argue that a discussion of these problems can usefully depart from a distinction between the two main types of data mining models (supervised and unsupervised). We argue that both types pose specific challenges and give examples of ways they have been productively overcome. In particular, we argue that while big data approaches have introduced novel opportunities for research, they have fundamentally been incorporated into media and communication studies in ways that comply with existing, prototypical explanatory schemes. Our examples link specific empirical studies to general strategies of scientific explanation, focusing on neo-positivist, critical realist and interpretivist explanations.

Keywords

Big data explanation media and communication studies methodology optimization social science theory of science

Get full access to this article

View all access options for this article.

References

Anderson

(2008) The end of theory: The data deluge makes the scientific method obsolete. WIRED Magazine, 23 June. Available at: https://www.wired.com/2008/06/pb-theory/

Blaikie

NWH

(2000) Designing Social Research: The Logic of Anticipation. Malden, MA: Polity Press.

Blaikie

NWH

(2007) Approaches to Social Enquiry . Cambridge ; Malden, MA: Polity Press.

Blaikie

NWH

Priest

(2016) Social Research: Paradigms in Action. Malden, MA: Polity Press.

Bolsover

Howard

(2019) Chinese computational propaganda: Automation, algorithms and the manipulation of information about Chinese politics on Twitter and Weibo. Information, Communication & Society 22(14): 2063–2080.

boyd

Crawford

(2012) Critical questions for big data. Information, Communication & Society 15(5): 662–679.

Chen

Guestrin

(2016) XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM Sigkdd international conference on knowledge discovery and data mining, San Francisco, USA, August 13-17, 2016, pp. 785–794. New York: Association for Computing Machinery.

Efron

Hastie

(2016) Computer Age Statistical Inference: Algorithms , Evidence, and Data Science. New York: Cambridge University Press.

Fuchs

(2017) From digital positivism and administrative big data analytics towards critical digital and social media research! European Journal of Communication 32(1): 37–49.

10.

Hardy

(2014) Critical Political Economy of the Media: An Introduction. London; New York: Routledge; Taylor & Francis.

11.

Hastie

Tibshirani

Friedman

(2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer.

12.

Healy

(2018) Data Visualization: A Practical Introduction. Princeton, NJ: Princeton University Press.

13.

James

Witten

Hastie

, et al. (2013) An Introduction to Statistical Learning: With Applications in R. New York: Springer.

14.

Kok

Rogers

(2016) Rethinking migration in the digital age: Transglocalization and the Somali diaspora. Global Networks 17(1): 23–46.

15.

Krizhevsky

Sutskever

Hinton

(2012) Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 60: 1097–1105.

16.

McCombs

Shaw

(1972) The agenda-setting function of the mass media. The Public Opinion Quarterly 36(2): 176–187.

17.

McKinsey Global Institute (2011) Big Data: The Next Frontier for Innovation , Competition, and Productivity. New York: McKinsey Global Institute.

18.

Marres

Gerlitz

(2018) Social media as experiments in sociality. In: Marres

Guggenheim

Wilkie

(eds) Inventing the Social. Manchester: Mattering Press, pp. 253–283.

19.

Mayer-Schönberger

Cukier

(2013) Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston, MA: Houghton Mifflin Harcourt.

20.

Murthy

(2017) The ontology of tweets: Mixed methods approaches to the study of Twitter. In: Sloan

Quan-Haase

(eds) The SAGE Handbook of Social Media Research Methods. Thousand Oaks, CA: SAGE, pp. 559–572.

21.

Neuman

Guggenheim

Mo Jang

, et al. (2014) The dynamics of public attention: Agenda-setting theory meets big data. Journal of Communication 64(2): 193–214.

22.

Provost

Fawcett

(2013) Data Science for Business: What You Need to Know about Data Mining and Data-analytic Thinking. Sebastopol, CA: O’Reilly.

23.

Rogers

(2013) Digital Methods. Cambridge, MA: The MIT Press.

24.

Rogers

(2019) Doing Digital Methods. Thousand Oaks, CA: SAGE.

25.

Rogers

(in press) Deplatforming: Following extremist internet celebrities to Telegram and alternative social media.

26.

Rogers

Marres

(2000) Landscaping climate change: A mapping technique for understanding science and technology debates on the World Wide Web. Public Understanding of Science 9(2): 141–163.

27.

Schroeder

Cowls

(2020) Big data approaches to the study of digital media. In: Hunsinger

Allen

Klastrup

(eds) Second International Handbook of Internet Research. Dodrecht: Springer, pp. 957–977.

28.

Zuboff

(2018) The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. 1st edn. New York: PublicAffairs.