Editorial introduction: Towards a machinic anthropology

Abstract

Bringing together a motley crew of social scientists and data scientists, the aim of this special theme issue is to explore what an integration or even fusion between anthropology and data science might look like. Going beyond existing work on the complementarity between ‘thick’ qualitative and ‘big’ quantitative data, the ambition is to unsettle and push established disciplinary, methodological and epistemological boundaries by creatively and critically probing various computational methods for augmenting and automatizing the collection, processing and analysis of ethnographic data, and vice versa. Can ethnographic and other qualitative data and methods be integrated with natural language processing tools and other machine-learning techniques, and if so, to what effect? Does the rise of data science allow for the realization of Levi-Strauss’ old dream of a computational structuralism, and even if so, should it? Might one even go as far as saying that computers are now becoming agents of social scientific analysis or even thinking: are we about to witness the birth of distinctly anthropological forms of artificial intelligence? By exploring these questions, the hope is not only to introduce scholars and students to computational anthropological methods, but also to disrupt predominant norms and assumptions among computational social scientists and data science writ large.

Keywords

Computational ethnography quali-quantitative methods mixed methods digital methods social data science machine anthropology

This article is a part of special theme on Machine Anthropology. To see a full list of all articles in this special theme, please click here: https://journals.sagepub.com/page/bds/collections/machineanthropology

Introduction

Over the last decade, computational techniques developed by data scientists have transformed social science (Centola, 2018; Jemielniak, 2020; Lazer et al., 2009a, 2009b; Mills, 2019; Mohr et al. 2015; McFarland et al., 2016; Pentland, 2015; Salganik, 2019; Sekara et al., 2016; Veltri, 2019). This has been especially prominent within sociology and political science (Grimmer and Stewart, 2013) but also economics (Gentzkow et al., 2019) and psychology (Goel et al., 2010). Machine-learning methods have revolutionized the quantitative study of text data, just as the field of network science has revitalized the field of social network analysis (Centola, 2018; Sekara et al., 2016). Anthropologists and other social or humanities researchers relying on ethnographic methods have, however, been conspicuously absent from these developments. While the founders of modern anthropology ‘were brought up as part of a larger and far more quantitative’ (Munk, 2019: 163) tradition, socio-cultural anthropology has earned the reputation as ‘one of the least “mathematized” and “computerized” of the social sciences’ (Cunningham, 1996: 401). Indeed, ‘many ethnographers and qualitative researchers more broadly have been reluctant to integrating computing into their “craft”’ (Abramson, 2016: 255), presumably because quantitative methods are widely associated with ‘positivist’ and other purportedly epistemologically and ethically flawed approaches. No matter whether it is due to the limited statistical and computational competencies among many anthropologists, or whether it involves a more fundamental (and largely tacit) aversion to anything quantitative (Feldman, 2017), the fact is that the potentials (as well as the perils) of a computational or indeed machinic anthropology remains largely unexplored. It is just this lacuna that this special theme issue seeks to address.

Can ethnographic data and methods be combined with machine learning? Does the rise of social data science allow for a realization of Levi-Strauss’ old dream for a computational anthropology – and if so, to what benefit (and with what risk)? Bringing together researchers from academia and beyond who have spearheaded the combination of ethnographic and big data, this special theme builds on and goes beyond existing work on the complementarity between ‘thick’ and ‘big’ data by exploring what a fully-fledged integration or even fusion between anthropology and data science might look like. In doing so, the ambition is not only to contribute to the development of novel computational anthropological methods, but also to subvert established conventions within and bifurcations between quantitative and qualitative social science.¹

Anthropology and data science: Three strategies

In broad terms, one can distinguish between the different strategies for how anthropologists as well as scholars from cognate disciplines relying on ethnographic approaches² have over the last decade engaged with the so-called Big Data revolution, namely what might be called the anthropology of (big) data science, the anthropology with data science, and the anthropology as – or by – data science. While there are several overlaps between these three strategies when it comes to both analytical approach and personnel, each can be said to represent a distinct positioning within, and attitude towards, social scientific engagements with computational methods and data science in the academy and beyond. Let us now consider these in turn, with emphasis on the third strategy, which echoes the aspiration of the present social theme issue. As we shall see, while there are only few existing examples of studies fall within this latter category, it arguably also this strategy that holds the greatest transformative potential for both qualitative social science and data science alike.

The anthropology of big data/data science has so far been most dominant. Numerous scholars have studied data science and data scientists ethnographically (e.g. Bell et al., 2015; Madsen et al., 2018; Douglas-Jones et al., 2021; Kockelman, 2020; Mackenzie, 2017), as part of a wider interest in ‘critical data’ (Blok and Pedersen, 2014) or ‘critical algorithm’ (Seaver, 2017) studies. These approaches serve as a corrective to the techno-futurism (McGillivray et al., 2020) of ‘big data’ narratives. Yet, while recent ethnographic work on vernacular data practices (e.g. Hobbis and Hobbis, 2022) offers a welcome reality-check to the alarmism of much critical work on the political economy of digital data, the anthropology ‘of data’ literature is still dominated by ‘a distant, neutral gaze’ (Paff, 2022) posited to be at safe distance from the big data reality studied. Crucially, this sense of epistemological-cum-ethical superiority and distance extends to the algorithms and other computational techniques deployed by the data scientists themselves. While this is unsurprising given the skepticism towards quantitative data and quantification among modern-day socio-cultural anthropologists and critical sociologists, it is still unsatisfactory. Indeed, there is a something profoundly non-anthropological about the knee-jerk way in which anthropologists tend to reduce quantitative data and methods to mere objects of critique. As several scholars have pointed out (Fortun et al., 2017; Knox and Nafus, 2018; Paff, 2022), the entanglements between anthropology and data science cannot be reduced to a relationship between etic subject that studies and an emic object of study. There is a need for more respect, curiosity and symmetry, between the two communities – including toward their respective methods.

The anthropology with big data is a step in this direction. Representing different attempts to bring disparate qualitative and quantitative datasets into dialogue through ‘integration’ (Charles and Gherman, 2019) and ‘complementarity’ (Blok and Pedersen, 2014), this ‘quali-quantitative’ (Venturini and Latour, 2010) literature includes Bornakke and Due’s (2018) ‘big-thick blending’ of observational and camera data, Christin's triangulation between ethnographic and algorithmic data (2020), and Blok et al.’s (2017) ‘stitching’ of fieldwork notes and sensor data. Yet, this and similar studies (e.g. Beaulieu, 2017; Ford, 2014; Lowrie, 2018; Ruckenstein, 2019) tend to be pilot work and thus hard to generalize from, new generations of social data scientist are left ‘in the dark’ as to how to design ‘process[es] of complementing big and thick insights’ and what is the most ‘practical method for integrating big and thick data’ (Bornakke and Due, 2018: 13).

Several contributions to this special issue take steps in this direction. Heeding Breiger et al.’s (2018) call for a ‘low-tech formalization for text analysis’, Isfelt et al. combine netnographic material about green activists in Denmark with computationally generated Twitter data in order to write a ‘micro-history of ideas in real-time’. Under heading ‘thick quali-quantitative data’, Albris et al. reflect on how digitized logbooks can optimize the collection, processing and analysis of (n)ethnographic data. Finally, based on research focused on the micro-sociological aspects of international diplomacy, Adler-Nissen and her colleagues illustrate how the political scientific study of social media data can benefit from treating them as both singular data points in a larger pattern and as fluid objects embedded in broader social processes. In each of these cases (as in other studies in the same mould as Moats and Borra, 2018; Pretnar and Podjed, 2019)), an established field (anthropology, international relations), is augmented by juxtaposing ethnographic and computational methods in hands-on ways, which may inspire more quali-quantitative applications in the future.

Anthropology as data science is almost terra incognito. Perhaps the reason why is that this approach requires ‘enacting [the qual-quant difference] into our own practice as anthropologists’ (Paff, 2022). Whereas, in the two above strategies, anthropology is reproduced as a distinct discipline with its own methodological tools (fieldwork) and epistemological assumptions (e.g. about quantitative methods), this third strategy seeks to disrupt, transform and expand what anthropology is, or rather could be. Of course, quantitative anthropology has a long history, even if it has always been marginalized in the discipline (Chibnik, 1999; Pedersen, 2021; Schaffer, 1994). After all, people who do fieldwork count things all the time, no matter whether they recognize this or not (Pedersen, 2019). Famous figures from classic British (e.g. Gluckman, 1961) and American (Driver and Kroeber, 1932) anthropology promoted quantitative data and methods, and the history of the discipline is awash with attempts to introduce a more formalized and mathematicised collection, processing and analyses of ethnographic data, although these have generally left little impact.³ However, it was Levi Strauss who first called for a computational anthropology. ‘The fundamental requirement of anthropology’, he it, ‘is that it begins with a personal relation and ends with a personal experience, but … in between there is room for plenty of computers’ (cited in Hymes, 1965; see also Levi-Strauss, 1963). Yet, due to a combination of lacking processing power, technical skills, and institutional backing, he never ‘developed … a systematic program of investigation based upon the repertory of basic mathematical structures’ (de Almeida et al., 1990: 370). Instead of using computers in concrete anthropological research, ‘the imagined computer allow[ed] Lévi-Strauss’ ideal method to exist, in theory’ (Seaver, 2014).⁴

Only a limited number of recent works represent a genuine anthropology as data science approach. These include Hsu's (2014) spirited plea for ‘unleashing’ quantitative data ‘from the disciplinary compartmentalization of science’ to ‘discover new interpretative and speculative territories’ and Brooker's no less enthusiastic suggestion to ‘incorporate bots into the sociological mold to harness them for sociological service’ (2019: 2). After all, as Brooker also elaborates upon in his commentary to this special theme issue, algorithms are imbued with ‘the potential to perform a wider range of sociologically [and by implication anthropologically] relevant functions’ (2019: 1234). Munk et al.'s contribution to this issue, where three authors in their own words seek to build a deliberately playful ‘an ethnographic algorithm capable of passing for a native’, is case point (indeed, during the workshop in Copenhagen, Munk and colleagues brought with them a physical prototype of an ‘anthropological machine’!). Indeed, deliberate playfulness is a characteristic feature of much research in the intersection between data science and qualitative social science, including my own (e.g. Blok et al., 2017).⁵ Yet, one might (self)critically note, the ultimate goal of a computationally enhanced critical anthropology must be to transcend the binary between the ‘playful’ and the ‘earnest’ by harnessing AI methods to address fundamental social scientific questions.

Doja and colleagues’ idiosyncratic combination of ‘fuzzy logic, probabilities, machine learning, and maps manipulation’ (as they put in this issue) is probably the most earnest attempt yet to realize Levi-Strauss’ old computational dream. Unlike word embedding models that train neutral networks to ‘provide insight into the relationship between individual words and the overall conceptual structure undergirding a text’ (Kozlowski et al., 2019), Doja et al. directly seek to simulate the generative structural logic allegedly undergirding the spatio-temporal unfolding of all myths (Bruchansky, 2019). But in more overarching terms, as a ‘Turing test of Amerindian mythology’ (Santucci et al., 2020), their neo-structuralism fundamentally resembles the ‘purely relational approach to modeling’ (Kozlowski et al., 2019) that has recently gained traction among computational cultural sociologists (e.g. Evans and Aceves, 2016). Certainly, there is a sense to which unsupervised machine learning is ‘a kind of folk structuralism – that if we number-crunch culture in a sophisticated enough way, the “latent” … mathematical structures of our (secretly computational) minds will be uncovered’ (Castelle, 2018: see also Pedersen and Nielsen, 2018; Santucci et al., 2020)).

Towards a machinic anthropology

Other scholars have theorized and typologized the relation between anthropology/ethnography and data science. Let us now consider these to nail down more precisely what is specific about the machine anthropology agenda. In Munk and Winthereik (2022) outline their vision for a ‘computational ethnography’, whose aspiration is to both ‘appropriat[e] digital media as its empirical material and us[e] computational techniques for gathering and analyzing this material’. This definition of computational ethnographic calls to mind well-known understandings of digital methods by Marres (2017) and Rogers (2019). While not a problem as such, it does raise the question of why a new term (‘computational ethnography’) is needed, just as it leaves unresolved how to conceive of social data science studies that use digital devices as sources of quantitative data in their own right, and less as objects of critical meta-analysis (e.g. Anderson et al., 2009; Lohse et al., 2022). Indeed, this is probably the main difference between digital methods and machine anthropology. Whereas the former predominantly uses preexisting, often commercial digital software and visualization tools for is critical analysis of the affordances of digital infrastructures and natively digitalized data, the object and aim of machine anthropology is not restricted to digital phenomena. Instead, the focus here is the digital as a means, using computers for the processing of data and analysis of data.⁶ Here, unstructured data afforded by digital devices and platforms (e.g. social media posts, or biometric or geolocation data logged in wearable sensors) are used as source of methodological development, including the programming of specific-purpose algorithms that are built from scratch to collect (e.g. scrape), pre-process (e.g. clean), process (e.g. mine) and analyse (e.g. model) this data.

Munk (2019) and Paff (2022) both capture this data sciency aspiration. Thus the variant of quali-quant research that Munk calls ‘algorithmic sensemaking’ does ‘not involve any conventionally qualitative work but rather solicits sensemaking to quantitative community detection and pattern recognition’ (2019: 165). An apt example is Munk et al.'s contribution to this issue, where the incongruity between big and thick data to test the limits of both. Turning now to Paff, his recent call for an ‘anthropology by data science’ closely resembles what I in the previous section called the ‘anthropology as data science’ approach. Suggesting that ‘anthropology and data science do not possess fundamental theoretical or philosophical differences’, Paff calls for the incorporation of machine-learning techniques in ‘ethnographies and other anthropological research’ (2022). This claim – that there is a correspondence between anthropology and data science in terms of methodology, epistemology and metaphysics – echoes the call for the 2020 ‘Machine Anthropology’ workshop.⁷ Yet, as I am going to suggest now, what is called for is a more nuanced rendering of the similarities and the differences between qualitative social science, quantitative social science and data science, which delineates and reflects on the strengths as well as the weaknesses of each of the three approaches.

To be sure, the data-driven attitude of data scientists does come much closer to the explorative aspiration of anthropology and other grounded-theory-informed approaches than the theory-driven hypothesis-testing characteristic of mainstream quantitative social science. As Sapienza and Lehmann put it in this issue, as ‘data scientists … we are not hypothesis-driven… We are looking for questions that can be convincingly answered by our dataset’ (see also Milner, 2018). Here, the old anthropological ideal of ‘taking people seriously’ (Malinowski, 1961) via bottom-up and radically empiricist research reappears among computer scientists doing ‘AI in the wild’ (Dyson, 2019). Yet, as shown in publications in this journal (e.g. Kitchin, 2014) and others (e.g. Radford and Joseph, 2020; Shmueli, 2010), this does not mean the ‘end of theory’ (Anderson, 2008). Social science theory, after all, ‘is useful not only in generating hypotheses, but also in selecting an appropriate way of measuring constructs with big data’ (Lazer et al., 2021). That is to say, the role of theory in computational social science/social data science research has to do with the all-important issues of construct (Cronbach and Meehl, 1955) and measurement (Adcock and Collier, 2001) validity.

Carlsen and Ralund's contribution to this issue is a case in point. Via a critical discussion of Nelson's ‘computational grounded theory’ (2021), they present a detailed protocol for a state-of-the-art quali-quantitative analysis of large-scale text data. As its name indicates, the computer assisted learning and measurement (CALM) protocol leverages the advantages of unsupervised machine-learning techniques for improving especially the more explorative phases of computational text analysis, while at the same time systematically deploying qualitative methods and measures to mitigate against the problems with the topic modeling method, which has become widely used both within and outside the academy over the last 15 years or so (DiMaggio et al., 2013; Mohr and Bogdanov, 2013). Indeed, because it allows for the iterative integration of qualitative insights into both data work, model building, and the final analysis, CALM offers the perhaps most concerted attempt made so far to put together a directly applicable framework for a so-called ‘abductive logic of inquiry’ (Brandt and Timmermans, 2021) in the study of digital phenomena and/or data.⁸

But how then to conceive of quali-quantitative analyses, like Blok et al.'s political ‘micro-history’ (this issue), where new concepts are formulated via an iterative oscillation between data, model and theory to describe and theorize a particular state of affairs? While such studies evidently qualify as ‘grounded theory’ in both Nelson's and ‘pre-digital’ senses of this term, their scope and aspiration seem to differ slightly from the methods of abduction according to recent sociological accounts of this concept (Brandt and Timmermans, 2021). Here, the bottom-up discovery and development of new concepts is done, not to existing big theory (as Brandt and Timmerman would have it), but to take the specific phenomenon under investigation theoretically seriously without necessarily having any impetus towards generalization (cf. Holbraad and Pedersen, 2017). We are here reminded of Levi-Martin's notion of ‘mathematical sociology’, which presents a pragmatist alternative to established sampling strategies and representativity within quantitative social science research. As Levi-Martin himself puts it, ‘[we] want be able to mathematize this group, with its number of isolated people right here, right now. If that doesn’t do justice to the population of all possible sets of groups, then so be it. Mathematical sociology isn’t about inference in this sense of sampling, and we shouldn’t let statisticians come in and smash [our] more delicate constructions’ (Levi-Martin, 2020: 27). Which begs the question: Perhaps anthropology could be mathematical too? Not in the naturalist sense of cognitive anthropologists (e.g. Sperber, 1985), but in the pragmatist sense of Levi-Martin and Chicago colleagues (e.g. Abbott, 2004). This would not only allow for a contemporary version of Levi-Strauss’ old computational structuralist vision; it might also open up for a fusion between the radically empiricist commitment of much contemporary anthropology and sociology's continuing commitment to big theory building.

Certainly, there are several low hanging ethnographic fruits. Consider ethnographic fieldnotes, which still tend to be collected and processed in predominantly manual ways, even with the availability of software like NVivo.⁹ Yet, their unstructured nature makes them particularly compatible with unsupervised machine-learning methods (Nelson, 2020: 7). What is more, as Albris et al. point out in this theme issue, once qualitative researchers begins analyzing fieldnotes via computational methods, they can suddenly ‘ask different kinds of questions, such as: Does the style of fieldnotes depend on where the fieldwork took place? Does the length of a fieldnote depend on the time span in which it took place? Does group-based fieldwork impact … fieldnotes?’ Still, barring a few recent exceptions (Abramson et al., 2018; Astrupgaard et al., 2022; Marathe and Toyama, 2018), NLP methods have not been used systematically on ethnographic fieldnotes.

But the potential contribution of machine anthropology goes beyond the automatization and augmentation of fieldnote collection, processing and analysis. As Glavind and Bjerre-Nielsen suggest (this issue), a very significant (but so far largely ignored) interdisciplinary advantage of ethnographic data is the fact that they can be used to ‘validate [quantitative] data by establishing a ground truth … to examine whether the data measures what the researcher think it measures’ (see also Grigoropoulou and Small, 2022; Marda and Narayan, 2021). Indeed, the impact qualitative research would probably increase significantly if they tapped into data science narratives of ‘ground truth’ as ‘information gathered via direct observation …[used as] …the standard with which to compare the performance of a model’ (Corwin and Erickson-Davis, 2020).

But there is yet one further implication of the machine anthropological project. At issue is whether ethnographic data should, in fact, be deemed ‘small’ in the first place. As Glavind and Bjerre-Nielsen goes on to argue, ethnographic data has ‘high depth (“high M”) since, for each individual or setting, the ethnographer can potentially list hundreds, possibly thousands, of details’. Moreover, they typically have ‘a temporal dimension … (‘high T)…from observing individuals in a specific setting for a couple of hours [or] following the same individuals across settings for months or years’. In other words, ethnographic data is ‘“big” in the same way that “big data” is big, even though N is small’ (see also Pedersen, 2019). This has huge ramifications for social science. If field notes and other qualitative data are ‘big’ on this alternative measure of size, it not only underscores the earlier mentioned need to introduce computational methods in their collection, processing and analysis, but it also raises the more fundamental issue of whether such data should always conceived as qualitative in the first place. Perhaps what is needed is more quantitative ethnographic data (standardized records of systematic participation and/or observation), and less reflexive, poetic and intersubjective ethnography –in short, less qualitative fundamentalism and epistemological exceptionalism?

For such a distinctly anthropological and distinctly quantitative research agenda to succeed, scholars and students subscribing to an anthropological identity will have to do away with some of their most deeply held beliefs. In particular, they will need to make a separation between ethnographic data and anthropological analysis. Indeed, this might be the main difference between machine anthropology and the cognate projects discussed above, where ethnography and anthropology tend to be used as synonyms with little semantic difference. Conversely, throughout this introduction, I have sought to operate with a principled and systematic distinction between ethnography and anthropology (Agar, 2006; Ingold, 2014). Indeed, it seems to me, it is precisely in this separation between anthropology as a distinct analytical method and mode of theorizing on the one hand, and ethnography as a certain empirical method and data collecting and processing form on the other, that the potentials and promises of a future machine anthropology can be located.

We can, then, think of machine anthropology as a strong version of computational anthropology in Munk and Winthereik's sense (2022). Leveraging the technical advancements brought about by the data science revolution and combining these methodological innovations with an epistemological re-orientation towards a mathematical sociology, the machine anthropology project aspires to expand the very scope of anthropological inquiry by embracing quantitative thinking and computational methods. For anthropology to embrace its machinic potential, it will require a widening of the discipline's data, methods and identity. In addition to data obtained through ethnographic fieldwork (be they ‘thick’ qualitative or ‘big’ quantitative), the mathematical machine anthropologist must also be open towards other registers of data, ranging from large corpora of scraped tweets (Breslin et al., 2022) to experimentally sampled and collected as well as statistically processed and modelled sensor data (Lohse et al., 2022). For such a more-than-qualitative transformation and extension of anthropology to happen, it will involve a questioning some of the most deep-held convictions of scholars and students from this discipline, including a bracketing of ethnography as anthropology's primary – and to some, only – method. Only when, or rather if, this happens, will computers cease to be merely ‘good to think [anthropology] with’ – whether in the modernist imaginary espoused by Levi-Strauss or the postmodernist stereotype of an evil, quantitative other popular in certain quarters of the academy – and become vehicles for the emergence of distinctly anthropological forms of machine learning and AI.

Footnotes

Acknowledgments

This work was made possible by funding from the DISTRACT Advanced Grant project grant 834540 from the European Research Council). Apart from the contribution from Blok et al., all articles and commentaries are the product of the Machine Anthropology Workshop, which was held at the Copenhagen Center for Social Data Science (SODAS) on the 27^th and 28^th of January 2020 to inaugurate the DISTRACT project. In addition to the contributions to the present theme issue, the workshop also included presentations by Krista Lagus and Minna Ruckenstein, Ajda Pretnar and Dan Podjed, Marie Cury and Sebastian Barfort, Daniel Souleles and Nicholas Skar-Gislinge, as well as by Andreas Refsgaard. The author would like to thank all these people, as well as Andreas Roebstorff, Eva Iris Otto, Sophie Smitt Sindrup Grønning, and Emilie Munch Gregersen and everyone in the DISTRACT team (including Thyge Enggaard), for their invaluable academic and/or administrative assistance in making this workshop a successful inauguration of the ERC project. The author is also indebted to Jennifer Gabrys and Matthew Zook for their perceptive comments on a draft of this introduction, and to Jennifer for her advice, assistance (and patience!) in co-editing this theme issue with me. A special thanks also to John Levi-Martin for stimulating discussions on the topic of machine anthropology.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the ERC (grant number 834540).

ORCID iD

Morten Axel Pedersen

Notes

References

Abbott

(2004) Methods of Discovery: Heuristics for the Social Sciences. Contemporary Societies. New York: W.W. Norton & Co.

Abramson

(2016) What in/is the world is/of big data. Fieldsights–Cultural Antropology. Available at: https://culanth.org/fieldsights/what-in-is-the-world-is-of-big-data.

Abramson

Joslyn

Rendle

, et al. (2018) The promises of computational ethnography: Improving transparency, replicability, and validity for realist approaches to ethnographic analysis. Ethnography 19(2): 254–284.

Adcock

Collier

(2001) Measurement validity: A shared standard for qualitative and quantitative research. American Political Science Review 95(3): 529–546.

Agar

(2006) An ethnography by any other name. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research 7(4).

Anderson

(2008) The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine 16(7).

Anderson

Nafus

Rattenbury

, et al. (2009) Ethnographic Praxis in Industry Conference Proceedings: 123–140.

Arora

, et al. (2018) Ethnographic Praxis in Industry Conference Proceedings: 224–244.

Astrupgaard

Gregersen

Sandbye

(2022) See You Later, Thick Data?: How we experimented with doing collaborative fieldwork as part of an interdisciplinary research project. In: Anthrodendum. Available at: https://anthrodendum.org/author/distract/ (accessed 17 November 2022).

10.

Beaulieu

(2017) Vectors for fieldwork: Computational thinking and new modes of ethnography. In: Hjorth

Horst

Galloway

, et al. (eds) The Routledge Companion to Digital Ethnography. London: Routledge, pp.55–65.

11.

Bell

Gregg

Seaver

(2015) Data, Now Bigger and Better!. Chicago.

12.

Blok

(2020) Commentary: Why (and how to) experiment with digital social data? STS Encounters 11(1): 118–140.

13.

Blok

Carlsen

Jørgensen

, et al. (2017) Stitching together the heterogeneous party: A complementary social data science experiment. Big Data & Society 4(2): 205395171773633.

14.

Blok

Pedersen

(2014) Complementary social science? Quali-quantitative experiments in a big data world. Big Data & Society 1(2): 205395171454390.

15.

Bornakke

Due

(2018) Big–thick blending: A method for mixing analytical insights from big and thick data sources. Big Data & Society 5(1): 205395171876502.

16.

Brandt

Timmermans

(2021) Abductive logic of inquiry for quantitative research in the digital age. Sociological Science 8: 191–210.

17.

Breiger

Wagner-Pacifici

Mohr

(2018) Capturing distinctions while mining text data: Toward low-tech formalization for text analysis. Poetics 68: 104–119.

18.

Breslin

Blok

Enggaard

, et al. (2022) “Affective publics” performing trust on Danish Twitter during the COVID-19 lockdown. Current Anthropology 63(2): 211–218.

19.

Brooker

(2019) My unexpectedly militant bots: A case for programming-as-social-science. The Sociological Review 67(6): 1228–1248.

20.

Bruchansky

(2019) Machine learning: A structuralist discipline? AI & Society 34(4): 931–938.

21.

Castelle

(2018) Social Theory for Generative Networks (and Vice Versa). Available at: https://castelle.org/pages/social-theory-for-generative-networks-and-vice-versa.html.

22.

Centola

(2018) How Behavior Spreads: The Science of Complex Contagions. Princeton: Princeton University Press.

23.

Charles

Gherman

(2019) Big data analytics and ethnography: Together for the greater good. In: Big Data for the Greater Good. New York: Springer, pp.19–33.

24.

Chibnik

(1999) Quantification and statistics in six anthropology journals. Field Methods 11(2): 146–157.

25.

Christin

(2020) The ethnographer and the algorithm: Beyond the black box. Theory and Society 49(5): 897–918.

26.

Corwin

Erickson-Davis

(2020) Experiencing presence: An interactive model of perception. HAU: Journal of Ethnographic Theory 10(1): 166–182.

27.

Cronbach

Meehl

(1955) Construct validity in psychological tests. Psychological Bulletin 52(4): 281–302.

28.

Cunningham

(1996) Machine learning applications in anthropology: Automated discovery over kinship structures. Computers and the Humanities 30(6): 401–406.

29.

Curran

(2013) Ethnographic Praxis in Industry Conference Proceedings: 62–73.

30.

Cury

, et al. (2019) Ethnographic Praxis in Industry Conference Proceedings: 254–281.

31.

de Almeida

MWB

Arcand

Jorion

, et al. (1990) Symmetry and entropy: Mathematical metaphors in the work of Levi-Strauss [and comments and reply]. Current Anthropology 31(4): 367–385.

32.

DiMaggio

(2015) Adapting computational text analysis to social science (and vice versa). Big Data & Society 2(2): 205395171560290.

33.

DiMaggio

Nag

Blei

(2013) Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding. Poetics 41(6): 570–606.

34.

Douglas-Jones

Walford

Seaver

(2021) Introduction: Towards an anthropology of data. Journal of the Royal Anthropological Institute 27: 9–25.

35.

Driver

Kroeber

(1932) Quantitative Expression of Cultural Relationships. Berkeley: University of California Press.

36.

Dyson

(2019) AI That Evolves in the Wild: A Talk By George Dyson. Available at: https://www.edge.org/conversation/george_dyson-ai-that-evolves-in-the-wild.

37.

Enggaard

Lohse

Pedersen

, et al. (2023) Dialectograms: Machine Learning Differences between Discursive Communities arXiv. Available at: https://arxiv.org/abs/2302.05657

38.

Evans

Aceves

(2016) Machine translation: Mining text for social theory. Annual Review of Sociology 42(1): 21–50.

39.

Feldman

(2017) Big data and ethnology. Anthropology Today 33(3): 1–2.

40.

Fielding

(2012) Triangulation and mixed methods designs: Data integration with new research technologies. Journal of Mixed Methods Research 6(2): 124–136.

41.

Ford

(2014) Big data and small: Collaborations between ethnographers and data scientists. Big Data & Society 1(2): 205395171454433.

42.

Fortun

Marcus

(2017) Computers in/and anthropology: The poetics and politics of digitization. In: The Routledge Companion to Digital Ethnography. New York: Routledge, pp.37–46.

43.

Fuhse

Stuhler

Riebling

, et al. (2020) Relating social and symbolic relations in quantitative text analysis. A study of parliamentary discourse in the Weimar Republic. Poetics 78: 101363.

44.

Gentzkow

Kelly

Taddy

(2019) Text as data. Journal of Economic Literature 57(3): 535–574.

45.

Gluckman

(1961) Ethnographic data in British social anthropology. The Sociological Review 9(1): 5–17.

46.

Goel

Hofman

Lahaie

, et al. (2010) Predicting consumer behavior with web search. Proceedings of the National Academy of Sciences 107(41): 17486–17490.

47.

Grigoropoulou

Small

(2022) The data revolution in social science needs qualitative research. Nature Human Behaviour 6: 904–906.

48.

Grimmer

Roberts

Stewart

(2021) Machine learning for social science: An agnostic approach. Annual Review of Political Science 24: 395–419.

49.

Grimmer

Stewart

(2013) Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21(3): 267–297.

50.

Hobbis

(2022) Beyond platform capitalism: Critical perspectives on Facebook markets from Melanesia. Media, Culture & Society 44(1): 121–140.

51.

Holbraad

Pedersen

(2017) The Ontological Turn: An Anthropological Exposition. Cambridge: Cambridge University Press.

52.

Hsu

(2014) Digital ethnography toward augmented empiricism: A new methodological framework. Journal of Digital Humanities 3(1): 3–1.

53.

Hymes

(1965) The Use of Computers in Anthropology. Mouton: The Hague.

54.

Ingold

(2014) That’s enough about ethnography!. Hau: Journal of Ethnographic Theory 4(1): 383–395.

55.

Jemielniak

(2020) Thick Big Data: Doing Digital Social Sciences. Oxford: Oxford University Press.

56.

Kitchin

(2014) Big data, new epistemologies and paradigm shifts. Big Data & Society 1(1): 205395171452848.

57.

Knox

Nafus

(2018) Introduction: Ethnography for a data-saturated world. In: Ethnography for a Data-Saturated World. Manchester: Manchester University Press, pp.1–30.

58.

Kockelman

(2020) The epistemic and performative dynamics of machine learning praxis. Signs and Society 8(2): 319–355.

59.

Kozinets

(2019) Netnography: The Essential Guide to Qualitative Social Media Research, 3rd ed. Thousand Oaks, CA: SAGE Publications.

60.

Kozlowski

Taddy

Evans

(2019) The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review 84(5): 905–949.

61.

Lazer

Hargittai

Freelon

, et al. (2021) Meaningful measures of human society in the twenty-first century. Nature 595(7866): 189–196.

62.

Lazer

Pentland

Adamic

, et al. (2009b) Computational social science. Science (New York, N.Y.) 323: 721–723.

63.

Lazer

Pentland

Adamic

, et al. (2009a) Life in the network: The coming age of computational social science. Science (New York, N.Y.) 323(5915): 721–723.

64.

Levi-Martin

(2020) Thinking Through Statistics. Chicago: University of Chicago Press.

65.

Levi-Strauss

(1963) Structural Anthropology. New York: Doubleday Anchor Books.

66.

Lohse

Gregersen

(2022) Measuring Attention Ethologically: How an Interdisciplinary Team of Social Data Scientists and Anthropologists Conducted a Collaborative Field Experiment. Paper presented at Attention: An Interdisciplinary Workshop, LSE, 14–15 Sept. 2022.

67.

Lowrie

(2018) Algorithms & automation: An introduction. Cultural Anthropology 33(3): 349–359.

68.

Mackenzie

(2017) Machine Learners: Archaeology of a Data Practice. London: MIT Press.

69.

Madsen

Blok

Pedersen

(2018) Transversal collaboration: An ethnography in/of computational social science. In: Ethnography for a Data-Saturated World. Manchester: Manchester University Press, pp.183–211.

70.

Malinowski

(1961) Argonauts of the Western Pacific. New York: E.P. Dutton.

71.

Marathe

Toyama

(2018) Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems: 1–12.

72.

Marda

Narayan

(2021) On the importance of ethnographic methods in AI research. Nature Machine Intelligence 3(3): 187–189.

73.

Marres

(2017) Digital Sociology: The Reinvention of Social Research. Hoboken: John Wiley & Sons.

74.

McFarland

Lewis

Goldberg

(2016) Sociology in the era of big data: The ascent of forensic social science. American Sociologist 47: 12–35.

75.

McGillivray

Alex

Ames

, et al. (2020) The challenges and prospects of the intersection of humanities and data science: A white paper from The Alan Turing Institute.

76.

Mills

(2019) Big Data for Qualitative Research. New York: Taylor & Francis.

77.

Milner

(2018, April 24) Newton didn’t frame hypotheses. Why should we. Physics Today.

78.

Moats

(2021) Rethinking the ‘Great Divide’: Approaching interdisciplinary collaborations around digital data with humour and irony. Science & Technology Studies 34(1): 19–42.

79.

Moats

Borra

(2018) Quali-quantitative methods beyond networks: Studying information diffusion on twitter with the modulation sequencer. Big Data & Society 5(1): 205395171877213.

80.

Moats

Seaver

(2019) “You social scientists love mind games”: Experimenting in the “divide” between data science and critical algorithm studies. Big Data & Society 6(1): 205395171983340.

81.

Mohr

Bogdanov

(2013) Special issue title: Topic models and the cultural sciences. Poetics 41(6): 545–569.

82.

Mohr

Rawlings

(2015) Formal methods of cultural analysis. In: International Encyclopedia of the Social & Behavioral Sciences. Elsevier, pp.357–367.

83.

Mohr

Wagner-Pacifici

Breiger

(2015) Toward a computational hermeneutics. Big Data & Society 2(2).

84.

Munk

(2019) Four styles of quali-quantitative analysis: Making sense of the new nordic food movement on the web. Nordicom Review 40(s1): 159–176.

85.

Munk

Winthereik

(2022) Computational ethnography: A case of COVID-19’s methodological consequences. In: The Palgrave Handbook of the Anthropology of Technology. New York: Springer, pp.201–214.

86.

Nelson

(2020) Computational grounded theory: A methodological framework. Sociological Methods & Research 49(1): 3–42.

87.

Nelson

Burk

Knudsen

, et al. (2021) The future of coding: A comparison of hand-coding and three types of computer-assisted text analysis methods. Sociological Methods & Research 50(1): 202–237.

88.

Paff

(2022) Anthropology by data science. Annals of Anthropological Practice 46(1): 7–18.

89.

Paredes

Rufino Ferreira

Schillaci

, et al. (2017) Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 2017: 1562–1575.

90.

Pedersen

(2019) Anthropology of/as repetition. History and Anthropology 30(2): 226–232.

91.

Pedersen

(2021) Going native in data science. In: The Moral Work of Anthropology: Ethnographic Studies of Anthropologists at Work. London: Berghahn Books, pp.133–168.

92.

Pedersen

Nielsen

(2018) Revisiting structuralism. Paper presented at the Annual Meeting of the American Anthropological Association, San Jose, 2018.

93.

Pentland

(2015) Social Physics: How Social Networks Can Make Us Smarter. London: Penguin.

94.

Pink

Horst

Postill

, et al. (2015) Digital Ethnography: Principles and Practice. London: SAGE.

95.

Pretnar

Podjed

(2019) Data mining workspace sensors. Contributions to Contemporary History 59(1): 179–197.

96.

Radford

Joseph

(2020) Theory in, theory out: The uses of social theory in machine learning for social science. Frontiers in Big Data 3.

97.

Rogers

(2019) Doing Digital Methods. North Tyneside: SAGE.

98.

Ruckenstein

(2019) Tracing medicinal agencies: Antidepressants and life-effects. Social Science & Medicine 235: 112368.

99.

Ruppert

Law

Savage

(2013) Reassembling social science methods: The challenge of digital devices. Theory, Culture & Society 30(4): 22–46.

100.

Salganik

(2019) Bit by Bit: Social Research in the Digital Age. Princeton: Princeton University Press.

101.

Santucci

J-F

Doja

Capocchi

(2020) A discrete-event simulation of Claude Lévi-Strauss’ structural analysis of myths based on symmetry and double twist transformations. Symmetry 12(10): 1706.

102.

Schaffer

(1994) From Physics to Anthropology, and Back Again. Chicago: Prickly Pear Press.

103.

Seaver

(2014) Structuralism: Thinking with Computers. In: Savage Minds. Available at: https://savageminds.org/2014/05/21/structuralism-thinking-with-computers/.

104.

Seaver

(2017) Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data & Society 4(2): 205395171773810.

105.

Sekara

Stopczynski

Lehmann

(2016) Fundamental structures of dynamic social networks. Proceedings of the National Academy of Sciences 113(36): 9977–9982.

106.

Shmueli

(2010) To explain or to predict? Statistical Science 25(3): 289–310.

107.

Sperber

(1985) On Anthropological Knowledge: Three Essavs. New York: Cambridge University Press.

108.

Veltri

(2019) Digital Social Research. John Wiley & Sons.

109.

Venturini

Latour

(2010) Proceedings of Future En Seine: 87–101.

110.

Ziewitz

(2017) A not quite random walk: Experimenting with the ethnomethods of the algorithm. Big Data & Society 4(2): 205395171773810.