Abstract
While “scaling up” is a lively topic in network science and Big Data analysis today, my purpose in this essay is to articulate an alternative problem, that of “scaling down,” which I believe will also require increased attention in coming years. “Scaling down” is the problem of how macro-level features of Big Data affect, shape, and evoke lower-level features and processes. I identify four aspects of this problem: the extent to which findings from studies of Facebook and other Big-Data platforms apply to human behavior at the scale of church suppers and department politics where we spend much of our lives; the extent to which the mathematics of scaling might be consistent with behavioral principles, moving beyond a “universal” theory of networks to the study of variation within and between networks; and how a large social field, including its history and culture, shapes the typical representations, interactions, and strategies at local levels in a text or social network.
Network science research in the computational, social, and biological sciences is increasingly focused on datasets of thousands and even millions of nodes and comparably massive sets of connections among them—for example, in gene interaction networks or social media datasets. Well over a decade ago my colleagues and I, falling in step with many other researchers, began asking, “How well do the different analytical techniques and algorithms ‘scale up’ to large networks … ?” (Breiger et al., 2003: 5). Traditional concepts of network centrality, for example, and attendant shortest-path and betweenness metrics, are often impractical to compute for large-scale networks, even on very fast computers. More fundamentally, the phenomenology of taking account of all possible links, which is what these metrics do, may well be appropriate for a small face-to-face group or for several dozen trading partners, but inappropriate for the structuring and operation of networks at very large scale. Much of the success of Big Data science has consisted of formulating for large datasets algorithms that are more efficient and appropriate, and that “scale up” only linearly with the number of nodes and edges in a graph (Palmer et al., 2003).
While “scaling up” is a lively topic in network science and Big Data analysis today, my purpose in this essay is to articulate an alternative problem, that of “scaling down,” which I believe will also require increased attention in coming years. “Scaling down” is the problem of how macro-level features of Big Data affect, shape, and evoke lower-level features and processes.
A premise of a great deal of network science and Big-Data analysis of online behavior is that “the web sees everything and forgets nothing” (Golder and Macy, 2014: 132). Large-scale studies of Internet behavior often make use of what is in this sense thought to be an unmediated study of social interactions, and it is not at all rare for authors of such studies to claim, from the analysis of millions of Facebook posts, findings about human behavior that are said to be “in contrast to prevailing assumptions” in social science such as Festinger’s (1954) social comparison theory formulated from research on small human groups (Kramer et al., 2014: 8788, 8790).
As I envision it, the alternative problem of “scaling down” addresses four often-interrelated features of Big Data and network science research that are routinely ignored or accorded insufficient attention, to the detriment of progress in research. First, whereas many studies have been undertaken of massively large systems such as social networking sites, an under-researched question is the extent to which the behavioral findings of these studies “scale down,” i.e. apply to human groups and organizations of moderate size (dozens or hundreds), where most human social life takes place and is likely to continue to do so. This is the question of the extent to which Big-Data research applies to human behavior at the human scale of church suppers and department politics in which we spend much of our lives. Second, what are the behavioral processes that lead to the macro-level outcomes? The research community has produced stunningly impressive and workable mathematical models of how processes at lower levels (among individuals, say) might cumulate to high-level complexity (e.g. Lusher et al., 2013; Morris, 2003), or how bags of words from multiple topics might spill together to form texts (Blei, 2012). However, there has been precious little attention paid to formulating micro-processes that reflect actual behavior. Big Data has no analog to behavioral economics, the study of when and why actors follow or depart from the postulated model (Thaler, 1994). Third, network science and Big Data often see themselves as scaling “up” to generalizations that are freed from the shackles of particular texts, and to findings that apply universally to “all” networks whether power grids, gene interactions, or Facebook friending. Scaling “down” would recognize the possibility that, the bigger the dataset in the case of a particular research question, the greater are the opportunities to search for variation within the case, to contextualize its features in such a way as to lead to a distinctive form of case-based generalization (George and Bennett, 2005). Fourth, “scaling down” refers to the problem of how a large social field, including its history and culture, shapes the typical representations, interactions, and strategies at local levels in a text or social network.
In brief: (a) the degree of applicability of Big Data research to small- and moderate-sized social groups, (b) the study of when actors behave as if the mathematical mechanisms postulated to generate Big Data were true, (c) the relative utility of binning Big Data into local contexts, and (d) the production of local action from macro-level processes are all problems in “scaling down.” I will say a bit more in turn about each of the four aspects of “scaling down” that I have identified.
In research that I see as related to the studies mentioned above, Lazega et al. (2008) formulate a multi-level social network analysis via linked design: French cancer labs have ties (such as mobility of researchers among them), scientists have network ties (such as working together), and scientists are affiliated with labs. This formulation presents what I would like to identify as “a duality of scaling up and down” with an emphasis on actors’ strategies, inter-organizational control mechanisms, and a distinctively institutional theory of their coevolution that is being developed brilliantly by Lazega and colleagues (especially, Lazega, 2015; Lazega and Prieur, 2014).
In conclusion, I have identified four interrelated features of “scaling down,” the problem of how macro-level features of Big Data affect, shape, and evoke lower-level features and processes. Too often, problems of scaling down remain merely in the background of Big Data and network science studies. Recognizing and addressing them should lead to additional progress in advancing the study of what Lazer et al. (2014) term an “all data revolution,” wherein innovative analytics using data from all traditional and new sources are developed and used to further our understanding of our world.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
