Abstract
Communication of scientific findings is fundamental to scholarly discourse. In this article, we show that academic review articles, a quintessential form of interpretive scholarly output, perform curatorial work that substantially transforms the research communities they aim to summarize. Using a corpus of millions of journal articles, we analyze the consequences of review articles for the publications they cite, focusing on citation and co-citation as indicators of scholarly attention. Our analysis shows that, on the one hand, papers cited by formal review articles generally experience a dramatic loss in future citations. Typically, the review gets cited instead of the specific articles mentioned in the review. On the other hand, reviews curate, synthesize, and simplify the literature concerning a research topic. Most reviews identify distinct clusters of work and highlight exemplary bridges that integrate the topic as a whole. These bridging works, in addition to the review, become a shorthand characterization of the topic going forward and receive disproportionate attention. In this manner, formal reviews perform creative destruction so as to render increasingly expansive and redundant bodies of knowledge distinct and comprehensible.
The communication of scientific findings is fundamental to scholarly discourse. Isolated findings are understood only when they are viewed in relation to other scholarly output. Any particular claim is without substance unless it is interpreted contextually. What theories does it support? Which research agendas does it contradict? How does it fit into the overarching structures of scientific knowledge? Scholarly discourse relies on the relational interpretation of research findings to codify claims. Such interpretive work is present in virtually every piece of scientific writing, but formal academic review articles represent a quintessential means through which scientific findings are brought into context with one another and sense is made of collected research. Reviews aim to gather the relevant published findings in a domain of inquiry and to synthesize those findings into a coherent body. If we want to know how scientific discourse progresses, how ideas move from tentative propositions to accepted knowledge, formal reviews offer us a window into the mechanisms involved.
Much of the work in social studies of science that investigates relational synthesis focuses on novel juxtapositions of scientific ideas (Fleming 2001; Foster, Rzhetsky, and Evans 2015; Latour 1987; Leahey and Moody 2014; Uzzi et al. 2013). This research consistently finds that scientific projects that posit unexpected relationships between domains—as long as they have positive findings—receive greater attention and are more richly rewarded than projects that explore more commonplace connections. The implication is that scientists who are able to bring disparate domains into conversation are more likely to generate significant innovation by defining a newly relevant area of research. Scientific research that synthesizes divergent knowledge moves away from Kuhn’s (1970) normal science, introducing unconventional ideas and potentially defining new domains of research.
But most of the curatorial work that takes place in scientific discourse is not oriented toward finding novel or surprising combinations. The bulk of synthetic work takes the opposite approach: it is concerned with assessing and interrogating the immediate relationships among bodies of research. Such acts of curation have long been recognized as playing an important role in scientific discourse, addressing, among other things, the need for scientists to handle an ever-increasing deluge of new research findings. Scholars faced with the reality that they will be unable to consume everything published in their own specialization, much less in subfields outside of their expertise, will turn to curated expositions of relevant research communities to help contextualize and expand their own work.
This phenomenon is not new. Price (1986) describes the emergence of the scientific paper (as opposed to published book) in the seventeenth century as a remedy for the perceived overabundance of research available for scientists to absorb. Such papers, published in periodical format that today is the norm for scientific journals, “had the stated function of digesting the books and doings of the learned all over Europe. Through them the casual reader might inform himself without the network of personal correspondence, private rumor, and browsing in Europe’s bookstores, formerly essential” (Price 1986:57). Scientific output outpaces researchers’ ability to consume and make sense of it. Synthetic work addresses this problem not only by sorting through a relevant literature and summarizing its findings, but, as we will argue, by contextualizing those findings among themselves and within a broader scholarly field.
Scholarly review articles are perhaps the most explicit form of synthetic production in contemporary scientific publication. 1 The curatorial work they do is not incidental: it is the expressed purpose of their existence. Review articles target a range of audiences, attempting to be valuable not only to communities actively engaged with their topic but also those that may be wholly unfamiliar with the relevant scientific domain. They aim to “inform interested readers who have limited knowledge of [a] topic, whether students new to the field or seasoned researchers from other domains” (Freeman and Jeanloz 2015). Beyond simply identifying a relevant body of literature, authors of review articles are expected to tell their readers how, exactly, the works they cite relate to one another. The authoritative account of research findings provided by formal reviews is intended to help insiders and outsiders alike make sense of those findings.
A good review article tells a story about what ideas are important and why, and synthesis of this type, although necessary, need not be neutral. In presenting an “official” account of a complex set of research to readers from diverse disciplinary backgrounds, 2 reviews must translate a specialized discourse into a more accessible description, and that translation, like any, is apt to add an interpretive aspect to the exposition. A review “selects from [the relevant] papers, juxtaposes them, and puts them in a narrative that holds them together, a narrative with actors and events but still without an ending. It draws the reader into the writer’s view of what has happened, and by ordering the recent past, suggests what can be done next” (Myers 1991:46). By synthesizing the findings of an emerging research area into a coherent narrative, reviews can affect the future direction of research in that area. The unique discursive orientation of reviews, focused on clarity of synthesis for newcomers from a position deeply embedded within existing research, suggests a generative role in the production of scientific output.
The question of what review articles “do” thus becomes: how do review articles affect the knowledge they digest and the research areas they interrogate? In this article, we address this question by studying, at large scale, the histories of publications cited by review articles, examining the attention papers receive both before and after they are included in a review’s synthesis. We argue that reviews are much more than mere summaries of relevant findings: they induce novel structure on a research domain and define relevant actors, alliances, divisions, and omissions within the literature. The structure they impose comes to represent the domain and becomes a scaffolding around which future discourse is formed.
We will show that academic review articles perform curatorial work that substantially transforms the emerging research areas they aim to summarize. Using an exhaustive corpus of millions of journal articles, we analyze the consequences of review articles for the publications they cite, focusing on citation and co-citation as indicators of scholarly attention. Our analysis shows that being included in a review article is, on average, detrimental for scientific publications and leads to a dramatic overall decrease in future citations. However, reviews bring increased attention to a research area as a coherent whole, and they impose a novel structure on that area’s discourse moving forward.
By analyzing the co-citation networks of both reviewed and unreviewed areas of knowledge, we show that reviews dramatically simplify a specialized domain of knowledge, focusing future scholarly attention onto a few key publications and the relations between them at the expense of the broad majority of the research in a domain. Reviews help establish and relate a set of exemplars in a specialized domain, structuring subsequent conversations within and outside of the existing research area. In short, reviews perform a type of creative destruction: 3 in identifying a coherent subdomain centered on a set of exemplars, they diminish the effect of the non-exemplars going forward. The articulation of a topic as a legitimate scientific research area comes at the expense of a pared-down conception of what that research area entails. We discuss the implications of these findings, arguing that review articles’ work of synthesis suppresses many of the core discussions and conflicts in a specialized subdomain of knowledge, and that in doing so they help to constitute that subdomain as a coherent scholarly field.
Theoretical Frame
The assumed goal of a review article is to create a bite-sized synthesis of work from a distinct research area. A glance at the table of contents for a recent volume of the Annual Review of Sociology is illustrative: “Wealth Inequality and Accumulation” (Killewald, Pfeffer, and Schachner 2017); “The Development of Transgender Studies in Sociology” (Schilt and Lagos 2017); “The Second Demographic Transition Theory: A Review and Appraisal” (Zaidi and Morgan 2017). Each of these review articles is a window into a specific, relatively well-defined subdomain of research.
The separation of scientific research into diverse research areas is well established in the sociology of science and knowledge. These implicit areas, which researchers have variously termed invisible colleges (Price 1986), epistemic communities (Holzner 1972), and scientific collectivities (Woolgar 1976), among many other terms, arise from the need for some level of insulation from the torrent of scientific output. Specialized domains emerge as scholars form relationships, reputations, and regularized modes of communication among their peers (Chubin 1976; Collins 2009; Kuhn 1970; Lievrouw 1992). Tight-knit research areas can act as a hotbed for knowledge production and scientific development, but they can also promote research segregation, becoming inaccessible to the uninitiated (Collins 1998; cf. Callon et al. 1983). Many areas of research would remain invisible to the broader community without the aid of expert summary.
Consequentially, the common (and often implicit) theoretical view of review papers sees them simply as tools to identify the most important results emerging from a specialized domain. Indeed, in the rare cases where review articles are discussed in scholarly literature, they are often treated as mere synopses: “Review papers . . . can be considered simply as summary reports of research results in the specialty” (Morris and Van der Veer Martens 2008:260). In this view, authors of review papers sort through the scientific output in an area, discerning the significant findings and omitting the false starts and intermediary research that would be of less use to an outsider. Review articles, the argument goes, are little more than a “packing down” (Price 1986:178) of the contributions of a research area—they act as an undiscriminating spotlight, illuminating relevant work without interpretation.
But scholars concerned with the dissemination and use of scientific knowledge argue that the act of transporting ideas from one specialized domain to another, as review articles do, is not neutral. Synthesizing knowledge makes a body of work accessible to those not familiar with it and entails active translation. Translators of specialized knowledge do not merely expose that knowledge, they respond to their audiences’ perceived expectations, transforming and presenting a body of work in a way they hope will be palatable (Latour 1987; Star and Griesemer 1989).
A considerable literature examines the types and prevalence of scientific translation, focusing in part on the phenomenon of knowledge encapsulation often referred to as black-boxing (Whitley 1970). For a collection of research to be useful outside of its original, narrow domain—the argument goes—it must be perceived as an unproblematic “black box” so that outside researchers can use the specialized knowledge without concern about the details of the provenance of that knowledge. For science to be truly cumulative, its outcomes need to be encapsulated as uncontentious knowledge (Fuchs and Spear 1999), building an increasing corpus of consensus-based facts (Shwed and Bearman 2010). Black-boxed knowledge may be utilized by scientists without consideration of the roadblocks, conflicts, or methodological innovations (the inner workings of the box) that went into its creation (Latour 1987).
Black-boxing presents a plausible hypothesis for the impact of review articles on knowledge translation and consumption. As a domain of specialized research resolves internal disputes and reaches consensus on its core questions, a review article may distill the findings, crafting portable language for those findings to be utilized by a wider community (see, e.g., Oreskes’s [2004] review of anthropogenic climate change). In terms of the attention that research in the specialized domain receives, a review article may “poach” citations from the work it references. Faced with a limited budget for citations, scholars interested in utilizing the findings from one corner of a research area could instead cite the review itself as a black-boxed representation of the area as a whole.
This idea of black-boxing, however, may oversimplify the process of scientific discovery and dissemination. Although there are clear instances of scientific consensus that have reached the point of undissected fact (see Latour 1987; Shwed and Bearman 2010), a purely cumulative vision of science in which questions are posed, hypotheses tested, and answers eventually added to the body of scientific fact is incomplete. As many scholars argue, scientific research subsists as much on conflict and dialogical contention as it does on agreement and consensus-seeking (Abbott 2001; Bourdieu 1975). A review article that treats a research specialty as a unified whole—as a black box—may be less useful to scholars trying to engage with and understand a topic area than one that packages that specialty as a conversation with diverse viewpoints and internal divisions. Black-box representations are appropriate for translation of science for lawmakers and the public (Callon et al. 1983; Star and Griesemer 1989), but research scientists, who constitute the primary audience for the academic review article, may seek the discursive structure of a field as much as they do its established findings.
For scholars and students trying to understand a new research domain, there is value in knowing something of the internal workings, at least in schematic form, of a black-box machine. The relevance of a specialized domain’s work can be understood through the processes by which particular researchers and their findings came to dominate the discourse within that domain (Bourdieu 1990; Kim 2009). Kuhn’s (1970) conception of a scientific paradigm emphasizes the role of exemplars for structuring disciplines, but many scholars argue that the same structuring occurs at a smaller scale, constituting the less extreme evolution of scholarly knowledge that builds on previous work without subverting its paradigmatic core (Frickel and Gross 2005; Hedgecoe 2002; Price 1986). In this view, reviews should not be understood as external or peripheral to a field of study, but as active participants in that field. Scientific discourse is distinctly reflexive: exemplary publications can act as signposts for a specialized body of knowledge, both describing and defining the connections among a set of researchers. As scholarly subdomains develop, the shape of their discourse is based on a shared understanding of this relational structure and its history—the canonical citations, theoretical divisions, agreed-upon terminology, and myriad other features that allow researchers to feel they are on the same page (Myers 1985). In making a particular claim about the relations that define a research area, review articles are engaging in a specific intervention in the evolution of that area (Ketcham and Crawford 2007; Myers 1991; Sinding 1996).
This suggests a third possibility for the structuring role of review articles in the digestion of an emerging research area. Reviews may define an opinionated representation of a subdomain, singling out a set of key exemplars and the relationships between them to tell a specific story of the past, present, and future of their area. Like a black box, this type of review translates a research specialty by omitting details of its history, but by focusing on a representative set of exemplary objects, such a review would be selective in its omissions. Like a subway map, it would present its audience with a caricatured version of its subject, highlighting important landmarks and, by tracing the relationships between them, describing in broad strokes the shape of the area they inhabit.
Analytic Strategy
Through rigorous empirical analysis of a large corpus of scholarly output, we will show that formal review articles have significant and consequential influence on scientific discourse at multiple levels. The transformations we describe occur at several distinct analytic scales, which we address with a series of statistical models. As a first description, we will examine how reviews affect the trajectories of the individual articles they cite. A naive perspective would regard reviews as little more than digests of a subdomain, simply bringing the most relevant research to the attention of a larger audience. In contrast, we will show that reviews heavily curate a subdomain, drawing attention away from the majority of the articles they cite. The bulk of the existing work in a research area experiences loss of attention as a result of its inclusion in a formal review.
Having established the strong curatorial influence of reviews, the subsequent analyses focus on their structural consequences for the specialized domains they address. Do reviews universally suppress attention for the publications they cover, or do they selectively promote a subset of that work? To address these structural questions, we examine the relational networks of individual research areas before and after they become the subject of review. We show that reviews do draw an area into tighter relation with itself, but they also erase much of the structure that defined it. Reviews shift the focus of a research area from a collection of small specialized clusters onto a relative few research exemplars. After review, a handful of previously noncentral articles become the hubs around which a subdomain revolves, and the subdomain’s focus shifts to relations with these hubs at the expense of relations between nonhubs. We further demonstrate that such hubs are unlikely to be central to any specific specialized cluster in a research area but tend to hold bridging positions between those clusters. These analyses show that review articles dramatically restructure the patterns of attention that specialized domains receive by constructing a simplified narrative of the major discourses in those domains.
A key step in studying the effects of review articles is their identification among a sea of published works. The category of “review article” is far from clear-cut. 4 It is often not obvious whether a particular publication is intended as a review, and a great many articles do at least some review work in citing previous literature. However, the analysis presented here is not concerned with literature review as a characteristic of research publications in general, but with formal review as a specific form of academic publication. Articles that perform the “summing-up” discussed in the previous section have a particular authoritativeness among published articles. Approaches using citation patterns or keyword matching can identify reviews as a style of publication (see, e.g., Ketcham and Crawford 2007), but it is the role of formal reviews in the ecosystem of journals that is most relevant to our analysis.
In light of this, we consider review articles to be anything published in an Annual Reviews (AR) journal. 5 Annual Review articles do not follow the standard peer-review process of most academic journal publications—they are written by authors considered experts in a field who are invited by the journal’s editorial board to submit a review. These publications are overtly situated as authoritative sources of the state of the fields they review. Although there are many overt and authoritative review articles that do not appear in AR journals (including discipline-specific sources), the deliberately conservative approach we take minimizes the number of falsely identified review articles at the expense of an increased likelihood that a “genuine” review will not be categorized as such. Erring on the side of “false negatives” implies that the effects shown in our analysis will be biased toward the null; the chances that we conclude reviews have no effect on knowledge production when in fact they do is significantly higher than the converse.
In line with previous research (Price 1986; Small 1973; Small and Sweeney 1985), we focus on scholarly attention as measured by citations. Our analyses are based on data from Clarivate Analytics Web of Science (WoS). 6 The WoS has extensive coverage of journal publications across a wide array of scientific disciplines. Although it includes information on over 100 years of publications, the coverage is far from complete for most of that time. To ensure the data we use are representative, we limit our analysis to work published from 1990 through 2016. Relatively few scientific publications are ever included in a review article, especially articles published in less widely read journals. We therefore restrict our analyses to a subset, albeit a large subset, of the full WoS corpus. We generated our sample by calculating the 50 academic journals most-cited by each of the 52 Annual Reviews we consider, and collecting every article published in those journals. Review citations are heavily skewed toward top journals, so this sampling covers the large majority of articles likely to be reviewed—retaining 80 percent of all cited articles in reviews. 7 The final sample yields approximately 5.9 million articles published in 1,155 journals across the 27 years of the sample. Table 1 displays summary statistics for the sample, broken down by subject area.
Summary Statistics for the Full Sample, by Journal Subject Area and Total
Note: Many journals are listed in multiple subjects, so total counts will be less than the sum of the subject counts. In addition to the number of journals and number of articles, the number of articles cited at least once and at least twice by a review article are shown. Median and inter-quartile range across articles are shown for the mean annual number of citations received, the number of works cited, and the proportion of those citations that reference an article in the WoS database.
Promote or poach?
The discussion of theoretical perspectives on the translation of scholarly knowledge presented above suggests several distinct ways reviews might affect the research specialties they cover. As an important first pass at disentangling these ideas, we ask a relatively simple question: what are the implications of inclusion in a review article for future citations? Can an article that is discussed in a review expect to enjoy newfound awareness in a wider scholarly circle, or will it fall into relative obscurity as attention is diverted toward a smaller set of literature? More succinctly, do review articles promote or poach citations?
The question is simple, but it is vital to untangling the different theoretical processes discussed above (and the mechanisms they imply). If reviews simply shine uncritical spotlights on a literature, bringing wider visibility to research that was previously obscure, then we should expect the publications highlighted to enjoy increased attention in the form of citations. In contrast, if reviews are taking a more active role in the interpretation or creation of scientific knowledge, as suggested by the theories of black-boxing and exemplars, then their effect on the articles they cite will be more complex. If reviews indeed construct unproblematic black boxes for scientific topics, then future work will be more likely to ignore the individual publications cited, either citing the review as a placeholder for the entire area or not bothering to cite the ideas at all (Garfield 1977). Similarly, if reviews identify a small subset of exemplars as the core of a field, successfully reconfiguring discourse on the subject along specific, simplified lines, the average effect on future citations received by the subjects of review should also be negative or ambiguous.
Answering this question is not as straightforward as it might appear. Myriad factors confound and complicate the effect under consideration. The number of citations an article receives is highly time sensitive, and much of its year-to-year variation is linked to waxing and waning attention (Redner 2004). This suggests the identification of a “review effect” on future citations needs to take into account the natural cycle of attention for any given publication, seeking out discontinuities in the pattern of citations. Furthermore, being cited by a review article can hardly be considered an independent event in the lifecycle of a publication, and care must be taken to account for confounding factors that may be related to an article’s citation by reviews and non-reviews alike.
We account for these complications using a hierarchical negative-binomial model of citations over time for a large set of articles in the dataset. The model predicts the number of citations an article will receive in a given year, based on covariates specific to the year, the article, and the journal of publication. The model, specified in Equation 1, has three nested levels: years, articles, and journals. Each journal contains many articles, and each article exists for multiple years. The dependent variable, denoted
The remaining covariates in the model are included primarily as controls for the main effect of interest, although we will show they indicate some interesting patterns on their own. At the coarsest level, the model has variables associated with each article’s publication journal, indexed with subscript j. We model each journal’s status using its impact factor,
10
represented in Equation 1 with
The next level of the model, indexed with subscript i, includes variables specific to each article that help explain the number of citations it receives. Because citation practices vary considerably even within subject areas, these covariates capture details of the publication style of each article.
The model includes several covariates at the lowest level, representing relevant features of an article that vary over time. In addition to the outcome variable and the number of reviews (described above), temporal patterns are accounted for by a quadratic function on the number of years since publication (
Finally, to be sure we are capturing the effect of being included in a published review rather than some feature of the types of articles that are likely to be reviewed, we include a covariate intended to proxy the cumulative reviewability (
The model described in Equation 1 allows considerable structured flexibility through the random-effects vectors ηi and vj. β0ij, β1i, and β2i describe the average evolution of citations over time, but each article’s underlying citation curve is accounted for through random variations in the values of η0i, η1i, and η2i. The main explanatory coefficient of interest is β3ij, which measures the impact of citation by a formal review article on future citations. We are especially concerned with the ways this effect may be heterogeneous, so we take particular care to account for variation in this parameter between articles. In addition to allowing article- and journal-level characteristics to moderate this effect, the random-effect term η3i models variation in β3ij that is unexplained by covariates. 13
Finally, recognizing that the number of citations an article receives may depend strongly on the venue of publication for a given article, each journal similarly has a random vector vj to allow otherwise unmodeled variation. Estimation of such a model poses significant computational difficulties, due both to the overall size of the sample (88,590,720 observations across 5,901,566 articles), and the need to estimate random effects for each article in the sample. 14 We therefore restrict the estimation to five-percent subsamples of the articles, weighted to over-sample less prolific subjects and years. To avoid bias related to the heavy right skew of the time variable, we consider only articles with at least 10 years of citation information. The resulting sample has 2,931,604 year-level observations nested within 144,097 articles in 1,069 journals. Repeated estimation on independent subsamples yields virtually identical results, indicating the analysis is robust to the choice of subsample.
Table 2 presents the model estimates. The large negative estimates for the intercept and time-related coefficients tell a dismal but predictable story about the evolution of citations over time (see Figure 1): most publications are rarely cited at all, and their chances of receiving citations tend to peak at about the fifth year after publication. The quite large standard deviations (10.03, 22.70, 5.34) for these article-level random effects, however, indicate there is considerable variability among articles’ citation curves, as indicated by the thin lines in Figure 1. More important for the current analysis are the estimates affecting β3ij, which measures the effect of inclusion in review articles on future citations for each article. The strong negative estimate for α30 (labeled “Reviews” in Table 2) suggests the first time a given paper is cited by a review, that paper can expect approximately 11 percent fewer citations in every subsequent year. 15 Additional citations from other Annual Review articles further reduce expected future citations—an article reviewed for the third time will receive about 30 percent fewer citations over the remainder of its lifetime.
Multilevel Negative-Binomial Model Estimates of Model 1
Note: Coefficients reflect expected effects on total yearly citations received. All covariates except reviews standardized to have zero mean and unit standard deviation.

Predicted Citations per Year (setting all other covariates to zero)
This effect is highly heterogeneous across articles, due to predicted differences in article characteristics (e.g., article length, journal of publication) and the large standard deviation in the random-effect term η3i (2.20). Figure 2 summarizes this variation, showing the distribution of the net multiplicative “review effect” across all articles that received at least one review citation. The large majority of publications that are included in Annual Review articles can expect a substantial dampening effect on their future citations, with a median value of exp(β3ij) = .617—an expected decrease of nearly 40 percent for each review citation received. Figure 2 also shows that the distribution has a long right tail. This indicates that a substantial minority of articles (around 30 percent) are predicted to have at least a small increase in their future citations; of these, only around 12 percent are predicted to have their future citations increase more than twofold.

Net Multiplicative Effect of Annual Review Citation on Subsequent Citations Received, Corresponding to the Predicted Value of exp(β0ij) for All Articles Receiving at Least One Annual Review Citation
Figure 3 illustrates this pattern with six sample articles. In most cases, reviews lead to significantly diminished predicted citations (panels a through e). But for some articles, like the one depicted in panel f, citation by review predicts a small increase in future citation. The variability in the main effect of interest is an important outcome of the analysis so far: although the effect of inclusion in a review article is detrimental for the large bulk of scholarly publications, a small minority “rise to the top” and enjoy considerably expanded attention. Indeed, the remainder of our quantitative analysis will focus on uncovering the structural characteristics that differentiate these fortunate few articles from the rest of the published literature.

Actual versus Predicted Number of Citations Received by Six Sample Articles
In relation to the theoretical perspectives outlined above, the implications of these results are significant. In contrast with the “spotlight” theory, review articles tend to diminish the attention their citations can expect to receive. As these results suggest, and as we will investigate, reviews have this diminishing effect on the majority of articles they cite, with only a handful of reviewed publications gaining increased attention in the future. The results of the regression make it clear that reviews do not act as neutral observers, merely raising the awareness of a body of literature. They are not simple spotlights that bring attention to a scholarly domain. In fact, reviews execute curatorial power. By representing a research specialty for general scrutiny, reviews perform a selection on existing research. Through their curation, reviews turn eyes away from (most of) what they highlight as important in a research area.
Restructuring Discourse
Having shown that review articles have a heterogeneous and largely stifling effect on the individual articles they cite, we now turn to an analysis of their effects on the research area they aim to summarize. Formal review undercuts the attention given to published work, on average, but considerable variation exists in the magnitude of this effect. Indeed, some of the articles cited by a particular review may bear the brunt of the obscuring effect, while others could experience minimal impact or even a boost to their future citations.
The remainder of this article will focus on the relationship between the structure of scholarly subdomains and the positions of individual works within those structures. Dissecting such structural effects is crucial to differentiating between the different theoretical mechanisms summarized above. If reviews package knowledge into concise, unproblematic units (as the black-box theory suggests), then scholarly attention should shift away from the individual publications in a field more or less uniformly, focusing instead on the apparently settled concepts the field has produced. If instead, as we argue, formal reviews actively remake domains of knowledge, constructing new perspectives and new understandings of that knowledge, then the structure of scholarly attention in a domain should change in form and not just magnitude. As we will show, areas of knowledge that are recast in novel ways by review articles undergo a transformation in their topology—which findings are central, which are peripheral, and how they relate to one another shift in observable ways. We begin to uncover these differential effects by examining the structural changes that occur within scholarly discourse when a research domain finds itself the subject of a formal academic review.
The Structure of Reviewed Work
To interrogate the structural features of reviewed research domains, it is necessary to identify a representative collection of publications within the domain. Reviews generally concern emerging domains of scientific activity that are considered to speak to one another. The emerging domain can represent different scholarly moments: an emerging topic that is not yet widely recognized, the recasting of a developed subfield, or even articulation of an existing school of thought. All reflect areas of intellectual activity that the reviewer posits as related in some way. As such, a review defines a specialized domain and its subdomains through a body of related published work.
The reliable identification of specialized scholarly domains of this sort is a long-standing problem in research on scientific processes (Morris and Van der Veer Martens 2008). Although the idea of a scientific research specialty can seem straightforward from the perspective of an individual researcher, the appropriate definition of a specialized domain starkly differs depending on the research question being asked. A dominant thread in the identification of specialized domains focuses on the text of scientific output, using either domain-specific terminology (Foster et al. 2015; Rzhetsky et al. 2015) or statistically determined lexicons (Anderson, McFarland, and Jurafsky 2012; Munoz-Najar Galvez, Heiberger, and McFarland 2020) to locate researchers and publications that use similar language in similar ways.
Such methods have been put to productive use, but their emphasis on lexical similarity has certain shortcomings in relation to the questions we pose here. Reviews are often concerned with emerging research areas that may not yet have an established lexicon. The research they describe may be the work of a small subcommunity in an established field that has not differentiated its terminology from its parent domain. Moreover (as we will illustrate in our analyses), the establishment of specialized research areas is often the result of a conceptual merging of existing groups of scholars who might use different language to talk about the central ideas they have in common. 16
In light of this, we use a comparatively simple approach to identifying sets of related articles by utilizing the reference lists from published articles in the corpus. For each publication in the data (review or non-review), we define its reference set as the subset of its cited works that are contained within the WoS corpus. These reference sets exploit the expert knowledge within scholarly domains to find groups of related publications. The reference sets we define are necessarily incomplete collections of work in a field, but they capture the most important work being done in a particular domain. References have long been utilized as an effective way to identify interrelated research (Chen, Ibekwe-SanJuan, and Hou 2010; Gmür 2003; Mullins et al. 1977), and these reference sets provide a practical scaffolding for analyzing the structural dynamics of the research domain targeted by a review article.
Using reference sets as representative subsets of a specialized domain, we construct co-citation networks to describe the structure of the scholarly conversation within that domain. A co-citation network represents each publication with a node and creates an edge between a pair of nodes for each other publication that cites them both (Ennis 1992; Moody 2004; Price 1986; Small 1977, 1986; Stokes and Hartley 1989). 17 By focusing on co-citation between the set of articles that authors consider core to a topical area, we are able to discern structural characteristics that the simple citation counts from the previous analysis cannot. Co-citation reveals the relationships between scientific research as it is perceived by the scholars who are most engaged with and invested in that research (McCain 1986; White and Griffith 1981). Authors cite work for any number of contrasting reasons (Garfield 1979; Jurgens et al. 2018; Krampen et al. 2007), and high rates of co-citation between publications do not necessarily indicate agreement between those publications’ claims. But citation provides a signal that the referenced work bears direct relevance toward a publication’s arguments, be it supportive or argumentative. Frequent co-citation is therefore a strong indication that a pair of publications are in conversation with the same texts.
When aggregated across a set of articles, co-citation relations enable a host of network-analytic measures to be used, revealing the structure of similarity-in-use of a group of articles. The core of the published research in an area has an internal structure that is revealed by these patterns of co-citation among the individual works, and co-citation is especially well suited to identify the key scholarly roles and communities we will discuss. Figure 4 illustrates the structure revealed by the co-citation network for one such review (to be discussed in more detail below). If reviews have substantial structural effects on the reference set of research represented in a topic—fragmentation or unification, centralization or democratization—those changes will be revealed by comparing the set’s co-citation network before and after the review is published. 18

The Reference Set and Co-citation Network Associated with the Annual Review of Entomology Review “Geographic Structure of Insect Populations: Gene Flow, Phylogeography, and Their Uses” (Roderick 1996)
Identifying Structural Transformation
Structural change in networks can be difficult to consistently measure. Many measures, such as those relating to automatically identified communities, are descriptively rich but not robust to comparisons between different networks. Other measures (e.g., assortativity) offer consistent comparisons across time and communities but do not afford clear interpretations in the context of the co-citation networks under consideration here. We restrict our attention to two features of network structure, closeness and transitivity. Together, these features yield rich structural descriptions of how sets of references get co-used, and they provide robust measures of within-network change and between-network difference.
Our measure of network closeness is based on the average, weighted path length within a network; it is a description of how “narrow” a group of publications is. Does the set of publications expand over a wide range, with certain articles virtually unrelated to others? Or do the works address very similar ideas, touching on different aspects of the same topic? Weighted path length captures this idea succinctly: if two publications are topically similar then they are likely to be either co-cited often or both be co-cited frequently with an intermediary work. Starkly dissimilar works, in contrast, will likely require many “jumps” along co-citation edges in the network to reach one another.
We measure the length of a path in the network as the sum of the inverted
19
co-citation counts of the edges included in the path
The second network characteristic we investigate is clustering, the degree to which the papers listed in a review come to be co-cited in clusters. In a network with high clustering, the individual articles are embedded in tight-knit groups, with members of those groups having a high probability of being connected to one another through co-citation. A network with low levels of clustering, in contrast, is more heterogeneous, allowing certain publications to enjoy privileged positions in the global structure.
We measure network clustering (also referred to as a network’s transitivity) using an extension to the standard equation of the clustering coefficient for weighted graphs. If publication A is frequently co-cited with publications B and C, the clustering coefficient summarizes how frequently publications B and C are co-cited with each other—it is a straightforward measure of triadic closure. It is calculated by dividing the number of closed triplets (triangles) in the network by the number of connected triplets (triangles and open two-paths):
Together, these two measures do a remarkably good job of summarizing much of the overall structure of a co-citation network. Although a simple two-dimensional space obviously cannot capture the nuances of an entire network of relationships, average path length and the clustering coefficient can indicate a wide array of network topographies of interest. Figure 5 illustrates some common network configurations associated with high and low values of these measures. Networks with low clustering coefficients and short path lengths can be characterized by star structures, with central hubs connecting groups of vertices that are not connected to one another. Networks with many long, sprawling paths that do not connect back on themselves will have a low clustering coefficient and a large average path length. A high clustering coefficient with short average paths between nodes is the measure of a “small world” network (Watts and Strogatz 1998), in which tight, insular clusters are connected by bridging nodes that span subgroups. Finally, a network with long average paths but a relatively high clustering coefficient will have dense clusters of vertices separated by longer sequences of connected, intransitive vertices.

Typical Structures Found in Networks with High and Low Clustering Coefficients and Long and Short Average Path Lengths
Modeling change in networks is complicated by the interdependencies between their various structural features. A network’s size tends to be closely linked to the density of network relations, which itself is highly correlated with transitivity, and so on. Certain robust statistical models designed to address such interdependencies exist, including dynamic exponential random graph models (Krivitsky and Handcock 2013) and stochastic actor-oriented models (Snijders 2005). These models are excellent for uncovering the dynamics within a particular community, but they are ill-suited for comparing dynamics across communities. Moreover, the complexity of most network-evolution models does not scale to the order of millions of distinct networks containing tens of millions of vertices. We therefore take a simplified approach to the estimation of structural changes associated with scholarly review.
We created a dataset with each observation representing the reference set of one article in the corpus (reviews and non-reviews), measuring for each co-citation network its structural characteristics in the seven years leading up to publication of the focal article, and the change in those structural characteristics observed over the seven subsequent years. We then used a multivariate regression (see Equation 4) to predict the changes to the structure that can be explained by review. 22 This approach is robust to the pitfalls of network prediction in several ways. This modeling approach will allow us to compare the evolution of Annual Review reference sets to general non-review articles and to matched review-like articles. As described earlier, our primary measures of interest, change in mean path length and change in clustering coefficient, are calculated with respect to deviations from the average over random rewritings of the network. This is a widely used approach in the network literature to construct variables that are not sensitive to the particularities of an individual community (e.g., Kolaczyk and Csárdi 2014:5.3). In addition, our measures of change focus on the difference between the before- and after-publication structures, so by including covariates for the initial structure of each network, we are effectively controlling for that network’s idiosyncrasies.
Equation 4 specifies the multivariate linear model in detail, with reference sets indexed by i and
The model described in Equation 4 is a straightforward way to identify changes in network structure that coincide with the publication of Annual Review articles, but care must be taken to justify the broader argument we make about the effect of formal review on research domains. Academic discourse is a distinctly reflexive process—every stage of research is carried out with an awareness of the context in which it will be viewed by others. No part of the creation or dissemination of scholarly work is done in a vacuum, an observation that is especially true for review articles. Annual Review publications are often targeted directly at emerging fields that are likely experiencing characteristic structural transformations on their own. Disentangling the types of changes that are the result of formal review from those that would have taken place independently in a field requires careful consideration of both the theorized mechanisms in play and the analytic methods used. We therefore estimate the model in Equation 4 on two versions of the data, each emphasizing a different aspect of the transformations under consideration. The first version uses a representative subsample of the full corpus—reviews and non-reviews alike—to describe the overall features of structural change we observe. The second version uses a more restrictive sample of articles, matching Annual Review articles to Annual Review-like articles published elsewhere, to capture the narrow effect of Annual Review journals.
The initial estimation of the model parameters compares Annual Reviews to all other articles in the corpus using a weighted 5 percent sample from the corpus. A subsample is necessary for computational efficiency—the calculation of network statistics like average path length as well as estimation of the multivariate regression are impractical on the full set of citation data. 23 The sample is taken at the level of the referencing article for each reference set and is weighted to ensure (a) representation of less prolific subject areas and (b) retention of all Annual Review articles. In addition, degenerate networks (very sparse or having fewer than 20 referenced articles in the Web of Science) for which the structural statistics are not calculable were removed from the sample.
The first two columns of Table 3 present estimates
24
for the model using the 606,979 observations in this sample. The control variables suggest some interesting patterns, but the primary covariate of interest for our analysis is the indicator for sets of publications referenced by review articles. For both outcome variables, the coefficient on
Results Describing Structural Changes to Reference Clusters
Note: The left two columns of results show estimates using a weighted, 5 percent sample of the full data; the remaining columns show the same results estimated on the propensity score–matched sample. The final two columns use co-citation networks that exclude articles citing the root article. Values in parentheses represent 95 percent credible intervals on all estimates.
However, as mentioned earlier, we must be careful about the specific comparison this initial analysis makes. The question we hope to address is this: how does publication of a formal review affect the scientific discourse in a field? But the comparison implicit in the representative sample contrasts formal reviews with all other scholarly articles. To interrogate the effects of formal review specifically, we ought to compare “officially sanctioned” reviews to review-like articles that are not endorsed by a publisher like Annual Reviews. This is important because the estimates produced on a representative sample (like those just described) may confound the effect we are seeking with a more general pattern of the evolution of research specialties. It is plausible that specialized subfields develop in predictable ways and that the estimates in the first two columns of Table 3 are an artifact of the tendency for reviews to target fields at specific moments in their development. In this situation, the
To account for this, we estimate the model described in Equation 4 again, this time on an unweighted, propensity score–matched subsample of the data.
25
The propensity score used for matching is built from a logistic regression predicting formal review articles using the structural characteristics of the articles’ reference co-citation sets (Equation 5). In addition to the structural features used as controls in Equation 4, we add predictors that are likely to be associated with reference lists of review versus non-review articles. Because reviews may tend to focus on novel subfields, we account for structural features of the co-citation network that are indicative of newly emerging areas of research: the mean age of the cited references (
Before discussing the results of the propensity score–matched analysis, it is worth considering the results of the propensity score model itself (presented in Table 4). At the point of publication, the set of works that review articles reference are distinct from a regular citation community in important ways. Unsurprisingly, reviews are strongly associated with larger reference lists than are standard articles (
Logistic Regression Estimates for the Propensity Score Model (Equation 5)
These results suggest the literature cited by reviews forms a constellation of distinct, active communities; small research clusters, each engaged in vigorous conversation within itself, are linked by more sporadic connections that tie them together into a topically coherent whole (as summarized in the top panel of Figure 6). As a whole, estimates from the propensity model describe what many would expect of a newly emerging scientific specialty: clusters of vibrant research activity spanning a short time frame and just beginning to realize their connections with one another. This suggests the propensity score we use to produce the matched sample is identifying the types of research communities that could be selected for an Annual Review article but were not. 27 Analysis of this matched sample therefore allows us to look for divergence in the histories of research communities that were the subject of such formal review and those that were not.

Simplified Illustration of the Structural Changes to Co-citation Networks Associated with Review Articles
The model estimates presented in the last two columns of Table 3 are based on the matched sample of about 6,500 pairs of reviews and non-reviews (N = 12,990). Strikingly, these results display the same overall effect of reviews for the matched pairs as we see in the 5 percent sample. Formally reviewed research areas experience a significant negative change in both their average path lengths (about 11 percent of a standard deviation shorter) and clustering coefficients (about 7 percent of a standard deviation lower) when compared to similar reference sets that were not the subject of formal review. However, note that the estimated effect of review on clustering coefficient is significantly smaller in magnitude when estimated on the matched sample (–.070 versus –.208), suggesting that at least some of the dramatic “declustering” we observe in these communities would have taken place even in the absence of a formal review. Nonetheless, when taken together with the propensity-score estimates themselves, these results describe a striking restructuring of the discourse in a field. Although reviewed networks were already very narrow, the further decrease in path lengths suggests these networks shrink even further. The negative estimate for the effect of review on change in clustering coefficient shows that the individual clusters characteristic of the small-world sets that are formally reviewed in an Annual Review journal become considerably less cohesive. 28
The estimates presented in Tables 3 and 4 describe a network that becomes much more centralized, with more of the literature in their specialty relating exclusively to a small subset of publications. After a review is published, more of the co-citation relations are centered on fewer of the cited works (a structure Gondal [2011] suggests is typical of a newly-emerging research domain). Communities of literature that are the subject of a review article collapse into a hub-and-spoke structure, as illustrated in Figure 6. Edges from the hubs to any one of the peripheral works become more common, and edges between those peripheral works diminish. This transformation suggests reviews are performing an act of selection: certain works are singled out as exemplary in a scientific subject, so much so that the remaining work already published in that area is cited only in relation to the newly anointed exemplars. The story that a review tells about an emerging field—a narrative of its past, present, and future as a coherent specialty (Sinding 1996)—shapes that field in consequential ways. The legitimacy of a formal review in an Annual Review journal grants its authors considerable influence over the development of a research domain.
Identifying Exemplars
Still, if review articles are indeed performing curatorial work on the articles they consider, centering a few as exemplars of a field while sidelining others, what criteria does that curation use? Do reviews simply amplify the attention received by articles that are canonical to a topic, drawing further accolades to the already celebrated? Or do reviews perform a more dramatic form of synthesis, drawing previously marginal work into the spotlight to compose a novel portrait of a specialized domain of knowledge? We address this question by examining the structural features of individual articles that lead to changes in the attention they receive.
Many of the articles a review cites do not become significantly more or less central to their subject area as a result, but some migrate from the periphery to the center of the domain’s focus. Our aim is to identify the articles that gain significantly in this regard, and to identify which features of their initial position help them achieve increased attention. We focus on two facets of initial network position associated with the importance of a vertex in its community: central nodes and bridging nodes. Structurally, a central publication is one that is co-cited frequently within a relatively tight cluster of other publications that are themselves co-cited frequently to one another. This recursive definition means a central article is one that is cited alongside many other publications in the network and is in the core of a cohesive cluster of publications that are all frequently cited by the same body of published work. Eigenvector centrality (Bonacich 1987) is ubiquitous in the literature, as it succinctly captures this recursive notion of structural importance—an article is central if it is tightly connected to other central articles.
In contrast to central nodes, network bridges exist in the spaces between clusters in a network. In the literature, bridges are discussed in terms of brokerage (Fleming, Mingo, and Chen 2007) and structural holes (Ahuja 2000). To capture the degree to which a specific article acts as a bridge in a reference set, we measure the local transitivity of each vertex. 29 It is important to note that transitivity has an inverted relation to bridging: a vertex with low transitivity holds a highly bridging position in its network and vice versa. The transitivity of a given vertex is the proportion of the pairs of its neighbors that are themselves connected to one another (we use the generalization of this measure for weighted networks from Barrat and colleagues [2004]). Co-citation networks, as a class of affiliation networks, tend to have high numbers of closed (transitive) triads, making local transitivity an especially powerful measure of bridging in the communities we are studying.
To discern which publications are boosted by review articles relative to the reference set, we use a publication-level regression. Equation 6 specifies a multilevel representation of the model, where i indexes articles embedded in the reference set indexed with j. For each referencing article, we calculate centrality and bridging statistics for every article in its co-citation network before and after publication of the reference article, and we use these statistics to predict those articles’ change in citations received. The dependent variable is computed as the difference in the total number of citations each article receives in the seven-year windows before and after publication of the review article.
To allay spurious patterns that may emerge from variability between reference set networks, we standardize the measures of citation change, centrality, and transitivity to ensure they describe each publication’s evolution relative to the reference set in which it is embedded. Thus, each value is centered at the set’s mean and divided by the set’s standard deviation. Such group-mean centering is vital for interpretation of the model—we are interested in the change for each article relative to the other articles in its reference set, and group-mean centering allows us to measure that change while minimizing the confounding effects of set-specific characteristics. Articles with high values of the dependent variable are those that gained especially greatly after being cited by the focal article, for instance, moving from a position of relative obscurity to one of high visibility in the field.
As independent variables, we use measures of each work’s eigenvector centrality and transitivity in the co-citation network that preceded the focal article’s publication (as well as their interaction). We also include the total number of citations each article received in the seven years preceding its citation by the focal article. Each of these article-level variables is group-mean centered—a value of
Finally, we include two non-structural covariates to account for potentially confounding characteristics of each referenced article.
Because we aim to measure the difference in these predictions for networks referenced by formal reviews, we interact all the covariates with a dummy indicating whether the focal article was published in an Annual Review journal. In addition, we restrict our sample to the matched pairs of referencing articles described earlier (see Table 3). Although the current model (Equation 6) is specified at the level of the individual cited publication, the causal effect of interest remains at the level of the referencing article. Using the reference–set level matched pairs allows us to compare the changes in individual article positions within review and non-review sets, and the estimates indicate the degree of divergence between scientific specialties that were chosen for review and those that were not. The full model is estimated for 589,735 articles across about 13,000 reference sets using OLS with cluster-robust standard errors to mitigate the within-network interdependencies of observations. 30
Table 5 reports the coefficient estimates (along with 95 percent credible intervals) for the model. The first six variables, those not interacted with
Coefficient Estimates and 95 Percent Credible Intervals from a Linear Regression Predicting Relative Change in Article Eigenvector Centrality
Local transitivity is similarly negatively associated with increased centrality. For articles with low transitivity, the articles they are co-cited alongside are not likely to be co-cited alongside one another, suggesting these low-transitivity articles act as bridges in a network. The results of this model indicate that even when not cited in a review, bridging articles are likely to become more central over time. Articles that share an author with the referencing article are also more likely to have a positive change in their eigenvector centrality, whether or not the referencing article is a formal review. Finally, articles that are themselves reviews are predicted to become more central after being cited by non-review and review articles alike.
These patterns tell us something about the general structural evolution of scientific specialties, but we are most interested in the comparison between research areas that receive formal reviews and those that do not. Once we account for the types of research fields that are prone to review, what is the residual difference in fields that are chosen for review by an Annual Review journal? The remainder of results in Table 5—those that include an interaction with
First, we see no evidence that reviews have an independent effect on the centrality of self-cited articles or of articles that are themselves reviews. Moreover, reviews do not seem to change the trajectory of highly cited publications, at least in comparison with similarly structured communities that were not cited by a review. However, the coefficients measuring the structural positions of articles before being referenced tell a different story. The estimated effect of eigenvector centrality, local transitivity, and their interaction are all significantly different than zero, and they suggest review articles will tend to disproportionately reward publications that hold a central position or act as bridges in their co-citation network. The negative estimate on the interaction term can be interpreted to mean articles that are both central and occupy bridging positions—those like vertex B in Figure 7—are especially likely to become more central after being cited by a formal review. To get a sense of the magnitude of these effects, one can think of an article that is in the top fifth percentile of eigenvector centrality and bottom fifth percentile of transitivity in its network before being cited by a review. Such an article will be expected to experience an increase of more than 10 percent of a standard deviation in eigenvector centrality in the seven years following review, being co-cited much more frequently alongside the other important work in the domain. 31

Illustration of the Interaction of Eigenvector Centrality and Local Transitivity in a Network
Together, the estimates from the two models (Tables 3 and 5) suggest review articles redefine how cited works are interrelated going forward by weakening the existing, tight-knit clusters of research, and by recentering the conversations that relate these clusters to one another. An important caveat to these findings lies in the small magnitude of the coefficient estimates in Table 5. It is clear the model is not describing the majority of the variation in centrality change among these communities (the R2 for the model is around .26). Nevertheless, the findings are significant for understanding the effects of formal review articles. The dynamics of citation are immensely complex, determined by a multitude of scholarly, social, institutional, and structural factors. The above analysis examines only the effects of the network’s structure, but it still uncovers significant regularities in the way those networks change. The contrasting predictions for review and non-reviewed communities reveal the acute, atypical restructuring that results from formal scholarly review, even when accounting for the types of changes typical for emerging specializations. Referring to Figure 6, it is exactly the bridging articles that are most likely to become the central hubs in the reshaped network that reviews create.
Looking Closer
These statistical analyses paint a vivid picture of how formal scholarly review contributes to the structuring of emergent research domains. Research output is treated much differently by the larger academic community after it has been reviewed. Specific findings are drawn into a broader conversation with one another, with fewer stark divides between different approaches to the same topic. Research projects that are relevant to multiple conversations become exemplars that relate disparate threads into a single cohesive discourse. To interrogate this process in more detail, and to cement the ideas in a practical framework, it is useful to dissect specific cases of domain transformation associated with a formal review.
We selected three such cases for dissection: “Integrated Assessment Models of Global Climate Change” from the Annual Review of Energy and the Environment (Parson and Fisher-Vanden 1997); “Geographic Structure of Insect Populations: Gene Flow, Phylogeography, and Their Uses” from the Annual Review of Entomology (Roderick 1996); and “Surface Treatments of Polymers for Biocompatibility” from the Annual Review of Material Science (Elbert and Hubbell 1996) (see Figure 8). The close examination of individual cases serves two purposes. First, in a large-scale quantitative analysis such as this, looking at the outcomes as they play out in real scenarios gives substance to the abstract results revealed by the models. Clear examples of the structural changes described in this article can aid in understanding how they are realized within actual publication communities. These cases are not intended to validate the mechanisms of change we describe—a group of three exemplary samples from such a large corpus would be ill suited for that purpose—rather, they allow a partial elaboration of a dynamic structural evolution. Any case, including those shown in Figure 8, will afford multiple, potentially contradictory explanations of the mechanisms at play. We chose these cases not to be representative of the specific, diverse circumstances of academic fields, but to be illustrative of the transformations that the quantitative analyses uncovered at scale.

Illustration of Network Change in Three Cases
The second purpose for showing a selection of real-world examples is to underscore some of the limitations of the story of structural change we describe. The evolution of any social network is an inherently complex process. The reflexivity and multifaceted nature of scholarly publication mean co-citation networks display especially complicated dynamics. The mechanisms we describe here—those of scholarly centralization and the promotion of bridging exemplars—form just one component of the process of disciplinary evolution. However, as the examples in Figure 8 illustrate, myriad other factors influence the type and magnitude of reshaping these fields undergo. The success of a research project and its impact on a field are influenced by forces related to institutional affiliations, personal relationships between scientists, geographic location, political and cultural climate, technological innovations, and any number of other particularities. In spite of the complexity and heterogeneity of scholarly fields’ evolution, formal review appears to exert consistent influence over the shape of that evolution.
Each of the three panels in Figure 8 represents one Annual Review publication. The cases were chosen to demonstrate the types of transformations suggested by the quantitative analysis from the previous section and to span a variety of scientific disciplines. In each panel, the network on the left represents the co-citation structure of the reference set collected in the seven years immediately preceding publication of the review, and the network on the right represents that same set over the seven years after the review was published. Perhaps most striking when comparing the networks before and after review is the consistent change in structure with clustered, small-world networks transforming into centralized, core–periphery networks. The size of the vertices in the figure represents the number of citations received by the article, demonstrating the high level of centralization in the post-review reference sets. Although the networks become more dense overall after review, the co-citation relations accumulate predominantly among a small core of publications. We argue that this centralization is a result of reviews breaking down the boundaries between insulated research clusters and lifting a smaller number of publications up as a “hub” holding the emerging scientific subfield together.
The examples in Figure 8 are consistent with our explanation, but they also illustrate the domain-specific particularities that underlie the pattern of centralization—the processes by which reviews reconfigure the field are by no means uniform. In each case, the network diagram on the left maps the contours of a burgeoning subfield defined by the institutions, disciplinary norms, and existing research agendas that constitute it, and the diagram on the right describes a unified, centralized structure that might represent a more established area of research. But each of these three transformations is contingent on the specific context of its own domain.
Edward A. Parson and Karen Fisher-Vanden’s 1997 review (top panel) surveys three distinct approaches to integrated assessment (IA) models of climate change, all in the context of the more traditional atmospheric climate modeling that does not focus on political, economic, or social factors. George K. Roderick’s 1996 review (middle panel) describes several distinct research clusters 32 concerned with the interaction between geography and the evolution of insect species, linking them together with recent work on genetic analyses. And Donald L. Elbert and Jeffrey A. Hubbell’s 1996 article (bottom panel) summarizes a class of biomaterial surface treatments that exists at the boundary between the otherwise often distinct biological and material sciences. Each of these situations is embedded in and responsive to a different scholarly setting, but they all share a common feature of illustrating connections between research domains that in other respects might have little in common. These three cases are typical of Annual Review articles in that they appear to meld a disparate archipelago of research clusters into a singular island of the targeted subfield.
The merging process is elucidated by examining the roles of key publications in the co-citation networks. The numbered vertices in each panel of Figure 8 mark the publications that are most central to the post-review co-citation network. 33 With few exceptions and across the three examples, the articles that end up near the core of the network seven years after the review occupied central bridging positions when the review was published. In practical terms, this means that as reviews selected specific publications for their relevance to more than one of the disparate communities those reviews were linking, those same publications became key citations in the academic literature that followed. However, the drift toward the core that these boundary articles experience coincides with an unraveling of the dense research clusters they bridge. The most dramatic examples of this process in the cases listed here are the “climate modeling” (top), “species interaction” (middle), and “biochemistry and cell biology” (bottom) clusters in Figure 8. Each of these clusters represents a distinct and highly cohesive research domain that is cited by the review. However, after the review is published, these clusters become sparse, being characterized more by their articles’ connections to the new hub than to each other.
Although these cases are just single examples, they demonstrate the processes that underlie the transformations characteristic of reviews. Each review in Figure 8 is tied to a major reshaping of the conversation surrounding its topic. In every case, the literature became more integrated in the eyes of those who cited it. The climatological research investigating atmospheric processes became more solidly engaged with integrated assessment models of economic and political change. Publications discussing insect evolution and geography began to reference the then-new work on genetic “microsatellites” in insect populations. However, this integration came at the expense of the cohesion that had been inherent in the distinct communities. By shifting the focus of discourse from the narrow scope of the highly specialized subdomains to the integrated whole, a larger portion of the conversation shifted to relationships between the distinct literatures rather than within them. The consequences of this shift were significant. The cost of greater integration was the minimization of the highly specialized work in smaller communities, and the complete marginalization of research that did not fit in to the new, centralized narrative.
Synthesizing Scientific Specialties
Our analyses describe the significant discursive transformations, both dramatic and subtle, that accompany curated academic review. Articles in Annual Review journals affect the future of the works they cite. We show that, contrary to conventional wisdom, the majority of reviewed publications are cited less than if they had been omitted from the review. Paradoxically, reviews tend to draw attention away from the specific articles they cite, an outcome that holds across disciplines and publication cultures.
One might explain this outcome by suggesting a process of knowledge encapsulation: reviews describe scientific specialties that have reached their conclusion and can be incorporated as resolved scientific fact. In this view, reviewed articles represent a step on the path to a conclusive finding and need not be cited once that finding has been achieved. This sort of black-boxing of knowledge may be present in scholarly discourse, but we argue that there is a more complex, and in many ways more significant, process taking place. Our examination of the structural changes that occur in reviewed scientific specialties suggests reviews shift discourse in a manner that simultaneously simplifies and collapses a knowledge community. A reviewed specialty may receive more attention overall, but that attention is directed toward a small group of exemplars. Peripheral research is included primarily through its relation to a central hub. The metamorphoses specialties undergo make them more closely connected while limiting the kinds of discourse they use.
Review articles mark a particular moment in the evolution of an academic specialty—a moment when a domain of research that is still relatively young has become established enough to warrant attention beyond that of its pioneering scholars. Annual Review articles are written about areas that have a sizeable literature, a range of committed scholarship, and often consist of multiple complementary or competing internal specializations. The transformations they describe are not as grand as the Kuhnian paradigm shifts or black-boxing of facts studied at length by scholars of science. Moments like those are marked not by reviews but by textbooks and encyclopedias. Rather, formal reviews indicate the meso-scale transformation of knowledge—from marginal collections of research to legitimate scientific specialties—that is missing from the macro-historical accounts focused on crisis, paradigmatic conflict, and revolution. Our work concentrates on review and synthesis as a site of negotiation of the definition and organization of a research area that, along with other forms of discourse like published replies and responses, may be integral to the meso-level process by which fields and their contents evolve and change rather than revolutionize.
Still, legitimation of knowledge requires translation of its principal claims and accepted theoretical stances into a more general framework. Novel domains are initially messy, full of contradictions, confusion, and exploration. Not until an area of knowledge has been schematized into a coherent and simplified framework will it enjoy acceptance in the wider community. Research specialties must make sense within existing disciplinary logics to be recognized within a wider scholarly context. This type of sense-making is the aim of all review articles, but our claim is that formal, invited Annual Review articles provide a distinctly authoritative source for this type of schematization. The quantitative analyses we describe compare Annual Review publications to non-AR articles that take the same summarizing form. The results indicate there is something distinctive about the authority conveyed by an Annual Review. The specific mechanism of this authority is not entirely clear—it could be the simple notoriety of AR journals, or it could be the high prestige of the authors they recruit. 34 But it is clear from our analysis that this authority allows the formal reviews to exercise curatorial control over the future direction a scholarly field takes.
It is not enough to simply publish an article that aims to summarize the important work in an emerging specialization. Reviews published in an Annual Review journal present a distinct form of legitimized scholarly knowledge that allows them not just to observe an evolving subfield, but to alter its course. They essentialize the small-scale conversations that constitute a specialty, ignoring the minutia that have little relevance to discipline-wide discourse and painting with broad strokes the different camps engaged with the reviewed topic. This erasure of detail is necessary to draw those camps into relation with one another as a singular whole. These types of restructuring, essential to the constitution of a legitimate research domain, are precisely those observed in the analyses we present here. The changes to scholarly discourse that accompany AR reviews create a constellation of simplified topics, tracing connections between them through exemplary publications and casting distinct bodies of work as a holistic gestalt.
But the sense-making entailed in scholarly reviews is more than simple translation. Novel areas of research are characterized by a lack of consensus not only on formal findings, but on their identity as a research specialty (Hill and Carley 1999). Much of the work of establishing a specialty lies in negotiating the definition of the underlying norms of discourse—a story with which to frame the conversation (Goffman 1974; Morrill and Owen-Smith 2002; Sinding 1996). Because of this, Annual Review articles occupy a privileged position in a newly formed area of research. By externalizing one particular definition of a scholarly situation and consecrating it in a published review, their particular form of academic discourse has disproportionate social influence over ensuing works that relate to the specialty. Review articles’ interpretations set an agenda for the future of scholarly domains (Myers 1991). In so doing, reviews do not just define what was, but what could be and will be.
Our findings offer a counterpoint to the view of scientific development as a process of conflict, challenge, and supplanting of paradigmatic frameworks. Sense-making efforts are ubiquitous in published scientific discourse. Researchers are constantly engaging in synthesis, attempting to make sense of observed scholarly developments by positioning them relative to other work and thus (re)defining some part of the field. In this way, science engages in a process of continuous self-reflection.
Formal review articles, although they occupy a niche in the domain of academic publishing, are ideal illustrations of such synthetic processes of knowledge creation. We have shown that formal reviews are much more than simple summaries of scientific subfields. By curating the published research in an area, reviews highlight certain connections between publications while obscuring others, dramatically simplifying a domain of knowledge. They focus scholarly attention around a few key publications and the relations between them at the expense of the broad majority of the research in a domain. Upon inclusion in a review article, the seminal research in a domain is apt to become forgotten, replaced by work that drew connections between existing ideas rather than generating new ones. This suggests the substance of scientific progression may be located somewhere between revolutionary shifts in paradigms of thought at one extreme and the ordinary science of cumulative advancement at the other. The synthetic work that is foundational to scientific discovery is an ongoing process of redefined frames imposed through micro-erasure. This continual, destructive restructuring of discourse constitutes the churning substrate on which significant swaths of knowledge are created.
Footnotes
Appendix
Replications of Estimates in Table 3
| 5% replication 1 | 5% replication 2 | |||
|---|---|---|---|---|
|
|
|
|
|
|
| (Intercept) | –.013 (–.016, −.010) |
–.032 (–.035, −.029) |
–.014 (–.016, −.011) |
–.029 (–.032, −.026) |
|
|
–.148 (–.168, –.127) |
−.209 (–.229, −.188) |
–.148 (–.167, −.128) |
–.207 (–.228, −.185) |
|
|
–.135 (–.138, −.132) |
–.026 (–.029, −.023) |
–.134 (–.137, −.131) |
–.023 (–.026, −.020) |
|
|
.008 (.007, .009) |
.006 (.005, .008) |
.008 (.007, .009) |
.005 (.003, .006) |
|
|
–.106 (–.109, −.104) |
–.084 (–.087, −.082) |
–.107 (–.109, −.105) |
–.082 (–.085, −.080) |
|
|
.003 (.002, .005) |
.028 (.026, .030) |
.003 (.002, .005) |
.027 (.025, .028) |
|
|
–.657 (–.659, −.655) |
.001 (–.001, .003) |
–.66 (–.662, −.658) |
0 (–.002, .002) |
|
|
.032 (.030, .034) |
–.621 (–.623, −.619) |
.031 (.029, .034) |
–.622 (–.624, −.620) |
| Res. Std. Dev. | .78 (.778, .781) |
.803 (.802, .805) |
.779 (.777, .780) |
.801 (.800, .803) |
Note: Each replication represents an independent, weighted, 5 percent subsample of the complete dataset.
Acknowledgements
The authors would like to thank Bas Hofstra, Vamshi Krishna, Hjalmar Carlsen, Sebastian Munoz-Najar Galvez, Tamara Gilkes, Taylor LiCausi, Zack Almquist, Daniel Reese, Sanne Smith, Elizabeth Reddy, and James Murphy for their invaluable feedback and assistance.
Funding
This research was funded in part by NSF award #1633036 and SMA-1829240.
