Abstract
The purpose of grant peer review is to identify the most excellent and promising research projects. However, sociologists of science and STS scholars have shown that peer review tends to promote solid low-risk projects at the expense of more original and innovative projects that often come with higher risk. It has also been shown that the review process is affected by significant measures of chance. Against this background, the aim of this study is to theorize the notions of academic judgment and agonistic chance, and to present and analyze situations in which expert reviewers are faced with the challenge of trying to decide which grant proposals to select when there is strong disagreement. The empirical analysis is based on ethnographic observations of ten panel groups at the Swedish Research Council in the areas of natural and engineering sciences. By focusing on disagreement, the study provides a more in-depth understanding of how agonistic chance creeps into the peer review process and becomes part of the consensus that is created.
Keywords
Introduction
We are all agreed that your theory is crazy. The question which divides us is whether it is crazy enough to have a chance of being correct. My own feeling is that it is not crazy enough.
It is sometimes only a feeling that separates experts in how they view a new research idea, and this feeling can have quite different effects depending on the situation. Several studies have shown that evaluation of grant proposals’ quality is influenced by a seemingly unavoidable measure of chance, which is tied to the variation and interplay between various expert judgments (Boudreau et al. 2016; Cole, Cole, and Simon 1981; Chubin and Hackett 1990; Graves, Barnett, and Clarke 2011; Roumbanis 2017). In the sociology of science research focused on peer review, scholars have long paid attention to what they have come to call “the luck of the reviewer draw,” which means that different groups of reviewers will essentially always come to substantially different conclusions. Experiments have shown that certain proposals that received the highest ranking in (and were granted funding by) one group would have been denied funding had the judgments been made by a completely different group (Cole, Cole, and Simon 1981). What is identified and recognized as the highest scientific quality can vary considerably, both within and between panel groups. The selection of particular reviewers assigned to a certain subset of proposals can have considerable consequences for the final decisions. For example, Mayo et al. (2006) showed that four of the ten top-rated applications identified through ranking had a greater than 50 percent chance of not being funded depending on which specific pair of reviewers performed the assessment. In another study conducted by Pier et al. (2018), forty-three individual reviewers’ ratings and written reports concerning the same twenty-five grant applications showed “no agreement in how reviewers ‘translated’ a given number of strengths and weaknesses into a numerical rating” (Pier et al. 2018, 2952). These results clearly demonstrate how unreliable and arbitrary the basis for a funding decision can be. Recently, Jerrim and De Vries (2020) analyzed data from 4,000 grant applications and 15,000 reviewer reports and revealed a remarkably low level of consistency among the scores assigned by different reviewers, with a correlation between reviewer scores of only 0.2. Indeed, a number of studies have presented convincing evidence supporting the notion that the current method of scoring proposals introduces a significant level of arbitrariness and chance into the peer review process. However, the fact that reviewers evaluate the quality of proposals differently is perhaps not so unexpected because when they are confronted with multidisciplinary issues involving scientific complexity and uncertainty, “even competent, honest, and disinterested scientists may arrive at different conclusions” (Mumpower and Stewart 1996, 194). In a panel group, the preconditions for creating consensus require compromises to be made—individual scores must be calibrated and disagreements must be resolved. Consensus is often a fundamental procedure for peer review in panel groups. Still, striving for consensus has often been shown to lead to the elimination of truly innovative, original, interdisciplinary, and/or riskier projects, in favor of “solid” ones that are easier to agree on initially (Hackett 1990; Luukkonen 2012; van den Besselaar, Sandström, and Schiffbaenker 2018).
Regardless of whether we choose to view the occurrence of variation and disagreement in peer review as a problem of reliability or as a natural consequence of the cognitive diversity found in most fields of research, we cannot disregard the fact that it has direct effects on the decisions that shape the fate of individual researchers/research groups as well as the future of scientific progress (Müller and Kaltenbrunner 2019; Roumbanis 2019). To be sure, various efforts are constantly being made by funding organizations to improve and safeguard the review process’s reliability. But despite these efforts, the fact remains that judgment of quality is more or less a matter of qualified interpretation based on reviewers’ embodied expertise. For what is scientific quality, actually, when viewed in relation to comparatively elusive criteria such as innovation, originality, and feasibility? There is, indeed, a basic difficulty associated with “delimiting precisely what is meant by scientific quality, which concerns the very nature of research itself” (Hemlin 2009, 186; see also Hammarfelt 2017). In the process of grant peer review, these fundamental issues are brought to a head because it concerns projects that have yet to be carried out, and in relation to which evaluation of scientific quality can only be a kind of proxy for what would seem to be promising in the future.
The aim of this study is to elucidate negotiation situations in which reviewers are faced with the challenge of trying to manage disagreement. By focusing on disagreement in particular—how it emerges and how it is handled—my aim is to promote a more qualitative understanding of how chance discreetly creeps into the peer review process and becomes an integrated part of the consensus that is created. The data used for the study are based on ethnographic observations of ten Swedish Research Council (SRC) panel groups in the area of the natural and engineering sciences. The first step in the study is to theorize the concept of academic judgment. I aim to elucidate the dynamic and creative evaluative context that all research essentially finds itself in and that, in all its variational richness and tension, also implicitly pervades grant peer review. The second step is to outline a theory of agonistic chance, which is related to the actual occurrence of substantial disagreements and what happens during the struggle to reach consensus. The third step is to use empirical examples to illustrate how reviewers manage disagreement in different negotiation situations. The complex of problems I wish to address in particular concerns the fact that an individual’s expertise-based feeling or professional intuition sometimes stands in strong opposition to the group’s collective expertise, something that does not always promote the most innovative research ideas.
Theorizing Academic Judgment and Agonistic Chance
The Essential Tension in Science and Peer Review
There exists within all basic research what philosophers of science and sociologists have come to call the “essential tension” between tradition and innovation (Kuhn 1977; see also Andersen 2013; Foster, Rzhetsky, and Evans 2015; Hackett 2005; Parker and Hackett 2012). In the peer review process, this inherent element of tension constitutes something that can affect, in various ways, every reviewer who has been assigned the task of evaluating and comparing the quality of a number of research proposals; how well versed is the author(s) of a given proposal in the most recent, cutting-edge knowledge in the field and how original is the new idea being presented? Which project has the greatest potential of leading to substantial contributions in the future? The fact that reviewers often use fairly similar rules and standards when making their judgments can usually be explained by another fact, which is that they belong to and are integrated into similar epistemic (sub)cultures, networks, and research environments (Peterson 2017; Knorr Cetina 1999; Lamont 2009). But these rules and standards are constantly being challenged by the creativity, skepticism, and openness of scientific thought, or as Poincaré (1907) pointed out long ago: “Is not human intelligence, more specifically the intelligence of the scientists, susceptible of infinite variation?” (p. 12). Even though exchanges of views between researchers are generally based on the existence of established concepts and theories, they can also be extremely complex and encompass various kinds of ambiguity, vagueness, and polysemy (McMahan and Evans 2018).
A closer look at how science is practiced on a day-to-day basis reveals the dynamic nature of consensus-making. For example, Peterson (2015, 1219) observed that “Rather than a picture of high consensus and cognitive integration, the molecular biology lab was a scene of constant, sometimes chaotic, change.” The intellectual climates cultivated at university departments, research centers, and laboratories often naturally follow the reviewers into their discussions during panel group meetings. Moreover, even idiosyncrasies and personality traits can play a role in how new theories or research findings are received (Kuhn 1977). Basically, the “essential tension” exists in all scientific activity but is at times particularly tangible. Oftentimes, several of the most innovative and original proposals wind up somewhere in the evaluative gray area because the ideas they contain easily give rise to different opinions about how they should be interpreted and ranked. Luukkonen (2012, 57) stressed that: What will tip such proposals to one or the other side of the boundary may depend on contingent factors. It should also be noted that the customary rule of deferring to expertise implies that the fate of a proposal can depend on the views of only one or two experts.
Cognitive Particularism and the Creativity of Academic Judgment
What is it though that, in certain situations, causes reviewers to evaluate new research projects in completely different ways? The very act of making a judgment constitutes a creative moment of discernment. de Certeau (1984, 72) wrote “acknowledging an art at the root of thought, makes judgment a ‘middle term’…between theory and praxis.” Evaluating the quality of grant proposals can doubtless be seen as a special form of interpretive art that opens the door to considerable variations and differences between different perspectives. Naturally, reviewers make similar interpretations, which help to create different boundary relations in the ranking of proposals. Travis and Collins (1991, 327) pointed out that, regarding academic judgment, both agreement and disagreement can be tied to what is called cognitive particularism, that is, “the existence of cognitive boundaries within and between scientific specialties and disciplines.” What this concept focuses on in particular is the fact that reviewers naturally prefer research that is close to their own expertise and intellectual preferences. Here, it would seem appropriate to mention the epistemological crux of confirmation bias—researchers’ tendency to positively judge research that is in line with their own established understandings, standards, and expectations (Hergovich, Schott, and Burger 2010; Lee et al. 2013). The very issue of cognitive particularism is far from trivial. On the contrary, it calls forth the idea that the connection between the degree of intellectual distance and the judgment of scientific quality is never completely given in the peer review process. A reviewer who is intellectually closest to a research topic may be the toughest judge and may dismiss project ideas that another reviewer, who is intellectually more distant, would find exciting and innovative (Boudreau et al. 2016; Roumbanis 2019). In certain cases, the reviewer closest to the topic may more easily see subtle, exciting aspects of a proposed project—aspects that another reviewer, with more distance, may not immediately discern.
It is often difficult to pinpoint exactly what will intellectually attract and engage some reviewers but not others. Interview studies have shown that researchers with long and successful careers often emphasize a specifically developed sensitivity, a feeling, for what they perceive to be truly brilliant research ideas or exciting problems to investigate (Merton 1973; Lamont 2009; Zuckerman 1996). But peer review is not only a matter of judging the quality of proposals but also equally of judging the plausibility of colleagues’ assessments and of reevaluating one’s own judgment in relation to new information that emerges during discussions.
In certain situations, reviewers with about the same intellectual distance to the topic and the same scientific expertise interpret the content of a given proposal completely differently. When these differences occur, the aforementioned cognitive sensitivity and the expert feelings it generates may play a significant role. Mumpower and Stewart (1996) pointed out that the source of disagreement is often that two or more experts combine relevant information somewhat differently, or that they attach importance to different aspects of the information in a complicated evaluative situation. The ideas and assertions presented in a proposal may give rise to, for example, different kinds of associations and insights, and several previous experimental studies have shown that it is “impossible to account for all the variance in individuals’ judgements solely in terms of cues, at least when the judgements have an intuitive component” (Mumpower and Stewart 1996, 197). Reviewers are constantly using their own expert intuition both when they separately examine proposals, and again when they negotiate within the group. Scientific qualities of many types can fill a reviewer with a particular feeling of intellectual enthusiasm and delight about a given research idea. O’Loughlin and McCallum (2019, 333) suggested that quality criteria such as “coherence, fruitfulness, non-ad-hoc-ness, and internal consistency as genuinely aesthetic criteria…are also sought in theory selection and development in science.” On the whole, both the epistemic-aesthetic and intuitive aspects of academic judgment are of crucial importance to understanding how variation and disagreement can emerge in the evaluation process.
Outline of a Theory of Agonistic Chance
Chance never emerges out of nothingness but instead grows in the dynamics between the countless courses of events, actions, opinions, and communication flows that are constantly occurring in social life (Bandura 1982; Ermakoff 2015; Popper 1957; Sauder 2020). Scientific work and academic judgment are not exceptions here; chance is embedded, in many different ways, in the creative exchange of ideas taking place between researchers. But how should we conceptualize chance in relation to the peer review process? Here, I present a theoretical framework and a definition of chance, with a focus on the context-specific attributes. Smith (1993) used a concept of chance that I find particularly appropriate to start from and to elaborate on, namely, agonistic chance—a special subtype of “social chance” that signifies “unforeseen consequences of social interaction” and should not be confused with mathematical randomness or mechanical chance. Smith’s (1993) primary motive for mapping out the sociological meaning of chance is to underscore the theoretical importance of using it in social analysis: “sociological models which include chance avoid assumptions of either total chaos or total regularity” (p. 528). Taking social chance into serious account gives us, in other words, a dynamic factor that can be fruitfully combined with other concepts such as agency, process, and organization, in order to deepen our understanding of the complex nature of contemporary societies. Agonistic chance arises from “contests where people compete with one another” (Smith 1993, 527). These are all important ideas, but what I think is missing from Smith’s otherwise interesting account is a closer interpretation of agonistic chance at the group level. It is crucial, therefore, to emphasize the organizational conditions that usually influence the contests and competitions people are involved in (see, e.g., Arora-Jonsson, Brunsson, and Hasse 2020). Hence, I will outline a theory of agonistic chance and its contextual attributes with a special focus on how chance arises in groups organized to make consensus-based decisions. And the meeting is the central stage.
First of all, agonistic chance is closely tied to real struggle or competition (Greek: ἀγών [agon]) and the practice to solve disagreements. The occurrence of conflicting judgments regarding what should, or should not, be viewed as an exciting and promising research idea is the key to understanding how agonistic chance works in an expert panel group. Disagreement consists of important cutoff points during the negotiations. Analyzing how different cases of disagreement are managed can deepen our understanding of the concrete effects of “the luck of the reviewer draw” (Cole, Cole, and Simon 1981). This, however, requires insight into the fact that disagreement in peer review is never an isolated event. On the contrary, disagreement in this context is always related, in various ways, to other cases of disagreement and to the difficult question of which proposals should ultimately be prioritized. Also, factors such as the order in which proposals are discussed can sometimes have random effects on how different cases of disagreement are resolved. Other things can also play a role in determining how agonistic chance is manifested during the review process, things such as which specific arguments are or are not presented, and what aspects are emphasized when comparing proposals that are teetering on the threshold. Accordingly, agonistic chance is entangled in the very act of exchanging meaningful opinions and creating consensus. Expertise-based feelings can have subtle effects, causing the negotiations to take rather different paths. These deeper intuitive aspects of academic judgment function as the “intellectual fuel” that puts agonistic chance to work in situations when differences turn into concrete disagreement.
I will now present my theoretical outline of agonistic chance by describing five context-specific attributes: (i) evaluative crossroads, (ii) aporetic position, (iii) radical compromise, (iv) collective risk-taking, and (v) fateful events. Let me start with the first and most basic premise. When a case of substantial disagreement emerges in the group and the reviewers struggle to find a resolution, the time pressure that is constantly looming will eventually push them toward what I call an evaluative crossroads. When a group of reviewers has reached this crossroads, the decision becomes more unstable, with some reviewers experiencing a feeling of ambivalence—that they could just as well “go either way” (Roumbanis 2017, 112). This scenario is most likely to occur when one of the three reviewers is forced to choose a side despite a feeling of puzzlement and genuine difficulty in passing judgment; he or she is then caught in an aporetic position between the two colleagues with opposing views. Taking a stand from an aporetic position—on the knife’s edge—leads to a more or less arbitrary decision. Hence, when a panel group has ended up at an evaluative crossroads, this constitutes a hotbed for agonistic chance to arise in the process. There is also another important attribute, which is reviewers in opposition. This attribute is tied to what I will call a radical compromise, that is, when one of the reviewers in the “triad of reviewers” unexpectedly goes totally against his or her own deepest feeling about the scientific value of the proposal under scrutiny. In other words, a radical compromise occurs at the moment when one of the reviewers involved in a disagreement finally gives up his or her personal conviction in favor of the common interest of the group. Both aporetic positions and radical compromises have their relational origin in the evaluative crossroads, in that they emerge from the “tug-of-war” going on in the panel group. These two concepts can help to explain the dynamic conditions that play a role in producing agonistic chance in the peer review process.
Another context-specific attribute is the issue of collective risk-taking. In the contemporary research on peer review, there is a common understanding that proposals perceived to be extraordinarily innovative and/or unorthodox are more prone to trigger reviewer disagreement. For example, as Lamont (2009, 154) wrote, “luck is especially important in discussions of the more creative proposals, for which usual standards do not apply and which require collective risk-taking.” Assessment of risk involved in highly creative proposals leads to strategic uncertainty and ambivalence, which, combined with time pressure, can in many cases have a direct impact on how disagreements are resolved. Finally, we should also consider the importance of unexpected events, which can sometimes have a rather significant impact on the management of disagreement within a panel group. I will simply call these events fateful events because they can determine, to some extent, the future pathways of one or a few proposals hovering around the funding threshold. In certain evaluative crossroads, these fateful events may play a decisive role in determining how consensus is finally reached. In other words, these types of events can be important components in explaining the unforeseen consequences of interaction that are taking place in the peer review process.
What happens through the multiple variations and thresholds created by the large number of judgments made is of inevitable importance to the concrete cases of disagreement during the meeting. Here, I believe it is important to highlight the analytical difference between agonistic chance, which is tied to the social interaction during the meeting, and the general social chance that takes place before the meeting when each of the reviewers scrutinizes applications individually. Agonistic chance is, in my view, the most significant subtype of social chance, that is, the “unforeseen consequences of social interaction” (see Smith 1993), and is identified by its necessary link to the evaluative crossroads that arises because of strong disagreements in the group. The elements of chance that affect the individual reviewers in the initial phase of the evaluation process are thus embedded in the mean values and rankings that guide the deliberations within the panel group. And this is exactly what makes agonistic chance relational. In a previous study (Roumbanis 2017), for example, I showed how the experience of chance and the “magic of numbers” can be indirectly related to what I called collective anchoring effects in the peer review process. My theoretical purpose was to make sense of how both similarities and differences in individual scores, anchored before the meeting and mediated by average scores and rankings at the group level, can have a rather unexpected and peculiar impact on the outcome. In this way, I tried to theorize the dynamic relation between social chance and consensus from an aggregated-interactional point of view. In the present paper, the concept of agonistic chance brings us even closer to the very heart of the struggle in the evaluation process, and works as a complementary perspective on the social interplay taking place between different academic judgments.
Methods and Materials
Observational studies on panel groups are rare. Few researchers have been allowed entrance into the meeting rooms of national research councils or private funding agencies. Secrecy and confidentiality are the main reasons for this empirical lacuna. The majority of studies on grant peer review have instead been based on analysis of interviews (applicants, reviewers, and administrators), written documents (proposals, review reports, and peer review handbooks), and bibliometric data (van Arensbergen, van der Weijden, and van den Besselaar 2014; see also Seeber et al. 2021). However, at present, many allocation decisions are made based on the work done by panel groups—one important reason being the large number and breadth of the pool of applications.
The data on which the present study is based were collected during the late summer of 2013 when ten of the SRC’s total of nineteen panel groups in the natural and engineering sciences were observed during their respective two-day meetings. This opportunity to observe these ten groups resulted in a relatively large amount of data. The ten groups were selected so as to achieve disciplinary spread and variation in the proportion of international and domestic reviewers. Each panel group had between ten and thirteen reviewers. In some of the panel groups, all of the reviewers were from Scandinavian countries, and spoke in “Scandinavian” instead of English. The SRC’s rules require that at least 20 percent of reviewers in each panel group be foreign. Even in the group with the largest number of foreign reviewers (75 percent), most of the groups were dominated by reviewers from Swedish universities. Because several of the groups’ meetings were to take place in parallel during the period under study, I decided to ask two colleagues to help me conduct the observations in some of the groups. Their efforts allowed me to collect considerably richer empirical data, which I have later benefited from greatly in my analytical work. Another advantage of having three researchers conduct the observations was that I could compare and discuss my own observations with my two colleagues following the meetings.
Ethnography of Meetings
From the very outset, the point of departure for the present study was to describe—with as open a mind as possible—what took place during the groups’ respective meetings. Studying formal meetings has been shown to be a fruitful method that can provide detailed insights into how discussions and decision-making play out in practice (Sandler and Thedvall 2017; Schwartzman 1989; van Vree 2011). What happens during the meetings often gives a rather good picture both of what is at stake in an organization and of the exchanges of views that may have consequences for many more people than those present. Accordingly, studying the peer review process by observing panel group meetings offers a unique opportunity to learn about one of the most important selection and control mechanisms in today’s research community. Briefly, these meetings offer an ideal setting for observing competing academic definitions of excellence and scientific quality. The results finally agreed upon by all the group members during the two-day meetings provided the basis for subsequent funding decisions, which are formally made by the SRC Board. However, the Board usually follows the recommendations made by the collective body of expertise in each panel group, so the time-consuming construction of consensus per se is crucial (Roumbanis 2017). Yet, peer review can be organized in different ways, for example with ad hoc reviewers only or with panel groups in which all the reviewers read all the proposals.
The strategy my colleagues and I agreed upon prior to the observations was to write down as much as possible of what was said and what happened during the meetings. We tried to do our work as unobtrusively as possible so as not to disturb the review process. Although there was always a potential risk that certain individuals under observation might unconsciously change their behavior, our overall impression was that the reviewers were very busy with their own work and seemed unaffected by our presence. On the whole, most of them seemed relatively relaxed about having us in the room. Before every meeting, I gave the members of each panel group a short presentation of my research project. I addressed my general interest in group interaction, negotiations and consensus-making, but avoided going into specific details. The majority of the panelists seemed pleased and confident after my presentation.
What played out during the meetings, when the review work was fully underway, constitutes the most important data in the present study. Still, quite a bit of interesting information emerged during lunch and coffee breaks when the reviewers talked to each other. These conversations sometimes touched on aspects of the review work as well as on all kinds of different, unrelated matters. During these breaks, my colleagues and I also had opportunities to participate in several conversations with the reviewers as well as administrators from the SRC. These interactions also served as a rich source of knowledge about how these individuals thought about their work, and they have contributed both directly and indirectly to my sociological understanding of the peer review process. Because the information I obtained from the many hours of observations was so rich and the conversations we had with the reviewers as well as administrators were so valuable, I decided not to conduct any formal interviews. Furthermore, I did not ask the SRC if I could look at all the hundreds of proposals before or after the meetings because they all involved complex topics in, for example, theoretical physics, quantum chemistry, and mathematics which would not have made much sense to me anyway. To be sure, I was also more interested in group dynamics and evaluation practices and not particularly interested in the content of the individual proposals per se. All of the individual scores, mean values, and group rankings were visible on the large wall-mounted screen in the meeting room, which made the process transparent and quite easy to follow, even for an external observer like myself.
The SRC Panel Groups
On the SRC website, one can read the following statement about the research grants that are generally considered to be the most prestigious in Sweden: “The purpose of the project grant is to give researchers the freedom to formulate by themselves the research concept, method and implementation, and to solve a specific research task within a limited period. The SRC rewards research of the highest quality.” In this case, “highest quality” corresponds to what the SRC is also talking about when it uses the term “Excellence,” and explicit quality criteria like “innovation and originality” are central aspects that must be assessed together with other implicit criteria such as “soundness” or “correctness,” which are part of the explicit criterion “Scientific quality.” A total of 1,218 applications were submitted and subsequently distributed between the nineteen panel groups within the natural and engineering sciences. From this pool of applications, 228 were eventually granted funding, giving a success rate of 18.7 percent (SRC 2014). To be sure, the size of the overall budget is always a strong determinative factor, in that it sets the limits for how many proposals can ultimately be recommended for funding by each group (Huutoniemi 2012; Roumbanis 2017).
The size of the budget is of course beyond the control of the panel groups. Also beyond their control are the rules and guidelines surrounding the peer review process. The most important points are written in a handbook that the SRC gives to all reviewers and that is intended to be the basis for the panel groups’ work (SRC 2019). I will not discuss in detail everything in the handbook, but I will touch on some of the more important rules shaping the conditions of review work. One of these rules concerns conflicts of interest, something the SRC takes very seriously and does its best to prevent. Anyone with a conflict of interest must leave the room and must not review the proposal in question if any bias exists at all. Whether or not a reviewer can be considered biased, however, is ultimately a matter of conscience that the reviewer him-/herself must account for. As a panel member, you are obliged to report any conflict of interest that may affect your impartial assessment (positive/negative) of the proposals you have been asked to evaluate. During the meetings observed for the present study, the issue of conflict of interest was usually handled in a routine manner. In a couple of isolated incidents, there were negotiations concerning whether someone in the group could be considered to have such a conflict. One US study (Gallo, Lemaster, and Glisson 2016) showed that many reviewers tend to underestimate several potential risks for conflict of interest and, therefore, fail to report this to the funding agency’s administrators. Thus, it is reasonable to conclude that some conflict of interest can creep into the system despite both the multiple precautionary measures taken by the organization and the reviewers’ good intentions. When proposals require additional expertise for some reason—for example, a conflict of interest or a paucity of specialist knowledge—one or two external reviewers are often called in. It is, however, always up to the panel group to determine what role these external reviewers’ statements should play during the group’s deliberations.
Another factor particularly affecting the evaluation of proposals can be tied to how the review methods themselves are designed. Use of, for example, numerical scores, different kinds of quality criteria, and ranking procedures as well as how many reviewers read each proposal has been shown to affect the prospects of projects that are considered safe versus those that are considered risky. Certain review methods can result in safer, more “solid” proposals having greater chances of being funded than risky projects. The major governmental research councils in Scandinavia and several other countries around the world often use methods inspired by US funding agencies such as the National Institutes of Health and National Science Foundation. Langfeldt (2006) described the effect of review methods on reviewers’ scope of action. According to her, if a funding agency employs average ranking and open decision-making, this approach tends to give greater scope for scientific diversity and innovative projects. This is because enthusiastic reviewers can, to a somewhat greater extent, change their colleagues’ opinions about the proposals they initially considered too risky, peripheral, or immature. Langfeldt also pointed out that having a larger number of reviewers per group who score all of the proposals—in combination with having a finely graded scale and average ranking—tends to promote a more thoroughgoing and predictable final outcome. It reduces the risk of arbitrariness and inconsistency affecting the decision. Still, the question of how many reviewers are optimal is far from settled. The balance that must be struck concerns reaching a decision of the highest possible reliability and quality without the process costing too much in terms of time and resources (see, e.g., Snell 2015).
The SRC employs a relatively open discussion format that takes into account an average ranking based on the scores given by three reviewers, sometimes including one or two external reviewers. A rougher scale is used (1–7), but several of the observed groups improvised by using decimals in their work to sort and rank the proposals. The quality criteria being scored are as follows: (i) novelty and originality, (ii), scientific quality, (iii) merits of the applicant, (iv) feasibility, and (v) overall grade. The “feasibility” criterion is scored on a three-point scale. The verbal tags for the scores on the seven-point scale are as follows: 7 = “outstanding,” 6 = “excellent,” 5 = “very good to excellent,” 4 = “very good,” 3 = “good,” 2 = “weak,” and 1 = “poor.” The reviewers are also expected to make their own, individual ranking of the applications they read. The results of this procedure function as an important supplement to the panel’s scores and mean values when the reviewers are comparing the applications. These individual rankings are often used as arguments during the negotiations (“This is my number two of 27” or “I ranked that one next to last”). Each panel group is led by a chairperson, and this chairperson plays an important role as discussion leader and as a kind of attendant during the negotiations. There is also a vice chair selected for each panel; he or she is in charge of the discussions if the chair has to leave the room owing to, for example, a conflict of interest. Prior to the meeting day, between 30 and 50 percent of the applications, that is, those with the lowest scores, have already been weeded out because they would certainly not qualify for funding anyway. This is a time-saving mechanism. The actual number of applications the panel members had to evaluate as a group during the meetings we observed varied between forty and seventy-five. Every proposal is presented by the reviewer assigned to serve as its so-called rapporteur. The rapporteur has primary responsibility for this proposal and begins by offering a brief summary of the research ideas, merits, and the advantages and disadvantages of the project description, finally presenting the score he or she assigned it. At this point, the two other reviewers explain how they have evaluated and scored the proposal, and this is followed by deliberations and adjustment of the scores. Our observations revealed that many proposals were dealt with relatively quickly (very often in less than two minutes) and in a routine manner. Nevertheless, in the four cases of disagreement that I use as illustrative examples here, the discussions were very prolonged. It should be noted, however, that these cases were neither exceptional nor common. In each of the ten panel groups we observed, a number of disagreements occurred. I selected the most distinctive cases, for which there was sufficient material to recreate the scenario in which the disagreement played out.
Results and Analysis
Disagreement, Aporetic Positions, and Agonistic Chance
According to SCR rules, the groups must complete their work within the allotted time. A rather large proportion of the time during meetings is spent discussing proposals that have relatively high scores and are on the threshold of being granted funding. This part of the sorting work is particularly important to the decision-making process as small changes in scoring can change the entire ranking. For this reason, great precision is required. Another category of proposals given more time consists of those receiving completely different evaluations, where the work entails investigating whether there is sufficient scope for a substantial reevaluation. The discussions typically flow without any real obstacles, but not always. Protracted and intense exchanges of views do occur in certain situations. I will now present the first case of disagreement. The chairperson of the panel group in question pointed out early, on the first day, that there was great spread in the scores and that, for this reason, they would need to have several tough discussions to reach consensus. Some of the hardest negotiations that I observed took place in this group. To illustrate this, I will use two proposals that came up in succession during the meeting. In this way, I will try to elucidate how disagreement can be part of a larger game—a game involving the new ranking the group is struggling to create. My intention is to let the meaning of agonistic chance emerge. In the situation I will try to provide some insight into, one of the reviewers had given high scores to a proposal he truly believed deserved funding. In his view, this was a very innovative and daring proposal. The other two reviewers, however, had given much lower scores for somewhat different reasons.
“I ranked him high last year too. He comes from the US originally, but now works at X University. He has an independent position there and I have a feeling he’s really trying to do something original. He has his own ideas. He hasn’t produced so many articles, but I like that he makes active choices. This is a really good project!”
“This proposal is next to last in my ranking. Sure it’s new. But in my view the likelihood of it flying is low. He hasn’t convinced me at all. I might be too critical, but….”
“Yeah, it’s a high-risk project, and I don’t dare promise it will fly.”
“I think it seems like a fun project, exciting. But I don’t know much about this area.”
“There are some pretty crazy things, but it’s fun.”
“Maybe funding something so high-risk isn’t a good idea.”
The above exchange was followed by a thorough discussion of risk and how it should be evaluated in relation to the degree of innovation. How should they think about innovation versus the risk that the project might fail? Someone in the group pointed out that certain proposals at the top of the ranking also entailed quite a bit of risk and that this is perhaps always unavoidable with projects that are trying to break new ground. Another reviewer added that he found innovation extremely difficult to evaluate. For instance, how can you determine the difference in innovation between two projects that are very similar? There was a general feeling of ambivalence within the group. Reviewer 1, above, tried in various ways to convince his colleagues that this project was not just a copy of established research but on the contrary demonstrated great courage and a genuine effort to be innovative. This was the main reason, he pointed out resolutely, he wanted to fund this proposal in particular. Still, another colleague soon responded by saying that if reviewer 1 truly believed this project idea was of such great interest, then perhaps the group should consider moving it up all the same. However, reviewer 2 did not agree to move the proposal up in the ranking, the reason being that he thought reviewer 1 was wrong and that there was too much risk involved in the project. The two reviewers struggled to convince reviewer 3, who at this point seemed more uncertain than ever before. After a while, the chairperson reminded them that they had to hurry because they had many other applications to discuss. She tried to force the three reviewers to compromise; they had reached an evaluative crossroads. Reviewer 3 said she really felt ambivalent, and she repeated that the topic of this proposal was not in her area of expertise. At this point, reviewer 3 was caught in an aporetic position. Finally, she decided to support reviewer 2 despite her frustration over not being sure that was the correct decision. And so, this ordeal was over, and they decided to adjust the scores such that the proposal ended up beneath the threshold.
Given that, in peer review, no proposal is judged in isolation—but on the contrary is always considered in relation to several other proposals—it is interesting to look at the situation that immediately followed to illustrate how disagreement can have ripple effects. In the situation below, the roles were reversed and reviewer 1 presented his strong objections to a proposal reviewer 2 praised very highly:
“This is a very interesting proposal. The project is original and innovative. She’s well qualified and I think her list of publications is strong.”
“This didn’t feel so new to me. What is it that’s new?”
“I reacted to the fact that she doesn’t talk about how she is going to build everything up. It’s the more trivial part she has to describe better,…but sure, she has published some good stuff. But she should have formulated a model in kinetic terms.”
I’m very concerned about the lack of innovation and risk-taking in several of the proposals that I read. Should the Research Council throw out people who really dare to do their own research and refuse to follow the mainstream? We must be able to talk about this! The people who sit in safe research environments and are doing safe research shouldn’t get all the money! And this is also the reason why I don’t think this project deserves to be funded. It’s not bold enough.”
“Seems difficult to increase her ranking then?”
“This is a very solid project, but it’s also innovative. I’m saying this based on my feeling that she will succeed. That’s how I see it.”
In the discussions surrounding the two proposals I have put forward, both reviewers 1 and 2 referred to their “feelings” about the respective proposals’ high quality and degree of innovation. This can be interpreted as a reference to their own personal scientific intuition being a kind of guarantee for the projects’ hidden potential. Still, neither of the proposals was granted funding, though my impression was that both of them might well have crossed the threshold if the circumstances had been somewhat different. At the same time, the entire scenario provides a picture of the importance of uncertainty and agonistic chance during negotiations, by showing how individual reviewers’ attitudes not infrequently have a crucial influence on the fate of several proposals. The discussions in the group sometimes concerned how various qualifications or statements should be evaluated. But concerning the central part of the evaluation—interpreting the purely scientific content—an invisible boundary line in the negotiations seemed to emerge. The group could come no further than to agree to disagree, and so they ended up at an evaluative crossroads. Even though reviewer 3 in this last case initially liked the project and later also explicitly agreed with reviewer 2 that it was of high quality, he nevertheless had some reservations. Still, he balanced between the roles of being either supportive or nonsupportive in this particular case (he did indeed seem to be able to go either way). He was “trapped” in an aporetic position, and for a moment, he seemed rather perplexed by the situation. But after having listened once again to reviewer 1’s critical comments about this proposal not being exciting enough and so forth, reviewer 3 decided not to support it. Everything took place while maintaining a professional spirit and a pragmatic disposition. At the same time, an atmosphere of resignation could be perceived among the reviewers, who were forced to see some of their favorite proposals eliminated. All of this is an inescapable part of peer review work: systematically weeding out even proposals that certain individuals in the group, relying on their own wisdom and expertise, find highly interesting and promising. On a more general level, this is a matter of an individual reviewer’s positive feelings never really being enough to raise a proposal over the threshold for funding when the competition is too great. But this always occurs in combination with the fact that, in other negotiation situations, reviewers can sometimes imagine shifting in one direction or the other depending on how such a move might help in resolving conflicts (see, e.g., Roumbanis 2017). Agonistic chance implies that management of disagreement in one situation can have different kinds of consequences for how disagreement is resolved in several other situations. In the panel group from which I have taken examples, it was palpable how disagreement often lingered in the air even when it was time to discuss subsequent proposals. To develop our understanding of agonistic chance in the peer review process, in the next section, I will put things in a somewhat different light, my goal being to gain increased insight into the relational nature of chance.
Disagreement and Radical Compromise
Coincidences often play a rather important role in the various stages of the review process; they also contribute to our further understanding of agonistic chance. As I have already mentioned, most time is typically spent on proposals that are perceived to have a fairly reasonable chance of being funded. These are often the proposals that have received high average scores. They are often the most difficult to separate, or as one reviewer pointed out, referring to a few proposals positioned near each other in the ranking: “They are really close, so in some sense it is impossible to choose.” Here, I will describe a case from another panel group in which the situation was somewhat different from the cases discussed above. The group’s chairperson was highly critical of the content of a proposal he had evaluated. At the same time, the other two reviewers liked the proposal a great deal, causing the negotiations to get tougher.
“This is a long-standing problem, and he [the grant writer] will make an effort to continue thinking about this unsolved problem in a really innovative way. His theoretical approach is brilliant. That’s why I gave him high scores, 7 6 7 3 and 6. Maybe the overall score should even be a 7?
“I had sixes all the way, and 3 for feasibility. This is someone who really thinks outside the box!”
“When I first read the proposal, I liked it. But then, after my second reading, I saw some serious problems that gave me the feeling that there is something wrong. It doesn’t make sense! So, my grading is based on disappointment. Is this really doable?”
After the above exchange between the three reviewers, it seemed rather unlikely this proposal would be prioritized for funding. The chairperson was very clear in communicating his negative views on the project, which were reflected in his low scores. His personal attitude was also rather strict, and he had a natural authority in this particular group. The other two reviewers, however, stuck to their opinion that this was an excellent project that absolutely ought to be funded.
“My feeling was that this is a great project and that the framework for developing the problem is good. I can’t see any real issues concerning the feasibility.”
“All the math is correct. But going back to fundamentals, deriving the first law, the second law and so forth,…my conclusion is that it probably won’t work. Well, I like the originality and it’s an important problem. But I had trouble with the scientific quality, something important is missing.”
After a while, the discussion had ground to a complete halt. There was nothing wrong with the merits of the applicant—all the three were in agreement there. But regarding other things, there was complete disagreement. Nonetheless, time went by, and they were forced to move on. How could consensus be achieved? The simplest solution would certainly be to let the negative judgment decide the outcome, just as it had done for several proposals they had gone through previously. One of the administrators even asked whether it might be worthwhile for someone else in the group to read the proposal stating that “You really do disagree a lot.” Yet, this is not what happened. Instead, the chairperson gave in, though not without objection:
“The merits is not 7, it’s more like 5. Now I have given up everything!”
“What about scientific quality? I think it should be a 6.”
“I agree.”
“I can’t really judge the quality of a project if I believe it’s wrong. But I’m happy to give up everything. Either it is completely wrong or really, really, new. If it’s correct, then it might be a contribution.”
In the chairperson’s view, there was too much uncertainty and risk associated with the project as a whole. He nevertheless joined his colleagues in the collective risk-taking, and they agreed on giving scores of 7, 6, 5, 3, and a summary score of 6, which meant that this proposal—one that seemed doomed to failure at the outset—was granted funding.
But what was it exactly that caused the reevaluation of innovation and risk to take a different turn regarding this particular proposal in comparison with the cases taken from the abovementioned group? Was it simply that “two positives against one negative” got to rule the day as opposed to “one positive against two negatives?” Uncertainty on the part of some of the reviewers was a factor influencing the outcome. But why did the struggle end up in this unforeseen construction of consensus? The answer is that a radical compromise had been made—something that can never be fully comprehended. In several other previous situations during the negotiations, the chairperson of this group had been rather forceful in holding up certain proposals and cutting down others. Disagreement had built up such that he, purely with regard to the negotiations, was in debt to the others and needed to show his willingness to compromise, even though this was difficult from an intellectual and scientific standpoint. The reason why this chairperson decided to let this proposal get through—but none of the other ones he had actively brought down at the finish line—would seem to be in part a result of the discrete and multifaceted effect agonistic chance has during the process. But a radical compromise cut through what we anticipated, namely, that the chairperson would not let this proposal pass. And yet, in the end, he did support it. If this proposal had been considered earlier, perhaps it would not have been granted funding. But that is mere speculation. Another more important issue is the problem of inconsistency, which can bring new meaning to the disagreement I illustrated above. When reviewers are forced to improvise in protracted negotiations, it can make them more prone to passing inconsistent judgments. At the end of this discussion, one of the other reviewers in the group pointed out that she had noticed a great deal of inconsistency in how some of her colleagues applied their quality standards to different proposals. In her view, this inconsistency made the process look more like a lottery.
Disagreement and Fateful Events
I will now present another case of disagreement. In the panel group in question, reviewer 1 had given 6s and 7s to a cross-disciplinary proposal he felt definitely presented a world-class project. He said that this proposal was next best in his personal ranking and that the principal investigator (PI) on the project was among the top in his field of research. The other two reviewers who had evaluated the proposal were more moderate in their scoring and were generally not quite as enthusiastic.
“He is developing new important mathematical models for computer simulations of x, y, z,…and I’m confident he will make valuable contributions! This is a project of the very highest quality, and he has many strong publications in top international journals. He’s a real superstar!
“I’m not as enthusiastic as you are. It’s a very big project. How is he going to do all the things he describes in the application? He hasn’t convinced me.”
“I think it’s a fascinating project. But I feel a bit skeptical about the theoretical framework […] Is it really plausible to assume A if B? I’m not sure. But okay, this is not my main area of expertise, so I may have misunderstood the details.”
“The theoretical framework is the state of the art in this field of research. I want to emphasize that this is a very novel approach.”
“Do you want to add anything to the discussion, Reviewer 2?
“I don’t agree with Reviewer 1. But I’m not going to fight about it.”
After the deliberation, some of the reviewers did not seem to be particularly happy with the preliminary scores (5, 6, 6, 3, and 6) assigned to this proposal. To be sure, no one could deny that the PI on the project had a very good reputation in his field, with frequently cited articles in Science, PNAS, and several other highly ranked journals. After reviewer 1 had succeeded in convincing his two colleagues of the project’s fantastic potential, there was much to indicate that it had a good chance of being funded. Still, the feeling of uncertainty remained among the reviewers regarding the ranking because they could see how it influenced other proposals that they believed were more deserving of funding. But all of a sudden, a fourth reviewer entered into the discussion. This reviewer had not read the proposal but pointed out that the project topic was definitely within his field of expertise. Because some doubt and uncertainty still remained after the discussions, he offered to read the proposal during the evening. The chairperson agreed to this, and the group continued its work. This was, indeed, a fateful event, as it gave a new twist to the struggle in the group. To understand how agonistic chance is generated and can take effect, it is particularly important to pay special attention to these events in the context of scientific evaluation.
On the following day, when this proposal was to be discussed again, the new reviewer offered devastating criticism. Among other things, he argued that the cross-disciplinary dimension of the proposal clearly revealed a lack of understanding of developments at the forefront of one of the disciplines and that the proposal was based on a much too reductionist perspective with regard to one decisive theoretical aspect. Reviewer 1 commented on this new information by saying “I accept your viewpoints, but I still believe this is a very good proposal. I can agree, though, to move the proposal down a bit on the list.” In this case, it was clear that reviewers 1 and 4 had completely different theoretical perspectives. Another reviewer whose favorite proposal had been lowered in the ranking the day before gave voice to her frustration: We spent eight hours discussing this yesterday. Our views might have changed slightly. The top proposals have overall 7s, and then there is a group of proposals that have overall 6s. The next row of proposals in the ranking have 5s. Are we happy with this? Should the 6s really be 6s?”
Concluding Discussion
One important piece of the puzzle in understanding the impact of chance in peer review can be found in the question of how members of the panel group manage disagreement. That has been the main topic of the present study. From a theory of science perspective, the issue of reviewer disagreement has often been perceived as both perplexing and deeply problematic, as it leads to a kind of unavoidable relativization of the concept of scientific quality. What does it mean that a single proposal can at once be considered brilliant and mediocre, depending on who is examining its contents? In grant peer review, the notions of quality and excellence are fundamentally context-dependent. The cognitive particularism/distance and the intellectual sensibility (“epistemic-aesthetic feelings”) of each individual reviewer can play a decisive role in the construction of consensus. But the true quality of a research project can never be fully observed in a proposal because it is impossible to predict future discoveries and breakthroughs in science (see, e.g., Roumbanis 2021). In light of this general condition, expert disagreements seem to be an inescapable part of the selection process. Still, there can also be a genuinely productive side of disagreement, as it creates zones of intensified intellectual engagement (McMahan and Evans 2018). Nevertheless, in the peer review process, strong disagreements can have considerable effects that influence the distribution of resources and indirectly the fate of scientific progress. In fact, a number of recent studies on peer review have discovered surprisingly low levels of agreement in how reviewers judged scientific quality and translated it into numerical scores (Pier et al. 2018; see also Brezis and Birukou 2020; Jerrim and De Vries 2020). It is, however, during panel group negotiations that the great variations in scores may develop into disagreements. The main reason for using panel groups in the first place is that they constitute a social adjustment procedure—overlapping areas of expertise can create, through dialogue and deliberations, a more reliable final outcome—that serves to reduce uncertainty and justify decisions (Huutoniemi 2012; Lamont 2009).
But despite all this, the basic question remains: what happens in practice when people are in disagreement, but nonetheless have to make a decision? And the answer is new elements of uncertainty, arbitrariness, ambivalence, and new collective forms of bias emerge at the group level instead. What I have attempted to show in the present study is that, in many cases, disagreement leads to the elimination of some of the reviewers’ favorite proposals. It is often very difficult to negotiate proposals up in the ranking when only one reviewer’s judgment is completely positive; after all is said and done, the purpose of this evaluation method is to weed out the majority of proposals that should not be granted funding. This may be considered mechanical. But appearances are deceiving because social chance is embedded in this organized decision-making procedure. First, the fact that some proposals end up in a certain negotiation position can be explained: small and large differences in judgment are translated into numerical scores that affect the preliminary ranking. This is a matter of certain judgments and scores happening to coincide (before the meeting), creating a certain kind of evaluative breeding ground for upcoming decisions (Roumbanis 2017). In the present study, my goal was to focus more directly on what happens when the selected reviewers have to jointly handle substantial disagreements under time pressure. For this purpose, I developed an outline for a theory of agonistic chance, with a special emphasis on its context-specific attributes. I introduced a conceptual toolbox to frame agonistic chance, a toolbox that contains concepts such as evaluative crossroads, aporetic positions, radical compromise, collective risk-taking, and fateful events. These concepts make up an interpretative framework for understanding the dynamic relation between strong disagreement and consensus-making during the panel group meeting.
Agonistic chance occurs particularly in situations when disagreement emerges concerning how one, two, or several proposals should be evaluated and ranked. Different variations can play a crucial role during the struggle to reach consensus; things that are said or not said can affect which proposals are granted or not granted funding. It is also very important not to forget the deeply relational nature of agonistic chance. In some situations, previous cases of disagreement can spread to upcoming proposals about which the reviewers also disagree. This is a manifestation of agonistic chance and singles out the unforeseen consequences of the struggle itself during the negotiations. During the meeting, when disagreement arose in the panel group—power struggles, emotions, and coincidences—all played a part in the genesis of agonistic chance. But what may actually amplify agonistic chance and give it a certain direction during negotiations are what I have called fateful events. Such events may be very crucial in shaping the pathways the competing proposals end up on in the process of ranking.
In the present study, I have tried to show how risk is carefully considered in the review process; risk is not ignored but is instead a natural topic of discussion during the meetings. On the other hand, a delicate balancing act is required to discern the reasonableness and value associated with investing in a project that is encumbered by great uncertainty and that may well fail. A number of previous studies have shown that highly original and challenging proposals are more likely to go unfunded because they tend to be perceived as hard to understand, strange, risky, or impossible to carry out (Chubin and Hackett 1990; Luukkonen 2012; van den Besselaar, Sandström, and Schiffbaenker 2018). The truth is that most individuals involved in the review process are trying to do their very best to arrive at the most satisfactory solution. Even highly innovative and bold proposals can get funded, sometimes against all odds. I tried to illustrate this with the third case. If one of the reviewers involved in a disagreement eventually goes against his or her deepest feelings, then that marks what I have called a radical compromise, in relation to collective risk-taking in the group. Or, in some cases, it may be that a third reviewer ends up in what I called an aporetic position between the two colleagues in disagreement. This position may eventually push him or her to choose sides if the other two cannot resolve their differences.
But if the truly daring, “crazy” project ideas are to be funded, several panel group members must see the same potential, and herein lies the snag in the peer review process. In all basic research aimed at trying to take on the unknown, nothing short of new, daring ideas are required. However, which of these ideas will result in pioneering findings is difficult to forecast. Even seemingly small and unimportant contributions can sometimes be of invaluable significance for future breakthroughs. It is, therefore, not so strange that great variation and disagreement occur when expert reviewers meet to determine the fate of many promising applications. Agonistic chance is born in the wake of disagreement.
Footnotes
Acknowledgments
A special thanks to Adrienne Sörbom and Staffan Furusten for their generous help with the data collection. I would also like to thank Moa Bursell and Richard Swedberg for valuable discussions concerning the topic of this paper. Thanks also to the editorial team at ST&HV and the anonymous reviewers for critically and constructively scrutinizing my manuscript. And finally, I would like to express my sincerest gratitude to all the members of the ten panel groups, for allowing me to realize this study.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
