Abstract
Evaluation is ubiquitous in current (academic) science, to the extent that it is relevant to talk about an evaluation regime. How did it become this way? And what does it mean for scientists, groups, organizations, and fields? Picking up on the inspiring debate in a previous issue of this journal, four articles in this special section go deeper in studying the causes and consequences of the current evaluation regime in (academic) science, contributing with new insight as well as opening important new routes for further investigation. This introductory essay provides a background and framework to the special section and points out some key takeaways from the articles included.
The evaluation of scientific research by the use of general and mostly quantitative indicators, on individual, group, organizational, and also national level, is a salient feature of contemporary academic life. Several critical analyses, which extend to the governance and funding practices that make use of evaluation, and their (mis)alignment with academic self-organization and long-term objectives of scientific research, have lately been published (e.g. Biagioli and Lippman, 2020; Espeland and Sauder, 2016; Hallonsten, 2021; Muller, 2018; Pardo-Guerra, 2022; Watermeyer, 2019). The topic is contentious also in research policy and governance, with several national and international bodies addressing excess evaluation and its potential harmful effects on academic freedom and research productivity. The proper development of a scholarly understanding of the background and causes of the current situation has, however, only begun. Likewise, the scholarly analysis of the consequences of ubiquitous research evaluation is still at an early stage and in need of more systematic contributions.
In this special section, a number of scholars have contributed with such analyses. The contributions deal with both the historical and institutional origins and causes of the current evaluation regime in (academic) science, and consequences of the development.
Background
In early 2021, this journal initiated a debate by publishing a provocative essay that claimed that the ubiquitous practice of evaluating science is based on misunderstandings that have historical and institutional origins, and that current research evaluation mostly misses its marks because it is insufficiently aligned with the social and intellectual organization of scientific knowledge production (Hallonsten, 2021). It was the honest intent of that provocative essay to stimulate a continued conversation, in the pages of this journal and elsewhere, on this very important topic.
The debate that followed (SSI Volume 60, No 3) both revealed a broad interest in the topic and clearly displayed the need for further work. It highlighted many important aspects of research evaluation and revived several important discussions over this contentious phenomenon, but its reach was nonetheless somewhat restricted: The 12 shorter debate pieces were aptly written in response to the original essay, and their most important contributions were new questions and the opening of new perspectives.
This special section picks up some of the threads, in an attempt to continue the discussion – this time with stronger empirical foundations and clearer analytical ambitions. Four articles make complementary contributions toward further understanding of important aspects of the causes and consequences of the current ubiquity of evaluation in science. As the four contributions demonstrate, each in its own way, this ubiquity is strong enough to warrant us to talk about an evaluation regime in academia.
Causes
Among the claims most frequently heard in the debate over research evaluation, one is that it has been imposed on academia and academics by politics and bureaucracy, as an attempt to increase control over previously autonomous academics (e.g. Hallonsten, 2021; Muller, 2018; Münch, 2014). The argument is somewhat over-simplified, since research evaluation has grown into its current important state not only as a result of colonization of science by other spheres of society but also due to internal developments (Edlund et al., this issue; Schneider et al., 2021). Nonetheless, there is enough historical evidence to warrant the conclusion that academic science has, indeed, been put under increased scrutiny from politics, bureaucracy, and economy in the past hundred years.
The ambitions behind have been both pragmatic and ideological, depending a bit on viewpoint. The structural transformation of society and the economy, which brought harsher (international) competition and made knowledge one of the key resources in advanced economies, intensified the need to ensure that only the productive scientists and the useful science gets financial support and institutional backing. Other accounts blame a broader ideological shift in Western societies that begun in the late 1970s – usually summarized as the advent of neoliberalism – for the changes to the governance of academic science that involved a broad rollout of evaluative practices.
Two of the articles in this special section address these arguments and add useful knowledge and complementary perspectives on the development. Peter Edlund, Linda Wedlin, and Lars Engwall demonstrate, through their use of a general model of the governance of organizations, that research evaluation is neither exogenous to academia nor attributable to one specific development path or actor group. Quite the opposite: as the authors conclude, several actors and actor groups on different levels and in different parts of the system – including not only the political and bureaucratic structures governing academia but also market actors that produce and promote the ways and means of evaluation, and practitioners who actively and passively use and reproduce them – bear collective responsibility for the present situation. In particular, it seems that the relationships between these actors are what most strongly create and sustain the evaluation regime, by a variety of responses to incentives and impulses created in the dynamic interplay of politics, management, administration, market forces, media, and research itself. Providing a much-needed analysis of these interrelations, thus giving a far more nuanced and underpinned answer to the question in the title of the article – Who is to blame? – Edlund et al. make a very useful contribution toward a deeper and better understanding of the causes of the current evaluation regime in academic science.
Similarly, Björn Hammarfelt and Olof Hallonsten take on the question of who is to blame, but from a different angle, questioning the proposition that research evaluation is ‘neoliberal’ or has been driven by a ‘neoliberal’ political and ideological agenda. They operationalize this both through a conceptual discussion about the understanding of ‘neoliberalism’ in scholarly works and whether it is useful as an analytical tool, and through an overview of four different national evaluation systems and how they were implemented. The article concludes that neoliberalism, in order to function as a root cause of the proliferation of research evaluation practices, must be given a narrow definition as a limited set of concrete principles for organizing society rather than a wide interpretation as a hegemonic ideological project. The empirical overview provides additional basis for this argument, showing that there are indeed many features of national research evaluation systems that are possible to attribute to the influence of ‘neoliberal’ policies, but only if discussed in some greater detail and in tandem with an analytical treatment of ‘neoliberalism’ as a conceptual tool with some explanatory value, but no all-encompassing capacity.
Edlund et al. are keen on pointing out that the complex web of actors and actor groups behind the proliferation of evaluation in academic science includes practitioners, that is, researchers themselves. They – we – both assist other actors (actively and passively) by helping in the development of metrics, and using them, sometimes enthusiastically, which builds both demand and legitimacy, and makes evaluations part of everyday academic life.
Consequences
The important observation, that the academic community has been part in the growth of research evaluation in the past decades, involves not only a sharing of the blame but also more profound issues pertaining to the changing dynamics of the social and intellectual organization of the sciences. Evaluation changes behaviors of those that are evaluated (Merton, 1936; Ravetz, 1971: 295–296), also at early stages of its implementation. The risk that scientific practice and science evaluation co-evolve to fit each other and serve each other’s needs, producing neatly evaluable but possibly unoriginal and unimportant scientific results, has been highlighted in previous studies (e.g. Martin, 2011; Tourish, 2019; Weingart, 2005). Causes and consequences are conflated, and never has the need for deeper studies of rich empirical material and with firm theoretical grounding been more articulated.
Fortunately, this special section contains also such contributions. Oliver Wieczorek, Richard Münch, and Daniel Schubert have conducted an ambitious and deep study of the development of British sociology through the two consecutive Research Assessment Exercise (RAE) and Research Excellence Framework (REF) national evaluations in 2008 and 2014, to find out how these major evaluative interventions have impacted the field. With a large data set and a statistical topic model to identify commonalities between article abstracts, they are able to demonstrate a rather low degree of diversity, especially at sociology departments that scored high in the evaluations, and a significant general realignment of topics to what is commonly referred to as the ‘REFable’ – that is, to fit the criteria and standards of evaluations. A powerful contribution to the study of what is usually called ‘Campbell’s Law’ – that any measuring of a social process will distort and corrupt the process itself (Campbell, 1979) – Wieczorek et al. (this issue) spur the formulation of important new research questions pertaining to the behavior of scientists, departments, universities, fields, and whole research systems in response to evaluations.
Academic science has its own quality assurance mechanisms, and there is quite obviously a potential conflict of interest between the internal standards and the metrics on which most current research evaluation is based. Stuart Macdonald claims, in his contribution to this special section, that journal peer review – that is, the specific type of peer review used in the selection of manuscripts for publication in scientific journals – has been compromised by the use of quantitative performance indicators and rankings. The result is widespread ‘gaming’ of citations, by scientists, by organizations, by journal editors, and by publishers. Discussing the ubiquitous Journal Impact Factor (JIF) and how it is ‘gamed’ by authors, editors, and university management, Macdonald (this issue) argues that these dubious practices involve not only tricks on how to get papers published and extensively cited but also that the JIF itself breeds behavior that goes squarely against scientific ideals of originality, creativity, and critique. The article devotes specific attention to the field of medicine, where problems seem especially severe and where the financial interests of the pharmaceutical industry add complexity and sources of gaming or even corruption. Closing with the suggestion that medicine is leading the way in academic science’s general deterioration into depravity through continuous compromises to the integrity of scientists, journals, and universities, Macdonald concludes with a warning that the consequences of evaluative metrics for science as a whole may very well be its entire collapse.
Learnings
Radically dystopian views aside, there is much to be learned from the ample and consequential examples provided both by Macdonald’s interesting and colorful contribution, and the other articles. What all four have in common is that they challenge prevalent views and conceptions about how science is organized and how special interests and stakeholders align (and misalign) in the collective process of developing scientific knowledge and in evaluating its quality and relevance. They thereby fulfill an important ambition behind this special section, namely to make a substantial contribution to the advancement of our knowledge about the causes and consequences of the current evaluation regime in (academic) science, and doing so in genuinely original, creative, and critical ways.
Allowing ourselves a slightly normative bent toward the end of an otherwise straightly descriptive introduction, it is apt to reiterate that a key consequence of the ubiquity of research evaluation today seems to be the redefinition of science from process to product (Hallonsten, 2021: 18). This commodification of scientific knowledge takes many shapes (Radder, 2010), but we should note that research evaluation today is carried out almost exclusively through counting of peer-reviewed journal articles and the citations of these in other peer-reviewed journal articles. This has contributed to the rather bizarre habit of viewing peer-reviewed journal articles as commodities, and as such priced, quantified, and traded. It should come as no surprise if this view of scientific knowledge would restrict creativity and stymie important debate by compelling scientists to crafting their journal publications so that they close the case on whatever topic they concern and thus strangle conversation instead of stimulating it.
It is not unlikely for future analyses to show that such a redefinition of science from process to product also is part of the causes of the current evaluation regime – in fact, Edlund et al. (this issue) are on to something similar when describing how both universities and individual scientists play along in the commodification driven by evaluators and evaluation developers – but this is a devastating change in the public understanding of science. A process perspective is necessary in order to understand many of the key features of scientific knowledge production, ranging from philosophical considerations of truth and evidence (Leng and Leng, 2020) to the organizational-sociological understanding of how scientific work is done (Hallonsten, 2022): Only by acknowledging that science is collective, cumulative, and continuous we can make sense of its seemingly unfathomable clutter of results, claims, contradictions, replications and refutations, points and counterpoints, competition, and collaboration.
Also in this perspective, the four contributions to this special section are exemplary. While they all make important and interesting contributions to our knowledge about the causes and consequences of the current evaluation regime in (academic) science, they also give rise to important and interesting further questions. Can the most damaging effects of performance evaluation in science be avoided? Can academic science regain its autonomy, in general and specifically in terms of greater power over the quality appraisal that surrounding society likely will continue to expect? Or will evaluative metrics weed out all of science’s crucial and expected diversity and ingenuity? Will they corrupt the integrity of scientists, and of the institutions of science, to the degree that they lose all steadfastness and reliability?
Perhaps the reader needs no reminder that it is the combination of results and further queries, or the combination of certainty on some key points and openly declared uncertainty on others, that constitutes a proper contribution to the scientific commons. No journal article, nor any issue or section of a journal, will ever be able to once and for all conclude the conversation on any topic. Making a substantial contribution to the fruitful continuation of said conversation is, however, precisely what scientific publishing is all about. In this capacity, the articles in this special section have fulfilled their sacred duty. We all look forward to the continued conversation.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
