Abstract
More than 20 years after the first round of the OECD’s Programme for International Student Assessment (PISA), it has become one of the most important large-scale international assessments, at a global level and in particular in Europe. Thus, a growing number of scholars have examined or discussed its research design. Nonetheless, a key feature of PISA seems to have been less examined by critical works: the discourse of the research in the PISA literature. Indeed, the PISA reports appear to be scientific literature, both in textual form and in terms of the intentions expressed. Based on the reports published by the OECD since 1999, this article examines, in line with the French tradition of discourse analysis, the characteristics of research writing in PISA and the explanations of its conceptual approach, test design or result analysis. This analysis shows the differences between the research discourse in PISA and typical scientific rhetoric, particularly in terms of constructing a contradictory theoretical dialogue or a discussion of the limits of its approach. Finally, the article questions how these differences indicate a particular construction of education research within PISA and its possible relationship with the global or European dynamics of this scientific field.
Keywords
In recent years, educational research has taken a great interest in international large-scale assessments and, more specifically, in the Programme for International Student Assessment (PISA) conducted by the Organisation for Economic Co-operation and Development (OECD). Indeed, it would have been difficult for scholars not to take note of the success of this triennial survey, one manifestations of which is the widening of its coverage: 43 countries and economies were involved in the first round of its test (2000), whereas by 2019 Andreas Schleicher, Director for Education and Skills at the OECD and supervisor of PISA, could claim that ‘today, PISA brings together more than 90 countries, representing 80% of the world economy, in a global conversation about education’ (OECD, 2019b: 5).
This success is linked to a wider expansion of transnational school accountability and benchmarking policies, particularly noticeable within Europe (e.g. the Bologna Process and the European Higher Education Area, the European Qualification Framework, the Common European Framework of Reference for Languages, etc.). Presenting a research project on the role of PISA in framing and steering education policy at a European level, Ozga (2012: 167) explained that in the education policy space, ‘policy actors use new policy instruments and project the idea of “Europe” through the redesign of institutions, the organisation of networks and the flow of comparative knowledge and data’. Ozga (2012: 166) states that the OECD’s programme has a role in the Europeanisation process, ‘in shaping new spaces of governance, while also enabling a move from political to technical accountability’.
In this framework, PISA has become one of the schemes most regularly referred to concerning performance comparisons in scientific debate (Volante and Fazio, 2018). Numerous collective publications have consequently been devoted to providing an overview of the theoretical and empirical debates on PISA, taking a more or less critical view. For example, the volume edited by Hopmann et al. (2007) explores the key issues of the construction and use of PISA; Pereyra et al. (2011) assemble texts dealing with questions of comparisons, assessment tests and knowledge of school systems generated by the programme; the book edited by Meyer and Benavot (2013) gathers works studying the role of the programme in new forms of global educational governance (on this point, see also, more recently: Teodoro, 2022; Volante, 2018 highlights conceptual and methodological problems of this survey and some national and transnational issues of its use; and the volume edited by Waldow and Steiner-Khamsi (2019) analyses, in different countries, the policy makers’ and media interpretations of PISA data and results according to their own national narratives and views.
Various journals have also devoted special issues to PISA. For example (the list does not claim to be exhaustive): the Revue française de pédagogie published ‘PISA : analyses secondaires, questions et débats théoriques et méthodologiques’ (Rochex, 2006a); Educational Research and Evaluation published ‘Cross-Cultural Comparison of Group-Related Educational Inequality: The PISA 2000 Study’ (Peschar, 2006) and ‘Cross-National Studies in Student Performance with PISA and TIMSS Data’ (Dronkers et al., 2009); the European Educational Research Journal brought out ‘Gender and PISA’ (Jóhannesson, 2009) and ‘Assessing PISA’ (Ozga, 2012); the International Journal of Science and Mathematics Education published ‘First Cycle of PISA (2000-2006). International Perspectives on Successes and Challenges: Research and Policy Directions’ (Anderson et al., 2010) and ‘Science and Mathematics Literacy: PISA for Better School Education’ (She et al., 2018); Educação & Sociedade published ‘PISA, política e conhecimento em educação’ (Carvalho, 2016).
Pons (2011) described the diversity of the academic reception of the PISA survey in three European countries according to their theoretical traditions and their organisation of the relationships between science and the political and administrative system. But the link between research and PISA is perhaps related to the fact that the programme produces what can be called a discourse of the research, in the double sense of a research discourse (the design of the survey, its theoretical and empirical construction, its methods for processing and interpreting the assessment results, etc.) and a discourse on the research (how PISA envisages the use of its data and outcomes by researchers). However, this discourse of the research in PISA does not seem to have been examined for its own sake, and it is to such an endeavour that this article aims to contribute: if educational researchers show a scientific interest in PISA in various ways, what, conversely, can be said about the way in which PISA seems to perceive and construct research in its programme?
Of course, since PISA’s inception, its scientific aims and theoretical and methodological approach have been the subject of substantial critical studies in works discussing the conceptual and technical underpinnings of its survey (e.g. Goldstein, 2004; Hanberger, 2014; Prais, 2003; Wuttke, 2007), the motivations, uses and limitations of the programme for secondary analyses (e.g. Bautier et al., 2006; Gorur, 2011; Olsen and Lie, 2011), the implications of PISA for theoretical works on educational policy (e.g. Grek, 2009; Sellar and Lingard, 2014) or its conception of the contents assessed (e.g. Bart and Daunay, 2016; Lau, 2009; Romainville, 2002; Takayama, 2018). Thus, several articles synthesise the major scientific issues raised by PISA (e.g. Fernandez-Cano, 2016; Rochex, 2006b; Sjøberg, 2015; Zhao, 2020). The last author even denounces the ‘illusion of science’ (Zhao, 2020: 253) created by PISA through, among other things, using ‘data representation methods to make its reporting attractive and look scientific, with excellent visuals and all sorts of numbers’.
Hence, the features of the research writing in PISA deserve in my view to be examined as such, not only to highlight a particular facet of this programme (especially productive in terms of writing, as will be seen below), but also to improve understanding of the research conception conveyed by this programme. Focus on the construction and functioning of the scientific discourse in PISA is also consequently a means of shedding a specific light on its approach to research, through its expository and argumentative methods. This article therefore aims to contribute to the scientific debates about PISA mentioned above, by identifying and studying the discourse of the research in this programme. This seems all the more important as the PISA discourse is now a major reference in educational research and also because the OECD discourse has become increasingly entwined with other institutional discourses, in particular those of the Council and Commission of the European Union (Michel, 2017).
In this article, it will be seen in particular that the PISA discourse of research tends to neutralise scientific controversy by presenting its theoretical and methodological choices as self-evident and not entering into discussion with the critical works published on its evaluation. It will also be seen that despite the empirical limits of the programme, PISA gives a predictive and prescriptive value to its analyses, relating the results of its evaluation to orientations of public education policies. These characteristics of the PISA research discourse may therefore be remote from the ordinary framework and limits of scientific debates in education.
This paper is organised as follows: a first section sets out the theoretical and methodological approach which led to these results. The second section addresses the way in which PISA details the knowledge project it pursues and the frame of scientific dialogue it builds. The third section analyses the conditions in which the programme reports on the specificities of its survey data and the investigations they allow. The fourth section explores the discursive means used by PISA to present its results with a view to their insertion in political debates and public policy. Finally, the differences between the research discourse within this programme and the more usual scientific literature are examined, and the relationship between this specific approach to education research within PISA and the global or European orientations of this scientific field is discussed.
Theoretical and methodological approach
This article aims to explore the results and discuss the main issues raised by a study of the discourse of the research in PISA. This study is part of a larger research project (Bart and Daunay, 2016, 2018) involving a didactic study of the PISA tests and, in particular, focusing on the theoretical definition and methodological treatment of reading literacy in this assessment. This project sought to analyse the content (knowledge, tasks, skills, subject, etc. – cf. Daunay et al., 2015) and functioning of items, marking guides and accounts of results taken from the PISA literature. The project was based more particularly on the critical research tradition of the francophone field of subject didactics research on assessment issues (e.g. Brissaud and Lefrançois, 2016; Chevallard, 1986; Mercier, 2007). This is a critical research tradition which has in particular been able to describe the difficulties raised by the tests and their relationship with the habitual evaluation practices of school subjects (Bain, 2003; Bodin, 2005; Roditi and Salles, 2015). Among the various works on PISA referenced in the Introduction above, this approach is thus more broadly related to research that has studied the shortcomings and limits of the content evaluated in the programme (e.g. Lau, 2009; Sjøberg, 2015; or Dohn, 2007).
This research project required analysing all the PISA reports available up to 2018, extended to 2020, thanks to the voluntarist policy of disseminating OECD publications on the Internet 1 (with restrictions, however, as will be seen below). This documentation – of which the PISA texts cited in the final references of this article give a limited overview – brings together, for each round of the test, a theoretical and methodological assessment framework, some thematic volumes of results, a technical report and several notes. 2 As will be shown later, these texts appear to be research discourses, both in textual form and in terms of the intentions expressed.
But the in-depth analysis of PISA texts carried out in the context of the aforementioned didactic study of the PISA test (Bart and Daunay, 2016, 2018), led me to consider more broadly the functioning of the discourse of research in that evaluation. This PISA scientific discourse seemed both to diverge from the norms of publications in education research (e.g. the self-evidence with which the methodological choices are presented) and to manifest a specific conception of research (e.g. the prescriptive orientation of PISA texts).
I therefore undertook to subject the corpus of PISA texts – several thousand pages, described above – to a critical analysis. This procedure belongs to the French tradition of discourse analysis (Maingueneau and Angermüller, 2007; Pêcheux, 1981) applied to the writings of international organisations (Gobin and Deroubaix, 2010; Rist, 2002).
In this approach, the working of the research discourse to be described is that of PISA understood as an ‘institutional order’ (Grek, 2012: 244) producing a particular problematisation, methodology and discourse which go beyond the contributions of the actors involved in this survey. Thus, this article focuses on the discourse of the research produced by the PISA institution, grasped through one of its preferred materialisations (the reports), and not on the points of view of the actual authors of these texts (who are not always clearly mentioned in the PISA literature). As indicated in the Introduction, this discourse of the research is understood both in the sense of research discourse, which relates to the theoretical and methodological frameworks of PISA or to its results, and in the sense of the discourse on the research, which relates to the way PISA conceives educational research or secondary analysis of its data.
The analysis of the PISA writings first examined the thematic content of the discourses of the programme, focusing on the presentations of the principles of the test, the knowledge and skills targeted by the survey, the tasks by means of which students are assessed, and the statistical methodologies for processing the data or the main results. This analysis consisted in identifying the construction of these dimensions in the PISA texts and aiming to grasp the logic of their functioning and their possible variations. Here, it was rather the paraphrastic recurrences that were identified (Maingueneau, 1991: 71–105).
The analysis was also interested in the textual means by which these elements are presented. Here, it was more the enunciative modalities of the discourse of the programme that were identified by characterising a scientific ethos of certainty and self-evidence that corresponds more broadly to the social construction of the authority of such an institutional discourse (Monte and Oger, 2015).
The aim was thus to reconstruct more clearly the contours of the writing-up of PISA research, by systematically analysing the programme literature from the point of view of the major questions raised by the usual organisation of scientific texts (development and discussion of the theoretical framework, interests of the survey put forward, modes of presentation of investigations and their limitations, data processing, links with other works or with secondary uses). On the basis of the research literature devoted to this programme, this required particular attention to be paid the uncertainties and tensions at work in the PISA research discourse, and even the paradoxes and contradictions that these writings manifest.
This article thus presents the main results of that analysis. It will be seen that the discourse of research in PISA tends to limit the scope of its exposure to scientific discussion, whether at the theoretical or the empirical level. These analyses, which, as indicated, are based on a review of the reports published by the OECD in successive rounds of its assessment, will be supported here by several extracts from this documentation which seemed to be more broadly indicative of the discursive working of the programme.
Narrowness of the theoretical controversy in the PISA publications
It was in the 1990s that the governments of OECD member states launched the PISA programme to build an assessment that would regularly produce indicators of student competences in three main areas of literacy: reading, scientific and mathematical. As indicated in the report explaining the assessment framework of the first edition of PISA 2000 (OECD, 1999: 7), the implementation of the programme was then related to a clear political purpose:
3
While it is expected that many individuals in participating countries, including professionals and lay-persons, will use the survey results for a variety of purposes, the primary reason for developing and conducting this large-scale international assessment is to provide empirically grounded information which will inform policy decisions.
However, the publications of PISA also clearly present this programme as a scientific one. Indeed, not only does the assessment framework of the first edition of PISA 2000 (OECD, 1999: 7) stipulate that researchers (along with policy makers and educators) are targeted by the PISA results and data (OECD, 1999: 7, 21), but above all, experts are presented as contributors to this programme. Indeed, a number of researchers are involved in the development of PISA: members of the institutes and other types of organisations of the consortium selected for each edition of PISA, members of the various panels or expert groups of the programme, analysts and writers of the PISA reports, etc.
Thus, PISA is described as ‘above all, a collaborative effort, bringing together scientific expertise from the participating countries and steered jointly by their governments’ (OECD, 1999: 3). In each assessment cycle, the OECD PISA reports detail the members of the consortium, staff and experts involved in developing the test (e.g. OECD, 2019b: 345–351). According to the OECD (1999: 3), ‘through participating in these expert groups, countries ensure that the PISA assessment instruments are internationally valid’ and ‘have strong measurement properties’. Furthermore, this programme claims to provide ‘the most comprehensive and rigorous international assessment of student learning outcomes to date’ (OECD, 2019b, back cover).
Reading these reports, however, gives the impression that the updating of these logics of reliability and scientificity is not part of any close debate among experts. The PISA publications provide little evidence of the conceptual confrontations, empirical or technical controversies found in the theoretical or methodological reflections of the scientific communities, as shown for example by the publications cited in the Introduction to this paper.
This may be a consequence of a programme that presupposes a ‘common language and a vehicle for discussing the purpose of the assessment and what it is trying to measure’ by ‘the development of a consensus around the framework and the measurement goals’ (OECD, 1999: 18). Thus, the PISA 2012 mathematics framework ‘was written under the guidance of the Mathematics Expert Group’ (10 members from seven countries) but a draft ‘was circulated for feedback to over 170 mathematics experts from over 40 countries’ with a view to securing ‘more extensive input and review’ (OECD, 2013: 24). Moreover, the consortium in charge of managing ‘framework development, also conducted various research efforts to inform and support development work’ (OECD, 2013: 24). The question arises of how far this discursive form is conducive to shedding light on and putting into discussion what Lafontaine and Monseur (2012), two researchers involved in PISA, describe as the ‘shared certainties [of] the research paradigm’ (p. 48, my translation) embedded in this kind of programme.
For example, PISA’s scientific, mathematical and reading literacy frameworks are very largely built on Anglo-American psychological and cognitive theoretical references (for example, in reading literacy: OECD, 2019a: 22–55, 57–66). On the one hand, however, the theoretical writing of PISA does not clearly show the work of choosing these orientations over other possible research approaches (other trends in psychology, linguistics, sociology, didactics, etc.), nor does it clearly show the reasons behind these choices or their limitations. On the other hand, in these discourses, there is no real cumulative dialogue with the critical works that have discussed the relevance of these theoretical choices over the course of the PISA rounds. This is particularly the case with the conception of the areas assessed. From the beginning of the PISA project, researchers from different countries studied the design of the scientific, mathematical or reading literacy in PISA and raised some theoretical and methodological concerns about it (e.g. Bain, 2003; Dohn, 2007; Dolin, 2007; Hatzinikita et al., 2008; Lau, 2009). Nonetheless, the PISA literature does not provide any clear discussion about these issues.
More broadly, the key characteristics of the assessment are rather presented as self-justifying or self-explanatory, with some self-evidence and assurance, with little expression of doubt or possible lacunae. This is particularly the case for two fundamental principles of PISA, opposed to the choices of previous large-scale international assessments: to assess students close to the end of their compulsory schooling and to refer the tests not to school years or curricula but to knowledge and skills for their future ‘real’ life (e.g. OECD, 2003: 13–14). Goody (2001) has pointed out the limitations and weakness of this global approach to ‘life’ and literacy from the beginning of the OECD programme: There is no competency without an end. The end cannot be competence for ‘an individually and socially valuable life’ in the abstract since such lives differ enormously both within and between cultures (Goody, 2001: 186).
However, although Goody (2001) set out his critical point of view in the frame of an OECD project, PISA reports do not, it seems, refer to it. 4
This form of neutralisation of debates that the writing of PISA seems to carry out is perhaps even more evident if we look at the treatment of a discussion that was published in 2003 in the Oxford Review of Education. Prais (2003) disputed the reliability of the results for British pupils in PISA 2000, and a member of the PISA consortium (Adams, 2003) provided him with some answers (to which Prais, 2004 responds). However, although technical indications can be found in the PISA reports that were subsequently published, no explicit reference to Prais’s critiques or to the debates with Adams seems to be made in these reports. 5 Nor do they specify what is being done about the other points of controversy raised by Prais (2003).
It should be noted, moreover, that PISA rarely enters into an explicit dialogue with scientific controversies that are nevertheless characteristic of research activities. For example, Hopmann and Brinek (2007: 13–15) reported on unsuccessful attempts to engage with PISA. Policy Futures in Education (Volume 12, No. 7, 2014) also illustrated this difficulty, on a more ‘institutional’ level, by publishing the ‘Open Letter to Andreas Schleicher’ (Meyer, 2014b) signed by a very large number of educational researchers and experts around the world, together with Schleicher’s (2014) response and the responses of Goldstein (2014), Thrupp (2014) and Meyer (2014a).
Yet, within the PISA reports, the strength of the scientific logics and standards asserted in the programme does not seem to have generated any tension with the political origin and purpose of the assessment system which is the basis of the programme. Rather, the whole operation is presented from a consensual angle, while the linking of political priorities and scientific requirements seems to have been self-evident:
6
Participating countries take responsibility for the project at the policy level. Experts [. . .] also serve on working groups that are charged with linking the PISA policy objectives with the best available substantive and technical expertise in the field of internationally comparative assessment (OECD, 1999: 3).
This is also suggested by the programme’s lack of discourse on the problem posed by the limited dissemination of the items that make up the PISA assessments, which will be discussed in the next section.
Lack of discussion on the empirical concerns within PISA
Since the first round of its assessment in 2000, the PISA survey has been ‘designed to collect information through three-yearly assessments and presents data on domain-specific knowledge and skills’ combined with ‘information on students’ home background, their approaches to learning, their learning environments’ (OECD, 2013: 14). The diversity of participating countries, the hundreds of thousands of pupils surveyed, the continuity of the tests over time and their methodological rigour thus generate, as the OECD (1999: 21) puts it, ‘a rich array of data’. This remarkable empirical source is then subjected to ‘state-of-the-art technology and methodology for data handling’ (OECD, 2013: 14), with a view to producing quantitative indicators of pupil performance, comparing results across countries and exploring the relationships between these skill levels and variables of a social, economic, educational, etc. nature.
This survey produces in particular a large number of results tables, figures and other graphs in the (numerous and substantial) reports published by the OECD. For example, the 352 pages of the first of the six volumes of results from PISA 2018 present nearly 100 figures and tables (leaving aside the 59 tables of country performance trends), most of which also have a Web link to download the corresponding data files (OECD, 2019b: 14). In addition, there are numerous boxes, appendices and technical notes detailing the steps and operations carried out (e.g. presentation of the PISA target student population and sampling procedures in the same volume: OECD, 2019b: 157–161). The amount of these data in PISA illustrates the weight of the quantification processes in accountability policies, which is perhaps, paradoxically, on the scale of the indeterminacy and uncertainty that weigh on the steering of educational systems (Mangez and Vanden Broeck, 2021).
But this abundant information does not cover all the assessment data and materials: readers of the reports have access only to a few examples of the units and items of the test released by PISA; 7 this is also the case for the marking guides intended for the coding of responses and the actual responses of students. Finally, no complete test booklet is made available. This choice of secrecy is presented as a way of keeping enough units identical from one PISA round to another, so that the results are considered comparable (Goldstein, 2018 points out flaws of this ‘test equating’ technique).
However, while a ‘Reader’s Guide’ (OECD, 2019b: 21–23) warns readers of the first volume of PISA 2018 results about the methods of calculation of ‘international averages’ and the use of ‘rounding figures’ for ‘standard errors’, and advises that ‘this volume discusses only statistically significant differences or changes’, nothing explicitly informs them that the assessment materials remain essentially inaccessible 8 nor about the limitations that this implies in terms of interpretation and intelligibility of the analyses produced by PISA.
Nonetheless, this point is essential. Evaluations or measures are of course not neutral; they are built upon theoretical and methodological choices, as detailed above. Thus, researchers must be able to examine extensively the instruments and raw data to be able to discuss the relevance of the interpretations that underpin the programme’s results. As Sjøberg (2007: 212) says, ‘an achievement test is never better than the quality of its items’. Some academics have called for a wide transparency of the process, with publication of all the booklets and items (Goldstein, 2004), although the PISA reports do not refer to this debate. Thus, many researchers have criticised the PISA items released, 9 but, as seen in the previous section for the conceptual or methodological approach, here too no presentation of units and items proposed in the PISA reports enters into dialogue with this kind of analyses.
The absence of PISA discourse on the limits created by the secrecy of materials is similarly remarkable as regards the possibilities of secondary uses and statistical reprocessing of its data (Olsen and Lie, 2011). It is indeed possible to freely download the PISA data (with the exception of assessment instruments and raw materials resulting from administration of the tests). The OECD thereby offers ‘a valuable knowledge base for policy analysis and research’ (OECD, 2003: 11). Moreover, the OECD (2013: 172) stresses the importance of ‘building a sustainable database for policy-relevant research’ and it legitimately stipulates that ‘in the long term, one of the major benefits of the PISA database will be the availability of trend data’ (OECD, 2013: 173).
But nowhere do these discourses propose an equally clear discussion of the decision to limit the dissemination of assessment materials and the methodological limitations thus generated to the project of creating ‘a reliable, sustainable, comparative database that allows researchers worldwide to study basic as well as policy-oriented questions’ (OECD, 2013: 169). Nowhere in the PISA reports is there any detailed discussion of the fact that the accessible data are not the transcription of the students’ factual responses (especially those to the open constructed-response items) but their coding.
The question of how far and with what precautions it is possible to describe categories of student responses to questions that are not always known and without the actual responses of the tested students being known, therefore appears here to be resolved by the self-evidence and assurance of the writing. Nothing, however, prevents one from choosing to rely on PISA and the solidity of its procedures, as several researchers do (e.g. Baudelot and Establet, 2009). 10 Indeed, the OECD (2019a: 13) argues that ‘stringent quality-assurance mechanisms are applied in translation, sampling and data collection’ and that, ‘as a consequence, results from PISA have a high degree of validity and reliability’. Nonetheless, it is clear that a reflection on the PISA results or an equally rigorous secondary analysis should be able to reason about pupils’ responses, and to conduct a critical examination of the forms of questioning and marking (e.g. Dolin, 2007; Roditi and Salles, 2015). It is also clear that, like any research dynamic, such debates would primarily benefit the PISA methodology by highlighting the benefits of the approach and possible ways of improving it.
In view of the number of pages explaining how these units, items and their coding guides are developed, translated, field-tested, calibrated, etc. (e.g. OECD, 2017: 29–55, 91–99), the lack of coverage of these issues in the PISA discourse cannot be explained by the low importance that the programme might give to the data construction process. It may then be wondered whether, paradoxically, the explanation is not the opposite: the strength attributed to the development of assessment instruments, to the control of bias, to compliance with the ‘stringent quantitative standards of technical quality and international comparability’ (OECD, 2019b: 21), to the ‘quality-assurance procedures’ applied (OECD, 2019b: 30) perhaps leads to this non-debate on access to assessment materials. It is as if, once all the psychometric guarantees of data robustness had been presented, researchers were expected to reprocess the data, compare scores, identify factors and calculate correlations, without having to, or being able to, consider the questions asked in units and the relevance of the choices of coding.
There is therefore a risk of reproducing the presuppositions of the PISA investigative tools and naturalising the construct, as well as the results produced 11 (Vrignaud, 2008). There is also a risk of participating in the production of ‘knowledge that may be mathematically defensible but perhaps ontologically absurd’ (Gorur, 2016b: 651). This is all the more crucial as ‘PISA appears to have become a modern day Delphic Oracle which governments consult to get policy direction’ (Gorur, 2011: 77). This is especially the case for the governments of European countries and for the European Union, as indicated in the Introduction above. The next section will address this issue by examining the prescriptive discourse of PISA.
A prescriptive emphasis despite the methodological limitations
The weakness of the discourse on these questions is all the more surprising given that PISA does not hesitate to point out certain limitations of its programme, as we shall now see. First of all, it should be remembered that, as the OECD (2019b: 63) explains, ‘the goal of PISA is to provide useful information to educators and policy makers concerning the strengths and weaknesses of their country’s education system, the progress made over time, and opportunities for improvement’. Moreover this ‘policy orientation’ is the first of ‘PISA’s unique features’ that the OECD (2019a: 15) details in its list of ‘What makes PISA unique’: it ‘links data on student learning outcomes with data on students’ backgrounds and attitudes towards learning, and on key factors that shape their learning’, which ‘exposes differences in performance and identifies the characteristics of students, schools and education systems that perform well’ (OECD, 2019a: 15; emphasis added).
PISA uses two means to collect such context and background indicators. One the one hand, the programme makes use of system-level data collected regularly by the OECD for each country (e.g. indicators such as structure or expenditure of each school system, labour market characteristics, recruitment and training of teachers, etc.). On the other hand, it administers questionnaires to the participating students and their school principals (OECD, 2019a: 17–18). The student booklet seeks information about age, gender, engagement and motivation towards school, languages spoken, family environment, economic, social and cultural characteristics, etc. The school questionnaire asks questions about the institution’s organisation, its resources, the teaching practices, the characteristics of the teaching staff and students, etc. Moreover, each round of PISA offers optional questionnaires to participating countries. 12
Since PISA’s inception, the aim has of course therefore been to study the relationship between these data and the performances assessed, in order to identify political strategies for improving learning (for a critique of this line of reasoning in PISA, see for example Ercikan et al., 2015; Feniger and Lefstein, 2014). Indeed, the volumes of results of each PISA round describe ‘some indications for policy’ (OECD, 2001: 183), ‘some policy implications’ (OECD, 2010: 105), and ‘the questions education policy makers should ask’ (OECD, 2020: 191).
But in contrast to the discourses seen in the previous section, which tend to highlight the methodological quality of the OECD assessment, the accuracy of its measures or the reliability of its results, PISA seems to be more cautious in establishing ‘political considerations’ on the basis of the results of its programme.
For example, the first report of the first PISA assessment made it clear that ‘as in other analyses of this kind, [. . .] correlations cannot be interpreted in a causal sense since many other factors may be at play’ (OECD, 2001: 178). More recently, PISA explained that it is exceedingly difficult to draw causal inferences, such as concluding that a particular education policy or practice has a direct or indirect impact on student performance, based on [. . .] data of the kind collected in PISA [. . .] (OECD, 2013: 172).
On this point, 13 PISA even states that ‘much of the value of the programme is based on a constant interplay between PISA as a monitoring survey and more rigorous kinds of effectiveness research done elsewhere’ (OECD, 2013: 172, emphasis added).
One might then ask how this caution about linking performance to the contextual indicators and background characteristics fits with the predictive discourse of PISA (see Popkewitz, 2011: 31). For example, this kind of discourse, without any particular empirical precaution or theoretical underpinning, relates 15-year-old students’ performance to ‘the challenges they may encounter in future life’ (OECD, 2013: 13, emphasis added), to the ‘competencies [they] will need in the future’ (OECD, 2013: 13, emphasis added), and even to ‘unknown future challenges for which teaching of today’s knowledge is not sufficient’ (OECD, 2013: 121, emphasis added).
Nonetheless, PISA warns that, while some ‘patterns of statistical association between achievement, on the one hand, and family, school, and other educational influences, on the other’ are identified as ‘strong’, ‘there is also a basis for further investigation by countries to confirm whether such a phenomenon exists when more direct research is applied’ (OECD, 2009: 150, emphasis added). However, the question arises as to whether, once these limitations have been raised, there is not then a slippage in which ‘PISA findings obtained and exposed even in the proper reports are badly interpreted as causal inferences ad nauseam’ (Fernandez-Cano, 2016: 4).
For example, here are some titles of the results reports released by PISA 2018: What Students Know and Can Do (OECD, 2019b; Volume I), Where All Students Can Succeed (OECD, 2019c; Volume II), Effective Policies, Successful Schools (OECD, 2020; Volume V). Beyond the problems raised by this ‘generation of standardised templates and protocols to guide practices’ that contributes to the ‘seeing like PISA’ phenomenon (Gorur, 2016a: 608, based on Scott, 1998), the confidence displayed by these titles is astonishing in view of the explicit limitations of PISA scientific investigations.
Thus, the fifth volume with its unambiguous title, Effective Policies, Successful Schools, contains a chapter titled, as we have seen above, ‘The questions education policy makers should ask’ (OECD, 2020: 191–203). There are 10 or so main questions such as ‘Are schools equipped to teach – and are students ready to learn – remotely?’ (OECD, 2020: 192), ‘Do all students have equal opportunities to learn at school?’ (OECD, 2020: 188), ‘Is giving parents more school choice better for an education system as a whole?’ (OECD, 2020: 200) or ‘What kinds of assessment and evaluation policies make a real difference for schools and school systems?’ (OECD, 2020: 201). Readers therefore have every reason to believe that PISA can reliably answer these quasi-closed questions. 14 Yet, it is not very consistent with the PISA discourse about the fact that results cannot be interpreted in a causal sense, 15 as has been shown above in this section.
Likewise it seems that it is left to the reader to articulate appropriately the apparent certainty of a title such as Effective Policies, Successful Schools (OECD, 2020) or an issue like ‘What kinds of assessment and evaluation policies make a real difference for schools and school systems?’ (OECD, 2020: 201; emphasis added) with the ‘caveats’ set out in a box in the same volume, called ‘Interpreting the data from students and schools’ (OECD, 2020: 42). These ‘caveats’ (OECD, 2020: 42) refer especially to the fact that the report is built upon data self-reported by principals and students and not provided by external observations. Moreover, this box specifies that ‘there are other limitations, particularly those concerning the information collected from principals [. . .] that should be taken into account when interpreting the data’: Although principals can provide information about their schools, generalising from a single source of information for each school and then matching that information with students’ reports is not straightforward. Also, principals’ perceptions may not be the most accurate source for some information related to teachers, such as teachers’ morale and commitment (OECD, 2020: 42; emphasis added).
So, how can we estimate the effectiveness of such Effective Policies, Successful Schools or the reality of such a ‘real difference’ (OECD, 2020)? All the more so as this box states (OECD, 2020: 42) that ‘comparisons of results between resources, policies and practices, and [. . .] performance across time [. . .] should also be interpreted with caution’ because the PISA survey does not allow one to distinguish between two cases: the case where ‘the relationship between [. . .] performance, and resources, policies and practices is stronger because they are available to higher-performing students/schools/systems’ (emphasis added) and the case where ‘a particular set of resources, policies and practices may have been used more extensively in 2018 than earlier, and may have promoted student learning more in 2018 than before’ (emphasis added).
In brief, a certain prescriptive emphasis, reinforced by the authority (editorial, scientific, political, institutional, etc.) of PISA 16 (Meyer and Benavot, 2013), leads this set of discourses to go undoubtedly beyond the scope offered by its empirical foundation, 17 which is the explicit limit of the programme. For Hopmann and Brinek (2007: 13), ‘any policy making based on [PISA] data [. . .] cannot be justified’ and ‘the use of PISA [. . .] by the OECD [. . .] goes far beyond what is scientific evidence or simply well done research’. However, at least at a European level, Grek (2009: 35) describes ‘the acceptance of PISA – and the parameters and direction that it establishes – along with its incorporation into domestic and European policy making’ and observes, in particular, that ‘PISA data are used to justify change or provide support for existing policy direction’.
Conclusion
In conclusion, it seems important to recall that this article has focused on the PISA literature with the aim of providing a study of its discourse of research and its theoretical and methodological statements. Thus, it has examined this international survey in a critical perspective linked to a large diversity of works, considering that it is the responsibility of researchers to question and shed light on the working of an institutional assessment involved in powerful processes of globalisation and Europeanisation of educational policies.
The article has therefore examined a number of conceptual frameworks, thematic volumes or technical reports published by the OECD since PISA’s launch at the end of the 1990s. The analyses have identified the main characteristics of the discourse of research in PISA and described its kinds of disconnection from the typical scientific rhetoric: an argued dialogue with contradictory points of view, a problematised and cumulative process of knowledge construction, a reasoned discussion on the methodological contributions and limitations of the study carried out or a descriptive orientation in the data analyses and interpretations.
We have seen that despite a rich exhibition of statistical data, bias controls, measurement standards and conceptual references, the PISA discourse often tends to present its theoretical, empirical or analytical choices as undeniable and self-evident. This article has also described PISA’s lack of discourse on the scientific controversies surrounding its assessment or on the problems posed by certain methodological issues such as the limited dissemination of the assessment units. The research writing of PISA is also characterised by the self-evidence and assurance with which the programme asserts both its political logic and its scientific rigour, as well as the convergence it describes of theoretical questions with the concerns of the public authorities. We have observed this phenomenon particularly when PISA draws policy recommendations from its outcomes despite its own stated limitations of reliability.
All these aspects differ from the scientific writing norms that precisely require questioning and denaturalising of its statements. Research discourse should set out the limits of the approach and results validity, and, correlatively, define a possible frame to put them up for debate: as Johsua (1996) showed, this is a main requirement to consider the conclusions of a study as contributions to the scientific process. From this perspective, the discursive functioning of PISA could then manifest what makes the difference between the production of some assessment results and their construction into research results.
This working of the discourse of the research in PISA is perhaps the sign that the programme targets far less an audience of researchers than one of political and public actors, contrary to what PISA itself argues, as explained above. Thus, Mangez and Hilgers (2012: 197) analyse the reception of PISA in the education policy fields across Europe and consider that the ‘communication strategies’ implemented by the OECD ‘show that the “product” is not meant to be “consumed” primarily by the research community’. More broadly, for these authors, it is an indication of PISA’s position close to the heteronomous pole of the field of educational knowledge (Mangez and Hilgers, 2012: 197), which has grown stronger in recent decades, in contrast to its more autonomous pole, and which produces ‘knowledge for the economy, knowledge for policy [. . .,] knowledge-based governing tools’ (Mangez and Hilgers, 2012: 195).
These oppositions of course relate to tensions long identified in comparative educational research between an approach centred on theoretical development and another oriented towards political problematics (Schriewer, 1989). But these tensions also more broadly reflect developments in science itself. The strengthening of the heteronomous pole of educational research, to which PISA belongs, is thus explained, according to Mangez and Hilgers (2012), by a transformation of the relationship between the scientific fields and the social, political or economic demands made of research. According to Gibbons et al. (1994), since the 1950s, research has been characterised by a decline of its independence in the definition of its objects of study and the use of its means (‘Mode 1’ of knowledge production) and growth in works implemented and financed in relation to problems defined outside the academic world (‘Mode 2’).
Thus PISA is not only a scientific instrument for educational governance; it builds education research as an instrument. This is what Rochex (2008: 84, my translation) calls the ‘temptation’ to reduce research in international assessments ‘to a function of expertise’, ‘at the expense of a “critical” stance and thinking that are traditionally important in the sociology of education’ – and are important, more broadly, for all educational research. This could explain the opposition identified by Hopfenbeck et al. (2018) in their systematic review of PISA-related English-language articles published from 2000 to 2015, in a sense quite similar to the findings of Mangez and Hilgers (2012) in the European context. According to Hopfenbeck et al. (2018: 347), ‘there exists a somewhat unsettling divide across the articles’ between those in the ‘secondary data analysis’ category (the most frequent theme) and those in both the ‘critique’ and ‘impact/policy’ categories: papers in the first category ‘use PISA data as a foundation from which to build additional levels of newly constructed knowledge to bolster even further the already vast amount of information generated by PISA alone’ while papers in the two others point out ‘structural weaknesses and cracks in the foundations of ongoing PISA constructions’.
Finally, it may be wondered whether, paradoxically, PISA, which claims to be a significant international scientific collaboration on education issues based on a spirit of consensus, does not tend to reinforce the polarisation of educational research. This would be an unintended, but not the least important, effect of PISA on European and global educational research and its scientific discourse. But one would then need to ask more generally, as do Savage et al. (2021: 313–314), whether, in mainly attending to PISA, educational research, particularly that with a critical orientation, does not tend to reify this programme and the power of the OECD, while leaving aside other essential aspects of the functioning of educational policies and educational systems. In other words, scientific works require debates and controversies, which is precisely what tends to be lacking in the PISA research discourse that we have analysed.
Footnotes
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: An earlier version of this work was published in the Revue française de pédagogie (Bart, 2015). This article offers an expansion and update of the theoretical and empirical framework of the analysis. It has been supported by the European Center for Humanities and Social Sciences (MESHS-Lille, France).
