Abstract
In this paper, we analyse the trajectory of the French testing policy in education since 1973. Regarding its statistical tradition and its ability to produce its own evaluation tool, France may be regarded as an interesting case to interrogate the capacity of national educational systems to meet international standards of testing. Anchored in a perspective of sociology of public action, we show that the development of testing in France is the outcome of specific policy configurations that themselves depend on various types of factors. Using materials drawn from four qualitative research studies on testing and evaluation, we argue that this policy trajectory can be interpreted as a statisation process in which state administrations and political leaders both increased their power on society and imposed their categories and own interests to policy actors. This statisation led to a rationalisation and a politicisation of testing. Testing development did not lead in France to a deep transformation of governance patterns: it rather merged into traditional modes of regulation of education and confirmed them to some extent. Testing is thus an interesting way to study the propensity of the French education system to redefine global problems according to domestic stakes.
Introduction
Testing of pupils’ achievement, be it developed at the national level or not, is very often studied through an international or a global approach. In that perspective, various theories have been provided to explain both the emergence of testing and its political meaning. Testing is alternatively regarded as an emblematic measure of the implementation of an evidence-based policy in education (e.g. Wiseman, 2010), as a key device to promote a “new [neoliberal] global educational order” (Laval and Weber, 2002) or as one policy potentially paving the way for post-bureaucratic modes of governance of education systems (Maroy, 2012). For others, testing is one of the numerous forms taken by a new governance of education through numbers (Felouzis and Hanhart, 2011), through measurement policies (Normand, 2011) or through statistical data and quality assurance (Ozga et al. 2011). More recently again, some even theorised the emergence of a new global panopticism through testing (Lingard et al., 2013) and the development of a new “global testing culture” in education (Smith, 2016).
These works are very helpful to highlight common trends and study globalisation or Europeanisation in education, to overcome methodological nationalism, to compare different national experiences and learn more from each of them through comparison, to understand the reasons why a culture of evaluation may emerge (or not) in some countries and, above all, to interpret the political significance of this new governing mode through testing.
Nevertheless, these works do not really allow the researcher to understand the forms taken by testing in specific national policies, the trajectory of testing in various domestic policy contexts and the domestic processes of legitimation of testing. Yet, these three aspects are essential to correctly interpret the political meaning of testing growth, to compare countries with relevance and to understand the potential globalisation at work. For instance, several works illustrated that testing development increased central national state power. Nevertheless, this trend will not have the same political meaning in traditional decentralised states, such as England or the USA, in which it corresponds to a major change of public governance, and in traditionally centralised systems as in France. Thus, the simple occurrence of testing in a growing number of countries is not a sufficient proof of the emergence of a common global testing culture.
That is why this article provides a framework in order to rethink these three aspects. It studies the trajectory of the French national testing policy of pupils in primary and secondary education, or more precisely the devices organising the diverse standardised evaluations of pupils’ achievement (SEPA) that are implemented in different periods and how these SEPA were legitimised. As far as SEPA are concerned, France is an interesting most likely case (Lijphart, 1971). It is indeed a country with a strong tradition of public statistics (Desrosières, 1998), which has developed its own tools from 1973, which tried to export its model in Europe and in Africa, especially in the 1990s, and which is often described in formal Eurydice international comparisons of evaluation devices as a country with various evaluation tools. Thus, it can be expected that France meets international standards of testing and if it does not, it will be meaningful for our discussion on globalisation through testing. As will be exposed in the two first sections, this trajectory of testing in France is understood as the outcome of specific policy configurations that enable more or less its development and that are studied on the basis of qualitative materials drawn from four research studies carried out since 2004.
Our argument is that this trajectory corresponds to a statisation process (Payre and Pollet, 2013) in which state administrations and political leaders both increased their power in society and imposed their categories and own interests on policy actors in education. This statisation took different forms according to the periods and the policy configurations at work but, overall, it led both to a rationalisation and a politicisation of testing, the two trends being sometimes compatible, sometimes not. Consequently, testing policy did not lead in France to a transformation of governance patterns but it rather merged into traditional modes of regulation of education and confirmed them to some extent. Testing is thus an interesting way to study the propensity of the French education system to redefine global problems according to domestic stakes.
After presenting our theoretical framework and our methodology, we distinguish five policy configurations from 1973 to 2017 and illustrate the kind of statisation at work in each period. We discuss then in the conclusion the political meaning of these trends.
Theoretical framework
In this article, we study the trajectory of the French testing policy as a succession of policy configuration outcomes. This approach inscribes itself in a policy sociology (Ozga, 1987) or a sociology of public action (Lascoumes and Le Galès, 2007) in education (van Zanten, 2011) that tries to understand how and why education policies are conceived, implemented and evaluated and what effects they produce.
A trajectory can be defined as a movement in time and space. It has been conceived and conceptualised differently from one scientific discipline to another. In social sciences, for instance, it often designates a succession of social positions occupied by a person in his/her social life, for instance within the school system (school trajectory, or career, of a pupil for instance) or on the housing market (residential trajectories of families). In an international issue that we coordinated on the trajectory of school inspections in Europe, several authors used this notion to question the surprising durability of inspections in a context of deep changes in the governance of education (Pons, 2014). They often used neo-institutionalist theoretical tools to conceptualise this notion of trajectory (like those of path dependency or gradual changes), but this perspective was not unique since others regarded the trajectory of different inspectorates as the result of specific strategies of internationalisation or as the outcome of a professional legitimation process implemented by inspectors. As far as policy trajectory specifically is concerned, Stephen Ball was among the first authors to insist on the necessity to “capture the dynamics of policy across and between levels” and to study “the ways in which policies evolve, change and decay through time and space and their incoherence” (Ball, 1997: 266). This study can be conducted on the basis of various theoretical frameworks. In a recent comparison of the trajectory of the accountability policies in French and Quebec education to which we contributed, this policy trajectory was conceptualised as the combination of three processes: path dependence on early choices, policy bricolage and translation of transnational imperatives in domestic contexts (Maroy and Pons, 2019). Our approach is close to these two last contributions and conceives the trajectory of the French testing policy as the outcome of a succession of specific policy configurations.
Policy configuration refers here to the notion of configuration developed by Norbert Elias (1978) and designates a set of factors within the policy process that produces specific interdependencies between policy actors. One major analytical challenge of this approach is to identify the factors that shape these interdependencies. We distinguish four types of factors following our previous work using this notion (Buisson-Fenet and Pons, 2014). The first one is political and refers mainly to the political making of public policies (here mainly the evaluation policy itself). The second is institutional. By defining the formal rules of the games, institutional designs strongly predetermine the roles and routines of policy actors, their margins and spaces of liberty and their degree and form of coordination. Here, it will mainly refer to the institutionalisation of SEPA and the institutional properties of evaluators. The third is professional: the way in which policy actors define their job (professional identity), embody it in specific activities (professional skills and competencies) and struggle for it (professional legitimating) plays an important role in their involvement in the policy process. The last is cognitive and designates all the representations, ideas and pieces of knowledge that policy actors may share, and that may influence their intervention in a policy process.
As illustrated below, these policy configurations all shape a movement of statisation of the SEPA, of their purposes and of the methods, the tools and the pieces of knowledge produced through them. Statisation (“étatisation” in French) can be defined as the institutionalisation of state power, the state being regarded as a centralised, differentiated, institutionalised, autonomous and sovereign political system (Badie and Birnbaum, 1983). At this general level, the notion of statisation is close to the so-called “state building” often quoted in international literature and so to the numerous theses that have tried to explain the emergence of the state form. It is also close to several historical works that stressed the globalisation of the “school form” (“forme scolaire”) and the increasing power of the state on education (Meyer and Ramirez, 2000; Vincent, 1994). Yet, as Michel Offerlé (1997) argued, analysing statisation does not only mean analysing how the state penetrates society but also the statisation of the state itself, that is to say the process of bureaucratisation of its own activity.
Three kinds of works contributed to the analysis of this double process (Payre and Pollet, 2013). The first focuses on the historical origins of some local initiatives to determine their influence on state policies and understand the drivers of statisation. This approach was particularly developed to analyse the appropriation and the generalisation by the state of local initiatives in urban policies. Nevertheless, this approach is not relevant in the case of SEPA since the development of public statistics in education in France was mainly top down and came from the initiatives of central administrations. The second approach of statisation, inspired by Max Weber, studies the bureaucratisation of the society by the state on the basis of an empirical sociology that stresses both the “inventive hesitations” (“errements inventifs”) of state agents and their work of delimitation of publicness, of stateness and of the frontier between the centre and its peripheries (Offerlé, 1997). Alternatively, this sociology may focus on the role of policy networks taking place between state administrations and private actors or on the role of myths, speeches and government sciences in the definition of state representations. We propose to call this kind of statisation an institutional statisation. The last approach regards statisation as the outcome of the imposition of specific policy categories that participate in the definition of policy problems requiring state intervention and that contribute to make the public targeted by this intervention exist. The best example of this double process is probably provided by the category of “unemployment”. We will talk about a cognitive statisation to designate this process even if cognitive processes are also at work in the institutional statisation.
To summarise, studying statisation implies seeing the state plunged in a perpetual process of redefinition of the frontiers of its legitimate sphere of action, as an actor in perpetual interaction with non-state actors and processes. This process can be more or less favoured by processes of politicisation. Politicisation occurs when policy devices – such as the SEPA – are more and more linked with the implementation of a specific political offer coming from political actors, such as parties. In axiological terms, it implies increasing debates on the values and the goals of these devices. According to the context, this politicisation can restrict the development of the SEPA or, in the contrast, favour their rationalisation. The latter can be formal – in that case it consists of cleaning up the device and improving its internal coherence at the risk of its simplification – or material when policy makers try to increase its empirical relevance (Benamouzig, 2005).
Methodology
This article is based on materials drawn from four different qualitative research studies led since 2004 and mobilised here in a synthetic perspective. The first one is a survey of scientific literature about standards and standardised tests done with Nathalie Mons. This survey was an opportunity to cover the English-speaking literature on standards, to interview French policy makers about SEPA and to compare four French-speaking education systems (Mons and Pons, 2006). The second study is a four-year PhD research on policy evaluation in France that allowed us to collect many public and non-public documents about SEPA and the personal archives of one statistician and lead intensive semi-structured interviews with people from the DEPP 1 (n = 32) and their interlocutors (n = 66) (Pons, 2010a). The third study was conducted in 2010 and focused on the implementation of new SEPA in primary education. It was based on additional interviews (n = 7), on document analysis and on the exploitation of a small dataset of dispatches (n = 69) from a press agency that specialised in education (Pons, 2010b). It was updated in 2012 with a new dataset of 70 dispatches. The last research study consisted of comparing policies of accountability in France and in Quebec 2 (Maroy and Pons, 2019). It was an opportunity to confirm, through interviews and document analysis, the slowdown of the French testing policy between 2012 and 2017 and to send a short questionnaire to regional statistical services of the ministry to check the degree of implementation of SEPA at the level of the “académies”. 3
Testing policy as a succession of policy configurations
In this section, 4 we present the five successive configurations that have characterised the trajectory of French testing policy since 1973. For each of them, we synthesise the main factors of interdependence between the actors in accordance with our theoretical framework. These elements show a regular process of statisation that has taken various empirical forms over time and that we present in the final section.
Internalising academic competencies (1973–1988)
While the first initiatives in intelligence testing in France date back to the early 20th century, the 1970s is a relevant starting point for our analysis. In 1973, the central office of statistics of the French ministry of education received a new mandate of evaluation. The year after, an office specifically devoted to “pedagogical assessments” was created. Its holder, a young administrator recently graduated from the prestigious National College of Administration (ENA), was given full liberty to organise its activities. For one year, he and a statistician undertook a kind of “scientific tourism” in France and in Europe (especially in Belgium, England and Switzerland) to learn from scholars’ experiences in testing. This enabled them to recruit a research officer and other administrators without specific technical skills in testing, to launch a collaboration with some specialists from the French National institute of pedagogical research and to implement with them the first national test of pupils in French and in mathematics between 1974 and 1976.
This new orientation of the ministry is the consequence of four main interconnected factors. The first refers to the successive transformations of the lower secondary education that have taken place since 1944 and that came up in 1975 with the creation of a single school system. Many observers wanted to know if this new system and the new curriculum that it implied would improve pupils’ achievement. The second is the development of several works in psychology and docimology that have shown the biases of teachers’ assessments and that have regularly pleaded for alternative forms of assessment since the 1930s. The third is the implementation from 1968 in all French ministries of a new form of budget planning that invited public administrations to evaluate ex ante the effects of their decision before programming them. Even if this operation, called BCR (Budgetary Choices’ Rationalisation), was limited in the education sector, it allowed the Ministry of Education to recruit a research engineer and to develop its statistical power. The last factor is the political opportunism of the statisticians themselves, whose head managed to convince the minister that national tests would provide interesting results to appreciate the effects of his reforms and that this operation was possible and not so costly, since the office already had implemented a panel of pupils in secondary education since 1972.
Despite the negative reactions coming from a part of the influential general inspectors, who criticised this quantitative approach of assessment, a second national test in primary education was implemented in 1979. In the 1980s, the SEPA were more frequent but their periodicity, their purposes and the level and discipline that they tested were never stabilised (see Table 1). Yet, two kinds of assessment rapidly emerged: “diagnosis assessments” and “record assessments”. 5 The diagnosis assessments are exhaustive, relatively easy and implemented at the beginning of a learning cycle to allow teachers to identify rapidly pupils with difficulties. They are conceived as tool for professionals in schools. The record assessments are sample-based, summative and conceived for political leaders to provide them with a first measure of the outcomes of pupils and through them of the whole school system. These evaluations were not compulsory and were intermittent. They were conceived primarily to support the implementation of new curricula in the 1980s or to give political leaders specific feedback on their policies (this was the case, for instance, for the assessments of early-learning studies).
The standardised evaluations of pupils’ achievement from 1980 to 1989.
Source: Levasseur (1996).
ISCED: international standard classification of education.
This configuration had some advantages. It allowed professionals to improve their mutual knowledge on pupils without having to cope with administrative consequences. Gathering different people within pluralist groups whose task was to conceive the tests favoured communication between professionals that hardly knew each other at that time (such as inspectors and statisticians, for instance) or whose communication was rarely freed from hierarchical relations (such as between teachers and inspectors). This configuration also favoured spontaneous appropriations of the SEPA logics by professionals in the académies even if this appropriation took various directions.
Nevertheless, this configuration raised several issues. Methodologically, it caused a fragmentation of the data available and it implied for statisticians to always reconstitute samples. This exposed them to various forms of criticisms, such as the impossibility of having a longitudinal approach, the lack of control of some contextual explanatory variables or, in contrast, the regular condemnations of their figures-based approach to education. Politically, since the testing tools were not strongly institutionalised, it placed statisticians under the dependence of political leaders whose interest for this approach was uneven from one minister to another. If Alain Savary (1981–1984) wanted to have regular feedbacks on pupils’ achievement, Jean-Pierre Chevènement (1984–1986) was less interested in this approach and René Monory (1986–1988) had a strategic use of testing, that is, using it to influence teachers’ practices by circumventing the traditional institutional dialogue with unions. This paved the way for important decisions, such as the end of the scientific collaboration with the researchers who represented France in the Association for the Evaluation of Educational Achievement and the withdrawal of scholars from the SEPA devices.
This first period was characterised by a double statisation process. The latter was first cognitive since the ministry progressively internalised skills and competencies in pupils’ assessment that were initially developed beyond it. This internalisation started from “scientific tourism” and went on with an official collaboration with researchers for the conception of the tests, which stopped in 1984 when the ministry decided to reproduce alone its own tools. This statisation was also institutional, since the “inventive hesitations” of the founders of the office of evaluation came up with an extension of the state frontier that integrated SEPA. This double process went with a low politicisation of the testing policy and a low rationalisation of the SEPA themselves.
The promises and pitfalls of systematisation (1989–1997)
The situation rapidly changed in 1989. In March, within a whole policy debate on the so-called growing illiteracy of pupils, the minister Lionel Jospin announced the systematisation of diagnosis assessments in CE2 (year 8) and 6ème (year 11) for the following year. This statement was announced after several weeks of negotiation with teachers’ unions to prepare the future bill of 1989, whose last title was devoted to evaluation. During these negotiations, teachers’ unions recalled the importance of pedagogical liberty in their eyes and stressed their fear to see the SEPA becoming evaluation tools of teachers themselves. They won three concessions from the ministry: the lack of legal obligation for teachers to consider the results of the SEPA, the non-publication of these results per school to avoid parents’ consumerism and the privilege given to diagnosis assessments and to the SEPA conceived to support teachers’ practices and not evaluate or sanction them.
Despite these limitations, the systematisation of the “CE2-6ème assessments” device and the Act of 1989 considerably changed the configuration at work. If the former period was characterised by organisational hesitations, personal initiatives and the possibility for the opponents to the SEPA to ignore them, from 1989, the latter took a major place in the ministerial schedule and were clearly integrated in the whole evaluation policy. Massive training sessions on the SEPA were organised at various policy levels. Several workshops and professional symposia were organised locally, and their acts sometimes published by regional pedagogical centres. The SEPA became one of the key policy tools promoted by the head of the DEPP of that time, Claude Thélot, whose ambition was to transform this department into an institutional interface between the research, the administration, the media and the political leaders. For him, diagnosis assessments were a key channel of dissemination of a new “culture of evaluation” within the bureaucratic French education system. In 1992, a diagnosis assessment in “seconde” (upper secondary education, year 15) was launched. Two years after, a first evaluation of the reception of the SEPA in schools done by the DEPP revealed that 93% of teachers in CE2 and 6ème used the SEPA to discuss with parents, 82% to update their teaching and 67% to make their pedagogy evolve. The DEPP started to present the SEPA as a whole “observatory of pupils’ achievement” (Levasseur, 1996) and Claude Thélot even conceptualised on their basis a specific vision of evaluation that must produce the “mirror effect” (Pons, 2010a). According to this theory, evaluation can rarely provide professionals with strong and stable causal relations because education processes are too complex. Consequently, it can only confront professionals with the outcomes of their choices and invite them, once they accept to look in the mirror of the evaluation study, to find themselves, as professionals, the solutions to improve the results of their pupils. This theory deeply influenced the French model of evaluation policy that is sometimes presented in typologies of accountability policies as a reflexive (Dupriez and Mons, 2011) or a soft one (Maroy and Voisin, 2014).
Nevertheless, the priority given to diagnosis assessments had its own limits. Since they were exhaustive, these assessments were costly in material (production and mass edition of testing documents, contracts with transports, delivery of specific informatics equipment to enter data in schools) and in time (persons entirely devoted to supervisory tasks in the department). Their heavy logistic never allowed the DEPP to send back the results to schools before November. This was too late for teachers, who used other means to identify pupils’ difficulties. In addition, this costly priority implied to put in the background the record assessments, whereas they were supposed to be a key steering tool. This led some policy makers at various policy levels to use diagnosis assessments as quality indicators even if they were not conceived for that purpose.
During this second period, the statisation process continued. Institutionally, it consisted of the penetration in the schooling process of pupils of a new state device, the aim of which was to evaluate pupils and position them vis-à-vis others in their class, in their académie and in the whole nation. Cognitively, a specific dichotomy was imposed – diagnosis assessments versus record assessments – whereas it did not really correspond to the scientific literature. This movement was favoured by an increasing politicisation of the testing policy, now integrated in a whole evaluation policy, and came up with a formal and material rationalisation of the SEPA through the “CE2-6ème assessment” device.
The uncertainties of politicisation (1997–2007)
The Claude Allègre’s ministry (1997–2000) clearly broke this dynamic. The minister reorganised the department, publicly criticised the data that the latter produced – for instance on the proportion of under-achieving pupils in reading that would be underestimated for him – and defended another model of evaluation, which was more external and academic based. This break introduced much uncertainty on statistical production. If the diagnosis assessments went on, the record ones were clearly slowed down during the period, even if two emblematic assessments were conducted in 1997 and 1999. 6
Moreover, several works have questioned the regulatory power of diagnosis assessments in the years from 2000. While a second study from the DEPP led in 2005 concluded that the majority of teachers still declared that these assessments contributed to make their professional practice evolve – thanks to the aggregation of general categories proposed in the questionnaire – an ethnographic research study in three académies showed that teachers had many difficulties integrating these tests in their pedagogy and in their communication with parents (Derouet and Normand, 2003). On the basis of a several-year research study in the Lille académie, Lise Demailly (2003) highlighted a resistance of teachers towards these tests, which they justified by corporatist reasons (fear of being evaluated themselves through these tests and losing their pedagogical autonomy), by general values (the refusal of technologies coming from the private sector and leading to unfair evaluations) and by a global distrust towards statistics.
As far as head teachers were concerned, various works illustrated that if they recognised that outcomes-based steering was now part of their mission and that these tests might increase their possible influence on teachers, they also strongly criticised the bureaucracy that they implied and their problematic insertion in the everyday professional relations with teachers and parents (Barrère, 2009). Hence, the integration of these tests in school projects was very uneven from one school to another (Verdière, 2001). Agnès van Zanten (2001) even argued that when these tests were used, it was not to support a pedagogical project but rather to maintain a good reputation (in the case of top performing schools) or to fabricate a good image (in the case of unfairly stigmatised schools). Concerning local authorities, if some municipalities actively used diagnosis assessments as an observation tool to bypass the traditional educational experts and assert themselves as legitimate interlocutors on pedagogical issues (Dutercq, 2000; van Zanten, 2001), their use remained very uneven from one territory and a policy level to another. Concerning académies lastly, these tests were sometimes used as steering indicators, whereas they were not conceived for that purpose and this instrumentalisation was criticised both by the DEPP and by the general inspectors (IGEN-IGAENR, 2005).
Consequently, the years from 2000 were characterised by many political uncertainties concerning the future of the SEPA, which were illustrated both by the numerous reorganisations of existing tools and by the inflation of new short-lived devices. In 2001, for instance, the diagnosis assessment in “seconde” (year 15) was abandoned and a new one was launched in “5ème” (year 12). A new test of that kind in CE1 (year 7) was experimented in 2004–2005 within a global policy programme for reading. It was re-experimented the year after officially to better capture pupils’ difficulties, and it was generalised in 2006–2007 before being abandoned in 2008. The test in CM2 (year 10) launched in 2005 in the continuation of a new education act was stopped this year too. Lastly, the former tests in CE2 (year 8) were abandoned in 2006.
Concerning record assessments, a new cycle entitled CEDRE 7 was implemented in 2003 to evaluate pupils’ competencies at the end of primary and secondary education in various disciples. Based on national samples, their methodology was close to that of international comparisons and took better into account international research in psychometrics. Nevertheless, they were not conceived to inform at the same time some expected fields of the new budgetary documents introduced in 2006 by the LOLF, 8 so that the ministry had to create other ad hoc sample-based tests to fill these documents. Experimented in 2005–2006, they were generalised in 2007, initially as a temporary device.
The 1997–2007 period was thus characterised both by a strong politicisation of the testing policy – visible for instance in the uncertainties introduced by Allègre’s ministry and in the inflation of new devices in the years after – and by a weak rationalisation of the numerous SEPA available whose lifetime were generally short. This double movement paved the way for an ambivalent statisation process, in which the SEPA devices were more and more defined according to state (always changing) needs, but with a decreasing regulatory power on professionals and bureaucrats.
Political instrumentalisations (2008–2012)
The instability of the SEPA went on between 2008 and 2012 because of political instrumentalisations of these tools by policy actors. In November 2007, the minister Xavier Darcos (2007–2009) announced the creation of two national exhaustive tests in CE1 (year 7) and CM2 (year 10). These new tests were expected to measure pupils’ achievement, to provide parents with feedback on their child, to be an indicator of teaching efficiency and to estimate the school system efficiency. Initially, these new tests were supposed to solve various problems of the past: for the first time they were explicitly linked with an on-going curricular reform (whereas before, tests were conceived a posteriori), they would allow one to improve the statistical database in primary education, which met several difficulties in the past (striking from primary school leaders who refused to fill in statistical documents, contestations of a former intrusive database on pupils, etc.) and so to improve the LOLF indicators and, lastly, they were expected to bridge the gap between former record assessments that were conceived to steer the system but only at the national level, on the one hand, and on the other diagnosis assessments, which enabled statistical analyses at various levels since they were exhaustive but not in a steering perspective since their purpose was different.
These new tests introduced three main breaks in the testing policy trajectory. Firstly, they signalled the end of the former emblematic “CE2-6ème assessments” design since the responsibility of the implementation of the tests in “6ème” was transferred to the académies in 2008. Secondly, they constituted a new form of test – neither diagnostic nor summative – whose conception was far from traditional psychometric canons and closer to the immediate political needs of the ministry. Symbolically, their conception was not given to the DEPP but to another central department (the DGESCO), whose mission is to implement the ministry policy. Thirdly, they provoked for the first time a burning and durable controversy within the education field between the ministry and its partners that went on until the next presidential elections (Dutercq and Lanéelle, 2013; Pons, 2010b).
This controversy developed in three main stages. From October 2008 to May 2011, it was essentially fuelled by methodological considerations. These new tests were reproached for being created discretionarily by the ministry cabinet, announced in emergency without giving birth to a preliminary consultation of professional organisations, conceived according to too-binary logics of correction (pupils know or do not) and implemented in a bad period of the schooling year. For instance, the test in CM2 (year 10) was supposed to be organised in February, which is too late to help teachers struggling again pupils’ difficulties identified by the test and too early to constitute a summative assessment of the whole schooling year. From May 2011 to December 2011, these methodological elements did not disappear, but they were added by opponents to other criticisms on other SEPA devices. The announcement by the government that another exhaustive test in “5ème” (year 12) would be experimented in September 2012 was received by teachers’ unions as an illustration that the ministry did not take into account their former methodological criticisms. In September 2011, the High Council of Education published a report in which it strongly criticised the evaluation tools that were used to measure the degree of pupils’ competencies in the LOLF documents. These criticisms had a new echo from January 2012 with the official opening of the presidential campaign. While the Right in power confirmed regularly its policy and even published a circular organising the next schooling year several days before the first round of the elections, the Left rapidly reassured teachers by confirming that it would abrogate the CE1–CM2 assessment device after the election.
Consequently, between 2008 and 2012, the politicisation of the testing policy clearly increased since the interventions of political leaders did not only have the effect of creating uncertainty on the future of the existing SEPA as before, but also they materialised themselves in explicit attempts to instrumentalise these tools for specific political and institutional purposes. Therefore, the rationalisation of the SEPA devices was particularly weak and even negative in a way. The trajectory of the testing policy can still be regarded as a statisation process if we take also into account in this process the resistances that state initiatives can bring about.
Political withdrawal and technical stabilisation (2012–2017)
From 2012, the testing policy was clearly put in the background by political leaders. The position of the new minister Vincent Peillon (2012–2014), who announced in a press conference in August 2012 that he abrogated the tests in CE1 and CM2 and who reintroduced the distinction between diagnosis and record assessments, did not pave the way for the implementation of a new device. The wild consultation organised by the Left after the presidential elections, entitled “the Refoundation”, the aim of which was to prepare the Act of 2013 did not give birth to many discussions on the SEPA except to reassert that the choices made in the former period were poor ones. Significantly, the new national council for the evaluation of the school system (CNESCO) created by this Act did not specifically address this question during the quinquennium and no new ad hoc SEPA device was created.
This political cooling allowed the DEPP to improve the stability and the cumulativeness of its evaluation tools. New sample-based assessments were conceived in 2014 to improve the evaluation of pupils’ basic competencies in order to better inform the LOLF budgetary documents. The CEDRE device was consolidated with each year a regular record assessment on a specific discipline. This allowed the DEPP to develop diachronic assessment systems. The latter rest not only on LOLF indicators and CEDRE assessments, but also on the repetition of the 1987 survey that we mentioned above and on the exploitation of specific panels of pupils who are periodically tested. These initiatives were also opportunities to improve the methodological foundations of the SEPA and to better incorporate the latest findings of the international research, an issue illustrated by the DEPP (Rocher and Simonis-Sueur, 2015).
Nevertheless, this political cooling did not favour the development of a testing policy at other policy levels than the national one, for instance in the académies. Our collective researches on the accountability policies in three major académies confirmed that the success indicators that these regional authorities consider for their governance or their institutional dialogue with the central administration are only the success rates for national exams. The short questionnaire that we sent to the statistical department of 30 académies confirmed that the implementation of regional tests, as was the case in the académie of Caen at the 6ème level since 2011, is an exception. Most académies simply do not implement such tests even if some of them sometimes conceive ad hoc studies on specific topics.
In that context, it remains difficult to appreciate the effects of this testing policy orientation on the actors of the school system. Another research on the French policy debate in education confirms that the LOLF is very little debated and that this debate tends to focus on budgetary considerations and not on the question of evaluation (Pons, 2017). The CEDRE device seems to be relatively consensual, which does not prevent policy actors instrumentalising it for the sake of their own policy stances.
This period of low politicisation of the testing policy undoubtedly favoured a strong rationalisation of the SEPA devices but also a reduction of their importance at various levels. This led to a narrow statisation in which the testing policy is essentially defined according to the immediate cognitive and institutional needs of the state central power.
Testing as a statisation process
Finally, the trajectory of the French testing policy can be regarded as a succession of five different policy configurations, which are synthesised in Table 2. If these configurations more or less encouraged the politicisation of this policy and the rationalisation of the SEPA devices, they all drew a statisation process that took different forms according to the periods. It started by the internalisation of technical competencies that were developed beyond the state sphere and it went on with the imposition of specific categories and processes, with an increasing integration of the state’s own needs in the conception and objectives of the SEPA, to their contestation and to their concentration on the needs of the central state only.
Policy configurations: a final synoptic view.
Nevertheless, this statisation process does not imply that only one logic of accountability was at stake during the whole period. Soft accountability is relatively constant: the SEPA were never linked in France with hard institutional, financial or administrative consequences. Yet, this logic has been progressively challenged since the middle of the decade of 2000 with the LOLF and the Right policy between 2007 and 2012. In addition, concerning the reflexive accountability through the SEPA, it was central during two decades (1980s and 1990s) but not really before and far less after. It is even surprising to notice that, finally, this trajectory came up in recent years with a situation in which there is neither reflexive accountability (since the SEPA are defined essentially for the purposes of the central power) nor, strictly speaking, soft accountability (since the SEPA constitute some indicators of the LOLF).
Learning from long-term policy trajectories
We chose to study France because it was a priori a most likely case of development of a testing policy. In contrast, the analysis of the long-term trajectory of the testing policy in this school system shows that, in the end, this policy deviates in several ways from international standards: it is implemented irregularly over time according to the configurations and the unequal interest of political leaders; it is not always linked to an overall policy of accountability; and it does not inscribe itself easily into a linear scheme of methodological rationalisation that would see the tools implemented moving continuously closer to the latest advances in international research (only certain assessment tools are included in this scheme, in particular “record assessments”). What theoretical lessons can we draw from the refutation of this highly probable case through a long-term policy trajectory analysis? We propose three as a final theoretical discussion (rather than a conclusion).
This trajectory analysis makes it possible to take much more account of the historicity of public action, that is, both the capacity of history to influence current policy choices and the selective mobilisation of past experiences by the actors in the policy process to impose their vision of the problems to be resolved (Laborier and Trom, 2003). This consideration invites the researcher to show a salutary critical distance from the many “turns” that are sometimes associated in the international literature with the implementation of tests, whether international or not (performance, comparative, quality or topological turns, for instance). 9 A testing policy is inevitably part of a long history, a trajectory that conditions the possibility and the very forms of not only its implementation, but also its effects. In the case of France, contrary to a rapid inference that the implementation of tests would be a sign of the country’s alignment with a global (testing) culture, of its inevitable imprisonment in a global panopticism or of a neo-liberalisation of schools, the trajectory of the testing policy, like that of accountability in this country more generally, remains a neostatist one, leading to a strengthening of the power of the state (Maroy and Pons, 2019).
Moreover, the entry through the successive configurations makes it possible to grasp, on the theoretical as well as the methodological level, the construction of local sites that are more or less context generative when they are exposed to global imperatives (Lingard, 2006). The idea here is not to enclose France in the image of the Gallic village still resisting the invading forces of globalisation, but rather to show that the succession of configurations studied in this article produces a strongly generative political and institutional context, conducive to a vernacular globalisation of the French school system.
This last remark does not mean either that the globalisation or Europeanisation of educational policies at work through the development of testing and the networks of experts who carry them do not have an effect on French testing policy. The surveys show several examples of one-off borrowings: the “record assessments” have gradually been aligned with the item response models used in international surveys, the Right justifies the new assessments implemented in primary education in 2008 by France’s poor results in international surveys, etc. However, the long-term trajectory approach invites us to grasp its nature in greater detail. In the French case, exposure to a new world testing culture did not result in a profound transformation of the modes of governance of the school system (for example, more focused on student performance), but it did provide technical and political opportunities, among others, that actors caught up in domestic policy configurations could seize (or not, or in different ways) depending on the context.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
