Abstract
Developing summative assessment literacy for valid instruction is a qualification requirement for language teachers in Sweden. Yet novice teachers may be unprepared for how to implement regulations in practice. They may even experience a reality shock when facing large classes in which knowledge levels and motivation vary substantially or when adjusting to entrenched assessment cultures in schools. Previous research has measured summative assessment literacy with several theoretical criteria, but no studies have focused on the alignment between regulations and teachers’ perspectives on summative assessment in Sweden. Hence, this investigation addressed 15 novice language teachers’ experiences from on-campus courses, teaching practice and current workplaces. Data were collected from graduates of a Swedish university, who are working at schools in eastern central Sweden. In view of regulations and standards, the findings were then analyzed. Deliberative curriculum theory and teacher cognition constituted the conceptual framework, while constructivist grounded theory provided a methodology for analyzing the qualitative data produced. The findings show that instruction was inadequate regarding aspects such as deliberation and test construction in teacher education. The novice language teachers appreciated teaching practice. However, they argued that the quality of mentoring could vary considerably and recommended that courses devoted to assessment for bridging theory, regulations, recommendations and practice as well as stricter, structured guidelines for teaching practice should be implemented. The main conclusion is that criterion-referenced standards require early timing and ample space for instruction devoted to summative assessment in teacher education.
Plain language summary
Being competent when dealing with test construction, testing and grading in light of assessment research and regulations is imperative for language teachers. However, novice teachers may lack the advanced competence for this skill, named summative assessment literacy, which is outlined in regulations for university and school. Initial teacher education is accountable for providing adequate instruction in various areas that concern both theory and practice with regard to teaching, assessing and learning. This is also important for continuous professional development. Charting novice language teachers’ perspective on opportunities and challenges in teacher education can therefore shed light on what should be addressed and improved in professional education. In the present study, interview conversations were conducted with 15 novice language teachers at various schools located in eastern central Sweden. In general, the findings show that the novice teachers would welcome more focus on instruction in test construction, cooperative summative assessment or deliberation for teachers as well as how to implement assessment research and regulations in practice. Moreover, they would welcome more coherent policies for instruction in summative assessment during teaching practice in school. The findings suggest a dire need for providing sufficient space and time for instruction in summative assessment early on and continuously in teacher education. Some proposals for improvement are included.
Introduction
Swedish educational standards make demands on alignment between targets, activities and assessment (Bonnevier et al., 2017; SKOLFS, 2018; Skolverket, 2021). This is a crucial approach in criterion-referenced assessment and testing or “the determination of the characteristics of student performance with respect to specified standards” (Glaser, 1963, p. 519). However, interpretive perspectives on “curriculum alignment and coherence” diverge in Swedish schools and implementing strategies for testing requires extensive theoretical and practical skills (Lindberg et al., 2018; Molander & Hamza, 2021; Sundberg, 2022, p. 82). Adherence to fundamental standards such as voluntary guidelines in the Common European Reference for Languages (CEFR) (Council of Europe, 2001) can also vary. Research has shown that there may be challenges involved in the implementation process, particularly at tertiary level (Baldwin & Apelgren, 2018). This can result in “uneven scoring of students’ assessments” across the country, when teachers act as assessors, and prompts a need for professional development (Pont et al., 2015, pp. 52–53). In this light, novice teachers face a number of challenges when entering the profession. A logical inference is that courses in teacher education must provide student teachers with adequate summative assessment literacy (SAL) to cope with multifaceted aspects, such as standards for aligned instruction, deliberation, and test design.
Hence, addressing standards in school policies alongside practitioners’ perspectives on summative assessment (SA) provides a relevant line of inquiry for consideration. As Fulcher (2016) argued, transnational standards must be applicable for local education. This issue—how regulations for education are interpreted, employed and enforced in various parts of the world—should engage any number of stakeholders. A qualitative investigation of Nordic teachers’ perspectives can therefore provide some additional dimensions when evaluating the implementation of standards.
This research aims to investigate how Swedish school policies and European recommendations for SA relate to novice language teachers’ local perspectives. Formative aspects of these guidelines are largely beyond the scope of the present paper. However, several of the responses from the teachers included such references. Two research questions guided the investigation:
What do novice teachers perceive of developing summative assessment literacy in relation to school policies and recommendations?
What challenges do novice teachers report in relation to their summative assessment practices?
This investigation juxtaposes formal discourse published by authorities with in-service teachers’ informal, personal statements or assertions. Thus, guidelines serve as standards when interpreting the teachers’ perceptions. Such an approach necessitates the inclusion of theoretical findings regarding SAL. This is controversial in traditional grounded theory, but in constructivist grounded theory (CGT, see Methods) a theoretical framework can locate “the specific argument” that authors wish to make (Charmaz, 2014, p. 311).
Previous Research
On the one hand, previous research on novice language teachers’ perspectives on SA vis-à-vis standards for education in Sweden and beyond is scarce. Nevertheless, four ancillary mixed-methods studies within the project entitled Developing summative assessment literacy: A longitudinal study of pre-service and novice language teachers charted SAL instruction as well as pre- and in-service teachers’ knowledge base of SA in theory and practice (Hilden et al., 2022, 2024; Yildirim, Oscarson et al., 2024; Yildirim, Stjernkvist et al., 2024). Further examples that focus either on SA or on standards are Edwards (2017) analytical rubric of SAL for science teachers in New Zealand and Cizek’s (2020) study of American standards for testing and validity in the sense of “understanding test score meaning and justifying test score use” (p. i). Moreover, Papageorgiou (2016) mapped opportunities and challenges regarding alignment to standards such as the CEFR and Dimova et al. (2022) investigated standards and practices for local language testing. In addition, an interview article and a research review advocated teacher autonomy, professional development as well as a sharper focus on SA in Swedish teacher education (Fröjdendahl, 2018; Lundahl et al., 2015).
On the other hand, research on assessment literacy is extensive. Stiggins (1991) introduced the term and drew attention to the field. At the time, he argued that instruction in “assessing learning outcomes” and “standards for high-quality assessment” should be introduced in the US (pp. 534, 539). Then, the members of the Assessment Reform Group (1989–2010) advocated structured standards of classroom assessment in theory and practice (see e.g., Black & Wiliam, 1998; Gardner et al., 2008). Basically, they reacted against the prevailing culture named teaching-to-the-test in the UK and focused on progressive formative assessment that would increase learner autonomy. Arguments for integrating formative and SA have later been advocated by, for instance, Looney (2011) and Lau (2016).
Subsequently, the language assessment literacy (LAL) aspect was introduced concerning both standards and what “language teachers and testers need to learn, unlearn and relearn” (Coombe et al., 2020, n.p.; Fulcher, 2012; Inbar-Lourie, 2013). In a mixed-methods study, Vogt and Tsagari (2014) investigated European schoolteachers’ LAL and detected a need for more comprehensive training in assessment and testing. Moreover, Bøhn and Tsagari (2021) examined Norwegian instructors’ perspectives on LAL in teacher education, while Tsagari gauged such literacy among teachers of English and proposed a number of improvements for local instruction.
Furthermore, Levi and Inbar-Lourie (2020) underlined some prominent characteristics of language education and assessment today: “migration, globalization and technological innovation [or] ubiquitous technology (u-learning)” (p. 170). The authors argued that these multifaceted and complex phenomena affect the field of language assessment to the extent that “a reconceptualization of the assessment knowledge base or LAL of prospective and practicing teachers” is required (Levi & Inbar-Lourie, 2020, p. 171). Further studies, such as Bachman (2014), drew attention to how such circumstances can affect language test construction. In a similar vein, Fulcher and Harding (2021) advocated education that caters for contemporary conditions, while DeLuca et al. (2018) highlighted the assessment turn in which target-oriented processes interact in “the accountability and standards-based movement” (p. 356). In this reform, alignment between assessment strategies affects contemporary educational perspectives both in terms of regulations, theories and practice.
Similarly, “instructional alignment” (S. A. Cohen, 1987) or congruence between targets, instruction, activities and SA will guide the present investigation. Co-constructed “interview conversations” (Charmaz, 2014, p. 59) are then introduced to examine whether intentions in official standards correspond with novice language teachers’ perceptions of, and challenges for, SA in their professional role as teachers and assessors in Sweden. This approach can fill a research gap in the field of LAL.
Theoretical and Conceptual Framework
Deliberative curriculum theory and teacher cognition serve as the conceptual framework for this investigation.
Deliberative Curriculum Theory
Deliberative curriculum theory advocates local interpretation of policies. Here, deliberation is defined as a “rather complex, fluid, transactional discipline” that can address situational opportunities and challenges in an organic, multifaceted education field (Schwab, 2013, p. 595). An early proponent of the term deliberation from the late 1960s onward, Schwab critiqued a reliance on abstract “theoretical constructions” when addressing “actual problems of teaching and learning” (Schwab, 2013, pp. 592–593). Instead, he introduced a practical method, which revolves around deliberating on solutions for local and specific issues in education. This approach includes the “eclectic” or a recognition of theory for the sake of deliberation but in which the practical concerns of schooling may adjust “certain weaknesses of theory as a ground for decision,” if necessary (Schwab, 2013, p. 599). For him, “a defensible curriculum” will have to consider a plethora of stakeholders, conditions and issues in society, since “in practice, they constitute one complex, organic agency” (Schwab, 2013, pp. 608–609). Along the same lines, Reid (2006) advocated deliberation as a channel for pursuit and for taking responsibility in an active engagement with the world (p. 1). Teachers are of central significance in this pursuit, but the responsibility extends to “civic interest” and, hence, further groups, such as students and caregivers, need to be involved. This is even of central importance, since “the processes by which curriculum problems find their solutions depend on the creation of a focus of interest deriving from the activity of a community” (Reid, 2006, p. 72). Hence, deliberation is a democratic endeavor.
Turning to research on curriculum theory in Sweden, the deliberative approach matches the mode of writing in, and the essence of, decentralized, democratic and egalitarian regulations for education (Englund, 2000, 2015; SFS, 2010:800, Ch. 1, sections 4 & 5). However, researchers have both advocated deliberation and critiqued the practical implementation of this curriculum philosophy in the country (Englund, 2022; Sundberg, 2022; Wahlström & Sundberg, 2017).
Teacher Cognition
Teacher cognition can be defined as a general “construct” that aims to detect “unobservable dimensions of teaching” (Borg, personal communication, 17 January, 2022). In other words, it can reveal “unseen influences […] internal to the teacher, such as their beliefs, knowledge, feelings, perceptions, attitudes, and thoughts” (Borg, 2019, p. 1150). Similarly, the present investigation offered space for the participants—both for the interviewer and the interviewees—in various ways to articulate, or even to become aware of, circumstances in light of requirements and expectations. After all, the novice schoolteachers in this investigation were in “transition from student of teaching to teacher of students” (Ovens et al., 2016, p. 353). At times, this transition for newly qualified teachers can be defined as a state of limbo, which involves unforeseen challenges when planning for, conducting, making decisions and providing evidence for decisions about SA. In this investigation, teacher cognition gives shape to the data collection and provides a fundamental complement to the theoretical and methodological framework. It functions as a channel for detecting teachers’ SA perspectives as well as how they can align with regulations and practice.
Furthermore, exploring their perspectives on SA and on the role of teacher education as well as on what remains to be learned is important for understanding and defining specific preferences, needs or even frustrations that novice teachers may experience. As Borg stated, individuals’ perspectives may be subject to change for various reasons over time, but for researchers and teacher educators this approach can nevertheless lead to valuable insights about individual language teachers’“tacit, personally-held, practical system[s] of mental constructs” (Borg, 2015, p. 40).
Standards for Summative Assessment
The documents in the present paper include the Education Act (SFS, 2010:800), curricula for lower (years 7–9) and upper secondary school (Skolverket, 2011, 2022) as well as relevant support materials, such as general recommendations for grading by the Swedish National Agency for Education (SKOLFS, 2022:417). Moreover, the CEFR and the Companion volume as well as a handbook for implementation are included (Beacco et al., 2011; Council of Europe, 2001, 2020; Piccardo et al., 2011).
Education Act and Curricula
General instructions about SA in the Education Act and curricula, developed by the Swedish National Agency for Education, for lower and upper secondary schools to date are concise. In lower secondary school learners need guidance for developing learner autonomy (Skolverket, 2022, Ch. 2, section 7), while in upper secondary school, learners are expected to:
take responsibility for their learning and study results, and
assess their study results and need for development in relation to the requirements of education. (Skolverket, 2011, Ch. 2, section 5)
Instructions like these invite local stakeholders to deliberate on how to implement aligned strategies for testing and grading. Aims, content and grading criteria are listed in the syllabi for each language subject. Teachers are then expected to develop assessment procedures in the various municipalities and schools. They are responsible for informing students about principles for grading and award grades based on holistic assessments for each subject in question as well as provide reasons for their decisions, while headteachers are held accountable for valid grading (SFS, 2010:800, Ch. 3, sections 14, 15, 16, & 17).
European Recommendations for Standards-Based Assessment
CEFR is a recommended standard for language assessment “across the curriculum” with descriptors for local benchmarking (Council of Europe, 2020, p. 14). For example, a support project aimed at bridging the gap between language policies and classroom practice. One chapter addresses assessment as an “integral part of language teaching and learning, not merely a final step in the process nor just a judgment about an activity accomplished” (Piccardo et al., 2011, p. 42). The report outlines aspects of continuous assessment and how teachers and students share responsibility for the implementation of such strategies. This can empower both stakeholders and facilitate “transparency of the whole process” (Piccardo et al., 2011, p. 45).
Turning to the original document, CEFR, and the new descriptors, there are recommendations for various approaches to aligned analytic and holistic assessment strategies. This framework can be used for:
the specification of the content of tests and examinations.
stating the criteria for the attainment of a learning objective, both in relation to the assessment of a particular spoken or written performance, and in relation to continuous teacher-, peer- or self-assessment.
describing the levels of proficiency in existing tests and examinations thus enabling comparisons to be made across different systems of qualifications. (Council of Europe, 2001, p. 19)
Hence, CEFR aims to guide teachers and further professionals in target-oriented language teaching and test design. Chapter 9 is devoted exclusively to assessment and the terms that follow. “Feasibility” defines the level of good judgments with regard to scope and timing in analytic or holistic assessment, “validity” reflects the quality of the alignment with targets and what the evaluation of the test may reveal over time, while “reliability” makes demands on the consistency of the assessment; in this context, the level of “accuracy of decisions made in relation to a standard” points to the framework as a reference point (Council of Europe, 2001, p. 177, emphasis in original). Then, the new descriptors can be of assistance when “planning backwards from learners’ real-life communicative needs, with consequent alignment between curriculum, teaching and assessment” for local preference and relevance (Council of Europe, 2020, p. 29).
Methods
This is a qualitative, ancillary investigation to a larger mixed-methods research project that addresses pre-service and novice language teachers’ SAL. As mentioned previously, this investigation relies on analyses of official documents and on findings from conversations with newly qualified language teachers. Employing “purposive sampling,” relevant documents were consulted, and invitations were sent to novice language teachers via e-mail (L. Cohen et al., 2018, p. 218). The teachers who responded were provided with open-ended questions prior to the conversations.
Participants
Fifteen participants, 10 female and five male novice language teachers, attended 12 interview conversations and in one case, a written reflection. In terms of education programs, they had graduated from:
the Bridging teacher education program (BTEP) 1.5 years,
the Regular one, 5.5 years (Swedish as major) or
the Regular one, 5 years (further languages).
The participants worked as lower- and/or upper secondary schoolteachers of Swedish, Swedish as a second language (SSL), English, Modern languages and, in one case, as a native language teacher of Finnish. In this investigation, the conversations were held individually and in one focus group at schools in eastern central Sweden. A separate conversation with a teacher who could not attend the focus-group session and a written reflection from a teacher who was unable to attend an individual interview because of the Covid-19 pandemic were also included. Combining analyses of standards and of semi-structured interview conversations enabled the authors to detect both the novice language teachers’ learning perspectives on SA and their educational attainment level regarding SAL. Table 1 introduces the participants and languages in their degrees. The teachers were given names by the letters of the alphabet in the same order as the interview conversations were conducted.
Participants.
Data Collection and Analysis
CGT was employed to provide a thematic analysis of elicited “public documents” and co-constructed “interview conversations” about SA were conducted (Charmaz, 2014, p. 59). The documents and interviews were analyzed as “texts” in a progression from general “observation” of the material to a “focused, selective phase” of coding regarding standards and perspectives on SA (Charmaz, 2014, pp. 45, 113; L. Cohen et al., 2018, p. 551). Regulations may be “far removed from […] interviews,” but they are nevertheless expected to interact and can be juxtaposed when examining to what extent they converge or diverge (Charmaz, 2014, p. 45).
Furthermore, triangulation or, what Charmaz called “obtaining rich data,” by combining documents and interpretations of interview conversations can enhance the validity or trustworthiness of qualitative enquiries, but at the same time there is a subjective angle to consider, since researchers in CGT “construct […] data through [collected] observations, interactions, and materials” (Charmaz, 2014, pp. 3, 23). Hence, the research paradigm in this investigation is interpretive and exploratory (L. Cohen et al., 2018, pp. 19–20, 643–647). It accepts the contention that “objectivity is refracted through the researcher’s eyes” but still acknowledges the notion that adhering to what the data may convey reduces bias (L. Cohen et al., 2018, p. 25). Interview conversations may be adjusted depending on how the discourse evolves and can lead to “new insights for both the researcher and the participant” (Charmaz, 2014, p. 100). Engaging novice teachers and teacher educators in awareness-raising conversations may even generate constructive solutions for the parties involved.
This investigation goes beyond demands for pure induction in grounded theory. Previously, critics have raised issues about “presuppositions and prescriptions” (Charmaz, 2014, p. 242; Goldkuhl & Cronholm, 2010; Themelis et al., 2022). Thornberg (2012) even argued that the “grounded theorist has to accept the impossibility of pure induction and at the same time recognize the analytical power of the constant interplay between induction (in which he or she is never tabula rasa) and abduction” (pp. 247–248). In this context, abduction in the Peircean sense represents a dynamic process that may lead to the detection of unanticipated insights, inferences and hypotheses when analyzing the data. Abduction also allowed the authors of this investigation to employ a flexible, interactive approach to the selected texts. Still, a theoretical framework was included for the sake of scope, delimitations and clarity.
In CGT, the initial phase includes analyzing findings from the data by constructing codes. In this investigation, “in vivo codes” were constructed from patterns detected in the written and oral texts (Charmaz, 2014, p. 343, emphasis in original). Themes in the larger project revolved around theories and recommendations for implementing SA in school as well as general components of SAL (see Hilden et al., 2022, 2024; Yildirim, Stjernkvist et al., 2024), whereas five themes crystallized in the interview conversations:
summative assessment in teacher education
summative assessment in school
assessment cultures in schools
deliberation in testing and
test construction and the role of national tests in schools.
Using NVivo, the authors formulated these themes during the course of the investigation, since they proved to be significant in the conversations and reflected the participants’ perspectives or concerns. Here, the researchers’ focus emanated from “abductive reasoning” (Thornberg, 2012, p. 247) based on standards, research and novice language teachers’ perspectives on SA for the school subjects concerned. In other words, an “iterative approach” was employed, which involves a circuitous or non-linear and progressive process, in the analysis of data (Kennedy & Thornberg, 2018, p. 49; Themelis et al., 2022). Having reported the main findings, the authors then formed hypotheses (see Discussion and conclusions) that can have implications for SA instruction (Thornberg & Forslund Frykedal, 2019).
Findings
Findings of the interview conversations are presented in line with the five thematic areas used as headings in the tables below. These areas crystallized through issues raised by the interviewer and the participants. They cover the research questions about novice teachers’ perceptions of developing SAL as well as challenges in their SA practices. Aspects dealing with standards will be touched on below but will be given further attention in Discussion and conclusions.
The responses of novice teachers showed limitations of advanced knowledge of SAL and of theoretical terminology. Table 2 presents some possible explanations for this predicament in teacher education. A2 argued that on-campus courses could improve in terms of bridging theory and practice. Another novice language teacher, H2, complained about the limited focus on SAL in language education—both on campus and during teaching practice—and found legally defensible testing extremely challenging. During the conversation, H2 had to be reminded of how the various assessments function individually and how they are supposed to interact. Still, when the interviewer reminded her about how SA, or the target of learning, should be preceded by relevant formative strategies, this information refreshed her memory. During the remainder of the interview conversation, however, she struggled to recall and provide correct definitions of terminology such as validity, alignment, transparency, and accountability.
Perceptions of Summative Assessment in Teacher Education.
Turning to teaching practice and test construction, a lack of validity in teacher education became apparent. To some extent during teaching practice, A2 said she could learn about assessment and grading of national tests. Still, this experiential learning “came to her by chance” and there was neither any systematic and structured instruction on campus nor during teaching practice in Swedish. A mandatory course on SA would “really have been brilliant,” she stressed. C2, D2 and M2 reasoned along similar lines. C2 found it unfortunate that test design was not covered but added that teaching practice provided the best learning opportunities for SAL. D2 exclaimed that there could be extreme differences between the various mentors’ SAL in schools and that a comprehensive course in assessment therefore would be a valuable addition to teacher education.
Moreover, I2 advocated the inclusion of fixed standards in on-campus courses, since the quality, or rather the understanding, of what is required by mentors and student teachers varied substantially during teaching practice. M2 asked for a course “dedicated explicitly” to SA. The few seminars on campus that dwelled on learning goals were inadequate, he thought, since they tended to focus on discussions about lesson planning or tasks rather than on grading. He appreciated the fact that the extensive field of formative assessment required a “fair amount of time” but had noted a disturbing imbalance between the attention given to this field compared with SA. One of the participants in the focus-group conversation, L2, even thought that the literature and proponents to formative approaches in school gave him the impression that such approaches are incompatible with SA or “perhaps rather” that the former approach is to be recommended in order to avoid grade anxiety and stress.
Furthermore, as a qualified language teacher A2 perceived a reality shock at school due to lack of preparation and “there is such an emphasis on assessment in school,” she added. L2 noted that several of her fellow students were bewildered by what they regarded as a gap between studying on campus vis-à-vis teaching practice. Nevertheless, she acknowledged that bridging or marrying theory to practice can be a serious challenge for teacher education.
Some novice language teachers addressed further complexities that they faced in school. E2 stressed that each class of students is unique and that theories have limitations against the background of diversity in education. When the interviewer then asked questions about what could be improved in teacher education as regards SAL, E2 mentioned that learning about theory, structure, and planning was useful, but the course literature did not prepare her for what she faced in school after graduation.
As for K2, she stated that her placement in her third and final teaching practice was a stroke of good fortune but this was less so during the previous practicums. Nevertheless, as a novice teacher she was overwhelmed when having to assess the achievements of students with heterogeneous knowledge levels of language in classrooms with more than 30 teenagers. Teaching as such was no problem, but fairness in assessment may be difficult to maintain in circumstances like these, she added. K2 asserted that teacher education should introduce a practical university course based on theories and relevant policies on formative and SA.
Common denominators in the responses about SA in school had to do with practical concerns of schooling. Some illustrative samples are provided in Table 3. The participants either addressed challenges with assessment and grading in English or SSL for newly arrived migrants and refugees with extremely low proficiency or with the transition from teacher education to high achievers. D2 would welcome distinct guidelines and “clear analytic rubrics for each aspect of writing and for various genres” from the Swedish National Agency for Education. I2 also struggled with the interpretation of knowledge requirements. An autodidact, he said he learned how to categorize and structure the goals in time.
Perceptions of Summative Assessment in School.
B2 contrasted her previous experiences of summative strategies abroad with how she had embraced contemporary formative approaches to learning in Sweden. Speaking of her own teaching philosophy, B2 had come to understand the relevance of alignment, validity, transparency, authenticity, and peer assessment. She argued that SA “wings the whole thing together” and that her responsibility as a language teacher was to guide the students in formative and summative processes toward the targets. Her final words and self-evaluation included confidence and hope: “I think I’m succeeding to help them value the process, we’re getting there.” E2 presented an alternative perspective on SA, since formative approaches, including progressive documentation, were the preferred norm in her communicative methods. Well-planned tests in her teaching would also invariably lead from summative examinations to further formative pursuits, she asserted.
As sample responses in Table 4 show, the novice teachers drew attention to contrasting perceptions of SA in schools. C2 mentioned striking contrasts between the assessment culture in an upper secondary school in which he worked previously and at his current workplace in adult education. At the former school, he claimed, the teachers did not trust the students, since they developed several strategies for cheating and so the teachers’ countermove was to give unannounced tests in the classroom. However, at the latter school, the headteacher introduced formative assessment as the prime policy in which transparency would produce an atmosphere of trust between the teaching staff and the students so that summative testing would not come as a surprise for the learners.
Perceptions of Assessment Culture in Schools.
For D2, the assessment culture at the English-speaking school in which she was working was well structured thanks to effective leadership. Still, the strategies for languages may differ from those of mathematics for instance, she stressed, since teachers in the latter subject more often employed summative tests. On the one hand, thanks to the academic culture at this school, D2 was familiar with terminology such as holistic and analytic assessment, validity, alignment and transparency: We [the teaching staff] are dealing with all of those aspects, and we even use the same terminology that you mentioned. Still, we focus mostly on feedback, feed-forward and all that [formative strategies] and also flipped classroom in seminars. (D2)
On the other hand, a lack of self-efficacy was noticeable regarding grading and conveying information about students’ achievements when communicating with caregivers, but she added that “some of my colleagues still feel overwhelmed by such duties, even if they have been working for twenty years.”
F2 focused on her experiences from teaching SSL and Swedish in mixed classes for newly-arrived migrants and refugees and reported further perspectives or dilemmas concerning SA. She faced various challenges when grading tasks produced by students whose knowledge of Swedish was poor. First, she was puzzled by the assessment culture at the school in which she worked. Teaching-to-the-test was the undisputed strategy both among her colleagues and students. Secondly, when co-assessing assignments with senior colleagues, she recalled instances in which the inter-rater reliability could either prove to be high, acceptable or low in collaborations. Thirdly, she experienced a communication breakdown when setting grades in some cases. G2 encountered similar challenges at another school in courses that prepare the newly-arrived students for upper secondary school. The norm was to test skills analytically and systematically and to prevent school failure, which stood in stark contrast to her experiences in teaching practice.
Well, I suppose that the assessment culture at this school can be described as a collective will to assist students in the classroom so that they can pass the tests. Hardly any of the students are aiming for an A [the highest grade] and this is not something we discuss, instead we are trying to reduce the number of F grades [in the school subject SLL]. This is also our goal for English. (G2)
When facing formative strategies, the students failed to take responsibility for their learning—“this is a totally different planet,” she added. Furthermore, at this school there were neither any opportunities for continuing professional development, nor any systematic deliberations concerning assessment among the staff, except for co-assessment.
Table 5 reflects the significance of collegial conversations about testing. Moderation and collaboration—or what can be defined as deliberative approaches to the curriculum—were of crucial importance to novice language teachers such as C2. Together with colleagues, he had also taken initiative for cooperative learning. Conversations about teaching and learning or deliberations regarding grading were an integral part of a typical workday for the teaching staff at his school, C2 underlined. The headteacher actively encouraged such collaboration. Moreover, he said he learned more about how to award grades from fellow teachers in practice than from, what he regarded as, the vague knowledge requirements for schools defined by the Swedish National Agency for Education.
Perceptions of Deliberation in Testing.
Similarly, K2 would welcome clear guidelines from the National Agency for Education and from instructors at teacher education about testing. In contrast to C2, however, she did not describe her work environment as a place in which conversations and deliberation about such concerns came naturally. The most challenging aspect for her, she stated, was not teaching, but assessing and grading in the scale A through E for pass and F for fail. At university, she enjoyed taking part in some deliberation with fellow students about grades for the national test, but she had noticed and heard about rather dramatic differences between how teamwork and collaboration, or deliberation, functioned in various schools. Over time, K2 said that she had received some assistance and support from colleagues and headteachers, but she argued that it was rather thanks to being mature in terms of age that she had managed to solve issues to do with testing and grading.
Table 6 shows samples of frequent comments on test construction and national tests. As for test construction, the responses revealed lack of sufficient attention to this area during teacher education. For example, C2 stated that there had been no such instruction, apart from a certain focus on terminology such as validity and reliability. Instead, he relied on the national tests and, what he regarded as, the clear descriptions that accompany them concerning what to test and how to grade. When the interviewer outlined opportunities for learning about test development at university, he stated that such a course would be welcome. Cognitive theories would be particularly useful for achieving a better understanding of learning perspectives, he added.
Perceptions of Test Construction and the Role of National Tests in School.
In the conversation with K2, the interviewer asked about her thoughts on the confidential nature of national tests. She argued that lack of transparency in testing can be detrimental and even be interpreted as patronizing toward teachers. Of her own accord K2 had just applied to a course in professional development for test construction and assessment in SSL. Developing tests, she added, takes an inordinate amount of time and “we would really, really need to have access to several reading comprehension tests,” she underlined.
The participants displayed various attitudes to national tests. For E2, the tests enable teachers to adhere to a national standard which “obviously must be relevant for what we are expected to assess,” she surmised. However, H2 presented mixed views on them because of the negative washback involved when spending considerable time and effort on these high-stake tests. Still, they facilitate reasonably effective and reliable grading, which is helpful, H2 thought.
D2 would welcome access to test banks with clear national standards for assessing assignments in school. There were some challenges that she had detected, for instance in connection with online teaching during the recent pandemic, such as plagiarism. Hence, she refrained from employing tests that were available on the Internet. Instead, she sometimes designed minor relevant tests without informing the students in advance. I2 offered an alternative perspective. He was grateful for the theoretical instruction on campus which had enabled him to evaluate ready-made tests and, in simple terms, to be able to produce language tests for school. However, he also lamented the lack of test construction in teacher education. Such skills would have been useful during teaching practice and for having deliberations about testing with fellow students, he suggested.
In the focus-group conversation, L2, two participants emphasized how courses in SSL had provided them with practical skills for testing grammar patterns and similar linguistic concerns, while the instruction in Swedish allegedly lacked a focus on test construction and grading and was instead devoted to formative feedback and feed-forward. Another member of this group, who majored in Swedish, then added that practical teaching methods were on the agenda in her case. Neither formative nor summative practical aspects were raised in the Swedish courses, she stated, while formative assessment and a few aspects to do with test construction were covered in modern languages. Still, she appreciated and advocated for the focus on theoretical terminology at university, since such knowledge can provide a route to practical knowledge. Regarding courses in language education, she valued the scant opportunities for conducting deliberations with fellow students in exercises to do with grading national tests.
In general, the interview conversations revealed common denominators but also disparate experiences and perspectives among the novice language teachers. The lion’s share of the participants gave prominence to teaching practice as the optimal scene for developing their SAL. The third and final teaching practice provided the most optimal opportunities for implementing assessment theory into practice, they argued, and assessment and grading were also named specifically in the intended learning outcomes. At the same time, however, they thought that the quality of mentoring and of school placements varied considerably and would welcome a more strict and structured instruction in test construction and deliberation on campus.
Discussion and Conclusions
Teacher education is accountable for providing adequate instruction so that qualified teachers can demonstrate “specialised knowledge about assessment and grading” in the Swedish school system (SFS, 1993:100, Annex 2, para. 4). In the aftermath of school reforms from the 1990s onward, Sweden adopted aligned target-oriented instruction and learner-centered validity (SKOLFS, 2018; Skolverket, 2021), which are ubiquitous reference points for local deliberation (Bonnevier et al., 2017; S. A. Cohen, 1987; Dimova et al., 2022; Sundberg, 2022). There are also transparent instructions for planning and conducting aligned, valid assessment in place (SKOLFS, 2022:417; Skolverket, 2021). Thus, standards for SA in Sweden are formulated to enable deliberations for specific language subjects and contexts, but there are various assessment cultures in schools to consider.
In this investigation, official documents, assessment theories and personal accounts of SA were juxtaposed for the sake of examining how such “texts” match (Charmaz, 2014, p. 45). Probing into the elusive areas of teacher cognition, or “beliefs, knowledge, feelings, perceptions, attitudes, and thoughts” revealed several challenges for the stakeholders (Borg, 2019, p. 1150). General findings indicate that Swedish official regulations for assessment in schools may be aligned with contemporary research on assessment and European recommendations, but the implementation and enforcement should be addressed systematically throughout teacher education and by local management of schools. Two research questions emerged during the course of this research. Firstly, the participants’ perceptions of developing SAL in relation to standards and, secondly, what challenges they encountered in practice are presented below. The five thematic areas listed in Findings provide a structure for this account. Some hypotheses about how to improve SA instruction are included.
The interview conversations revealed a common denominator in the novice language teachers’ perspectives, namely that instruction for advanced SAL can be improved in terms of focus, time, and space. Some teachers appreciated the importance of theoretical studies for their own practical implications, while the majority would welcome stricter and more stringent or comprehensive SA instruction both for pre-service teachers and mentors in teaching practice so that the intended learning outcomes can be met. This is relevant in a country that adheres to target-oriented philosophies in teaching and learning as well as to criterion-referenced standards for assessing, testing and grading. If teacher educators expect schoolteachers to plan for alignment and validity or congruence between targets, instruction, activities, and assessment in their language subjects, then they must ensure that the same applies to their own on-campus instruction and teaching practice. However, more research on teacher education in Sweden should be conducted by asking questions such as why the novice language teachers may not be comfortable in their SAL and what can be done to alleviate the situation. Introducing courses devoted to assessment specifically at initial stages and throughout the teacher education programs is one solution. Developing a solid foundation for assessment literacy for teachers may even have positive consequences for helping students in school to manage exam stress and language anxiety.
Moreover, when commenting on SA in schools, several participants saw teaching practice as optimal for learning about assessment or for marrying theory to practice but added a caveat. The quality of the mentoring was uneven, they argued, and recommended the university to set more stringent standards for teaching practice—both on campus and in schools. Here, there is a further potential misalignment and threat to validity to be considered. However, the Ministry of Education addressed this problem by introducing trials with teacher training schools in Sweden between the years 2014 and 2019 (SFS, 2014:2). Instructors offered courses in professional development of 7.5 credits to mentors at training schools. These activities have continued and are on-going, which may both lead to improvements for pre-service teachers concerning the quality of mentoring and increase the alignment between on-campus courses and teaching practice in schools.
Furthermore, the participants in this investigation invariably shared their thoughts on how they had to make decisions about SA against the backdrop of varying knowledge and learning curves as well as motivation levels in schools. They were not prepared for how to manage such circumstances regarding teacher-and-student relationships or classroom leadership. The novice teachers were nonplussed by the gap between on-campus courses or theory and teaching practice. Interpreting knowledge requirements proved to be another challenge for them. Admonition for the community to engage in deliberative interactions can provide one solution to these concerns about practical issues in the workplace (Reid, 2006; Schwab, 2013). Extending deliberative skills to students, caregivers and further parties concerned may also solve potential problems to do with learners’ sense of responsibility and study motivation. Above all, learning about how to manage professional deliberation, which includes “eclectic” considerations for standards, theoretical sources and for civic responsibilities, may empower these teachers (Schwab, 2013, pp. 599–602).
Turning to assessment culture in schools, the teachers faced contrasts in terms of approaches to formative and summative strategies. This perspective complies with findings on how local interpretations of the regulations diverge, which may even produce “knowledge segregation [in] high- and low-performing classroom contexts” (Sundberg, 2022, p. 82, emphasis in original). Such challenges can be linked to management culture in schools or to students’ knowledge levels and study motivation, but they may nevertheless include acute issues to do with SAL. For example, in a high-achieving school an academic culture prevailed, which meant that theoretical terminology concerning assessment was employed and implemented. This was empowering for the participant in question, but she admitted nevertheless that her self-efficacy regarding SAL was low. In another, low-achieving school teaching-to-the-test was the norm and in yet another school, the ambition among the teaching staff was to test skills exclusively in summative ways. In the latter case, the students could not be persuaded to take responsibility for their own learning and formative strategies failed for this reason. Ultimately this also suggests that the school staff neither adhered to the Swedish regulations in the curricula, nor to research about standards-based assessment and to learners’ commitment in the learning process.
Taking part in deliberations about grading with fellow students, instructors and mentors at seminars and during teaching practice was an aspect that the participants thought had improved their SAL the most during their student days. They underlined a further aspect of deliberation, or what they called informal conversations, about assessing and grading as a cherished aspect of the daily activities in school. Still, all participants revealed a sense of insecurity connected with SA also in this context and would welcome clearer national guidelines. It is promising that the teachers were prone to and appreciated deliberation, but the findings also showed that their craving for clarity and simplicity revealed a lack of knowledge and insights about opportunities for local interaction. Their perspectives confirmed that deliberative curriculum theory had not been on the agenda in a systematic fashion when learning about Swedish policy documents and recommendations from Europe. This is corroborated by Englund, who raised concerns about how the deliberative curriculum was not properly implemented in the Swedish school system. He even complained about how the “democratic offensive” of the 1980s and a period characterized by “teachers’ professionalism” and of decentralization had given way to a strong “top-down governance” (Englund, 2015, p. 54). Hence, university-led courses in deliberative curriculum theory with practical exercises and assignments may be useful for improving teacher agency.
Moreover, the participants argued that test construction was missing on the agenda in teacher education. The ones who majored in SSL or in English stated that they had learned about strategies for assessing, testing and grading when using pre-made online tests or when designing assignments and teaching unit plans. Still, the teachers’ self-assessment revealed that their knowledge base of test construction was found wanting. Instead, some maintained that the structure, clarity and comprehensive instructions of the national tests gave them solace. This perspective may however also be problematic for a variety of reasons. National tests in Sweden are aimed at providing teachers with guidelines for grading student achievement and they may be aligned with the general requirements in the curricula, but they can nevertheless fail to cater for varying local needs and diversity across the country. Owing to their design with focus on each of the four skills, the national tests may also prompt undesirable washback effects. According to European recommendations, testing the four skills, listening, reading, speaking and writing separately “has increasingly proved inadequate in capturing the complex reality of communication” (Council of Europe, 2020, p. 33).
National tests in Sweden may even prove to be potential threats to validity, if teachers in their daily routines opt for “continuous assessment,” backward planning for authentic assignments, alignment for “local preference and relevance” or “transparency and coherence between the curriculum, teaching and assessment” (Council of Europe, 2001, p. 185; Council of Europe, 2020, pp. 27, 29). On the one hand, the confidential nature of the national tests in Sweden may have positive effects to do with their reputation as objective indicators of knowledge and skills as well as their status as high stakes for future endeavors. On the other hand, it can impinge on the practical implications for the teachers and test takers, since there is little advance information about the topic and even if the design of the national tests may be known from samples available on the Internet, there are limited opportunities for preparation. Placing considerable weight to the results from a single, largely non-transparent, language test can also be misdirected in target-oriented education. As Glaser (1963) argued in another context, criterion-referenced standards focus on test takers’“achievement continuum” and not necessarily on final test results (p. 519; see also Piccardo et al., 2011).
In conclusion, the findings of this investigation suggest that a practical approach to theory, national regulations and recommendations in CEFR, test construction and the art of deliberation should be mandatory in teacher education and for professional development among mentors in teaching practice. In light of their commitment to deliberation, the novice language teachers would be empowered by a course on how to implement SA in their respective language subjects for various circumstances and levels in school. As Bachman (2014), Fulcher and Harding (2021), or Levi and Inbar-Lourie (2020) argued in their respective geographical contexts, there is a need for an expanded knowledge base for pre- and in-service teachers’ SAL owing to the dynamic progression in contemporary assessment research as well as increased population mobility, digital transformation, and ubiquitous learning.
Limitations of the Investigation and Suggestions for Future Research
The present investigation juxtaposed fundamental standards and perspectives on SA in Sweden. It offered space for the participants to articulate, or even to become aware of, circumstances and conditions regarding SAL. The limited number of participants and the fact that they were recruited to participate voluntarily can be a weakness in terms of quantitative representativity. At the same time the cohort in question represented a variety of age groups and several language subjects, even if the lion’s share majored in English. Hence, the participants have met various instructors and mentors both in teacher education and teaching practice.
Moreover, the scope of the investigation emerged from two observations that indicate potential dilemmas. On the one hand, Swedish curricula and further official documents are intended to inform and guide stakeholders across the nation and at the same time they invite individuals to implement local standards for instruction, assessment and learning in accordance with the Education Act. Hence, they should function as frameworks for deliberation in specific local contexts.
On the other hand, perspectives in interviews may be subjective, even if they can reveal individual truths that need attention. This subjectivity concerns both the interviewer and the interviewees. As Charmaz (2014) argued in another context, “people not only invoke [oral discourses] to claim, explain, and maintain, or constrain viewpoints and actions, but also to define and understand what is happening in their worlds” (p. 85). This approach resembles Borg’s teacher cognition as a tool for detecting perspectives that may impinge on language teachers’ daily activities but that may not be known outside a close circle of colleagues, friends and family. In this light, further research on perspectives articulated by professionals involved in teacher education, such as instructors on campus as well as mentors and headteachers in school, would be useful for creating alignment and finding a common ground.
Footnotes
Acknowledgements
We thank the novice language teachers for devoting precious time and space to our investigation.
Author’s Note
The authors of this investigation are established academics in teacher education, but they were not involved as university instructors in this project. All of the authors contributed to design of the research project entitled Developing summative assessment literacy: A longitudinal study of pre-service and novice language teachers funded by the Swedish Research Council. Project no. 2018–04008. In the present investigation entitled Regulation in practice: Standards and novice language teachers’ perspectives on summative assessment in Sweden, the lead author conducted the interview conversations based on the questions and issues raised by the research team. She then developed a coding system for analyzing the data and was responsible for the pre-writing, drafting as well as the editing and revising processes, while the co-authors participated in the formative stages of the process. Professor Yildirim, research leader in the project, contributed the most to the editing process.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Swedish Research Council under Grant number 2018-04008.
Ethical Considerations
A certificate from the University of Gothenburg can be sent to Sage open, if required.
Statement Re. the Supporting Data
We can provide reviewers with documents that contain transcriptions of the interview conversations with the participants in the present investigation.
Data Availability Statement
The data that support the findings of this investigation are available on request from the corresponding author. They are not publicly available due to ethical restrictions.
