Abstract
This systematic literature review critically evaluates 14 empirical studies published over a 14 years span (2006–2019) to answer questions about the models and the effects of transformational school leadership on student academic achievement. The analysis of the related literature utilized vote counting and narrative synthesis to delineate the status quo of the current research field. It was found that the majority of these studies were conducted in Western and English- speaking countries and these studies utilizing different research methods and models reported mixed results. Recommendations for future research directions include use of an integrated leadership framework and complexity in the study of leadership in schools.
Introduction
Student achievement has been at the forefront of major legislation and discourse in the education field and in society more broadly. “In this new era of accountability, school leaders are expected to increase student achievement and make substantial academic growth for all students” (Quin et al., 2015: 72). Meanwhile, how to improve student achievement has long been the core mission of policymakers and the focus of school leadership scholars who strive to capture approaches to leadership that are effective for this purpose (Nichols et al., 2012; Sun and Leithwood, 2012). Also, scholars have recognized that leadership is “one of the key determinants of student achievement” (Dutta and Saheny, 2016: 941), and of all in school factors that have an impact on student achievement, leadership is second only to classroom teachers when it comes to influencing student learning (Leithwood et al., 2008).
There have been multiple reviews of empirical research on the direct and indirect effects of leadership on student outcomes with the majority of studies focusing on either instructional or transformational leadership (TL) (Kwan, 2020). Furthermore, there are disparate accounts on the impact of leadership on student outcomes depending on the methods these studies used to examine the effect of leadership. Leithwood and Jantzi (2005) conducted a literature review of published studies between 1996 and 2005 in order to understand the “nature” (p. 177) of TL, its antecedents, and “variables that both moderate and mediate its effects on students” (p. 177). They found out that TL was complex in nature and identified 41 mediating variables that include teacher characteristics, such as job satisfaction and teacher commitment, organizational structures and conditions, and student characteristics. This review included 15 studies that analyzed the effects of TL on student learning with nine studies about the effects of TL on students’ academic achievement and six studies about effects of TL on student engagement. Though the impact on student achievement seemed mixed, Leithwood and Jantzi (2005) did not elaborate on the reasons behind such disparate findings and conclusions. According to Robinson et al.'s review (2008) of qualitative studies, instructional leadership had a major influence on student outcomes in school turnaround efforts and was thus considered superior to TL. Reviews of quantitative studies show a somewhat different impact of principal school leadership on student outcomes with studies varying when it comes to the significance of impact. Interestingly, researchers using direct effect models produced small positive effects on student outcomes that differed from those using indirect models (Witziers et al., 2003). Witziers et al. (2003) warned researchers on the limitations of the direct effects model in studying leadership effect on student outcomes. Muijs (2011) conducted an overview of research on the impact of leadership (transformational, distributed, and instructional) on student outcomes. The review did not provide either a clear timeline of the studies included or the inclusion criteria for the studies. Muijs (2011) found that most studies that looked for a direct effect of leadership on student outcomes showed that leadership variables in the studies reviewed were “only modestly to weakly related to outcomes” (p. 46). However, Muijs (2011) concluded that leadership has a significant indirect impact on student outcomes with TL, distributed leadership, and instructional leadership showing evidence of some impact. Muijs (2011) also noted significant weaknesses in the research base which include dualism, that is, transactional leadership characterized as negative and TL as positive, over-prescriptivity, lack of international studies, limited methodologies, and poor measurement.
In lieu of these mixed results of the literature reviews, we focus on published studies that studied only TL theory and its impact on student academic achievement. Additionally, we expand our search to include international studies in hopes of providing a broader knowledge base on TL theory and its impact on student outcomes. Our hope is to provide updated research data on the effects of TL and models utilized to study this particular theory's impact on student academic outcomes. This is the major difference from the previous literature reviews as mentioned heretofore. This literature review intends to analyze the evidence on the following questions regarding TL and student achievement: (1) What conceptual frameworks of TL are researchers using in their studies? (2) What is the effect of TL on student achievement? (3) How is the effect of TL manifested, that is, what models (direct or indirect impact) are the studies utilizing in studying TL and student achievement? We include studies that were published after Leithwood and Jantzi's (2005) review of research on TL and schools to avoid replication.
Conceptual framework: TL models
TL is one of the most popular approaches to leadership and occupies a central role in research across many fields such as management, social psychology, education, nursing, political science, and industrial engineering (Northouse, 2019). TL's emphasis on intrinsic motivation and follower development might explain to some extent the popularity of TL since the 1980s (Bass and Riggio, 2006). TL is a process that changes and transforms people and organizations, which was first introduced by Burns, a political sociologist, in his seminal work titled Leadership in 1978. Burns (1978) sees leaders and followers as inextricably linked and bound by the mutual needs and transformation process. He distinguished between transformational and transactional leadership with the latter being concerned and engaging in quid pro quo exchanges. TL is often juxtaposed to transactional leadership as a process that raises the level of motivation and morality in both a leader and a follower.
In the mid-1980s, Bass (1985) refined the work of Burns by including emotional elements from House's charismatic leadership theory (Northouse, 2019). Bass and Avolio (1994) proposed a model of TL arguing that transformational leaders motivate followers to do more than expected through four factors: (a) idealized influence, (b) inspirational motivation, (c) intellectual stimulation, and (d) individualized consideration. More importantly from the empirical research standpoint, Bass (1985) developed an instrument, multifactor leadership questionnaire (MLQ), that measures TL. Consistently, studies conducted across different countries and organizations have shown an additive effect of TL on followers and organizations.
Bennis and Nanus (1985) viewed the concept of TL as a process. They developed a model that consists of four strategies that transforming leaders use to transform organizations. The interaction between transforming leaders and followers undergirds this model. Kouzes and Posner (2002) developed their model of TL, which consists of five practices that enable leaders to accomplish extraordinary success. The five practices serve as strategies for exemplary leadership (Northouse, 2019), which included model the way, inspire a shared vision, challenge the process, enable others to act, and encourage the heart. The model is prescriptive in nature and suggests practices that can be learned to become effective. To measure these practices/behaviors, Kouzes and Posner developed the leadership practices inventory (LPI).
TL entered the field of education research more recently and it has become popular as an idealized form of leadership (Hallinger, 2003; Leithwood and Jantzi, 2005). Leithwood et al.'s model (2001) adapted and expanded Bass's model of TL in order “to better capture the consequences for leaders of working in school organizations” (Leithwood and Jantzi, 2005: 180). Their TL model includes seven behaviors or dimensions: (1) building school vision and establishing school goals; (2) providing intellectual stimulation; (3) offering individualized support; (4) modeling best practices and important organizational values; (5) demonstrating high-performance expectations; (6) creating a productive school culture; and (7) developing structures to foster participation in school decisions. According to Sun and Leithwood (2012), TL in educational contexts has aimed to absorb and integrate more leadership models, such as instructional leadership and managerial or transactional leadership. This incorporation makes it a more comprehensive leadership model in educational settings (Sun and Leithwood, 2012).
Models of study
How to study the effects of TL in schools and its outcomes has been a focus for educational scholars. In a literature review study of principal leadership, Hallinger (2008) developed an effects model of educational leadership to sort dissertation studies on principal leadership and educational outcomes into five main types of causal models: (A1) direct effects model, (A2) direct effects with an antecedent model, (B1) mediated effects model, (B2) mediated effects with antecedents’ model, and (C) reciprocal effects model. We modified Hallinger's models to suit the purposes of this literature review as shown in Figures 1 and 2. The first modification to Hallinger's model has to do with TL as a variable rather than principal leadership more generally. The second is its link to student achievement rather than educational outcomes more generally.

Direct model of transformational leadership (TL) and student achievement.

Indirect model of transformational leadership (TL) and student achievement.
Method
We use Hallinger's (2014) conceptual framework for conducting systematic literature reviews to guide this literature review and organized our study around the following five questions proposed by him:
What are the central topics of interest, guiding questions, and goals? The central topic of this review is TL and its impact on student academic achievement. The purpose of this literature review is to analyze the evidence on the following questions regarding TL and student academic achievement: (1) How is TL defined in research studies? (2) What is the effect of TL on student academic achievement? (3) How is the effect of TL manifested, that is, what models of causality are the studies utilizing in studying TL and student academic achievement? What conceptual perspective guides the review's selection, evaluation, and interpretation of studies? We use the major theories of TL: Burns (1978), Bennis and Nanus (1985), Bass and Avolio (1994), Kouzes and Posner (2002), and Leithwood and Jantzi (2005) as a framework of TL that is described in the initial part of the article. What are the sources and types of data employed in the review? In the attempt to conduct a literature review that does not overlap with studies covered in “A review of transformational school leadership (TSL) research 1996–2005” written by Leithwood and Jantzi (2005) summarized earlier in this article, we restricted our timeline to studies from 2006 to 2019. Other criteria for inclusion in this review were:
Publication status: Only studies published in peer-reviewed journals. Research studies regardless of methods (quantitative, qualitative, or/and mixed methods studies). Research studies published from 2006 to 2019. Publication language: Only articles published in English. Only studies that focused on TL and student academic achievement in K-12 settings.
We bracketed “transformational leadership” and “student achievement,” which yielded 40 articles. Limited year of publication lowered the number of articles to 12. We ran searches on Education Source search engine due to (1) its comprehensive coverage of education journals that include major English language journals that primarily publish on educational leadership, (2) access to full text, and (3) abstracts that cover journals for elementary, middle, and high school levels of schooling. Additionally, we searched Educational Administration Quarterly Journal, Education Management Administration and Leadership and Journal of School Leadership issues from 2006 to 2019 that increased the number of articles only slightly. Finally, we scanned the articles’ references to see whether there were any published studies we might have missed.
(4) What is the nature of data evaluation and analysis employed in the review? Once we collected these articles, we conducted a collective review of abstracts of each article to determine whether it meets the criteria we set for inclusion in this literature review. Our objective despite its limited focus was clear, and that was to review the studies on TL and student achievement. In this process, we found out there were articles that did not focus on these topics despite keywords used to describe some of the studies. After excluding the irrelevant articles, there were 14 articles in total included in this literature review ultimately (see Table 1) which is a similar number to Leithwood and Jantzi's finding of 15 studies on TL and student learning.
Studies on TL and student achievement (2006–2019).
LPI: leadership practices inventory; MLQ: multifactor leadership questionnaire; NLS: England’s National Literacy Strategies; NNS: England’s National Numeracy Strategies; PLQ: principal leadership questionnaire; TL: transformational leadership; TSL: transformational school leadership; TSLS: TSL scale;
(5) What are the major results of the review? We presented the results of review in four parts: (1) general findings based on descriptive analysis of each study; (2) models of TL and the instruments utilized to measure TL; (3) concept and measurement of student achievement; (4) major findings of the impact of TL on student academic achievement. Using the vote-counting method and the narrative synthesis, we analyzed each study based on Hallinger's criteria and summarized the findings from these studies. Finally, we pointed out possible future directions for research and discussed the implications for policymakers.
Data extraction and method of analysis
Hallinger (2014) emphasizes the importance of outlining extraction procedures in systematic literature reviews. First, we extracted descriptive information on each article such as author(s), year of publication, locus, journal, and methodology (see Table 1). We used narrative synthesis (Snilstveit et al., 2012) to review the studies that entailed extracting the information from each article, summarizing each article based on the purpose of the study, and drawing conclusions based on the findings reported by authors. This stage of the study involved extracting the definition of TL used by the researchers, and how they measured TL and its impact on student achievement (see Table 1). To guarantee the quality of the evidence and reduce the potential for bias, data extraction and analysis of the reviewed studies were conducted in two rounds. The first author conducted the first round of analysis, and the second author re-examined the information extracted from the reviewed studies. In cases of disagreement, the two authors re-analyzed the studies, discussed their interpretations till an agreement was reached.
Following Leithwood and Jantzi's method of analysis, a vote-counting method was used to summarize the results on the direct and indirect impact of TL on student achievement. Although meta-analysis is the widely acknowledged approach in literature synthesis, it is not applied in our study for several reasons. According to Cooper et al. (2008), meta-analysis should be the option when the objective of a synthesis is to summarize studies in one field and to make a general statement about the link between variables. Meta-analysis is inappropriate when conceptual and methodological approaches on a topic change over time, or when the major goal is to critically evaluate studies to identify central problems in a field (Cooper et al., 2008). In this review, we found three conceptual frameworks of TL that have been adopted in studies of TL and student achievement: (1) Bass and Avolio's (1994) four-factor model (idealized influence, inspirational motivation, intellectual stimulation, and individualized consideration), (2) Leithwood and Jantzi's (2006) three-factor model (setting directions, developing people, and redesigning the organization), and (3) Kouzes and Posner's (2002) five-exercise model (modeling the way, inspiring a shared vision, challenging the process, enabling others to act, and encouraging the heart). Additionally, while most of the studies were quantitative, there was one mixed-method and two qualitative methodology studies included in this literature review. More importantly, our aim is to critically appraise extant studies to point out the future direction of research in this field. Thus, vote counting and narrative synthesis seem to be more appropriate for our review of the studies.
Findings
Based on the analysis of the 14 articles, we group the findings into four sections: (1) general findings based on descriptive analysis of each study; (2) models of TL and the instruments utilized to measure TL; (3) concept and measurement of student achievement; and (4) major findings of the impact of TL on student academic achievement.
General findings
There is a relatively small number of published studies on TL and student academic achievement (n = 14) over a 14-year span (2006–2019) even though the discourse on the impact of leadership on student outcomes, especially achievement, is prevalent internationally. The majority of the studies on TL and student achievement were conducted in western and English-speaking countries, such as the US (n = 5), UK (n = 3), Canada (n = 1), Australia (n = 1), with two (n = 2) studies in India, one study in Israel (n = 1) and South Africa (n = 1) (see Table 1).
Models, theories, and instruments of TL
The majority of the studies used indirect models, with 4 out of 14 articles employing the direct model. The studies that tested the indirect impact of TL on student academic achievement used several mediating factors, including both school-level variables (i.e. school climate, improvement in school conditions, and organizational citizenship behavior) and individual-level variables (i.e. teacher commitment, teacher collective efficacy, and teacher job satisfaction). In addition, we examined the theories of TL and measurements used to study TL in these studies. The instruments utilized to study the impact of TL can be categorized into four groups (see Table 1):
TSL scale (TSLS) developed (or informed) by Leithwood and Jantzi (2006) was used in five studies (n = 5) (e.g. Boberg and Bourgeois, 2016; Day et al., 2016; Dutta and Sahney, 2016; Leithwood and Jantzi, 2006; Valentine and Prater, 2011). MLQ 5S and 6S developed by Bass and Avolio (1994) found in five studies (n = 5) (e.g. Adams et al., 2017; Cerni et al., 2014; Nash, 2010; Nir and Hameiri, 2014; Shatzer et al., 2014). LPI developed by Kouzes and Posner (2003) used in one (n = 1) (e.g. Quin et al., 2015). Self-developed instruments drawing upon different previous studies on TL found in one study (n = 1) (e.g. Ross and Gray, 2006).
As can be seen in Table 1, Leithwood and Jantzi's TSLS and Bass and Avolio's MLQ are still playing a dominant role in studies on TL. Most of the studies reported the validity and reliability of the measurement that ranged from 0.42 to 0.98. The majority of the studies directly quoted the validity and reliability from previous studies when using established instruments while two studies did not report any data on the reliability and validity of their instruments. The reliability of the MLQ instrument by Bass and Avolio is markedly different from the studies that report the reliability coefficients based on their data. According to Cerni et al. (2014), the reliability of the MLQ instrument overall was α = 0.85 but the instrument's dimensions range from α = 0.41 to α = 0.72. Shatzer et al. (2014) on the other hand reported high-reliability scores for each of the dimensions with scores ranging from α = 0.80 to α = 0.94. Granted, the participants in these two studies were different, Shatzer et al. (2014) had elementary school teachers while Cerni et al. (2014) studied school principals. Also, the sample sizes were different, with 590 teachers participating in the Shatzer et al.'s study (2014) while 88 administrators participated in the Cerni et al.'s study (2014). The latter sample size is considered small (Bollen, 1989). Finally, the locus of the studies was different as Shatzer et al. (2014) conducted their study in the USA while Cerni et al. (2014) conducted their study in Australia. Another reason for such a discrepancy could be the reporting of only four dimensions rather than the full scale which is suggested by psychometricians (Muenjohn and Armstrong, 2008). Leithwood and Jantzi’s (2006) TSLS was used in five studies. The researchers in three studies provided similar reliability and validity ranging from 0.73 to 0.88 on each dimension while Boberg and Bourgeois’s (2016) study reported high reliability and validity scores that hovered from α = 0.93 to α = 0.98. Overall, the TSLS appears to show consistency in reliability and validity scores.
Concept and measurement of student achievement
The majority of the reviewed studies used standardized test results to measure student achievement. The only exception was Adams et al.'s (2017) cross-sectional study that measured student learning capacity. Also, two studies did not include student achievement data. Instead, one study focused on student engagement as a variable, and the other on autonomy, competence, and grit. Specifically,
Four studies (n = 4) used students’ test scores in different subjects as the measurement of student achievement (e.g. Boberg and Bourgeois, 2016; Dutta and Sahney, 2016; Nash, 2010; Nir and Hameiri, 2014). Three studies (n = 3) adopted the percentage of students’ performance on certain examinations (e.g. Barker, 2007; Cerni et al., 2014; Leithwood and Jantzi, 2006). Two studies (n = 2) utilized a composite score that calculated the students’ test results and other performance measures (e.g. Quin et al., 2015; Valentine and Prater, 2011). Three studies (n = 3) examined student achievement by measuring the change in student academic performance over time (e.g. Day et al., 2016; Ross and Gray, 2006; Shatzer et al., 2014).
Finally, nine out of the 12 studies measured student achievement at the school level, one study measured student achievement at the student level, and two studies measured student achievement at both the school and student level. Also, worth noting was the fact that six studies did not measure student intake or context in their studies.
Major findings
Vote counting of the direct effect model of TL on student achievement (n = 4)
Three studies (n = 3) reported a direct and significant effect of TL on student outcomes (e.g. Nash, 2010; Quin et al., 2015; Valentine and Prater, 2011).
One study (Shatzer et al., 2014) found instructional leadership had statistically significant effect on student achievement while TL did not have a significant impact on student outcomes.
The number of studies in this category is small. Only four studies examined the direct effect of TL on student achievement with three studies using multiple regression (Nash, 2010; Shatzer et al., 2014; Valentine and Prater, 2011) while one used t-test statistics (Quin et al., 2015). Three of the studies reported a significant positive impact of TL on student achievement.
Vote counting of the indirect effect model of TL on student achievement (n = 8)
Five (n = 5) studies reported indirect positive impact (e.g. Adams et al., 2017; Boberg and Bourgeois, 2016; Dutta and Sahney, 2016; Nir and Hameiri, 2014; Ross and Gray, 2006).
Three (n = 3) studies reported neither direct nor indirect effect of TL on student outcomes (e.g. Cerni et al., 2014; Day et al., 2016; Leithwood and Jantzi, 2006).
There are eight articles that fall in this category of studies. The studies used mediating and moderating factors in their models of TL and student outcomes. The majority of the studies used some form of path modeling to determine the effect of TL on student academic achievement. Table 2 summarizes the sample size, data analysis technique, and results of the 14 studies on the impact of TL on student outcomes. The studies are listed from the earliest studies in the sample to the most recent ones.
Summary table of TL on student achievement.
CTE: collective teacher efficacy; OCB: organizational citizenship behavior; PSSPN: Principal Support for Student Psychological Needs; SEM: structural equation modeling; SES: socioeconomic status; TL: transformational leadership; CFI: comparative fit index; TLI: Tucker–Lewis index; OCB: organizational citizenship behavior; RMSEA: root mean square error of approximation.
Narrative synthesis of studies utilizing direct model
Positive impact studies (n = 3)
One of the simplest studies (no moderating variables were considered) that consisted of 15 principal participants found a significant and positive relationship between TL (MLQ) (Bass, 1985) and student achievement (state standardized test) third-grade reading and math as well as fifth-grade math but no significant relationship with fifth grade reading. The author (Nash, 2010) did not provide effect size. The low number of participants and a convenience sample in one geographical area make this study's predictive powers highly questionable. Another study with a modest sample of 92 high school teachers (Quin et al., 2015) explored the effect of TL on student achievement and the differences between principal practices (Kouzes & Posner's model) of high- and low-performing schools. t-Test analysis of the data showed that principals in the high-performing schools utilized all five TL practices more regularly and effectively than leaders in low-performing schools. Inspiring a shared vision and challenging the process had the highest impact on student achievement. The authors did not report effect size nor did they provide information on the characteristics of the schools included in the study. Rather they attributed the different levels of achievement to the principals’ use of these practices. Valentine and Prater (2011) found that demographic model (gender, educational level, principal experience, and socioeconomic status (SES) of schools and community type) alone accounted for 13% of the variance in language arts scores, 27% of the variance in math, 28% of the variance in science, and 25% of the variance in social studies. Data were collected from 155 principals and teachers from 131 schools out of these 155 schools. Adding the demographic variables and principal leadership factors together resulted in an increase in the variance in all four subtests. All nine factors of managerial, instructional, and TL had positive and significant relationships with student achievement but five factors were more potent: two factors of instructional leadership (instructional improvement and curricular improvement) and three factors of TL (fostering group goals, identifying a vision, providing a model) with the latter three having the greatest relationship to student achievement. The study is limited to one state in the US but the sample size is large and the response rate is within the range of respectable response rate (44.1%).
No impact study (n = 1)
Using a moderating effect direct model (school context and principal demographics), one study (Shatzer et al., 2014) that consisted of 590 teachers in 37 elementary schools compared the impact of transformational and instructional leadership on student achievement measured by a criterion-referenced test (CRT). Their findings showed that instructional leadership (CRT-raw scores: 45%; CRT-progress scores: 27%) explained more of the variance in student achievement than did TL (CRT-raw scores: 29%; CRT-progress scores: 22%). After controlling for school context and principal demographics, instructional leadership accounted for a large and significant amount of the variance in CRT-raw scores and TL accounted for a proportion that was non-significant but larger than the control variables. This study was also conducted in the USA.
Narrative synthesis of studies utilizing indirect model
Positive impact studies (n = 5)
A couple of studies tested mediated models of collective teacher efficacy (CTE), TL, and student achievement and engagement. The results of both studies support the indirect model and show the centrality of CTE in academic achievement and student emotional engagement which in turn contributes to academic achievement. Ross and Gray (2006) tested a model hypothesizing that principals contribute to student achievement indirectly through CTE, that is, teacher commitment and beliefs about their collective capacity. A total of 3042 teachers from 205 elementary schools in two districts in Ontario, Canada participated in the study. The results support the view that principal effects on achievement occur through leadership contributions to teachers’ perceptions of their capacities: CTE and teacher commitment to professional values. The indirect effect of leadership on achievement was small: for every 1.0 standard deviation increase in TL, there was a 0.222 standard deviation increase in student achievement. The study found no statistically significant direct effect of TL on student achievement in reading, math, and writing. Boberg and Bourgeois (2016) tested a serial mediation model of the direct and indirect effects of integrated TSL on student learning. The model assumes that leadership influences student learning through teacher behaviors and student engagement. Their convenient sample consisted of students and teachers from a public charter school in the south-central USA, with 5392 students providing data on student engagement while 596 teachers completed the Leithwood and Jantzi's instrument on TL of principals, CTE instrument, and organizational citizenship behaviors (OCBs) survey. Their results show that TL has no direct effect on either student achievement or engagement. However, the combined indirect effects of TSL on the three student outcomes were all significant and positive. Moreover, the model with three serial mediators (CTE, OCB, and student engagement) provided insight into how leadership affects achievement indirectly through its direct effect on teacher collective efficacy beliefs (CTE). OCBs have no direct effect on student achievement.
Another study by Dutta and Sahney (2016) examined the role of teacher job satisfaction and school climate in mediating the differential effects of leadership practices on student achievement. The model hypothesized that school leaders get things done primarily through other people (i.e. teachers), thus achieving their effect on student achievement through indirect paths. Path modeling was applied to validate a mediated-effects model using cross-sectional survey data (306 principals and 1539 teachers) obtained from 306 schools in the two Indian metropolitan cities of New Delhi and Kolkata. They found that teacher job satisfaction had a direct and significant effect on student achievement, while a supportive social and affective environment and a congenial physical environment appeared to have a positive impact on student achievement. Instructional and TL behaviors were not directly associated with student achievement. The researchers did not report the effect size.
In a mediating indirect model with moderating variables (SES), Nir and Hameiri (2014) tested the mediating effects of the use of power bases on the relationship between principals’ leadership (transactional, transformational, and passive) and school effectiveness as measured by student achievement in 191 public elementary schools in Israel. Participants in the study included 954 tenured teachers. They found the strongest relationship between TL and student outcomes (r = 0.52, p < 0.001) while passive leadership was negatively associated with school outcomes (r = −0.47, p < 0.001). Also, TL was positively related to the use of all three soft power bases (expertise and personal reward, information, and legitimacy of dependence and referent), while it was negatively related to the use of harsh powerbases (r = −0.77, p < 0.001). Furthermore, findings showed the expertise and personal reward (r = 0.49; p < 0.01) and referent (r = 0.45; p < 0.01) power bases were positively related to school effectiveness while harsh power bases were negatively related to school effectiveness (r = −0.35; p < 0.01). Their comparative analysis showed that leadership and powerbases differentiated between effective and ineffective schools.
While most of the studies focused on the impact of TL on student achievement as an outcome, Adams et al. (2017) study investigated how school principals nurture student learning capacity by analyzing cross-sectional data from 3175 students in 70 schools in a Southwestern city of the USA. They measured the relationship between perceived student psychological needs and autonomy, competence, and grit with TL used as a control variable. The findings showed a strong relationship between autonomy-support and competence-support (r = 0.74, p < 0.01) and a medium relationship between autonomy-support and grit (r = 0.40, p < 0.01) and the same as the relationship between competency-support and grit (r = 0.47, p < 0.01). Besides, a statistically significant and positive relationship between perceived student psychological needs and student-perceived autonomy-support (γ03 = 0.16, p < 0.01) was identified. Principal support accounted for the majority of school variance. The findings of this study were consistent with the indirect impact of principals on student outcomes, school leaders shaped the students’ mindset and behaviors through teachers.
No impact studies (n = 3)
Cerni et al. (2014) also developed a model to examine whether the rational and experiential systems and TL practices located in the center of the causal chain—predict teachers’ job satisfaction (absenteeism and turnover) and student learning outcomes (results in reading and math, school discipline, and absenteeism). Conducted in Sydney, Australia, 88 principals participated in the study and the authors found no relationship between TL and the proxy measures of teachers’ job satisfaction. However, there were positive relationships found between the laissez-faire factor and the percentage of the total number of working days taken as sick days (r = 0.19, p < 0.05), the percentage of full-time academic staff who took sick leave (r = 0.20, p < 0.05), and the percentage of turnover (r = 0.19, p < 0.05). They found no relationship between TL and student learning outcomes; however, a significant positive relationship was found between leader satisfaction (leadership outcome measure) and student learning outcomes.
Leithwood and Jantzi (2006) tested the effects of a transformational model of school leadership on teachers’ motivation, capacity and context, their classroom practices, and student learning in the context of large-scale efforts initiated by the UK government to improve local schooling in England. Using random sampling, two representative samples of 500 schools were selected, one sample to provide evidence from teachers on literacy strategies and one to provide evidence about teachers’ numeracy strategies. To assess the direct and indirect effects of leadership on motivation, capacity, and situation, as well as the effects of all these variables on altered teacher practices, Leithwood and Jantzi (2006) used structural equation modeling (SEM; LISREL) as a path analytic technique that allows for testing the validity of causal inferences for pairs of variables while controlling for the effects of other variables. They concluded that TL “had very strong direct effects on teachers’ work settings and motivation with weaker but still significant effects on teachers’ capacities. Second, TL had a moderate and significant effect on teachers’ classroom practices” (p. 223), but the effect was not as strong as either teacher capacity (the strongest effect) or teacher motivation but it was substantially stronger than teachers’ work settings. Finally, their model, as a whole, showed no direct effect on student achievement gains and could not explain the variations in student achievement.
Day et al. (2016) conducted a three-year mixed-method national study on the associations between the work of principals in effective and improving primary and secondary schools in England and student outcomes which were defined by their national examination and assessment results. Research started with a survey that focused on key aspects of TL strategies (e.g. setting directions and visions) and instructional leadership strategies (e.g. managing teaching and learning) and the perspectives of principals and key staff on change and academic and other kinds of student outcomes (e.g. non-academic areas such as engagement, motivation, behavior, and attendance). The survey participants included principals and key staff (two per school at primary level and five per school at secondary level) among the sample schools. Then, 20 in-depth case studies with principals, key staff, and stakeholders were conducted to complement the quantitative study. Results from the quantitative session indicated that “synergistic influences” may be promoted through the combination and accumulation of various relatively small effects of leadership practices that influenced different aspects of school improvement processes in the same direction. As a result, these “synergistic influences” promoted better teaching and learning and an improved culture, especially in relation to pupil behavior and attendance and other pupil outcomes such as motivation and engagement. Furthermore, neither instructional leadership strategies nor TL strategies alone were sufficient to promote improvement identified by the SEM model. The qualitative data reinforced the finding that “there is no single leadership formula for achieving success” (p. 253). Day et al.'s (2016) study also showed that school principals used common practices of leadership despite the context. They were able to achieve and sustain successful pupil outcomes, but the degree of success was likely influenced by the relative advantage/disadvantage of the communities from which their pupils were drawn.
Qualitative studies (n = 2)
Barker (2007) conducted a qualitative case study at an exceptional school in the southern part of England. It aimed to challenge rather than confirm the theory that certain types of leadership necessarily led to improved student attainment. The interview data found: a clear and shared vision had been established; a multiskilled head had created a greater interconnectedness; leadership had been widely distributed; decision-making empowered staff at all levels; opportunities were engineered to give people early responsibility; ambitious goals were set and individuals receive intellectual stimulation and individualized support; there was an institutional buzz with a strong emphasis on individual needs; coaching and mentoring activities made use of performance data. The quality of the relationship between students and teachers reported showed that this type of leadership influenced behavior and expectations indirectly and so created a positive climate for student learning. Inspection and observation reports provided further evidence that the school's leadership had created induced environments that were positive for their colleagues to seek to enhance the quality of learning and teaching. Barker (2007) criticized the reliance on standardized testing to measure organizational effectiveness and further concluded that evidence of superior organizational performance is limited because changes in family background and students’ own ability may explain a significant part of the school's improved effectiveness, rather than leadership.
Makgato and Mudzanani (2019) conducted a qualitative study to investigate school principals’ leadership styles and the educational performance of learners in high-performing and low-performing schools in Vhembe District, Limpopo, South Africa. The study intended to explore teachers’ perceptions of principals’ leadership at their school with relation to learners’ performance rather than to prove any correlation between the high- and low-performing schools and the principals’ leadership styles. Ten schools in Vhembe District, Limpopo of South Africa were purposefully selected as the sample site with 5 teachers from each school. The results have shown that the democratic leadership style was the most frequently used by all high- and low-performing school principals, followed by the TL style. The authors conclude that the “coexistence of transformational, democratic, and instructional leadership is highly recommended for improvement of learners’ educational performance as well as teachers’ job satisfaction” (p. 102). Democratic leadership style followed by the TL style had a positive impact on the learners’ educational performance, but the laissez-faire and autocratic leadership styles had a negative impact.
Discussion and conclusion
The purpose of this review was to critically evaluate the existing literature on the models and effects of TSL on student academic achievement and point out directions for future research. Overall, studies utilizing different research methods and models reported mixed results. More than half of the studies (8 out 14) confirmed the positive and significant impact of TL on student achievement, with three studies identifying a direct effect and five studies identifying an indirect effect. The mediators included both school-level and individual-level factors. Although a small number of studies found no impact of TSL on student achievement, they still provided useful insights, particularly the mixed-method and the qualitative studies.
An important theme that came from the non-impact studies was the comparison between the effect of TSL and instructional leadership on student achievement. For instance, Shatzer et al. (2014) found instructional leadership explained more of the variance in student achievement than did TL, which was in line with Dutta and Sahney’s (2016) study. Shatzer et al. (2014) pointed out that compared with the dimensions of TSL that developed a supportive school culture and facilitate change, instructional leadership presented more specific practices to focus on changes in the core curriculum that are easy for principals to understand and utilize. This indicates that TL theory may need to adapt to the practical demands of school leaders to be better applied in schools. Day et al.'s (2016) mixed-method study concluded that “there is no single leadership formula for achieving success” (p. 253). Similarly, Makgato and Mudzanani's (2019) qualitative study also recommended the combination of transformational, democratic, and instructional leadership to improve learners’ learning performance and teachers’ job satisfaction. In effect, scholars in recent years keep calling for an integrated model that “reconciles managerial strategies of coordination and control traditionally associated with instructional leadership behaviors (ILBs) with the empowering and distributed approaches often associated with TL behaviors (TLBs)” (Boberg and Bourgeois, 2016: 359). In this sense, future researchers may need to consider incorporating more unique elements from other leadership concepts to make the concept of TSL more integrated.
Barker's (2007) study on the other hand challenged the role of leadership in improving student achievement rather than confirmed the contribution of certain types of leadership. He pointed out that other factors, such as students’ own ability, their family background rather than leadership may explain part of the school's improved effectiveness. This conclusion actually responds to some scholars’ criticism of the focus of existing studies about leadership. The majority of the studies conducted continue to be focused on the principal as “the center of expertise, power, and authority” (Hallinger, 2003: 330) even though this “form of leadership assumes that the central focus of leadership ought to be the commitments and capacities of organizational members” (Bush and Glover, 2014: 557). This is especially the case for the studies that used a direct model of studying the impact of TL on student outcomes. Consequently, these studies ignored the role of other leaders in schools, such as assistant principals, leadership teams, and classroom teachers, and factors that are related to students themselves. Studies that used complex models consisting of multiple variables explained larger percentages of variance and showed that leadership is distributed and has an impact on student outcomes indirectly.
The importance of context has been emphasized in the literature. The findings of this study show that the context, that is, the various loci in the English-speaking countries and quite a few non-English speaking countries, did not have any obvious effect on the findings. Clearly, this is an area that warrants additional future research and examination. Regardless of the effect of TL on student achievement in the USA and other western countries such as the UK and Canada, research on the effectiveness of TL is becoming more international, as can be seen from the locus of these studies. This finding suggests that we are not approaching the end of the TL era, instead, international research about the effect of TL on student outcomes seems to be in its infancy given the limited studies in the non-western countries so far. Our review of 14 peer-reviewed articles worldwide of the effects of TL on student achievement over a time span of 14 years (2006–2019) leads us to point out several possible directions for future scholarships.
First, advancing the measurement in the research of TSL and student achievement
By examining the measurement in these 14 studies, we found that Bass and Avolio's (1994) four-factor model (idealized influence, inspirational motivation, intellectual stimulation, and individualized consideration) and Leithwood and Jantzi's (2006) three-factor model (setting directions, developing people, and redesigning the organization) were the most frequently adopted conceptual framework in these studies despite different locus. More importantly, the majority of the studies directly quoted the validity and reliability from previous studies when using established instruments while two studies did not report any data on the reliability and validity of their instruments. It is widely acknowledged that culture (societal and organizational) has a strong influence on leadership practices (Hallinger and Leithwood, 1996), hence TL may have different patterns of manifestation in non-western cultures. In this regard, how to measure TL in other cultures needs to be developed: (1) Rather than undertaking a survey, we recommend experimental designs in future TL research by incorporating cultural norms and values of different countries and their contexts. This will require researchers to identify specific TL practices in a certain context as the first step, then engage expert panels to determine TL practices that can be studied in comparative and quasi-experimental studies. (2) Rather than using a single TL survey, “integrated leadership” that combines other measures from various leadership types is recommended. A great number of studies have highlighted the impact of distributed leadership and instructional leadership on the improvement of teaching and learning with Leithwood et al. (2008) concluding that “school leadership has a greater influence on schools and pupils when it is widely distributed” (p. 34). Robinson et al. (2008) found that the impact of instructional leadership is three to four times that of TL. In effect, there has been some overlap between the broad concept of distributed leadership and TL in terms of fostering collaboration and involvement in decision-making (Robinson, 2008). Also, there is an increasing convergence of transformational and instructional leadership research in education (see Kwan, 2020; Li and Liu, 2020). In this sense, a more comprehensive concept of “integrated leadership” seems to be a new direction for educational leadership research rather than an either single transformational or instructional leadership approach.
When it comes to measurement of student achievement, the majority of the studies utilized test scores of some specific subjects (i.e. math, language, and arts) on different levels (i.e. nationwide and schoolwide). Nevertheless, “educational success is not simply limited to cognitive abilities and scholastic achievements (Wong and Ng, 2020: 254). For instance, in recent years, the “21st century competencies” and “future-readiness” were proposed by various organizations, educational theorists, and policy makers, which were considered to be global educational imperatives for students to confront the rapidly changing economic, social and environmental landscapes worldwide (Tan et al., 2017). Thus, future research should move beyond the use of single tests of certain subjects and study student learning more holistically. These new concepts may have laid new paths to study TL and student outcomes. Moreover, the influence of student intake needs to take a more central role when studying TL. Furthermore, studies that focused on student characteristics such as SES and race provided much-needed knowledge on principal leadership. The studies that examined student intake continue to show their impact on student achievement. Lower SES was negatively related to teachers’ capacity, and student achievement. While SES of students has a determinant factor when it comes to student achievement, school principals in Day et al.'s (2016) study demonstrated their ability to lead and manage successfully and to overcome the extreme challenges of the high need contexts in which some of them work. Barker (2007) also had similar findings. Success in schools serving both advantaged and disadvantaged student populations seems to be built through the synergistic effects of the combination and accumulation of a number of strategies that are related to the principals’ judgments about what works in their particular school context. Findings in Nash's (2010) study also showed a positive relationship between TL and achievement of students of color.
Second, jumping out of the “model-black-box” in research of TL and student outcomes
The analysis of the 14 papers revealed a rigid preference in model adoption. Despite calls by many scholars for utilizing comprehensive effects models in researching TL, the studies in this literature review reveal a continuation of the same models of study. The findings from the present review suggest that studies with the direct model are more likely to confirm the positive impact of TL on student achievement, however, studies that employ the indirect model provide a richer and more complex picture of the effects of TL on student outcomes even in cases of mixed results. Hallinger (2011) pointed out that studies with the mediated effects models that incorporate more antecedent variables might provide more explanations for certain outcomes. Thus, future studies would enrich the field by using mediated effects with the antecedent model or reciprocal effects model. We caution researchers to avoid examining a single outcome through mediating/moderating variables because “transformational leadership style builds on the transactional base in contributing to the extra effort and performance of followers” (Bass, 1998: 5). Hence, future researchers may pay more attention to the augmenting effect of TL, such as the developmental gains/improvement, and the change of behavior/thinking can be taken into consideration. Though similar recommendations have been made in the past, the analysis of the 14 papers and findings in this literature review shows that what is published continues to suffer from similar pitfalls in this area of research. We strongly encourage future scholars to free themselves from the “model-black-box” approach and advance the field of educational leadership by expanding the domain of TL research.
Third, strengthening research design and employing more sophisticated data analysis technique
The evidence from the mixed-method study (Day et al., 2016) highlights the value of adopting mixed methods as an approach to study leadership, since it goes beyond the over-simplistic promotion of a specific leadership style or theory or conceptual framework as an adjectival approach to school improvement. The majority of the studies in this review were quantitative in nature. Thus, we recommend the adoption of a mixed-method approach in the research of TL and student outcomes. More centrally, compared with these particular types of models or frameworks of leadership, leaders’ strategies and actions, and personal qualities (values and relationships) are also important. Qualitative data in a mixed-method study may help to uncover and unpack the complexity and all the layering of leadership practice, judgment, and decision making.
In terms of analysis in quantitative studies, more sophisticated and advanced data analyses techniques are recommended in future studies, particularly agent-based modeling (ABM). Even though leadership is defined as a dynamic process, few studies examined the role of time as an important component in this process (Day, 2014). For instance, the majority of leadership studies have been static and have relied heavily on the use of survey methodology (Dinh et al., 2014). This was also the case of the reviewed studies in the present study. Even longitudinal studies usually collected data over two or three points to cover a period of less than a year (Dulebohn et al., 2012). These situations implied the existing methodological limitations in the current leadership studies. According to Castillo and Trinh (2018), ABM not only provides the analytical capacity to explain the role of time in the process of leadership, but also documents the recursive process of agents’ mutual influencing that can yield fruitful insights into transformations that occur between proximal and distal leadership development (Day and Dragoni, 2015). To this end, ABM may offer a novel way to understand the effects of TL on student outcomes. It should also be noted here that the majority of the reviewed studies seemed to present an over-reliance on the analysis of data from single-channel sources (i.e. teachers, principals, or students) with some of the studies utilizing limited sample sizes (i.e. only several principals’ self-reports). Future studies should include more perspectives and larger databases to bolster the field's confidence in their findings and implications for practice.
Finally, this review has provided practical implications for school leaders and policy makers. The finding that the majority of the studies concluded that TL has a positive and significant impact on student achievement highlights the need and significance of adopting TL behaviors and practices for school principals and leadership teams. Thus, it appears to be necessary for leadership preparation programs to incorporate TSL practices in their curriculum. Regarding the small number of studies that questioned the effectiveness of principals’ TSL, policy makers and school principals need to understand that principals may influence student outcomes, however, this influence relies on the support and contribution from leadership as a team effort rather than an individual endeavor. It should be noted that qualitative research is needed on the process school leadership engages to develop teacher leadership and a school culture conducive to student development and learning.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
