Abstract
This study aimed to provide a quantitative synthesis on the effect of the Sport Education Model (SEM) on basic need satisfaction, intrinsic motivation, and prosocial attitudes in physical education (PE). We conducted a systematic review and meta-analysis on experimental studies conducted before August 2020. The initial search yielded 6061 articles, with 25 articles (n = 2937) meeting the inclusion criteria. The articles were analyzed using five separate analyses using two- and three-level random-effects models and Hedges’ g effect size. The study showed the SEM to have a positive heterogeneous medium effect on autonomy (g = 0.43; CI 95% [0.12, 0.74]), competence (g = 0.42; CI 95% [0.17, 0.67]) and relatedness (g = 0.57; CI 95% [0.28, 0.85]) need satisfaction, intrinsic motivation (g = 0.63; CI 95% [0.37, 0.89]), and prosocial attitudes (g = 0.46; CI 95% [0.09, 0.83]). All a priori categorical moderators were statistically insignificant. The analyses indicate that the SEM is more need-supportive and promotes intrinsic motivation and prosocial attitudes more compared to the skill-drill, direct, and traditional instructional styles used in PE. However, high-quality experimental and comparative trials testing the efficacy of the SEM on broad outcomes are needed. Specifically, the concept of novelty, potential negative outcomes, and essential behavioral outcomes, such as physical activity, should be included in the future. Further, the fidelity of the interventions should be measured and reported with more transparency and detail.
Introduction
To provide youth with an authentic sport experience in schools, physical education (PE) teachers often adopt the Sport Education Model (SEM) (Siedentop, 1994). In the model, students are placed into teams where they compete for the entirety of the learning unit (season), which usually is at least 12 lessons long. In their teams, students serve in various roles throughout the season, such as referee, coach, athlete, or scorekeeper. In addition to organizing the learning unit as a sport season (a), the SEM has the five other key characteristics of (b) affiliation, (c) formal competition, (d) record keeping, (e) festivity, and (f) a culminating event (Siedentop, 1994).
The teaching strategies in the SEM are not clearly specified and can vary with different contextual factors of the learning unit (see e.g. García López and Kirk, 2021). However, according to Siedentop (1994), PE teachers use mainly three instructional strategies in the SEM: direct instruction, cooperative learning, and peer teaching. Direct instruction is usually used in the first lessons of the model, as students are still becoming familiar with the model. Cooperative learning typically occurs as students become more comfortable with the model. For example, after the first lessons, students have the opportunity to make decisions within their teams regarding practice strategies and playbook. In addition, peer learning takes place, for example, as students share the responsibility of their teams’ success. While a range of teaching strategies can be used, the SEM is generally viewed as a student-led approach (Hastie, 2016; Wallhead and O’Sullivan, 2005).
The goal of the SEM is to develop “competent, literate, and enthusiastic sportspersons” (Siedentop, 1994: 4). Competence refers to the ability to discern and execute skills and strategies of games (Siedentop et al., 2011). Literacy is defined as a student’s ability to implement good and bad practices of sports culture (Siedentop et al., 2011). Enthusiasm refers to a student’s “desire to participate because they have come to value the experiences and enjoyment derived from participation” (Siedentop et al., 2011: 5). Although best measured via behavior, scholars have commonly examined these desired student outcomes through established social and psychological constructs, such as motivation, enjoyment, and prosocial attitudes, with self-report questionnaires. Specifically, the desired outcomes of competence and enthusiasm are associated with constructs established in self-determination theory (SDT) (competence and intrinsic motivation), while the literacy outcome is often assessed through sport-specific prosocial attitudes (Chu and Zhang, 2018; Hastie, 2016; Perlman and Karp, 2010).
Self-determination theory
SDT is arguably the most used theory to examine motivation in education (Deci and Ryan, 1985, 2000; Ryan and Deci, 2020). The theory posits that students are naturally inclined to learn, grow, and become connected with others, but that these processes require supporting conditions (Ryan and Deci, 2017, 2020). Specifically, SDT argues that the satisfaction of the three basic psychological needs of autonomy, competence, and relatedness are vital for desirable human development (Vansteenkiste et al., 2020). Autonomy refers to engaging in behaviors with a full sense of volition, ownership, and initiative (Deci and Ryan, 2000). The need for competence concerns the experience of mastery and efficacy, while the need for relatedness refers to a sense of feeling belonging with other people in a meaningful way (Deci and Ryan, 1985, 2000). Educational research based on SDT primarily focuses on how different educational environments support or thwart the three basic needs (Deci and Ryan, 2000; Ryan and Deci, 2020). In general terms, need-supportive conditions satisfy students’ basic needs and promote intrinsic motivation, while need-thwarting conditions frustrate basic needs and lead to amotivation or extrinsic motivation (Deci and Ryan, 2000; Ryan and Deci, 2017). From the view of SDT, intrinsic motivation is the most optimal form of motivation, as it relates to activities done for inherent interest and enjoyment without the presence of external forces (Deci and Ryan, 2000). In the PE context, research has shown need satisfaction and intrinsic motivation to be strongly linked with various desirable student outcomes, such as physical activity and engagement (Vasconcellos et al., 2020; Xiang et al., 2017; Zhang et al., 2011).
SDT constructs of competence need satisfaction and intrinsic motivation are conceptually associated with the competence and enthusiasm learning outcomes of the SEM, and are hence widely researched outcomes of the SEM (e.g. Cuevas et al., 2016; Wallhead and Ntoumanis, 2004). Further, the SEM has several characteristics that can be seen to support the needs for relatedness and autonomy. Namely, the SEM’s focus on student affiliation, peer teaching, and student responsibility compared to traditional PE teaching have been hypothesized to support these needs (Perlman, 2010; Wallhead and Ntoumanis, 2004). On the other hand, some have argued that the SEM’s emphasis on formal competition can be detrimental for students’ need for competence and intrinsic motivation (Cuevas et al., 2016; Wallhead and Ntoumanis, 2004).
Prosocial attitudes
The term “prosocial behaviors” refers to voluntary acts intended to help or benefit another individual or group of individuals (Eisenberg and Fabes, 1998). In PE, helping, encouraging, and respecting classmates are examples of prosocial behaviors (Jennings and Greenberg, 2009). Examples of prosocial behaviors in sports include acts of sportspersonship such as shaking hands after a competition or showing respect toward teammates, referees, opponents, or the laws of the game (Kavussanu et al., 2013). Notably, prosocial behaviors have the potential to benefit others and improve sport experience (Grusec et al., 2002). Relating to the SEM specifically, prosocial behaviors can be viewed as part of the literacy learning outcome. While these outcomes are best measured at a behavioral level, PE research has mainly employed self-report instruments to tap into students’ prosocial attitudes (see Cheon et al., 2018, for an exception).
Purpose of the study
The SEM is one of the most extensively studied instructional models in PE. The plethora of systematic reviews of the SEM literature speak to the model’s popularity (Bessa et al., 2019; Chu and Zhang, 2018; Evangelio et al., 2018; Hastie, 2011, 2014, 2016; Kinchin, 2006; Wallhead and O’Sullivan, 2005). When looking across all of the reviews, a number of trends emerge: (a) research on the SEM is expanding; (b) the expansion of research has produced more sophisticated research designs across various contexts; (c) scholars display a propensity to measure the student outcomes of competence, literacy, and enthusiasm through basic need satisfaction, intrinsic motivation, and prosocial attitudes; (d) the SEM generally promotes social and psychological well-being when compared to traditional models, such as skill-drill and technique-based models.
Given these trends, a meta-analysis offers a logical step forward for a number of reasons. Firstly, the narrative reviews (e.g. Bessa et al., 2019) have established that the SEM promotes the social and psychological well-being of students, but a meta-analysis is necessary to reveal the magnitude and detail of those effects. Secondly, meta-analysis is able to consider how study characteristics such as country, gender, season length, and fidelity impact the relationship between the SEM and student outcomes. Thirdly, a comprehensive and systematic quantification across studies allows researchers and practitioners to better understand the generalizability of the SEM. Fourthly, a meta-analysis has the ability to uncover trends related to research design and reporting that can be used to guide future research. Lastly, to our knowledge, scholars are yet to perform a focused meta-analysis on SEM studies (see Sierra-Díaz et al., 2019, for a meta-analysis of PE instructional models and motivation). Therefore, the purpose of this study was to provide a quantitative synthesis on the effect of the SEM on basic need satisfaction, intrinsic motivation, and prosocial attitudes in PE. To this end, this study had two broad research questions: (a) what is the effect of the SEM on basic need satisfaction, intrinsic motivation, and prosocial attitudes compared to traditional PE instruction?; (b) what study features moderate the effects detailed in research question one?
Methods
Article identification
This meta-analysis was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement guidelines (Moher et al., 2009). The process of article identification, screening, eligibility and inclusion can be seen in Figure 1.

Flowchart of the study selection.
To begin, the inclusion and exclusion criteria were established to avoid reporting bias. The inclusion criteria required articles to: (a) be written in English or Spanish; (b) be peer-reviewed research publications from academic journals, unpublished manuscripts, conference publications, book chapters, or dissertations; (c) be published before August 2020; (d) be placed in a PE setting; (e) be intervention studies using control and experimental groups with pre- and post-measures; (f) be intervention studies testing the effect of the SEM; (g) statistically report one or more of the following outcomes: autonomy, competence, relatedness, intrinsic motivation or enjoyment, and prosocial outcomes (e.g. respect and fair play). Articles were excluded if the control and experimental groups were incompatible (e.g. control group: traditional instruction, experimental group: the SEM combined with teaching personal and social responsibility).
The first phase of data collection was article identification. In this phase, the following databases were used to perform the search: Web of Science, PsychINFO, PsychARTICLES, ERIC, SportDISCUS, GreyLiterature, ProQuest Dissertations and Theses database, and Google Scholar. The search was performed between July and August 2020 using the following keyword combinations: sport education, model, instruction, teach*, motivation, self-determination, basic need*, autonomy, competence, relatedness, prosocial, friendship, collaboration, support, sportsmanship, respect, physical, education, school, experimental, trial, comparison. The exact keyword combinations are listed in the supplemental files. Filters were applied to meet the inclusion criteria. Duplicates were identified and removed.
The next phase of data collection was article screening. During this phase, the first and second authors scanned the title and abstract of each article and removed articles if they did not meet the inclusion/exclusion criteria. At this stage the authors also set aside any literature reviews on the SEM to be reviewed for additional studies in the next stage.
In the third phase of data collection, the first and second authors read the full text of the remaining articles to assess their eligibility. Articles were removed if they did not meet the inclusion/exclusion criteria. During this phase, the authors also used a number of strategies to expand the search to locate any missing studies. Firstly, the reference list of the fully reviewed articles was scanned for missing studies. Secondly, the literature reviews saved in the previous phase were read in their entirety to identify any missing studies. Additional articles were identified through reference lists and literature reviews (n = 20). These articles were also read in their entirety and were included in the meta-analysis if they met the inclusion/exclusion criteria. In total, the first and second authors read the full text of 201 articles identified through the database search, reference lists, and literature reviews.
Out of all the fully reviewed papers (n = 201), initial interrater agreement in applying the inclusion and exclusion criteria (Cohen’s unweighted kappa from R [version 4.0.0)]; R Core Team, 2020) was high (.97, CI 95% [.93, 1], z = 13.8, p < .001; rough percentage agreement 99%). The few disparities in applying the inclusion and exclusion criteria between the authors were resolved via discussion. Both authors deemed one article eligible if additional information could be obtained from the author (no success). Moreover, the author of an already eligible article was contacted to seek more detailed information on the unreported scores relating to autonomy, competence, and relatedness satisfaction (success).
The first authors of all the included articles were contacted through publicly available email addresses to acquire unpublished data. If the email address of the author was not working or it was not publicly available, the private message function of the ResearchGate website was used as the method of contact. These authors were contacted a maximum of three times with one week between the contact efforts. Only one author could not be contacted due to failure of finding current contact information.
In total, after all the search and reliability procedures, 25 articles were deemed eligible. Lastly, the cited by and related articles functions in Google Scholar were used to identify additional studies corresponding to the 25 articles included in the quantitative analysis. No additional studies were located using this feature. The complete flowchart representing the article identification process is provided in Figure 1.
Effect size calculation
Hedges’ g was used as the measure of the effect size (Hedges and Olkin, 1985). The effect sizes were calculated by subtracting the mean change in the control condition from the mean change in the intervention condition and dividing the difference by the pooled standard deviation of the pre- and post-test scores (Hedges and Olkin, 1985; Lipsey and Wilson, 2001). As the correlation of pre- and post-test scores was not available in the studies, a recommended estimate of 0.5 was used instead (Follmann et al., 1992). The analyses with alternative correlations of 0.7 and 0.9 did not have a substantial influence on the results.
All the studies included in the quantitative analysis reported group means for pre- and post-scores and their standard deviations for an experimental and a control group. In the studies reporting only a total number of participants, an even split of participants was assumed to the different conditions (k = 4). An increase in the need satisfaction, intrinsic motivation, and prosocial attitudes resulted in a positive effect size.
The reliability of the outcome scores (k = 79) was assessed with an unweighted Cohen’s kappa for all extracted rough data (means, standard deviations, and number of participants in each group), which were used to compute the individual and pooled effect sizes. The preliminary unweighted Cohen’s kappa of the two authors for all the extracted data was high (.92, CI 95% [.90, .94], z = 402, p < .001; rough percentage agreement 91.9%). Before transforming the rough data into effect sizes and variance estimates with the esc-package (Lüdecke, 2019) in R (version 4.0.0) (R Core Team, 2020), a full rater agreement was achieved by locating and resolving dissimilarities, which were mainly due to typing errors and other minor mistakes.
Selection and coding of the moderators
An a priori moderator selection based on reason and logic was conducted to explain the expected variation in the effect sizes (Rosenthal and DiMatteo, 2001). A total number of six moderators were categorized as age (under 15 years or 15 years and over), length (below 13 lessons or 13 lessons and more), sport (one or multiple sports included in the intervention), control condition (skill-drill game approach or direct instruction, or traditional), fidelity (not reported or reported), and study quality (below average or average and above), using the Medical Education Research Study Quality Instrument [MERSQI] (Reed et al., 2007). As only one study did not have mixed gender class groups, an a priori moderator of gender distribution was not analyzed. See supplemental file for the moderator table.
The agreement of moderator coding between the two authors was initially almost perfect (Cohen’s unweighted kappa = .98, CI 95% [.95, 1]; 98.8% rough percentage agreement). The two differences in coding, which were due to typing errors, were reconciled before the analyses.
Study quality assessment
The quality of all included studies was analyzed with the widely used MERSQI tool (Reed et al., 2007). The MERSQI has been used previously in educational studies outside of medical education and has been reported to be a valid instrument to assess the quality of experimental research (Bai et al., 2020; Reed et al., 2008). The MERSQI assesses the quality of a study via 10 items organized into the six categories of study design, sampling, type of data, validity of evaluation instrument, data analysis, and outcomes. The possible scores of study quality range normally from 5 to 18, but to fit the PE context, the tool was slightly modified by excluding potential/health care outcome as a possible outcome from the tool, restricting the maximum score of the tool to 17. The total quality score of each study is shown in Table 1.
Study features of the analyzed studies.
Note. Studies marked with * report different findings from the same experiment listed above.
SEM: Sport Education Model. BB: Basketball; DI: Direct Instruction; HB: Handball; IM: Intrinsic Motivation; SDG: Skill-Drill Game; tgfu: Teaching Games for Understanding; TPSR: Teaching Personal and Social Responsibility; Trad: Traditional; UF: Ultimate frisbee; VB: Volleyball.
Statistical analysis
Two different approaches were used to analyze the pooled effect sizes for the five outcomes. Two-level random-effects models (Hedges and Olkin, 1985; Hunter and Schmidt, 2000), weighting by the inverse of the variance of each effect and restricted information maximum likelihood (REML) estimation of heterogeneity, were carried out separately for autonomy and relatedness using R (version 4.0.0) (R Core Team, 2020) and the Metafor package (Viechtbauer, 2010). For competence satisfaction, intrinsic motivation, and prosocial attitudes, separate three-level random-effects models with the REML method and the Metafor package were used to cluster the individual effect sizes at study level, as in some cases multiple effects coming from the same study were analyzed. The three-level approach has been shown to adequately deal with the dependency (i.e. correlations) of the effect sizes in simulation studies (Van den Noortgate et al., 2015). Models excluding both the second or the third level at a time were also tested and, although the three-level model did not in all cases outperform the simpler models (see supplemental file for sensitivity analyses), the more complex modeling was applied to keep with the recommended practices of dealing with non-independence of the effect sizes (Van den Noortgate et al., 2015).
The parameters Q and I 2 were used to test the heterogeneity of the effects (Higgins et al., 2003). The precision of effect sizes (g) was indicated by 95% CIs. A significant Q statistic indicates heterogeneity between the effects and I 2 indicates the non-sampling error variance of heterogeneity between studies. For the three-level models, the heterogeneity was assessed separately at levels two and three. The sum of the heterogeneity in those levels is the non-sampling error variance in a three-level model. Heterogeneity of the effect sizes was indicated if the Q total reached a significance level of p < .05, and the sampling error contributed to the observed variance less than 75% (Hedges and Olkin, 1985; Lipsey and Wilson, 2001).
A priori determined moderators were used in a linear regression analysis as univariate independent variables to explain the possible heterogeneous effects of the outcomes. The level of different moderators was limited to two (except for three for the control group) and the interactions of the moderators were not tested because of inadequate numbers of effects for each outcome (Deeks et al., 2011).
Egger’s test (Egger et al., 1997) using the standard error of the observed outcomes as a predictor and the rank correlation test (Begg and Mazumdar, 1994) were used to detect publication bias via asymmetry in the funnel plots. In the case of the three-level models, instead of a standard Egger’s test, the inverse of the variances of the observed outcomes was used to detect funnel plot asymmetry. Further, an analysis of Cook’s distances (studies greater than the median plus six times the interquartile range of the Cook’s distances or significant visual differences as seen in plots) were used to determine overly influential studies and to evaluate the need for the sensitivity analyses (Viechtbauer and Cheung, 2010). The number of unpublished null effects that could reduce the significance of the observed effects to <.05 was estimated as the random-effects fail-safe N+ for all the outcomes. The value of fail-safe N+ signifies the minimal number of additional null effects from multiple studies of the average sample size required to reach a non-significant value of the mean effect size (Rosenberg, 2005). The results of the publication bias and sensitivity analyses are in the supplemental file.
Results
Effects of the SEM on need satisfaction, intrinsic motivation, and prosocial attitudes
The 13 observed outcomes for competence need satisfaction ranged from –.14 to .87, with 85% of the effects being positive. Based on the three-level random-effects model, the mean change of competence satisfaction followed by the analyzed SEM interventions was .42 (CI 95% [.17, .67], t = 3.62, p = .003). The average mean change differed significantly from zero but was strongly heterogeneous (Q = 47.6, df = 12, p < .001, total I 2 = 75.7%).
For autonomy need satisfaction, the outcomes of 11 studies ranged from –.06 to 1.68, with 82% of the studies having positive estimates. The mean change based on the random-effects model was .43 (CI 95% [.12, .74], z = 2.72, p = .007). The Q-test indicated the true outcomes to be heterogeneous (Q = 56.74, df = 10, p < .001, I 2 = 86.9%).
The nine outcomes for relatedness need satisfaction were between .03 and 1.47, with all estimates being positive. The mean change based on the random-effects model was .57 (CI 95% [.28, .85], z = 3.88, p < .001). According to the Q-test, the true outcomes are heterogeneous (Q = 43.97, df = 8, p < .001, I 2 = 82.7%).
The 19 observed outcomes from 14 studies for intrinsic motivation ranged from –.19 to 1.42, with 84% of the effects being positive. The three-level random-effects model indicated the mean change of intrinsic motivation followed by the SEM interventions was .63 (CI 95% [.37, .87], t = 5.11, p < .001). The average mean change was strongly heterogeneous (Q = 89.6, df = 18, p < .001, total I 2 = 81.8%).
For prosocial attitudes, the 27 observed outcomes from 11 studies were between –.33 and 2.34, with 70% of the effects being positive. According to the three-level random-effects model, the mean change of prosocial attitudes followed by the analyzed SEM interventions was .46 (CI 95% [.09, .83], t = 2.56, p = .017). The average mean change was strongly heterogeneous (Q = 280.3, df = 26, p < .001, total I 2 = 92.1%). Forest plots of the five outcomes are presented in Figures 2 and 3. Detailed descriptions of the study characteristics are in the supplemental files.

Forest plots for competence, autonomy, and relatedness need satisfaction.

Forest plots for intrinsic motivation and prosocial attitudes.
Moderator analyses
The significant Q statistics and high values of I 2 for all the outcomes indicated that the variability in the outcomes in SEM interventions was not only due to sampling errors of the independent studies. The initial moderator analyses with regression analyses were conducted using six a priori determined moderators: participant age, length of intervention, fidelity reporting, quality of study (MERSQI), control group type, and sport (single sport, multiple sports) (see supplemental file). None of the a priori moderators were significant for any of the outcomes. Therefore, we decided to test for one exploratory moderator effect of the country Spain, as several studies were conducted in Spain. Like the other moderators, this moderator did not have significantly different effects for any of the outcomes.
Discussion
Our results indicate that the SEM is on average more need-supportive and promotes intrinsic motivation and prosocial attitudes when compared to other widely used PE instructional styles, namely direct instruction, the skill-drill game approach, and the “traditional style” of teaching. More specifically, the study showed the SEM to have a positive heterogeneous medium effect on autonomy (g = 0.43; CI 95% [0.12, 0.74]), competence (g = 0.42; CI 95% [0.17, 0.67]) and relatedness (g = 0.57; CI 95% [0.28, 0.85]) need satisfaction, intrinsic motivation (g = 0.63; CI 95% [0.37, 0.89]), and prosocial attitudes (g = 0.46; CI 95% [0.09, 0.83]). Despite the average positive effect of the SEM on all the examined outcomes, it is important to note that except for relatedness, in some contexts the outcome of the SEM can be negative.
In relation to the three psychological needs, the most relevant finding concerns competence. The SEM has several characteristics that are likely to contribute to helping students feel more competent in relation to the comparison models. Firstly, in the SEM, students serve in a variety of roles (athlete, coach, referee, team manager), which gives them the opportunity to display their personal strengths in a valued social setting. Secondly, the systematic record keeping feature of the SEM provides each student an opportunity to receive feedback on their performance throughout the learning unit (De Muynck et al., 2017; Mouratidis et al., 2008). Thirdly, the peer teaching component of the SEM might allow students to engage more fully with the content because they are required to observe their peers, demonstrate skills, and provide feedback (Hastie, 2016). Fourthly, the SEM lessons in the analyzed studies were clearly structured and planned, which in itself is a feature of a competence supporting learning environment (Aelterman et al., 2019).
Although the overall effect of the SEM on competence was significant and positive, it should be noted that in some studies the effect was actually negative when compared to the other models (see Perlman, 2011; Perlman and Karp, 2010). This variability could be attributed to the fact that the SEM does not prescribe teachers or student coaches specific instructional strategies for developing skill and tactical knowledge (Hastie, 2016). These differences may be exacerbated by the fact that some teachers are untrained and, therefore, limited in their content and pedagogical content knowledge, which may hinder their ability to teach higher order skills and tactics (Hastie, 2011, 2016; Wallhead and O’Sullivan, 2005).
The SEM also had a large effect on autonomy and relatedness need satisfaction when compared to traditional models. One explanation for improvements in student autonomy could be the student-led feature of the SEM. Most of the studies included in this analysis reported a third of the lessons were student led. Student-led lessons promote a sense of volition (Bechter et al., 2019). Specifically, the ability for students to make their own choices (e.g. select team names, colors, roles) has been shown to foster autonomy in PE (De Meester et al., 2020). The formal competition stage of the SEM unit may be particularly important, as it requires students to organize their own practices and prepare for festive activities. This requires creativity, experimentation, and self-regulation of behaviors, all key characteristics of autonomy (Teixeira et al., 2019). Autonomy was not uniformly improved by the SEM, suggesting that under certain conditions the model may not be as autonomy supportive as the comparison models.
The only outcome having all positive effects was relatedness need satisfaction. The inherent characteristics of the SEM, especially the pursuit of affiliation, are well-aligned with a relatedness supporting environment. The teams the students work in from the beginning until the end of the unit are likely to foster a deep sense of connection (Cox et al., 2009; Perlman and Karp, 2010). Furthermore, the activities teams engage in also promote relatedness. Throughout the season, students encounter challenges (e.g. losses) that require them to collaborate, devise strategies, and problem solve together (Chu and Zhang, 2018). Previous literature has indicated that the SEM may promote discrimination and exclusion based on student gender, skill level, or social status (Hastie, 2016), but our findings support more recent literature that indicates the SEM is generally an inclusive model if implemented correctly (Harvey et al., 2020). Beyond the scope of this study, it is also interesting to consider how broadly the SEM facilitates relatedness. For example, does the SEM promote feelings of relatedness between teams or just within teams? Does the SEM promote relatedness between students and teachers or just between students? These are questions that researchers may look to pursue.
The effect of the SEM on intrinsic motivation was most pronounced out of all the examined outcomes. According to SDT, the satisfaction of the three basic needs (competence, autonomy, and relatedness) leads to more intrinsic motivation (Cox et al., 2009; Ryan and Deci, 2017; Vasconcellos et al., 2020). This offers one explanation for the larger effect on intrinsic motivation. However, there might be other contributing factors. For example, the festive element of the SEM is by definition fun and links directly with intrinsic motivation. In addition, students may feel more intrinsically motivated by the SEM over other models due to the increased ownership they have over the learning process. Interestingly, our results indicate the SEM promotes students’ intrinsic motivation more than interventions designed to support students’ basic needs and motivation by different communication strategies (Manninen et al., in review). This finding suggests that factors such as the unit design, peer relationships, values, and novelty (Hassandra et al., 2003) may be more important for student motivation than teachers’ communication strategies.
According to the analyses, the SEM promoted prosocial attitudes more than the traditional models. Certain features of the SEM might play a role in facilitating prosocial attitudes. For example, formal competitions, team practices, and festive activities present students with numerous opportunities to practice their prosocial skills (fair play, respect, cooperation). In addition, many teachers expand the record keeping feature of the SEM beyond the tracking of in-game statistics and win–loss records to include displays of sportspersonship or teamwork. Lastly, like intrinsic motivation, a need-supportive environment has been shown to increase students’ prosocial behavior in PE (Cheon et al., 2018).
Out of all the five outcomes included in this meta-analysis, prosocial attitudes were the most affected by the contextual features. Only 70% of the effects were positive and the Egger’s test and the rank correlation test indicated that publication bias may influence prosocial attitudes. Inconsistent findings may be explained by the model’s emphasis on competition and affiliation, which has the potential to increase students’ engagement in antisocial behaviors (Graupensperger et al., 2018). For example, students with a “win at all cost” attitude may cheat or show disrespect to teammates who are less skilled. Ultimately, prosocial outcomes are highly dependent upon a teacher’s ability to foster a cordial environment and clarify student roles (Guijarro et al., 2020; Hastie, 2016). As such, the SEM should include features that promote students’ reflection on ethics and morals surrounding sports.
To our surprise, none of the a priori moderators reached statistical significance. In addition, the exploratory moderator of Spain as a country was not a significant moderator. However, a large heterogeneity is typical in educational meta-analyses (e.g. Braithwaite et al., 2011; Lazowski and Hulleman, 2016). The large heterogeneity together with a relatively small effect on each moderator category resulted in limited power to detect statistically significant differences between the groups. To confront this, more experimental research should be done to increase the power to detect the possible influencing factors. Moreover, there might be factors influencing between-study heterogeneity that the moderators could not detect. For example, the study by Perlman (2010) had only amotivated students as participants, whereas the other studies were done with typical classes.
We argue that more steps should be taken ensure model fidelity. Less than half of the studies included in this report provided measures of objective model fidelity. The studies that did report model fidelity were limited to reporting a binary (yes/no) use of different features of the SEM. These structural features do not detail the specific characteristic of how students were taught sports skills and tactics. For example, were the lessons primarily student led, or teacher led? How did student coaches teach their peers sports skills? The SEM does not prescribe these characteristics, which allows for teachers’ flexibility, but the vague fidelity reporting undermines the validity of the results of the SEM research.
Unresolved issues and future studies
Although the SEM increased all the analyzed outcomes, a substantial number of questions persist regarding the effectiveness of the SEM. Firstly, in all but one study, the students had not been taught using the SEM before. Thus, it might be that some of the effects are due to the novelty of the SEM compared to traditional lessons (González-Cutre and Sicilia, 2019). However, without measuring novelty as a concept, it is difficult to claim that the effects of the SEM might have been different if the students had had prior experience with the SEM.
Another characteristic of the interventions that needs addressing is the relatively long duration of the learning units. On average, the intervention lasted for about 10 weeks and included 20 lessons. The extended length of the model may be difficult for teachers to implement, since some national PE curriculums simply do not allow teachers to dedicate long periods of time to one or two sports (see Finnish National Board of Education, 2014) (Harvey et al., 2020). In addition, in the majority of the analyzed studies, the control group practiced only one or two sports during the study duration. Practicing one sport for this long could be perceived as less motivating or need-supportive than usual in a direct instruction format.
Need support, intrinsic motivation, and prosocial attitudes are commonly examined outcomes of the SEM (e.g. Cuevas et al., 2016; Wallhead et al., 2014). However, we recommend that other key outcomes also be included, namely physical activity inside and outside of PE (Hastie, 2011; Rocamora et al., 2019). Lifetime physical activity is a main outcome of PE (e.g. NASPE, 2004), yet physical activity levels were only measured in a few of the studies included in this analysis (Hastie and Trost, 2002; Rocamora et al., 2019). The limited studies that have measured physical activity in the SEM have shown promising results (Hastie and Trost, 2002). Nevertheless, there are features of the SEM that have the potential to reduce physical activity levels. For example, students may be less physically active if they are serving in the role of referee, team manager, or spectator (during the culminating event).
To further pinpoint the usefulness of the SEM, we argue that negative PE outcomes should also be considered in the future. Examples of such measures would be amotivation or aggression. The inclusion of positive and negative outcomes as well as behavioral outcomes (e.g. physical activity but also competence, literacy, and enthusiasm) would provide a clearer picture of the benefits and limitations of the SEM.
Limitations
Although the study employed detailed procedures, there are limitations that should be taken into account. Firstly, the small number of effects for most of the outcomes is an obvious limitation. In addition, the results of the analyses are rooted in the methodological features of the included studies, which lacked randomized allocation of participants to the conditions, proper fidelity procedures, and various control conditions. Together these factors leave too much room for speculation about the validity of the study results of the original articles.
The Egger’s tests, rank correlation tests, and fail-safe N metric suggested that there is a limited possibility for publication bias with the retrieved studies, especially for prosocial attitudes. However, caution regarding the publication bias results should be taken, as the number of analyzed studies was relatively small, and the heterogeneity among effects high (Sterne et al., 2000).
Conclusions
This meta-analysis suggests that the SEM is more effective in promoting basic need-satisfaction, intrinsic motivation, and prosocial attitudes compared to other more traditional PE instructional styles. The models seem to especially improve students’ relatedness need satisfaction and intrinsic motivation toward PE. The heterogeneity of the effects as well as the shortcomings in study reporting and methods, however, prevent firm conclusions and highlight the relatively low quality of SEM research. Further, we argue that physical activity as well as potential negative outcomes should always be measured in models-based research. Excluding essential outcomes such as these makes it difficult to balance the pros and cons of using the SEM in PE.
Supplemental material
Supplemental Material, sj-docx-1-epe-10.1177_1356336X211017938 - The effect of the Sport Education Model on basic needs, intrinsic motivation and prosocial attitudes: A systematic review and multilevel meta-analysis
Supplemental Material, sj-docx-1-epe-10.1177_1356336X211017938 for The effect of the Sport Education Model on basic needs, intrinsic motivation and prosocial attitudes: A systematic review and multilevel meta-analysis by Mika Manninen and Sara Campbell in European Physical Education Review
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
