Abstract
Although there is a plethora of tools available to assess children's movement competence (MC), the literature suggests that many have significant limitations (e.g. not practical for use in ‘real-world’ settings). The FMS2 assessment tool has recently emerged as a valid and feasible alternative to existing measures. The aim of this study was to examine the construct validity of the FMS2 among 10- to 13-year-old children in primary school in Ireland. As part of this study, 246 children (50% boys, Mage = 11.25 years) were assessed using the FMS2 assessment tool at two time points, eight weeks apart. Structural validity was assessed using confirmatory factor analysis. Results indicated acceptable model fit for each individual factor (locomotor, object manipulation, and stability) at both time points, although stronger fit statistics were observed at the second time point. Further, each factor demonstrated measurement invariance between the two time points. When combined into a single model, model fit was inadequate, suggesting that the individual components, while related, are distinct constructs. Boys demonstrated significantly higher MC scores than girls at both time points, while no significant age-related differences were observed. The findings of this study suggest that the FMS2 assessment tool has good structural validity, and is consistent when measuring children's MC proficiency over time.
Introduction
In recent times, a plethora of research studies from across the globe has illustrated that a significant proportion of children are failing to meet the World Health Organisation's (WHO) recommendation to engage in an average of 60-minutes of moderate-to-vigorous physical activity (PA) daily (Guthold et al., 2020; Woods et al., 2023; World Health Organisation, 2020). As a consequence of this, children are more likely to experience numerous negative health outcomes, such as obesity, increased susceptibility to chronic diseases (i.e. cardiovascular disease, type 2 diabetes; Warburton and Bredin, 2016), and poor physical fitness (Chaput et al., 2020), as well as an increased likelihood of experiencing negative psychological and social impacts (Murphy et al., 2020; Strauss et al., 2001).
Movement competence (MC) is defined as the development of skill proficiency to ensure successful performance in a range of physical activities (Bisi et al., 2017). The importance of MC to children's PA has been well established (Eddy et al., 2020). MC is a multi-faceted construct encompassing both fundamental and functional movement skills and has been positively associated with health outcomes. Children with greater proficiency in these skills are more likely to engage in higher levels of PA (Logan et al., 2018), have a healthier body mass index (Duncan et al., 2013; Slotte et al., 2015), demonstrate greater health-related fitness (Behan et al., 2022), and achieve greater academic attainment (De Waal and Pienaar, 2020; Jaakkola et al., 2015). Despite these associations, research consistently shows that children's proficiency in these skills remains low (Behan et al., 2019; Lopes et al., 2020; O’Brien et al., 2016).
In response to consistently low levels of MC worldwide, research groups, government departments and key stakeholders have developed novel interventions and strategic plans with the aim of enhancing children's MC. For example, interventions such as Moving Well-Being Well (Ireland; Gavigan et al., 2023a), SPARK (United States; McKenzie et al., 2016), A+ (Hong Kong; Chan et al., 2016), and Project Energize (New Zealand; Mitchell et al., 2013) have been implemented across the globe. In the United Kingdom, the government has called for greater collaboration between schools and various national healthcare services to provide more MC-focussed, community-based programmes and initiatives to enhance public health (Eddy et al., 2020). In Ireland, a new physical education (PE) framework for both primary and post-primary schools (NCCA, 2022) has been launched placing movement competency as a core element, further positioning MC as an invaluable gateway to lifelong PA.
These global initiatives would greatly benefit from having a suitable assessment tool capable of accurately and reliably assessing children's MC in a variety of settings where interventions are taking place (i.e. schools, sports clubs, clinical settings). At present, despite there being numerous tools available to assess MC, the implementation of MC measurement in real-world settings (i.e. PE classes) is limited (Platvoet et al., 2018). This is due to significant limitations in existing tools, such as, varied amounts of empirical evidence supporting their reliability and validity (Bardid et al., 2019; Eddy et al., 2020; Klingberg et al., 2019), often incomplete measurement of MC as a whole (e.g. stability is not assessed in certain MC measurement tools) (Rudd et al., 2015), lack of feasibility for use in many ‘real-world’ settings, such as schools or sports clubs (Eddy et al., 2020; Platvoet et al., 2018), and failure to consider movement patterns related to an increased risk of injury in children (Miller et al., 2018).
The FMS2 assessment tool has recently been developed as a targeted solution to the existing barriers that prevent practitioners from utilising MC assessments (Gavigan et al., 2022). Informed by a Delphi poll, with consensus from a panel of worldwide experts across a range of environments (research, education, sport, athletic therapy, and physiotherapy), this tool employs core features which aim to make it time-efficient, easy to administer, practical, fun, and more dynamic to better represent skill execution during play/PA (Gavigan et al., 2022). Preliminary findings investigating the structural and convergent validity of this tool have shown positive results (Gavigan et al., 2023a). To further substantiate the use of this tool in practical settings, validity and reliability must be further established. To this end, the aim of this study is to examine the construct validity of the FMS2 assessment tool among 10- to 13-year-old primary school children in Ireland.
Methods
Participants
Prior to the commencement of the study an a priori power analysis using G*Power was conducted to determine the minimum sample size required to achieve a power of 0.80 at a significance level of 0.05. Based on an anticipated effect size (Cohen's d = 0.4), the analysis indicated that a minimum sample of 111 participants was required. A total of 246 children (50% boys, Mage = 11.25 years) participated in this study, which exceeded the required threshold, ensuring adequate power to detect significant effects.
Data were collected in three suburban, non-fee-paying primary schools (one mixed, one boys' school, one girls' school), in County Dublin, Ireland. Prior to the study, school principals were contacted via email by the lead researcher inviting 4th, 5th, and 6th class pupils (age range 10–13 years) from their schools to participate in the study. Following consultation with the respective class teachers, all teachers agreed to participate. Upon acceptance, the lead researcher visited the principal of each school to give an overview of the study design and rationale, and to provide the classes with informed parental consent and child assent forms. Only children who provided parental consent and child assent were eligible for participation. Ethical approval for this study was granted by the Dublin City University Research Ethics Committee (REC/2022/240).
Instruments
The FMS2 assessment tool consists of three subtests and 15 test items (Gavigan et al., 2022). Of these, four are devoted to object manipulation (catch, kick, throw, and strike), five are devoted to locomotor skills (run/cut, vertical jump, horizontal jump, diagonal hop, and skip), and six are devoted to stability skills (beam balance, single-leg balance, bear crawl, crab walk, tennis-ball balance, and head-turn walk) (Gavigan et al., 2022). The assessment has two layers (versions) of marking criteria (see Supplemental Material) that can be used, based on the level of expertise of the assessor (basic and more complex versions). For the purpose of this study, the more basic criteria (Level 1) were used. The tool incorporates a dual approach, employing both process- and product-oriented assessment criteria. Each skill is assessed using a binary scoring system, where a score of ‘1’ is awarded when a component is performed correctly and ‘0’ when it is not. This scoring method results in an overall score for each skill that reflects both the execution process and the outcome of the movement. The final score for each subscale (locomotor, object manipulation, stability) is the sum of its individual components, contributing to an aggregate score that represents the child's overall MC. The sum of the observed performance criteria for each subscale comprises the total raw score (basic version: 0–100 points). Participants do not perform a practice trial. Most skills are performed and assessed twice: once on the dominant and once on the non-dominant side. Skills that are performed bilaterally include three locomotor skills (run/cut, diagonal hop, and skip), four object manipulation skills (catch, kick, throw, and strike), and four stability skills (single-leg balance, bear crawl, crab walk, and tennis ball balance; see Supplemental Material for more information). While bilateral execution of some skills such as throwing is less common in many sports compared to other skills like kicking or striking, it provides a broader understanding of movement adaptability and bilateral proficiency, which are key components of MC development, hence why they are included as part of the FMS2 assessment. In order to make the use of this assessment more feasible, a digital application was developed to support its execution, which was piloted prior to the commencement of this study (see Supplemental Material).
Data collection
Researchers in this study were divided into three teams for the data collection process. Team 1 consisted of an academic with expertise and experience in MC, a PhD candidate, and one third-year undergraduate PE student teacher. Team 2 consisted of a post-doctoral researcher with expertise and experience in MC, and two fourth-year undergraduate sports science students. Team 3 consisted of an academic with expertise and experience in MC, and three third-year undergraduate PE student teachers. The lead researcher acted as a ‘floater’ between these groups.
Prior to data collection, all data collectors completed a training module developed by the lead researcher (combination of online and in-person). As part of this training, data collectors had to undergo a consistency assessment to ensure high-quality data were collected and recorded. All research team members were required to attain a minimum of 95% inter-observer agreement on a set of data (that was pre-coded by the lead researcher) prior to commencement of data collection.
Data collection occurred at two time points, eight weeks apart (March and May 2024). The data collection protocol and personnel remained the same on both occasions. Any children who were in attendance on the day but did not consent to participate in the study completed the assessments during the class time with their classmates, but their data was not recorded. On average, data collection for each class group took 55 minutes (range 45–66 minutes).
Data analysis
Data were analysed using SPSS Version 27 and AMOS Version 23 for Windows. Descriptive statistics were computed for the composite score and subtest scores of the FMS2 assessment tool.
Gender and age differences
Differences in MC proficiency according to gender and age were examined using a two-way between-groups analysis of variance (ANOVA).
Construct validity
To analyse the construct validity of the FMS2, confirmatory factor analysis (CFA) using maximum likelihood estimation methods was conducted using AMOS Version 23 (Arbuckle, 2011). Robust maximum likelihood estimation (Satorra and Bentler, 2001) was used to estimate model parameters, because the data exhibited a multivariate non-normal distribution. Model fit was assessed using standardised root mean square residual (SRMR), root mean square of error approximation (RMSEA), comparative fit index (CFI), and normed fit index (NFI), using a combination strategy as advised by Hu and Bentler (1999) whereby a model is evaluated as acceptable when, for example, an acceptable SRMR is accompanied by another acceptable fit index. The SRMR represents the average discrepancy between the observed sample and hypothesised correlation matrices. Values below 0.05 are typically considered indicative of good model fit (Lopes et al., 2018). The RMSEA assesses the degree of model misspecification per degree of freedom, adjusted to the number of estimated parameters in the model (i.e. the complexity of the model). Values below 0.06 indicate good model fit (Hu and Bentler, 1999), while values between 0.05 and 0.10 are typically interpreted as representing reasonable or acceptable fit by others (Browne and Cudeck, 1992; MacCallum et al., 1996). The CFI indicates the degree of fit between the hypothesis and null measurements, adjusted to the sample size. The NFI reflects the proportion of the joint amount of data variance and covariance that can be explained by the measurement model being tested. CFI and NFI values above 0.95 are considered indicative of a good model fit (Hu and Bentler, 1999).
CFA for each factor individually (i.e. stability, locomotor, object-control) at each time point was conducted to assess the measurement model of each factor. Refined models for each factor were then combined into a second-order hierarchical model (referred to as the ‘full model’) to assess the higher-order structure of MC (Table 1). Following this, measurement invariance for each factor from time point 1 (T1) to time point 2 (T2) was assessed. Finally, a full model containing all three factors was assessed for adequate model fit at T1, T2, and then between T1 and T2 for measurement invariance.
Fit statistics for each factor at T1 and T2.
Note: CFI: comparative fit index; TLI: Tucker–Lewis index; RMSEA: root mean square error of approximation; SRMR: standardized root mean square residual; χ2: chi-square test of model fit; df: degrees of freedom; NFI: Normed fit index.
Internal consistency
The internal consistency of the FMS2 assessment tool was assessed for each construct using Cronbach's alpha to determine the reliability of the item sets. Cronbach's alpha values above 0.70 are generally considered acceptable, indicating that the constructs measured by the FMS2 tool are reliable (Tavakol and Dennick, 2011).
Results
Gender and age differences
A two-way between-groups ANOVA was conducted to explore the impact of gender and age on the composite scores of the FMS2 assessment at T1 and T2 (Table 2). The interaction effect between gender and age group was statistically significant at T1 (F(3, 246) = 2.93, p = .034, partial η2 = .036), but not at T2 (F(3, 246) = 0.93, p = .964, partial η2 = .001).
Mean composite scores for overall MC by age and gender at T1 and T2.
Note: η2: partial eta-squared – effect size; Loco: locomotor; OM: object manipulation; Stab: stability; MC: movement competence.
Maximum composite score (100).
There was not a statistically significant main effect for age at T1 (F(4, 246) = 1.405, p = .233, partial η2 = .023) or T2 (F(4, 246) = 1.724, p = .145, partial η2 = .028). There was a statistically significant main effect for gender at both T1 (F(1, 246) = 12.263, p = .001, partial η2 = .049) and T2 (F(1, 246) = 6.443, p = .012, partial η2 = 0.26), with boys scoring significantly higher than girls at T1 (mean difference = 7.48) and T2 (mean difference = 5.99).
For the subscales, a similar pattern was observed. The locomotor subscale showed a significant main effect for gender at T1, F(1, 246) = 9.83, p = .002, partial η2 = 0.04, but no significant effect for age or interactions. The object manipulation subscale also exhibited a significant gender effect at T1, F(1, 246) = 8.92, p = .001, partial η2 = 0.255, but no significant effects for age. The stability subscale demonstrated a non-significant gender effect, F(1, 246) = 0.14, p = .705, partial η2 = 0.001, across both time points.
Construct validity
For each factor, CFA was conducted to test the measurement model of the factor at T1 and T2. All three factors showed acceptable model fit at T1 (RMSEA: 0.062, CFI: 0.91, TLI: 0.88, NFI: 0.84) and T2 (RMSEA: 0.49, CFI: 0.95, TLI: 0.94, NFI: 0.89), with stronger fit statistics for all factors observed at T2. Measurement invariance from T1 to T2 was assessed individually for each factor (Table 3). Locomotor and stability factors (Model 1) were variant over time, while object manipulation demonstrated measurement invariance. Due to the variance over time noted for the locomotor and stability factors, indicator loadings were inspected and indicators with low loadings at T2 (<.34) were removed. This led to the removal of vertical jump (0.327) from the locomotor factor and single-leg balance (0.29) and beam balance (0.331) from the stability factor. Following the removal of these indicators, a second CFA for each of these factors was conducted (Loco Model 2 and Stability Model 2). Both models showed better fit than the initial versions (Table 1). Measurement invariance was also satisfied for the revised factors (Table 3). Finally, the full model was specified using Stability Model 2, Loco Model 2, and Object Model 1. First, a version was specified containing the underlying MC construct. However, this model was inadmissible (negative errors). The model was re-specified correlating the factors instead (Full model, Table 1). Though invariant over time (Table 3), this model did not demonstrate acceptable model fit at T1, though model fit was acceptable at T2 (Table 1).
Measurement invariance for each model between T1 and T2.
Internal consistency
The locomotor subscale, comprising five items, demonstrated acceptable internal consistency with a Cronbach's alpha of 0.75. The object manipulation subscale, consisting of four items, showed acceptable reliability with an alpha of 0.78. The stability subscale, which included six items, had acceptable internal consistency with an alpha value of 0.70. These values confirm that the items within each construct are sufficiently related and contribute to the reliability of the FMS2 assessment.
Discussion
The primary aim of this study was to examine the construct validity of the FMS2 assessment tool among 10- to 13-year-old primary school children in Ireland. The results provide several key insights into children's MC, including gender differences, age-related development, and the overall utility of the FMS2 tool.
It is evident from the literature that developing proficiency in all facets of MC is important for children, and can have wide-reaching benefits for their overall health and wellbeing (Barnett et al., 2009; Behan et al., 2022; Duncan et al., 2013; Duncan and Stanley, 2012; Fitton Davies et al., 2022; Slotte et al., 2015). The findings of this study revealed that overall, the children's MC was low. This can be seen in the mean overall FMS proficiency scores which ranged from 48.15 to 62.70, out of a possible 100. This finding aligns with a number of studies involving similar cohorts that exhibit slow progression or a lack of any significant development in MC during later childhood, with many failing to achieve mastery in fundamental and functional movement skills (Behan et al., 2019; Hardy et al., 2013; Lester et al., 2017; Lopes et al., 2020; O’Brien et al., 2016). This pervasive low level of MC is worrying, and underscores an urgent need for schools and community-based programmes to prioritise the enhancement of children's MC in their curricula and teaching.
In addition to the overall level of MC being below the desired level, gender differences were also evident, with boys demonstrating higher proficiency than girls at both time points. This finding is consistent with existing literature documenting gender disparities in MC (Barnett et al., 2010; Tsuda et al., 2020). The documented disparities in MC that have been reported have also been echoed by recent studies such as the Children's Sport Participation and Physical Activity study in Ireland, which shows that more boys are meeting the PA guidelines than girls, and more boys participate in both school and community sport than girls (Behan et al., 2019; Hardy et al., 2013; Lester et al., 2017; O’Brien et al., 2022). These differences could stem from various social, cultural, and environmental factors, such as social norms that encourage boys more than girls to participate in PA (Hopkins et al., 2022; Kretschmer et al., 2023). Regardless of how these differences emerge, if MC development is a gateway to promoting increased PA (Eddy et al., 2020), it is imperative that a greater spotlight be focussed on developing gender-relevant interventions that promote and support girls’ MC development and engagement in PA equally with boys. Further, schools and sports clubs should also provide equal opportunities and encouragement for both genders, potentially through programmes specifically designed to boost girls’ confidence and interest in PA.
Another critical finding is that the children's MC did not appear to improve significantly with age within the studied range. The lack of significant change could be a deficiency in sensitivity of the tool, or it could simply suggest that children are not developing their physical skills as expected as they move through the primary-school years (Table 2). Several factors could contribute to this stagnation, including insufficient PA, inadequate PE programmes, or a lack of exposure to diverse PA. Future research should explore this further to establish the reason for this plateau in proficiency. During the school years, teachers play a critical role in developing children's MC through PE classes (Strong et al., 2005). Yet, for many generalist primary school teachers, teaching PE lessons can be daunting, and many lack the confidence and competence to effectively enhance children's skills (Ní Chróinín, 2017). Thus, policy makers should ensure that PE remains a priority in school curricula and ensure that all teachers have the necessary training, resources, and supports to effectively support children's MC development.
One resource essential for supporting practitioners in developing MC is access to an assessment tool which is accurate, reliable, and feasible to use in the school setting. Previous literature on the FMS2 assessment tool suggested that, as well as being time efficient and ecologically valid, it had good structural and convergent validity (Gavigan et al., 2023b), albeit with a small sample size. Findings from this study demonstrate that the tool was time efficient, with 4th class (25 children per class) taking an average of 62 minutes, 5th class (25 children per class) taking an average of 53 minutes, and 6th class (16 children per class) taking an average of 31 minutes. This aligns with findings previously reported by Gavigan et al. (2023b). Further, the acceptable to strong internal consistency values observed across the three subscales suggest that the FMS2 tool is a reliable measure for assessing MC in children. The findings of this study also indicate that each individual subcomponent (i.e. locomotor, object manipulation, and stability) showed acceptable model fit. Yet, when combined into a single model, the fit was inadequate. This finding may suggest that these components, while related, may be distinct constructs, which may imply that perhaps each component should be assessed and reported individually to provide a more accurate and actionable evaluation of a child's MC.
In addition to this, the model demonstrated measurement invariance between the two time points, indicating that the FMS2 tool consistently measured MC over time. This consistency is crucial for longitudinal studies and for evaluating the effectiveness of interventions, or children's progress over extended periods. The better model fit at T2 compared to T1 is curious, and might be attributed to increased familiarity and confidence of the assessors with the tool, which would point to the importance of initial training and practice for assessors. In the school context, this also highlights the need for teachers to be given a tool that is user-friendly and relatively easy to use. Future versions of this tool will seek to incorporate computer vision to automate/semi-automate this assessment process, but further research will be required to develop and refine this technology.
Another notable finding was that removing the vertical jump from the locomotor model improved overall model fit. This aligns with other assessments, such as the Test of Gross Motor Development – 3rd Edition (Ulrich, 2019), which also excludes the vertical jump. The present findings suggest that the performance criteria for the vertical and horizontal jump overlap considerably. This interpretation is further supported by both skills demonstrating acceptable internal consistency (Cronbach's alpha = 0.74). Together, these findings suggest that future assessments may benefit from omitting the vertical jump to streamline the evaluation process without compromising measurement accuracy. Simplifying the assessment tool by removing redundant items can make it more practical and less time-consuming, encouraging broader adoption and more consistent use in various settings. The object manipulation indicators consistently showed a good fit at both time points, supporting the robustness of this component in the FMS2 tool. This consistency across time reinforces the reliability of this subcomponent and suggests that these skills are measured effectively by the FMS2 tool. Given the strong performance of the object manipulation indicators, future research could explore the specific factors contributing to this reliability and how they can be applied to improve other components of the assessment.
Limitations
While this study provides initial evidence for the construct validity of the FMS2 assessment tool, several limitations should be acknowledged. First, the findings pertain specifically to the Level 1 (basic) criteria used in the assessment; further validation of the Level 2 (complex) criteria is required. Second, this study did not assess inter- and intra-rater reliability or test–retest reliability, which are critical for determining whether practitioners can consistently administer and interpret the tool. Future studies should examine these reliability measures to support broader implementation.
Additionally, although the construct validity of the FMS2 was examined in children aged 10–13 years, further research is needed to determine its applicability to younger children (8–9 years old). Finally, clearer practitioner guidelines should be developed to ensure that the FMS2 results are effectively interpreted in educational and coaching settings.
Despite these limitations, the study provides a foundation for the development and advancement of the FMS2 as an MC assessment tool. Ongoing research will help refine the instrument, enhance its reliability and applicability, and support its future use in evidence-based assessment.
Conclusion
A plethora of research has expressed the key role MC plays in providing children with the tools they need to lead and sustain a physically active life (Barnett et al., 2009; Behan et al., 2022; Hardy et al., 2013). Yet, the results from this study suggest that despite an enhanced focus being put on MC in recent times, many children are still well below the desired level of MC proficiency. This low proficiency level is particularly evident in young girls, within whom recent research has demonstrated lower proficiency in MC (Bolger et al., 2021) and subsequently a lower likelihood of meeting the PA guidelines (Woods et al., 2023). In order to address this trend, schools, sports clubs, and community-based PA programmes should increase their efforts to promote and develop children's MC. Having a valid, reliable, and feasible assessment tool is an important component of this process. The findings of this study suggest that the FMS2 assessment tool has good structural validity, and shows consistency in its ability to measure children's MC proficiency over time. This, coupled with previous research on this tool suggesting that it is time-efficient and ecologically valid, makes it a viable option for practitioners to use in time-restricted environments such as schools. Future research should look to consolidate these findings in other environments, and further explore the reliability and feasibility of the tool in ‘real-world’ settings.
Supplemental Material
sj-docx-1-epe-10.1177_1356336X261426308 - Supplemental material for Examining the construct validity of the FMS2 assessment tool
Supplemental material, sj-docx-1-epe-10.1177_1356336X261426308 for Examining the construct validity of the FMS2 assessment tool by Nathan Gavigan, Sarahjane Belton and Una Britton in European Physical Education Review
Footnotes
Acknowledgements
The research team would like to offer our sincere thanks to the schools, principals, teachers and children for their participation in the research study. In addition, we would like to thank all those who assisted with the data collection process to make this study possible. Finally, we would like to thank Shane from Athletech Limited for building the first prototype version of the FMS2 digital application, which greatly helped with the data collection process.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
