Sage Journals: Discover world-class research

Abstract

Background:

Caring health from childhood is a most important challenge. To date, machine learning (ML) algorithms have been introduced to several fields of knowledge, while in education it is a novel perspective. This review aims to evaluate the effectiveness of ML applications on data registered by inertial measurement units collected from preschool to secondary education children’s physical activity during school-hours. Furthermore, the review aims to explore how ML is used to process and interpret this data for outcomes like motor competence, physical activity intensity, sedentary behavior and academic/developmental indicators.

Methods:

Following PRISMA guidelines, we systematically searched PubMed, Web of Sciences, SCOPUS, SPORTDiscus and ProQuest Central databases.

Results:

13 studies met the inclusion criteria, covering preschool to secondary education settings across multiple countries. The methodological quality ranged from moderate to high (11–17/18 MINORS points). ML algorithms, mainly Random Forest, Support Vector Machines, Gradient Boosting and Convolutional Neural Networks, were successfully applied to classify or predict various outcomes such as motor competence, physical activity intensity, sedentary behavior and developmental or academic indicators.

Conclusion:

Reported accuracies ranged from approximately 70% to 99%, demonstrating the strong potential of wearable sensor data combined with ML to objectively monitor and assess school-related physical activity.

Keywords

adolescent child exercise health promotion machine learning schools sedentary behavior

Introduction

Physical activity is fundamental to human nature. It improves muscular and cardiorespiratory fitness, bone health and mental fitness while reducing the risk of heart diseases, diabetes, hypertension, obesity and fractures. Insufficient physical activity is among the leading factors which cause mortality. The World Health Organization (WHO), therefore, recommends people of all ages to indulge in regular physical activity.^1,2 However, children as well as adults world-wide are struggling to fulfill the guidelines recommended by the World Health Organization.^3,4

For children, physical activity is beneficial for development in 3 major areas: motor skills, cognitive competency (such as creativity, attention and mental abilities) and social competency.⁵ Therefore, monitoring children’s physical activity is valuable to better understand their physical and mental development, along with the potential risk factors that consequently emerge with insufficient physical activity levels.⁶ Due to the seriousness of such risk factors, such as depressive symptoms and suicidal behavior, accurate assessment is crucial.⁷ Subjective methods such as questionnaires and parental reports have historically been leveraged, but have shown to be insufficiently reliable due to recall bias.⁸ Such methods are also time-intensive, have poor generalizability and yield substantial misclassification errors in children.^9-11

Information technology, such as accelerometers and other wearable sensors, collect objective and continuous data on physical activity. Thus, this technology has been considered useful as an unbiased way of validating the subjective methods previously relied upon.¹² Wearable sensors have provided the opportunity to monitor children’s physical activity throughout school hours, which are a significant part of children’s daily life. Until recently, the technology has been limited in differentiating between types of physical activity. However, machine Learning (ML) and Deep Learning (DL) algorithms have been applied to data collected by wearable sensors in attempts to improve the practical usability of these instruments. This has resulted in an automated process, leading to dramatic time-efficiency gains,¹³ and improved activity classification by detecting subtle movement pattern differences.^14-16 This has also enabled a more detailed analysis of different intensity levels of physical activity, and their effect on health indicators.¹⁷ The validity and reliability of ML wearable technology has been examined in assessing physical activity in preschool- and school-age youth.¹⁸ However, a wide variety of physical activities are performed spontaneously in real-life settings, which can negatively affect the performance of ML algorithms.¹⁹

Several reviews have examined the application of ML algorithms and DL approaches on children’s physical activity data from wearable technology, and have revealed significant progress within the field.^20,21 This has provided the following benefits with regards to children’s physical activity: (1) real-time monitoring and automated alert systems, (2) personalized evaluation frameworks accounting for differences in age and gender, (3) informing intervention strategies by identifying sedentary patterns and (4) detection of complex movement patterns that otherwise are likely to be missed by traditional methods of analysis.^22-24 However, previous systematic reviews have focused on early childhood²⁵ or specific clinical populations,²⁶ leaving a gap in understanding the full educational spectrum from pre-school to secondary school.

Thus, this review aims to evaluate the effectiveness of ML applications on data registered by information technology collected from preschool to secondary education children’s physical activity during physical education, classroom and school travel. Furthermore, the review aims to explore how ML is used to process and interpret this data for outcomes like motor competence, physical activity intensity, sedentary behavior, academic/developmental indicators and to identify methodological inconsistencies for improved methodological practice in the future.

Materials and Methods

Experimental Approach to the Problem

This systematic review complied with existing standards for conducting systematic reviews in sport sciences²⁷ and the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines.²⁸ In order to preserve methodological rigor and guarantee thorough coverage of pertinent literature, the review methodology was created. PROSPERO has registered the systematic review (CRD420261285539).

Information Sources

Five databases—PubMed, Web of Sciences, SCOPUS, SPORTDiscus and ProQuest Central—were thoroughly searched. All published literature before October 8, 2025, was included in the search.

Search Strategy

To organize the search strategy and guarantee methodical coverage of pertinent literature, the PICO (Patient, Problem or Population – Intervention or Exposure – Comparison, Control or Comparator – Outcome[s]) framework was used. The writers were not blinded to journal names or manuscript authors in order to preserve transparency. In order to find all pertinent material on information technology and machine learning in educational settings, the search terms were carefully chosen. The last search term was:

(preschool OR kindergarten OR school* OR schoolchildren OR “primary education” OR “elementary education” OR “secondary education” OR “high school”) AND (“machine learning” OR “deep learning”) AND (exercise OR “Physical activity” OR “physical education” OR sport OR fitness OR aerobic OR “motor skill*” OR “motor competence”) AND (“inertial measurement unit*” OR gyroscope OR pedometer OR barometer OR “smart band*” OR smartwatch* OR acceleromet* OR wearable OR sensor*).

Eligibility Criteria

The authors downloaded the title, authors, journal and date of every article that came up in the search after entering the search string into databases. After organizing the Excel file, duplicate articles were eliminated, and the remaining articles were assessed for eligibility. The writers included items that did not show up in the search by marking them in the excel spreadsheets as “included from external sources” (Table 1).

Table 1.

Inclusion and Exclusion Criteria.

Item	Inclusion	Exclusion
Population	Children and adolescents until 18 years old	People with more than 18 years old
Intervention or Exposure	Children are doing physical activity during school hours (school travel was included) Also, data was recorded during school hours and others, the article was included (24 h) Data is extracted from inertial measurement units	Articles that do not include any data extracted from children’s school hours The data was not extracted from an inertial measurement unit Study protocols.
Comparation	Not applicable	Not applicable
Outcome[s]	Any result extracted after applying machine learning or deep learning	Results non-related to machine learning/deep learning applications
Other criteria	Peer-reviewed full-text studies published in original journal articles	Non-peer reviewed journal articles. Non-original full-text studies (conference papers, etc.). Systematic review, meta-analysis, chapters, etc.

Data Extraction

An Excel spreadsheet created in compliance with the Cochrane Consumers and Communication Review Group’s data extraction template was used to conduct a consistent data extraction procedure.¹⁴ The spreadsheet made it easier to systematically evaluate the inclusion and exclusion criteria for each of the chosen studies. Two authors independently conducted the extraction process (including manually done duplicates’ removal) checking titles/abstracts and full-texts, with any disagreements resolved through discussion until consensus was reached. Full documentation was maintained for excluded articles, including specific reasons for exclusion. All data were systematically recorded and stored in the spreadsheet.

Assessment of Study Methodology

The methodological quality was assessed using methodological index for non-randomized studies (MINORS). The MINORS scale is a list that contains 8 essential points, and it is expanded to 12 points when the studies to be treated are comparative. In this case, it was assessed considering 9 items (out of 18 points) due to the non-possibility to applicate (NA) 3 of them. The score that each section receives can be from 0 to 2, depending on the quality obtained by each point.

Results

Identification and Selection of Studies

After analyzing all databases (PubMed: 23; Web of Science: 27; ProQuest Central: 13; SCOPUS: 49; SPORTDiscus: 5; External sources: 2) the contents of 119 articles were checked, detecting, at initial stage, 60 duplicate articles. Then, the authors analyzed if each of the remaining 59 articles meet all inclusion criterion, resulting on the elimination of 37 articles by exclusion criteria number 1 (n = 10), exclusion criteria number 2 (n = 26), exclusion criteria number 4 (n = 2) and exclusion criteria number 6 (n = 8). The remaining 13 articles were included in the qualitative synthesis of the systematic review (Figure 1).

Figure 1.

Flow diagram of the study.

Quality Assessment

The methodological quality of the 13 included studies, assessed using the MINORS checklist, ranged from 11 to 17 out of 18 points, indicating generally moderate to high quality. Most studies clearly stated objectives, used appropriate designs and collected relevant data. However, several lacked prospective sample size estimation, adequate control groups or neutral evaluation procedures. Only a few studies achieved the highest methodological standards, while the majority demonstrated solid internal consistency but moderate external validity. Overall, the reviewed works show acceptable methodological rigor, supporting confidence in their reported findings (Table 2).

Table 2.

Methodological Assessment of the Included Studies.

Reference	1	2	3	4	5	6	7	9	10	11	12	Score
Brons et al¹⁵	2	1	2	2	2	0	0	NA	NA	NA	2	11/18
Sulla-Torres et al¹⁶	2	2	2	2	2	0	0	NA	NA	NA	2	12/18
Froud et al²⁹	2	2	2	2	2	2	0	NA	NA	NA	2	14/18
Lander et al³⁰	2	1	0	2	2	2	2	NA	NA	NA	2	11/18
Muñoz-Organero et al⁹	2	1	2	2	2	0	0	NA	NA	NA	2	11/18
Li et al¹⁰	2	1	2	2	2	0	0	NA	NA	NA	2	11/18
Joensuu et al¹¹	2	2	0	2	2	2	0	NA	NA	NA	2	12/18
Carlson et al³¹	2	1	2	2	2	0	0	NA	NA	NA	2	11/18
Christian et al³²	2	1	2	2	2	0	0	NA	NA	NA	2	13/18
Letts et al³³	2	1	2	2	2	0	0	NA	NA	NA	2	11/18
Kwon et al¹³	2	2	2	2	2	2	0	NA	NA	NA	2	14/18
Clarke et al³⁴	2	1	2	2	2	0	0	NA	NA	NA	2	11/18
Mendoza et al³⁵	2	1	2	2	2	2	0	NA	NA	NA	2	17/18

Abbreviation: NA, not applicable. The MINORS checklist (2 = high quality; 1 = medium quality; 0 = low quality): Clearly defined objective (item 1); Inclusion of patients consecutively (item 2); Information collected retrospectively (item 3); Assessments adjusted to objective (item 4); Evaluations carried out in a neutral way (item 5); Follow-up phase consistent with the objective (item 6); Dropout rate during follow-up less than 5% (item 7); Prospective estimation of sample size (item 8); Adequate control group (item 9); Simultaneous groups (item 10); Homogeneous starting groups (item 11); and, appropriate statistical analysis (item 12).

Study Characteristics

Sample

The 13 included studies involved children and adolescents aged 3 to 18 years across diverse educational levels, from preschool to secondary school. Sample sizes varied widely, ranging from 14 participants in laboratory-based feasibility studies to over 1,700 schoolchildren in large-scale population analyses. Most samples were balanced by sex and recruited from schools in Europe, North America and Australia, ensuring a heterogeneous representation of educational and cultural contexts.

Data Collection Methods

Data were primarily gathered through inertial measurement units (IMUs) and related wearable sensors such as ActiGraph accelerometers, smart bands, gyroscopes and GPS loggers. Sampling frequencies ranged from 10 Hz to 110 Hz, with epochs typically set between 1 and 60 s depending on the target variable. Collected features included raw tri-axial acceleration, angular velocity, stride cadence, heart rate and derived metrics such as energy expenditure or motor competence indicators. Data were generally processed using standardized pipelines and exported for subsequent machine learning analysis.

Study Settings and Research Focus

Most investigations were conducted in naturalistic school or preschool environments, with some including laboratory validation phases. Activities analyzed ranged from classroom movement and playground behavior to school travel and structured motor skill assessments. The studies aimed to address diverse educational and health objectives, including classification of motor competence, detection of sedentary behavior, prediction of academic performance and identification of developmental or neurobehavioral patterns such as ADHD-related movement profiles.

Machine Learning Implementation

A variety of supervised and unsupervised machine learning algorithms were applied, including Random Forest, Support Vector Machines, Gradient Boosting, Neural Networks, Convolutional Neural Networks, k-means clustering and Self-Organising Maps. Reported model accuracies ranged from 70% to 99%, depending on the complexity of the task and data quality. Most studies emphasized feature engineering and validation procedures, while a few integrated deep learning frameworks for automated feature extraction. Collectively, these implementations demonstrate the growing feasibility of using machine learning to analyze sensor-based data for monitoring and enhancing schoolchildren’s physical activity and motor development (Table 3).

Table 3.

Main Characteristics and Findings of Machine Learning Applications Using Accelerometer Data in Schoolchildren.

Ref.	Participants	Activity registration				Aim of prediction	ML/DP accuracy		Conclusions	Practical application from predicting
Ref.	Participants	Accelerometer	Hz and Epoch	Place	Attributes/features/Variables	Aim of prediction	Algorithm	%	Conclusions	Practical application from predicting
Data extracted during school hours
Brons et al¹⁵	N: 95 children (52 girls, 43 boys) Age: 7.8 ± 0.7 years Country: Netherlands	Custom “Futuro Cube” Sensors: Tri-axial accelerometer (±8G) and tri-axial gyroscope (2000 dps)	Hz: 110 Hz Epoch: Not specified; features calculated per game level (30-60 sec levels).	School setting/Laboratory conditions	Sensor Features (both games): Total acceleration, total angular velocity, jerk (smoothness of translation), derivative of angular velocity (smoothness of rotation). Game Features: Roadrunner: Cosine similarity (accuracy). Maze: Maze correctness (time on path). General Features: Age, gender.	To predict the outcome of the fine motor skill part of the Movement ABC-2 (fine MABC-2) and classify children as having/not having fine motor skill problems.	Decision Tree (DT) k-Nearest Neighbor (KNN) Logistic Regression (LR) Support Vector Machine (SVM)	Highest Accuracy: 0.76 (76%) Highest F1-score: 0.70 (70%) Achieved with DT classifier using sensor + game data from the Roadrunner game.	Sensor-augmented toys can efficiently predict fine MABC-2 scores. The game focusing on speed (Roadrunner) performed significantly better than the precision-based game (Maze). Using a combination of sensor and game features was most effective. The choice of classifier was less important than the selection of input features and game type.	Can be used as a less time-consuming, playful and motivating tool for the initial assessment and screening of fine motor skill problems in elementary school children within a natural classroom setting, potentially reducing the need for universal standardized testing.
Froud et al²⁹	N: 1,711 children Age: 11-12 years Country: Norway Setting: 9 schools	ActiGraph (model not specified)	Not specified	Free-living setting)	Physical activity levels, aerobic fitness, muscular strength, diet, parental education, quality of life, academic performance (reading, mathematics, English tests)	Academic performance and quality of life	Linear Regression, Random Forest, Support Vector Machine, k-Nearest Neighbors, Neural Network	- Regression: R² = 22%-24% (academic performance), 15% (quality of life) - Machine learning: R² ≈ 0% (academic performance), 3%-8% (quality of life) in validation	Linear regression was less prone to overfitting and outperformed machine learning techniques for modeling continuous outcomes with missing data. Machine learning performed better only for simulated nonlinear/heteroscedastic relationships.	Machine learning approaches may be useful when dealing with complex nonlinear relationships and complete datasets, but traditional regression methods are more reliable for typical educational and health outcome prediction with real-world missing data.
Sulla-Torres et al¹⁶	764 schoolchildren (451 males, 313 females) aged 6-17 years from 5 state schools in Arequipa, Peru.	Huawei Band 7 smart band	Not explicitly stated for raw data; features extracted from tests.	Laboratory	Anthropometric: Age, weight, height, waist circumference, BMI. Smart Band Metrics: Cadence, number of steps, calories, speed, stride, heart rate, maximum heart rate.	To classify motor competence level (High, Normal, Low) based on percentile cut-offs (<P25, P25-P75, >P75).	11 algorithms tested (eg, Random Forest, SVM, Naive Bayes, Neural Network, Gradient Boosting, XGBoost, LightGBM, CatBoost).	Best Model (Gradient Boosting): Accuracy: 0.95 (Males), 0.89 (Females); F1-score: 0.92-0.93; ROC-AUC: 0.98.	Motor competence can be classified using smart band data and machine learning. Hyperparameter optimization significantly improved accuracy, with Gradient Boosting being the best-performing model. The proposed percentiles provide reference values for walking motor competence.	A mobile app was developed to allow educators to input student data and smart band metrics to automatically classify motor competence and identify children needing intervention, facilitating personalized physical activity recommendations in schools.
Lander et al³⁰	14 children (9 boys, 5 girls); Age: 7-12 years (M = 9.64); Country: Australia	XSENS AWINDA system (17 IMU sensors); Simplified: 4 sensors (wrists and ankles)	60 Hz; Not specified for epochs	School/university gymnasiums	Raw signals (angular velocity orientation); Time- and frequency-domain features; Kinematic body model data	To automatically classify performance of 7 TGMD-3 motor skills (hop, sidestep, skip, catch, kick, jump, throw) against skill criteria	Machine Learning (specific algorithm not named in paper; likely classification model)	Skip and sidestep: 100%; Kick and catch: ~95%; Throw: 80.5%; Jump and hop: ~80%	A 4-sensor IMU system is feasible for school use but lacks accuracy for some TGMD-3 criteria (eg, arm position, object interaction). Machine learning can automate scoring but may misclassify some skill components.	Enables automated, objective assessment of motor competence in schools, reducing teacher burden. Suitable for screening and group-level analysis, though not yet comprehensive for all skill criteria.
Muñoz-Organero et al⁹	N: 22 children (11 ADHD, 11 controls) Age: 6-15 years Country: UK Subgroups: 5 ADHD non-medicated, 6 ADHD medicated	Runscribe™ inertial sensors (Tri-axial accelerometer)	Hz: 10 Hz Epoch: 5-second sliding windows (50% overlap) for image creation	Setting: School hours (9:00 to 15:00) in a “free-living” environment	- Raw Data: Tri-axial acceleration signals. - Processed Data: 2D acceleration images (28x28 pixels) generated by projecting movement-related linear acceleration onto a geo-referenced coordinate system (vertical vs horizontal components).	To automatically classify children as having ADHD (non-medicated) or being a typically developing control based on movement patterns.	Convolutional Neural Network (CNN)	Leave-one-out validation: - Wrist: 87.5% Accuracy - Ankle: 93.75% Accuracy fourfold cross-validation: - Wrist and Ankle: 93.75% Accuracy	A CNN applied to geo-referenced acceleration images can effectively distinguish between non-medicated children with ADHD and controls in a free-living school environment. Medication alters movement patterns, making them more similar to controls. The ankle sensor provided slightly better performance.	The method can serve as a complementary, objective tool to aid in the diagnosis of ADHD and for non-intrusive monitoring during medication optimization.
Li et al¹⁰	N: 34 children Age: 3-5 years (3.97 ± 0.49) Country: USA Setting: Head Start program	ActiGraph model WGT3X-BT	Hz: 30 Epoch: 15-second converted to 60-second	Free-living	Vector Magnitude (VM) counts per minute (cpm). Reference: Hip-worn accelerometer cut points (Butte et al.) for sedentary, LPA, MPA, VPA.	To establish and validate cut-points for wrist-worn accelerometer data to classify physical activity intensity (Sedentary, LPA, MPA, VPA) in preschoolers.	ROC analysis Ordinal Logistic Regression (OLR) K-means cluster analysis	K-means performance (vs hip reference): • Overall correct classification: ~70% • Sedentary: Sensitivity 71.64%, Specificity 85.07% • LPA: Sensitivity 50.90%, Specificity 76.81% • MVPA: Sensitivity 68.37%, Specificity 86.58% • Kappa (overall): 0.40	K-means cluster analysis demonstrated the best performance among the 3 ML models for establishing wrist-worn accelerometer cut points. It showed acceptable agreement with the hip-reference criterion for predicting sedentary behavior, LPA and VPA. None of the wrist methods accurately assessed MPA.	This study demonstrates the potential of unsupervised ML (k-means) to calibrate wrist-worn accelerometers for assessing physical activity intensity in preschoolers, offering a method that may improve compliance over hip-worn devices while providing reasonable estimates for most activity levels except MPA.
Joensuu et al¹¹	N: 633 adolescents (50% girls) Age: 12.4 ± 1.3 years Country: Finland Setting: 9 public schools	ActiGraph GT3X+ and wGT3X+	Hz: 30 Epoch: 15 s	Non-freeliving setting	48 baseline variables: • Demographics: Age, sex. • Anthropometrics: Height, weight, BMI, waist circumference, body fat % (BIA). • Physical Fitness: 20MSRT, push-up, curl-up, 5-leaps test, throwing-catching test, flexibility. • Physical Activity: Accelerometry (sedentary, LPA, MVPA), self-reported PA, sport club participation. • Psychosocial: Life enjoyment, perceived fitness, social status, parental support. • Academic: Grade point average (GPA), PE grade. • Other: Pubertal status (Tanner), family social status.	To predict unfavorable future 20-m shuttle run test (20MSRT) status (lowest tertile after 2 years) and development using a holistic profile.	Random Forest (RF)	Task 1 (Future Status): • AUC: 0.83 (Girls), 0.76 (Boys) • Sensitivity: 80% (G), 60% (B) • Specificity: 78% (G), 79% (B) Task 2 (Development): • AUC: 0.68 (Girls), 0.40 (Boys)	The RF classifier successfully identified adolescents with unfavorable future cardiorespiratory fitness based on a holistic profile of 14-20 baseline characteristics. Baseline 20MSRT was the strongest predictor, but other factors like other fitness tests, adiposity, physical activity, academic scores and psychosocial factors added significant predictive value. Predicting future development was less accurate than predicting status.	Provides a method for large-scale fitness monitoring systems (eg, in schools) to identify adolescents at risk of low future cardiorespiratory fitness for targeted, early interventions. The holistic approach suggests interventions should consider physical, psychological and social factors, not just fitness scores. The attached MATLAB script facilitates use in future precision exercise medicine research.
Data extracted during school hours after school (24 h)
Carlson et al.	N: 278 children Age: 8-11 years Country: Australia Setting: 9 primary schools	ActiGraph GT3X+	Hz: 30 (raw), processed at 10 Hz Epoch: 10 s	Free-living	Raw triaxial acceleration data. Ground truth: activPAL thigh-worn monitor (posture: sitting/lying vs standing/stepping). Comparison: Standard 100 counts per minute (cpm) cut-point method.	To classify each 10-second epoch as sitting vs non-sitting and detect sit-to-stand transitions from hip-worn accelerometer data, equivalent to thigh-worn activPAL output.	Convolutional Neural Network (CNN) + Bidirectional Long Short-Term Memory (LSTM) network (CHAP-child method)	Epoch-level vs activPAL: • Balanced Accuracy: 87.6% • Sensitivity (sitting): 93.6% • Specificity (non-sitting): 81.6% Sit-to-stand transitions (1-min window): • Sensitivity: 71.1%	The CHAP-child deep learning method showed strong concurrent validity for deriving posture-based sedentary measures from a hip-worn ActiGraph in children. It significantly outperformed the traditional 100 cpm cut-point method, which demonstrated poor validity for assessing sedentary patterns (eg, bouts, transitions).	Provides an open-source method to derive accurate, activPAL-equivalent measures of sedentary volume, sit-to-stand transitions and sedentary bout patterns from existing and future datasets using widely deployed hip-worn ActiGraph monitors. This can refine research on the health impacts of children’s sedentary behavior without requiring thigh-worn monitors.
Christian et al³²	N: 1,167 children (601 boys, 566 girls) from the PLAYCE cohort study. Age: 2 to 7 years (Wave 1: 2-5 years, Wave 2: 5-7 years). Country: Australia.	ActiGraph GT3X+ (ActiGraph Corporation, Pensacola, FL, USA). Placement: Right hip.	Hz: 30 Hz (raw data). Epoch: Processed using 15-second non-overlapping windows for the ML model. Analysis outcomes (eg, daily minutes) are derived from these classifications.	Free-living conditions (preschool and school settings).	Input to ML Model: Raw tri-axial accelerometer signal transformed into a vector magnitude; 25 time and frequency domain features. Predicted Activity Classes: Sedentary (SED), light-intensity activities and games (L_ACT_G), walking (WALK), running (RUN), moderate-to-vigorous activities and games (MV_ACT_G). Derived Outcomes for Analysis: Energetic play (MVPA = sum of MV_ACT_G, WALK, RUN), total physical activity (sum of LPA and MVPA), sedentary time. Meeting age-specific physical activity guidelines.	To classify free-living movement behaviors into activity intensity/type categories as a superior alternative to traditional cut-point methods, in order to analyze developmental trends.	Random Forest (500 trees). A pre-validated model was used, not developed anew in this study.	The cited validation study for the RF model reported an average F-Score of 86% across all activity classes. Specific class accuracy: SED (85.3%), L_ACT_G (92.3%), MV_ACT_G (72.0%), WALK (80.2%), RUN (85.1%).	Using a validated ML classifier revealed developmental trends where total physical activity peaks at age 5 then declines, while energetic play increases linearly but remains below guidelines for most children aged 3-7. This ML approach overcomes limitations of cut-point methods, providing more accurate population-level estimates of guideline adherence.	Demonstrates the utility of using a standardized, validated ML model (Random Forest) for processing accelerometer data in large-scale longitudinal studies. This provides a more accurate method for monitoring population-level adherence to movement guidelines over time and assessing the impact of developmental transitions, which can better inform public health intervention strategies.
Letts et al³³	N: 497 preschool children (4-5 years) Country: Canada Cohort: CATCH study	ActiGraph GT3X and GT3X-BT	Hz: 30 Epoch: 15-second non-overlapping windows	Free-living setting	Raw tri-axial accelerometer signal transformed into vector magnitude; 25 time and frequency domain features.	To classify accelerometer data into one of 5 physical activity classes: sedentary (SED), light activities and games (L_ACT_G), moderate-to-vigorous activities and games (MV_ACT_G), walking (WALK), running (RUN).	Random Forest (Ahmadi hip random forest model for preschool children)	Overall accuracy: >80% (Free-living evaluation vs direct observation). Class-specific accuracy: SED (82.9%), L_ACT_G (90.7%), MV_ACT_G (69.9%), WALK (78.4%), RUN (83.6%). Average F-score: 84%.	No differences in SED, LPA or MVPA time were found between typically developing children, those at risk for DCD (DCDr), and those with probable DCD (pDCD). However, children with motor impairments (DCDr/pDCD) spent significantly less time in walking and running activities. Machine learning methods can reveal differences in how physical activity is accumulated, even when overall intensity levels are similar.	The validated Random Forest model can be used to provide a more detailed analysis of physical activity types in preschool children, moving beyond intensity-based cut-points. This allows researchers and clinicians to identify specific activity deficits (eg, reduced walking/running) in children with motor difficulties like DCD, which could be targeted in early intervention programs to prevent the more pronounced physical activity gaps observed later in childhood.
Kwon et al¹³	N: 301 U.S. children (149 girls, 152 boys) Age: 3-5 years Country: USA Setting: National survey (NNYFS)	ActiGraph GT3X+	Hz: 80 Epoch: 15-s non-overlapping windows	Free-living setting	Tri-axial accelerometer signals transformed into vector magnitude. Time and frequency domain (base) features + temporal features.	To classify physical activity types and estimate daily time spent in Moderate-Vigorous Physical Activity (MVPA) and Total Physical Activity.	Random Forest (RF)	Model Performance: Weighted average F-score = 81%.	A machine learning RF classifier applied to free-living wrist accelerometer data provides a more accurate estimation of PA levels in preschoolers compared to traditional cut-point methods. The study found that U.S. preschoolers, on average, do not meet WHO MVPA recommendations (28 min/day vs 60 min/day recommended) but exceed total PA recommendations. MVPA was positively associated with gross motor skills.	The validated RF algorithm can be used in large-scale surveys and research to accurately monitor compliance with physical activity guidelines in preschool populations. It helps identify that interventions should specifically target increasing MVPA, not just total activity volume, to support gross motor development.
Clark et al³⁴	N: 125 children (80 boys) Age: 4.3 ± 0.5 years Country: UK	ActiGraph GT3X+	Hz: 100 Epoch: 1 s	Free-living, 24 h activity	Motor competence (fine, gross, overall), sedentary time, light PA, MVPA, zBMI, waist circumference.	To identify profiles of relative motor competence and physical activity compositions.	Self-Organized Map (SOM), k-means clustering	Non-specified (Profiling, not classification accuracy)	The SOM analysis identified 5 distinct movement behavior profiles. A key finding was that while differences in movement behaviors are already evident, resultant changes in adiposity are not clear.	The profiling approach can shift the focus from basic obesity monitoring to assessing “moving well.” It allows for the identification of children with low motor competence for early, nuanced interventions.
Data extracted during school travel
Mendoza et al³⁵	N: 54 children (35 girls, 19 boys) Age: 9.9 ± 0.7 years Country: USA Setting: 4 public schools serving low-income families in Seattle, WA	ActiGraph GT3X+ accelerometer and Qstarz BT-1000 XT GPS Logger	Hz: 30 (Accelerometer) Epoch: 15-second (processed)	Free-living)	Accelerometer Features: 41 features (eg, counts, signal characteristics). GPS Features: 10 features (eg, location, time, velocity). Combined: Used to classify activity types, with a focus on identifying cycling.	To identify and classify cycling activity from other common physical activities (walking, running, motor vehicle, stationary) in order to accurately measure Moderate-to-Vigorous Physical Activity (MVPA) in a free-living intervention context.	Random Forest classifier with Hidden Markov Model smoothing	Accuracy: 99.9% (balanced accuracy from leave-one-out cross-validation on the protocol-based validation sample).	A machine learning algorithm using combined accelerometer and GPS data can accurately identify cycling activity in children. This method was successfully applied in an intervention trial, revealing that the bicycle train program significantly increased participants’ cycling to school and overall daily MVPA.	The validated algorithm allows for the objective measurement of cycling, an activity traditionally difficult to capture with accelerometry alone. This enables more accurate assessment of the impact of active transport interventions (like bicycle trains) on total daily physical activity in real-world settings.

Abbreviations: 20MSRT, 20 m shuttle run test; ADHD, attention deficit and hyperactivity disorder; BMI, body mass index; IMU, inertialmeasurement unit; LPA, light PA; ML, machine learning; MPA, moderate PA; MVPA, moderate-to-vigorous physical activity; PA, physical activity; SED, sedentary; SVM, support vector machine; TGMD-3, Test of Gross Motor Development 3; VPA, vigorous PA; zBMI, body-mass index expressed as a z-value.

Discussion

The traditional methods for physical activity assessment in educational settings suffer from time-intensity, poor generalizability and substantial misclassification errors in children.^36-38 Machine learning algorithms integrated with inertial measurement units offer automated and objective monitoring solutions. However, previous systematic reviews have examined only early childhood²⁵ or specific clinical populations,³⁹ leaving a gap in understanding machine learning applications across the full educational spectrum from preschool to secondary education. Therefore, this systematic review examined machine learning applications with inertial measurement units for assessing physical activity during school hours across all educational levels.

Thirteen studies demonstrated accuracies of 70% to 99% across diverse applications including motor competence assessment, activity classification, sedentary behavior detection and clinical screening. Random Forest emerged as the predominant algorithm in 7 studies, while Convolutional Neural Networks achieved 87.6% balanced accuracy for sedentary behavior detection and 87.5% to -93.75% accuracy differentiating ADHD children from controls. Machine learning approaches offered substantial advantages over traditional methods, including dramatic time efficiency gains (assessment time reduced from 15 to 2 min per child) and detection of subtle movement pattern differences not captured by intensity-based classifications (eg, reduced walking/running time in children with developmental coordination disorder despite comparable overall activity levels). However, methodological quality varied (11-17/18 MINORS points) with considerable heterogeneity in sampling frequencies, epoch lengths, sensor placements and validation protocols.

Motor Competence Assessment

Motor competence assessment via machine learning addresses the time-intensive nature of traditional evaluations that requires trained assessors and standardized protocols.⁴⁰ Brons et al¹⁵ demonstrated 76% accuracy predicting fine motor skills using sensor-augmented toys, reducing assessment time from 15 to 2 min per child. Similarly, Lander et al³⁰ achieved 80% to 100% accuracy across TGMD-3 skills using simplified 4-sensor IMU systems positioned on wrists and ankles, though limitations existed for detecting certain skill criteria such as arm positioning and object interactions. This accuracy-feasibility trade-off remains critical for school implementation, where comprehensive sensor arrays (eg, 17 IMUs) offer precision but lack practical scalability due to setup time and technical expertise requirements.³⁰ The integration of machine learning with consumer-grade wearables presents additional opportunities for large-scale implementation. Sulla-Torres et al¹⁶ achieving 95% accuracy in males and 89% in females for motor competence classification using smart bands combined with Gradient Boosting algorithms. Similar consumer devices have demonstrated acceptable accuracy for step counting and distance measurement in adult populations.⁴¹

The instrumentation of standardized motor competence tests represents a growing research area,^25,42 with IMU-based systems’ potential to provide objective assessments while reducing assessor burden. Traditional assessments such as the Movement Assessment Battery for Children (MABC-2) require extensive training,⁴³ creating barriers to widespread implementation. However, most studies employed structured protocols that may not fully represent the complexity of naturalistic classroom and playground activities.^30,34 Future development should prioritize algorithms that maintain acceptable accuracy with minimal sensor configurations while capturing ecologically valid movement patterns characteristic of children’s spontaneous play and structured physical education activities.^16,25

Activity Classification and Intensity Prediction

Activity type classification revealed advantages over traditional cut-point methods, particularly for detecting subtle movement patterns obscured by intensity-based approaches.^13,32,33 Christian et al³² and Letts et al³³ employed validated Random Forest models achieving F-scores exceeding 80% for classifying sedentary, light and moderate-to-vigorous activities in preschoolers. Critically, Letts et al³³ demonstrated that children with developmental coordination disorder showed comparable overall activity intensity but significantly reduced walking and running time. These findings highlight machine learning’s capacity to reveal qualitative differences in movement patterns beyond quantitative intensity metrics.⁴⁴ Similarly, Kwon et al¹³ demonstrated that Random Forest classification with wrist-worn accelerometers provided more accurate estimation of physical activity levels in preschoolers, revealing that U.S. preschoolers averaged only 28 min per day of MVPA versus the recommended 60 min.⁴⁵

Despite these advances, Li et al¹⁰ reported only 70% overall accuracy using k-means clustering for wrist-worn accelerometer calibration in preschoolers, with challenges in moderate-to-vigorous activity classification. This reflects persistent difficulties with children’s naturally intermittent movement patterns. Also, the variability in cut-point estimates across different processing methodologies creates substantial challenges for cross-study comparisons and population-level surveillance.^38,46 Machine learning approaches offer potential solutions to these methodological challenges by learning activity-specific features directly from data rather than relying on fixed intensity thresholds.^13,32,33 Notably, Mendoza et al³⁵ achieved 99.9% accuracy in identifying cycling activity through combined accelerometer and GPS data, enabling precise measurement of active transport interventions.⁴⁷

Predictive and Clinical Applications

Beyond classification tasks, predictive applications demonstrated potential for early identification of children at risk for adverse developmental outcomes. Joensuu et al¹¹ predicted unfavorable future cardiorespiratory fitness in adolescents using Random Forest incorporating physical fitness, motor competence, adiposity, physical activity patterns, academic performance and psychosocial variables (AUC: 0.83 girls, 0.76 boys). While baseline cardiorespiratory fitness emerged as the strongest single predictor, the inclusion of multiple domains significantly enhanced predictive accuracy, supporting machine learning’s capacity to synthesize complex, multivariate data for risk stratification.¹¹ However, predicting future development proved substantially less accurate than classifying current status (AUC: 0.68 girls, 0.40 boys),¹¹ suggesting that longitudinal changes in physical fitness involve complex, potentially non-linear developmental processes that current machine learning approaches capture imperfectly.⁴⁸

Conversely, Froud et al²⁹ found that traditional linear regression outperformed machine learning methods (Random Forest, Support Vector Machines, k-Nearest Neighbors, Neural Networks) when predicting academic performance and quality of life from physical activity data, with machine learning models explaining virtually no variance in validation datasets (R² = 0%) compared to 22% to 24% for linear regression. This negative finding provides crucial evidence that machine learning does not universally surpass traditional approaches, particularly when relationships are approximately linear, sample sizes are modest, and missing data are prevalent.⁴⁹ Clinical screening applications showed innovation. Muñoz-Organero et al⁹ differentiated ADHD children from controls through movement pattern analysis using Convolutional Neural Networks applied to acceleration images, achieving 87.5% accuracy with wrist sensors and 93.75% with ankle sensors. Importantly, medication altered movement patterns toward control-like profiles, suggesting potential applications in objective treatment monitoring beyond traditional behavioral rating scales.⁹ Clark et al³⁴ further demonstrated the utility of unsupervised learning approaches, employing Self-Organized Maps and k-means clustering to identify 5 distinct movement behavior profiles in preschoolers, highlighting the potential for profiling approaches to shift focus from basic obesity monitoring to comprehensive assessment of ”moving well.”

Methodological, Ethical Considerations and Limitations

ActiGraph accelerometers (predominantly GT3X+) dominated across studies,^{10,11,13,29,31-35} reflecting their established validity in pediatric research. Instead, considerable heterogeneity existed in sampling frequencies (10-110 Hz), epoch lengths (1-60 s) and wear locations (hip, wrist, ankle). Epoch length represents critical trade-offs: shorter windows (1-5 s) provide better temporal resolution for detecting brief activity bouts, while longer epochs (30-60 s) offer more stable classifications but risk missing short bursts of activity that contribute to daily energy expenditure.⁴⁰ This methodological variability, combined with the proliferation of device-specific and population-specific cut-points, creates substantial challenges for synthesizing evidence across studies.^10,32,38 The absence of standardized protocols for school-based sensing represents a significant barrier to clinical translation and cross-study comparison. Future implementation would benefit from consensus guidelines addressing: (a) optimal sensor placement considering both measurement validity and student comfort; (b) minimum sampling frequencies and epoch lengths for different assessment objectives; (c) standardized calibration procedures across device types; (d) data processing pipelines and feature extraction methods; and (e) minimum training dataset requirements for algorithm development. Such standardization would facilitate multi-site validation studies, enable direct comparison of algorithmic performance and support the development of generalizable models deployable across diverse educational contexts.^25,32

Consumer-grade wearables such as smart bands offer promising alternatives for large-scale implementation, with Sulla-Torres et al¹⁶ demonstrating high classification accuracy using the Huawei Band 7, though previous validation studies have shown variable performance of consumer devices depending on the specific metrics and populations assessed.⁵⁰ Deep learning approaches, particularly Convolutional Neural Networks, demonstrated the capacity to automatically learn relevant features from raw acceleration signals and potentially reducing researcher bias. Also, they improved classification accuracy for complex behaviors such as sedentary patterns and postural transitions.^9,31 However, these advantages must be balanced against increased computational requirements, larger training dataset needs and substantially reduced model interpretability.⁵¹

Finally, several limitations characterize the current evidence base including: (a) small, homogeneous samples limiting generalizability^9,15,30; (b) predominance of cross-sectional designs precluding assessment of algorithm stability across child development^{9-11,13,15,16,29,31-35} and (c) incomplete free-living validation with many studies relying on structured protocols³⁰; and (d) limited attention to model interpretability and identification of algorithmic biases.^25,31 Also, the exclusion of conference papers may have limited coverage of recent algorithmic innovations, due to many cutting-edge algorithms are frequently published in conference proceedings. Ethically, continuous sensor-based monitoring in schools raises concerns regarding informed consent, surveillance bias (children altering natural behavior when monitored), potential stigmatization from algorithmic classifications and equity if access concentrates in well-resourced schools. Data governance must address ownership, retention and protection against unauthorized access, particularly regarding commercial interests and algorithmic bias in homogeneous training samples.^25,31,44,52

Practical Implementation and Future Directions

Practical implementation faces technical expertise requirements for sensor deployment, data management and algorithm implementation. Sulla-Torres et al¹⁶ demonstrated feasibility through user-friendly mobile applications enabling educators to input student data and automatically classify motor competence. Time efficiency represents a key advantage: Brons et al¹⁵ reduced assessment time from 15 to 2 min per child through automated scoring, enabling population-level screening previously prohibitive in resource-limited educational settings. Consumer-grade wearables such as smart bands offer cost advantages and improved user acceptance compared to research-grade accelerometers, though the latter provide superior data quality and have undergone more extensive validation procedures.^40,50 The optimal balance between cost, accuracy and feasibility likely varies across educational contexts, assessment objectives and available resources, requiring careful consideration of specific implementation goals and constraints.^25,30

For clinical practice, school nurses and primary care physicians could use brief sensor-based screenings to identify children requiring comprehensive evaluations and leverage qualitative movement pattern detection for earlier identification of developmental coordination disorder.^15,33 High accuracy in differentiating ADHD movement patterns suggests potential for objective treatment monitoring.⁹ Integration with consumer wearables could facilitate longitudinal monitoring between well-child visits, while predictive models enable proactive risk stratification.¹¹ However, clinicians should recognize these as screening rather than diagnostic tools.⁴³

Future research priorities include: (a) external validation across diverse populations and cultural contexts to assess generalizability²⁵; (b) development of standardized data collection and processing protocols to facilitate cross-study comparisons and model sharing³²; (c) longitudinal designs tracking algorithm performance stability as children develop^11,34; (d) investigation of hybrid approaches combining traditional methods’ interpretability with deep learning’s performance^29,31; or (e) integration of multiple sensor modalities (accelerometry, GPSand heart rate) through sensor fusion techniques.³⁵ Establishing data governance guidelines to ensure machine learning benefits children’s health rather than enabling surveillance represents a critical ethical responsibility.

Conclusions

This systematic review demonstrates that machine learning algorithms integrated with inertial measurement units successfully assess physical activity during school hours across preschool to secondary education. Fourteen studies were identified, demonstrating moderate to high methodological quality (11-17/18 MINORS points). Machine learning algorithms (predominantly Random Forest, Support Vector Machines, Gradient Boosting and Convolutional Neural Networks) achieved accuracies ranging from 70% to 99% across diverse applications. These applications included motor competence classification, physical activity intensity prediction, sedentary behavior detection and clinical screening for conditions such as ADHD. Machine learning approaches offered substantial advantages over traditional assessment methods, including time efficiency and capacity to detect subtle movement pattern differences. However, considerable methodological heterogeneity was observed across studies regarding sampling frequencies (10-110 Hz), epoch lengths (1-60 s), sensor placements and validation protocols. Overall, the evidence indicates strong potential for wearable sensor data combined with machine learning to objectively monitor and assess school-related physical activity and motor development.

Footnotes

ORCID iDs

Markel Rico-González

Eivind Holsbrekken

Carlos D. Gómez-Carmona

Luca Paolo Ardigò

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Bull

Al-Ansari

Biddle

, et al. World Health Organization 2020 guidelines on physical activity and sedentary behaviour. Br J Sports Med. 2020;54(24):1451-1462. doi:10.1136/bjsports-2020-102955

Bourke

Haddara

Loh

Carson

Breau

Tucker

Adherence to the World Health Organization’s physical activity recommendation in preschool-aged children: a systematic review and meta-analysis of accelerometer studies. Int J Behav Nutr Phys Act. 2023;20(1):52. doi:10.1186/s12966-023-01450-0

Guthold

Stevens

Riley

Bull

FC.

Global trends in insufficient physical activity among adolescents: a pooled analysis of 298 population-based surveys with 1·6 million participants. Lancet Child Adolesc Health. 2020;4(1):23-35. doi:10.1016/S2352-4642(19)30323-2

Strain

Flaxman

Guthold

, et al. National, regional, and global trends in insufficient physical activity among adults from 2000 to 2022: a pooled analysis of 507 population-based surveys with 5·7 million participants. Lancet Glob Health. 2024;12(8):e1232-e1243. doi:10.1016/S2214-109X(24)00150-5

Martinez-Merino

Rico-González

Effects of physical education on preschool children’s physical activity levels and motor, cognitive, and social competences: a systematic review. J Teach Phys Educ. 2024;43:696-706.

Crumbley

Ledoux

Johnston

CA.

Physical activity during early childhood: the importance of parental modeling. Am J Lifestyle Med. 2020;14(1):32-35. doi:10.1177/1559827619880513

Baldini

Gnazzo

Maragno

, et al. Suicidal risk among adolescent psychiatric inpatients: the role of insomnia, depression, and social-personal factors. Eur Psychiatry. 2025;68(1):e42. doi:10.1192/j.eurpsy.2025.29

Loprinzi

Cardinal

BJ.

Measuring children’s physical activity and sedentary behaviors. J Exerc Sci Fit. 2011;9(1):15-23. doi:10.1016/S1728-869X(11)60002-6

Muñoz-Organero

Powell

Heller

Harpin

Parker

Automatic extraction and detection of characteristic movement patterns in children with ADHD based on a Convolutional Neural Network (CNN) and acceleration images. Sensors. 2018;18(11):3924. doi:10.3390/s18113924

10.

Howard

Sosa

Cordova

Parra-Medina

Yin

Calibrating wrist-worn accelerometers for physical activity assessment in preschoolers: machine learning approaches. JMIR Form Res. 2020;4(8):e16727. doi:10.2196/16727

11.

Joensuu

Rautiainen

Äyrämö

, et al. Precision exercise medicine: predicting unfavourable status and development in the 20-m shuttle run test performance in adolescence with machine learning. BMJ Open Sport Exerc Med. 2021;7(2):e001053. doi:10.1136/bmjsem-2021-001053

12.

Prieto-Botella

Valera-Gran

Santa-Marina

, et al. Validation of a parent-reported physical activity questionnaire by accelerometry in European children aged from 6 to 12 years old. Int J Environ Res Public Health. 2022;19(15):9178. doi:10.3390/ijerph19159178

13.

Kwon

O’Brien

Welch

Honegger

Physical activity among U.S. preschool-aged children: application of machine learning physical activity classification to the 2012 national health and nutrition examination survey national youth fitness survey. Children. 2022;9(10):1433. doi:10.3390/children9101433

14.

Group CCCR. Data Extraction Template for Included Studies. Group CCCR.

15.

Brons

De Schipper

Mironcika

, et al. Assessing children’s fine motor skills with sensor-augmented toys: machine learning approach. J Med Internet Res. 2021;23(4):e24237. doi:10.2196/24237

16.

Sulla-Torres

Calla Gamboa

Avendaño Llanque

Angulo Osorio

Zúñiga Carnero

Classification of motor competence in schoolchildren using wearable technology and machine learning with hyperparameter optimization. Appl Sci. 2024;14(2):707. doi:10.3390/app14020707

17.

Poitras

Gray

Borghese

, et al. Systematic review of the relationships between objectively measured physical activity and health indicators in school-aged children and youth. Appl Physiol Nutr Metab. 2016;41(6(3)):S197-S239. doi:10.1139/apnm-2015-0663

18.

Sousa

Ferrinho

Travassos

BF.

The use of wearable technologies in the assessment of physical activity in preschool- and school-age youth: systematic review. Int J Environ Res Public Health. 2023;20(4):3402. doi:10.3390/ijerph20043402

19.

Chong

Tjurin

Niemelä

Jämsä

Farrahi

Machine-learning models for activity class prediction: a comparative study of feature selection and classification algorithms. Gait Posture. 2021;89:45-53. doi:10.1016/j.gaitpost.2021.06.017

20.

Lettink

Altenburg

Arts

Van Hees

Chinapaw

MJM

. Systematic review of accelerometer-based methods for 24-h physical behavior assessment in young children (0–5 years old). Int J Behav Nutr Phys Act. 2022;19(1):116. doi:10.1186/s12966-022-01296-y

21.

Hendry

Rohl

Rasmussen

, et al. Objective measurement of posture and movement in young children using wearable sensors and customised mathematical approaches: a systematic review. Sensors. 2023;23(24):9661. doi:10.3390/s23249661

22.

Ahmadi

Pavey

Trost

SG.

Machine learning models for classifying physical activity in free-living preschool children. Sensors. 2020;20(16):4364. doi:10.3390/s20164364

23.

Ahmadi

Trost

SG.

Device-based measurement of physical activity in pre-schoolers: comparison of machine learning and cut point methods. Bergman P, ed. PLoS ONE. 2022;17(4):e0266970. doi:10.1371/journal.pone.0266970

24.

Thornton

Kolehmainen

Nazarpour

Using unsupervised machine learning to quantify physical activity from accelerometry in a diverse and rapidly changing population. McGinnis RS, ed. PLOS Digit Health. 2023;2(4):e0000220. doi:10.1371/journal.pdig.0000220

25.

Rico-González

Gómez-Carmona

CD.

Machine learning applications for physical activity and behaviour in early childhood: a systematic review. Appl. Sci. 2025;15(11):6296.

26.

Petersen

Erickson

Kurowski

Boninger

Treble-Barna

Emerging methods for measuring physical activity using accelerometry in children and adolescents with neuromotor disorders: a narrative review. J NeuroEngineering Rehabil. 2024;21(1):31. doi:10.1186/s12984-024-01327-8

27.

Rico-González

Pino-Ortega

Clemente

Los Arcos

Guidelines for performing systematic reviews in sports science. Biol Sport. 2022;39(2):463-471.

28.

Page

McKenzie

Bossuyt

, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. doi:10.1136/bmj.n71

29.

Froud

Hansen

Ruud

Foss

Ferguson

Fredriksen

PM.

Relative performance of machine learning and linear regression in predicting quality of life and academic performance of school children in Norway: data analysis of a quasi-experimental study. J Med Internet Res. 2021;23(7):e22021. doi:10.2196/22021

30.

Lander

Nahavandi

Toomey

Barnett

Mohamed

Accuracy

vs.

practicality of inertial measurement unit sensors to evaluate motor competence in children. Front Sports Act Living. 2022;4:917340. doi:10.3389/fspor.2022.917340

31.

Carlson

Ridgers

Nakandala

, et al. CHAP-child: an open source method for estimating sit-to-stand transitions and sedentary bout patterns from hip accelerometers among children. Int J Behav Nutr Phys Act. 2022;19(1):109. doi:10.1186/s12966-022-01349-2

32.

Christian

Adams

Moore

, et al. Developmental trends in young children’s device-measured physical activity and sedentary behaviour. Int J Behav Nutr Phys Act. 2024;21(1):97. doi:10.1186/s12966-024-01645-z

33.

Letts

King-Dowling

Kwan

MYW

Obeid

Cairney

Trost

SG.

Machine learning derived physical activity in preschool children with developmental coordination disorder. Dev Med Child Neurol. 2025;67(7):910-917. doi:10.1111/dmcn.16186

34.

Clark

CCT

Duncan

Eyre

ELJ

Stratton

García-Massó

Estevan

. Profiling movement behaviours in pre-school children: a self-organised map approach. J Sports Sci. 2020;38(2):150-158. doi:10.1080/02640414.2019.1686942

35.

Mendoza

Haaland

Jacobs

, et al. Bicycle trains, cycling, and physical activity: a pilot cluster RCT. Am J Prev Med. 2017;53(4):481-489. doi:10.1016/j.amepre.2017.05.001

36.

Cain

Sallis

Conway

Van Dyck

Calhoon

Using accelerometers in youth physical activity studies: a review of methods. J Phys Act Health. 2013;10(3):437-450. doi:10.1123/jpah.10.3.437

37.

López-Pastor

Kirk

Lorente-Catalán

MacPhail

Macdonald

Alternative assessment in physical education: a review of international literature. Sport Educ Soc. 2013;18(1):57-76. doi:10.1080/13573322.2012.713860

38.

Trost

Loprinzi

Moore

Pfeiffer

KA.

Comparison of accelerometer cut points for predicting activity intensity in youth. Med Sci Sports Exerc. 2011;43(7):1360-1368. doi:10.1249/MSS.0b013e318206476e

39.

Kozan Cikirikci

Esin

MN.

The impact of machine learning on physical activity–related health outcomes: a systematic review and meta-analysis. Int Nurs Rev. 2025;72(2):e70019. doi:10.1111/inr.70019

40.

Migueles

Cadenas-Sanchez

Ekelund

, et al. Accelerometer data collection and processing criteria to assess physical activity and other outcomes: a systematic review and practical considerations. Sports Med Auckl NZ. 2017;47(9):1821-1845. doi:10.1007/s40279-017-0716-0

41.

Pino-Ortega

Gómez-Carmona

Rico-González

Accuracy of Xiaomi Mi Band 2.0, 3.0 and 4.0 to measure step count and distance for physical activity and healthcare in adults over 65 years. Gait Posture. 2021;87:6-10. doi:10.1016/j.gaitpost.2021.04.015

42.

Barnett

Lai

Veldman

SLC

, et al. Correlates of gross motor competence in children and adolescents: a systematic review and meta-analysis. Sports Med Auckl NZ. 2016;46(11):1663-1688. doi:10.1007/s40279-016-0495-z

43.

Wagner

Kastner

Petermann

Bös

Factorial validity of the Movement Assessment Battery for Children-2 (age band 2). Res Dev Disabil. 2011;32(2):674-680. doi:10.1016/j.ridd.2010.11.016

44.

Halilaj

Rajagopal

Fiterau

Hicks

Hastie

Delp

SL.

Machine learning in human movement biomechanics: best practices, common pitfalls, and new opportunities. J Biomech. 2018;81:1-11. doi:10.1016/j.jbiomech.2018.09.009

45.

World Health Organization. Guidelines on Physical Activity, Sedentary Behaviour and Sleep for Children under 5 Years of Age. World Health Organization; 2019. Accessed October 22, 2025. http://www.ncbi.nlm.nih.gov/books/NBK541170/

46.

Bammann

Thomson

Albrecht

Buchan

Easton

Generation and validation of ActiGraph GT3X+ accelerometer cut-points for assessing physical activity intensity in older adults. The OUTDOOR ACTIVE validation study. PLoS ONE. 2021;16(6):e0252615. doi:10.1371/journal.pone.0252615

47.

Kerr

Duncan

Schipperijn

Using global positioning systems in health research: a practical approach to data collection and processing. Am J Prev Med. 2011;41(5):532-540. doi:10.1016/j.amepre.2011.07.017

48.

Barnett

Verswijveren

SJJM

Colvin

, et al. Motor skill competence and moderate- and vigorous-intensity physical activity: a linear and non-linear cross-sectional analysis of eight pooled trials. Int J Behav Nutr Phys Act. 2024;21(1):14. doi:10.1186/s12966-023-01546-7

49.

Rajula

HSR

Verlato

Manchia

Antonucci

Fanos

. Comparison of conventional statistical methods with machine learning in medicine: diagnosis, drug development, and treatment. Med Kaunas Lith. 2020;56(9):455. doi:10.3390/medicina56090455

50.

Evenson

Goto

Furberg

RD.

Systematic review of the validity and reliability of consumer-wearable activity trackers. Int J Behav Nutr Phys Act. 2015;12:159. doi:10.1186/s12966-015-0314-1

51.

Galán-Mercant

Ortiz

Herrera-Viedma

Tomás

Fernandes

Moral-Munoz

JA.

Assessing physical activity and functional fitness level using convolutional neural networks. Knowl-Based Syst. 2019;185:104939. doi:10.1016/j.knosys.2019.104939

52.

Marín

Tur

Ethical issues in the use of technologies in education settings: a scoping review. Educ Knowl Soc EKS. 2024;25:e31301. doi:10.14201/eks.31301

Machine Learning Applications for In-School Physical Activity Data Using IMUs in Children and Adolescents: A Systematic Review for Health Promotion

Abstract

Background:

Methods:

Results:

Conclusion:

Keywords

Introduction

Materials and Methods

Experimental Approach to the Problem

Information Sources

Search Strategy

Eligibility Criteria

Data Extraction

Assessment of Study Methodology

Results

Identification and Selection of Studies

Quality Assessment

Study Characteristics

Sample

Data Collection Methods

Study Settings and Research Focus

Machine Learning Implementation

Discussion

Motor Competence Assessment

Activity Classification and Intensity Prediction

Predictive and Clinical Applications

Methodological, Ethical Considerations and Limitations

Practical Implementation and Future Directions

Conclusions

Footnotes

ORCID iDs

Funding

Declaration of Conflicting Interests

References