Abstract
Velocity-based training (VBT) has gained widespread adoption in resistance training for real-time assessment of barbell kinematics. Smartphone-based VBT applications have emerged as low-cost alternatives to gold-standard devices, offering portability, minimal setup, and accessible interfaces. Despite increasing adoption, there remains no synthesis of measurement performance. This systematic review assessed the reliability and validity of six commercially available smartphone-based VBT applications for measuring barbell velocity and displacement. Following PRISMA guidelines, systematic searches of PubMed, SCOPUS, and SPORTDiscus identified 194 articles, with 18 meeting inclusion criteria. Four applications (iLoad, Metric, My Jump Lab, WL Analysis) demonstrated acceptable validity (r ≥ 0.70; CV ≤ 10%; ES ≤ 0.60). Two applications (Metric, My Jump Lab) demonstrated acceptable reliability (ICC ≥0.900; CV ≤ 10%; ES ≤ 0.60), though performance fell below research-level thresholds (ICC: ≥0.997; CV: ≤3.5%). Measurement performance varied across applications, exercises, and loading conditions, with smartphone applications demonstrating lower validity and reliability than established VBT devices. Current smartphone-based VBT applications appear suitable for recreational and field-based applied settings, though not high-precision research contexts. Practitioners should evaluate measurement performance specific to their training context and account for inherent measurement error. Future research should assess continued application updates across different training contexts and hardware/software configurations.
Introduction
Velocity-based training (VBT) has been widely adopted by strength and conditioning practitioners as a method for prescribing resistance training intensities. 1 VBT measures barbell velocity in real-time, enabling practitioners to guide load selection based on an athlete's maximal movement velocity.1–3 This dynamic approach accounts for daily fluctuations in neuromuscular performance, improving prescription precision whilst reducing the risk of under- or over-training. 4 Recent advances in field-based kinematic technologies have increased VBT accessibility; however, the measurement performance of these emerging tools remain unsynthesised, creating uncertainty for practitioners selecting devices.
Effective implementation of VBT requires valid and reliable measurement technologies. 5 Validity reflects a device's accuracy in measuring an intended variable against a ‘gold-standard’ criterion 6 ; more specifically, concurrent validity refers to the strength of agreement between two devices measuring the same variable simultaneously. Reliability, however, indicates measurement consistency under unchanged conditions, either measured within the same device (intra-device) or between devices of the same model (inter-device). 7 Gold-standard VBT technologies such as three-dimensional motion capture (3DMoCap), linear position transducers (LPTs), and linear velocity transducers (LVTs) demonstrate excellent reliability and validity. 6 However, these technologies are prohibitively expensive and require technical expertise and laboratory infrastructure, limiting accessibility for many practitioners in applied settings.
Advances in field-based kinematic technologies have enabled smartphone-based VBT applications to emerge as accessible alternatives to traditional devices. Modern smartphones possess high-resolution cameras (≥240 fps on many devices) and image processing capabilities that enable markerless motion tracking. 8 Several commercially available applications claim to assess barbell kinematics with sufficient precision, typically at no cost or less than NZ$100 annually (Table 1). These applications offer practical advantages including portability, minimal setup requirements, and accessible interfaces for practitioners with limited technical expertise. Whilst their accessibility has driven widespread adoption, questions remain regarding the reliability and validity of these tools.
Current smartphone-based VBT applications. Costs reported annually in NZD.
Although individual validation studies have emerged,9–26 findings vary considerably. This heterogeneity creates practical challenges: practitioners cannot confidently determine which applications provide acceptable measurement performance for their training context, nor understand the technical limitations that influence accuracy. Without clear guidance, practitioners’ risk suboptimal implementation that could compromise training outcomes. A systematic review is therefore warranted to synthesise current evidence and provide practical recommendations for implementation across research and applied contexts.
This systematic review aimed to (1) assess the concurrent validity and reliability of commercially available smartphone-based VBT applications for measuring barbell velocity and displacement, and (2) provide practical implementation guidance and recommendations for future research and practice. The primary research question is: What is the concurrent validity and reliability of commercially available smartphone-based VBT applications for measuring barbell velocity and displacement, and what considerations guide their implementation in applied settings?
Methods
This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. 27 Due to constraints related to the primary researcher's dissertation timeline, all screening, data extraction, and quality assessments were conducted by a single author (NP).
Search strategy
Literature searches were conducted across three electronic databases (PubMed, SCOPUS, and SPORTDiscus) from inception to May 2025, with all fields searched. Within each concept, terms were combined using the Boolean operator OR, and concepts were subsequently combined using AND. The complete search strategy is detailed in Table 2.
Complete search strategy for electronic databases.
Selection criteria
Studies were eligible for inclusion if they: (1) were original, peer-reviewed research published in English; and (2) assessed the validity and/or reliability of commercially available smartphone-based VBT applications for measuring barbell velocity and/or displacement. Studies were excluded if they: (1) lacked assessment against established criterion measures (i.e., LPT, LVT, or 3DMoCap); (2) provided no statistical reporting of validity and/or reliability outcomes; (3) were not commercially available; or (4) required supplementary hardware or software beyond standard tripods or mounts.
Selection process
Titles, abstracts, and full-text articles were screened against eligibility criteria. Reference lists of included studies were manually searched for additional eligible articles.
Data extraction
Following full-text retrieval, data pertaining to study characteristics, application details, criterion measures, exercise protocols, training intensities, velocity measures, and validity/reliability outcomes were extracted and collated in Microsoft Word (Microsoft Corporation, Redmond, WA, USA).
Quality assessment
Study quality was assessed via a modified version of the Downs and Black checklist 28 (supplementary materials). The original 27-item checklist was developed for randomised and non-randomised healthcare intervention studies, 28 but has since been used in systematic reviews within sport science.6,29 We selected 9 items (original items 1, 2, 3, 6, 7, 10, 16, 18, and 20) relevant to the assessment of reliability and validity studies. Items relating to intervention delivery, randomisation, blinding, and adverse events were excluded as they are not applicable to observational measurement studies. Scoring and interpretation followed the approach employed by Weakley and colleagues, 6 with each item scored as 0 (not reported/unable to determine) or 1 (clearly reported), totalling a maximum score of 9. Subsequent scores were classified as ‘good’ (≥ 7), ‘moderate’ (4–6), or ‘poor’ (≤ 3). 6
Formal reliability and validity appraisal
Applications were formally appraised against three criteria: one for validity (acceptable) and two for reliability (acceptable and research-level). These thresholds were established in alignment with previous research assessing velocity-based6,9,20,30–32 and other resistance training devices.33,34 Acceptable validity required: correlation coefficient (r ≥ 0.70), coefficient of variation (CV ≤ 10%), or trivial to small effect size (ES ≤ 0.60; based on a modified effect size scale 35 ). Acceptable reliability required: intraclass correlation coefficient (ICC ≥ 0.900),36,37 coefficient of variation (CV ≤ 10%), or trivial to small effect size (ES ≤ 0.60 35 ). However, because velocity trackers are used to assess neuromuscular performance, 38 guide load prescription, 4 and track strength adaptations, 3 higher reliability is essential for research contexts.9,20 Research-level reliability therefore required: ICC ≥ 0.997 and CV ≤ 3.5%. These criteria were included to contextualise measurement performance across use cases, not to imply that smartphone-based applications are designed to meet laboratory-grade standards.
Data synthesis
Extracted data were presented in tabular form and narratively synthesised. Studies that lacked compatible statistical reporting for validity (r, CV, or ES)12,13,15,17 or reliability (ICC, CV, or ES) 24 were excluded from the formal appraisal criteria only, but retained in both the tables and narrative synthesis to ensure comprehensive coverage of all commercially available smartphone-based VBT applications.
Due to substantial heterogeneity in outcome reporting, formal meta-analysis was inappropriate; available statistics were instead pooled descriptively by application. These estimates are exploratory and should not be used to rank or compare applications.
Results
Literature search
The study selection process is summarised in Figure 1. The systematic search yielded 194 records across all electronic databases. Following duplicate removal (57 excluded), 137 titles and abstracts underwent screening, with 117 excluded due to irrelevance. No language-based exclusions were applied. Full-text assessment of 20 articles resulted in 13 studies meeting inclusion criteria. Reference list screening identified 3 additional eligible articles, totalling 16 included studies. During peer review, 2 recently published studies20,24 identified by reviewers were assessed and met inclusion criteria, yielding a final sample of 18 studies.

Literature search: PRISMA flowchart.
Quality assessment
Quality assessments are summarised in Table 3. All 18 studies scored 9 or above (mean = 9.61 ± 0.54; median = 10.0; IQR = 0.75), indicating strong methodological rigour across all included studies.
Quality assessments via the modified downs and black checklist.
Study characteristics
Study characteristics are summarised in Table 4. Validity was assessed in 18 studies, and reliability in 14. The following applications were examined: iLoad,10,11 Iron Path, 12 MetricVBT,13–15,20,25 MyLift,38–41 My Jump Lab, 38 42–44, PowerLift,19,22,39 WL Analysis, 24 and QwikVBT. 13 Several of these applications, however, have since rebranded, and the versions examined in some studies are now outdated. Specifically, MetricVBT is now Metric, whilst MyLift and PowerLift are My Jump Lab. After accounting for these changes, 6 distinct commercially available applications were assessed across 18 studies (Table 1).
Characteristics of included studies.
Validity
A summary of studies assessing validity can be found in Table 5. Six VBT applications (iLoad,10,11 Iron Path, 12 Metric,13–15,20,25 My Jump Lab,13,16–19,21–23,39 WL Analysis, 24 and QwikVBT 13 ) were examined across all studies. Criterion measures included LPTs in eleven studies,9–12,16,18–22,25 and 3DMoCap in six.13–15,17,23,24,39 Mean velocity was the most assessed measure,9–15,17–23,25,39 followed by displacement,12,14,15,20,25 peak velocity,16,24,25 mean power, 18 and total workload. 10 The squat9–20 and bench press9,11,13–16,19–22,25,39 were the most assessed exercises, followed by deadlift,12–14,20 whilst snatch, 24 clean, 24 hip thrust, 19 and Bulgarian split squat 23 were examined only once. Free-weight (F/W) exercises were assessed in eleven studies,12–15,17,19–22,24,25 and Smith-machine (S/M) exercises in seven.9–11,16,18,23,39 Loading protocols ranged from 25% to 100% of one-repetition maximum (1RM).
Summary of studies assessing the validity of smartphone-based VBT applications.
LPT, linear position transducer. F/W, free-weight. r, Pearson correlation coefficient. SEE, standard error of the estimate. ES, Hedge's g effect size. LoA, limits of agreement. R2, coefficient of determination. 3DMoCap, three-dimensional motion capture. S/M, Smith machine. MAE, mean absolute error. LVT, linear velocity transducer. CCC, Lin's concordance correlation coeffect. CV, coefficient of variation. ICC, intraclass correlation coefficient. ME, maximum error. SDC, smallest detectable change. SEM, standard error of measurement. kj, kilojoules. w, watts. MAPE, mean absolute percentage error. RMSE, root mean square error.
Reliability
A summary of studies assessing reliability can be found in Table 6. Four VBT applications (iLoad,10,11 Metric,14,20,25 My Jump Lab,9,16–19,21,22,39 and WL Analysis 24 ) were examined across fourteen studies. Intra-device reliability was assessed in twelve studies,9,11,14,16–19,21,22,24,25,39 intra-rater in three (i.e., consistency of repeated measurements across manual-input applications),10,11,22 and inter-device in two.9,20 Mean velocity was the most assessed measure,9,10,14,17–22,25,26,39 followed by displacement,14,16,20,25 peak velocity,16,24,25 mean power, 18 and total workload. 10 The squat9–11,14,16–20 and bench press9,11,14,16,19–22,25,39 were the most assessed exercises, whilst snatch, 24 clean, 24 deadlift 14 and hip thrust 19 were examined only once. Free-weight exercises were assessed in eight studies,14,17,19–22,24,25 and Smith-machine exercises in six.9–11,16,18,39 Loading protocols ranged from 25% to 100% of 1RM.
Summary of studies assessing the reliability of smartphone-based VBT applications.
F/W, free-weight. ICC, intraclass correlation coefficient. a, Cronbach's alpha. R2, coefficient of determination. CV, coefficient of variation. S/M, Smith machine. CCC, Lin's concordance correlation coefficient. ME, maximum error. SDC, smallest detectable change. SEM, standard error of measurement. ES, Hedge's g effect size. SDD, smallest detectable difference. TE, typical error.
Descriptive pooling of validity and reliability statistics by VBT application is summarised in Tables 7 and 8. These estimates are exploratory and
Pooled mean concurrent validity (r, CV, ES) estimates by VBT application.
Study-level means were calculated by averaging all values within a study before pooling across studies to derive application-level estimates. r, Pearson's correlation coefficient. CV, coefficient of variation. ES, effect size. n, number of studies. —, single study (range not applicable).
Pooled mean reliability (ICC, CV, ES) estimates by VBT application.
Study-level means were calculated by averaging all values within a study before pooling across studies to derive application-level estimates. ICC, intraclass correlation coefficient. CV, coefficient of variation. ES, effect size. n, number of studies. —, single study (range not applicable).
Discussion
This review assessed the concurrent validity and reliability of six commercially available smartphone-based VBT applications for measuring barbell velocity and displacement. The primary findings were: (1) four applications (iLoad, Metric, My Jump Lab, and WL Analysis) demonstrated acceptable validity (r ≥ 0.70; CV ≤ 10%; ES ≤ 0.60); (2) two applications (Metric, My Jump Lab) demonstrated acceptable reliability (ICC ≥ 0.900; CV ≤ 10%; ES ≤ 0.60), though not research-level criteria (ICC ≥ 0.997; CV ≤ 3.5%); and (3) measurement performance varied across applications, exercises, and loading conditions, with smartphone applications demonstrating lower validity and reliability than established VBT devices.
Current smartphone-based VBT applications appear suitable for recreational and field-based applied settings, though not high-precision research contexts. Whilst ‘acceptable’ levels of validity and reliability are sufficient for general monitoring, 6 research applications demand substantially higher reliability, as even modest systematic or random error can meaningfully influence training decisions. 9 As these tools did not meet research-level reliability criteria, they are best suited to applied and field-based research contexts rather than high-precision laboratory settings.9,20 Nevertheless, these tools show promise as an emerging technology. Practitioners should evaluate measurement performance specific to their training context, and account for inherent measurement error.
Several technical limitations likely contribute to observed measurement variability. Hardware constraints, including frame rates and shutter speeds, limit temporal resolution during fast movements, whilst software limitations such as repetition-detection algorithms and pixel-scaling affect accuracy. Two-dimensional (2D) analysis introduces additional visual distortions, such as parallax error—though the extent to which this affects modern applications with compensatory algorithms remains unclear. These technical considerations are discussed in subsequent sections and should be considered when implementing smartphone-based VBT technologies.
Validity
Mean velocity
Mean velocity validity was assessed across: iLoad,10,11 Iron Path, 12 Metric,13–15,20,25 My Jump Lab,13,17–19,21–23,39 and QwikVBT. 13
Exercise-specific patterns were evident. Smartphone applications generally demonstrated highest validity in squat, followed by bench press, with lowest validity in deadlift.9,11–15,19,20 Hip thrust demonstrated slightly lower validity than squat and bench press, though assessed in only one study. 19 Between-application differences were also evident. When Metric (v2.3.1), My Jump Lab (v3.2.9), and QwikVBT (v0.94) were compared across squat, bench press, and deadlift, QwikVBT demonstrated near-perfect agreement with the criterion measure, whereas Metric and My Jump Lab showed substantially lower and divergent validity. 13
Smartphone applications demonstrated stronger concurrent validity than IMUs, though agreement with gold-standard devices was moderate. Across four studies,9,17,23,26 My Jump Lab (v4.0, v6.0.1, v10.0.6) was assessed against seven IMU devices (Bar Sensei, Beast Sensor, PUSHbar, PUSHband, PUSHbody), four LPTs (ChronoJump, GymAware, Vitruve), two LVTs (T-Force), and two camera-based systems (Velowin). My Jump Lab consistently demonstrated lower validity than LPTs, LVTs, and camera-based systems, but higher validity than IMUs.
Several technical factors may contribute to observed exercise-specific validity patterns. Pixel-scaling errors, often arising from inaccuracies in weight-plate diameter estimation, can distort displacement measurements, 21 whilst the commonly used 0.1 m/s vertical velocity threshold for repetition detection may misidentify repetition onset and termination points.15,20,25 Because mean velocity is calculated as displacement divided by time, either error type would directly affect measurement validity. 20
Peak velocity
Peak velocity validity was assessed across: Metric, 25 My Jump Lab, 16 and WL Analysis. 24
Smartphone applications demonstrated lower validity than established technologies during traditional resistance exercises. Compared to 3DMoCap (STT) and an LPT (Vitruve), My Jump Lab (v8.1) produced less accurate peak velocity estimates in squat and bench press, with substantially larger measurement errors than the reference device (LVT; T-Force). 16 Load-dependent validity patterns were also observed when validating Metric (v4.5.0) against a Vitruve LPT during bench press: poor agreement at light loads (40% 1RM: r = 0.28) but improved validity at moderate and heavy loads (60% 1RM: r = 0.79; 75% 1RM: r = 0.81). 25
Exercise-specific validity patterns were evident in weightlifting movements. When compared against three LPTs (GymAware, Vitruve, Graxity Box) and a laser optic device (Kinetic Flex), WL Analysis demonstrated higher validity than Vitruve and Kinetic Flex in the snatch (bias = 2.76% at 40% 1RM; 3.47% at 70% 1RM). However, clean error was significantly higher (bias = 6.23% at 40% 1RM; 10.55% at 70% 1RM). 24
Systematic underestimation of peak velocity was reported across multiple studies.24,25 Hardware factors including sampling rate and shutter speed may contribute, as lower frame rates reduce temporal resolution and inadequate shutter speeds introduce motion blur. Both of which can compromise measurement accuracy during ballistic movements. Recent updates to Metric have introduced advanced camera settings to address these limitations; however, independent validation is still required.
Displacement
Displacement validity was assessed across: Iron Path 12 and Metric.14,15,20,25
Exercise-specific validity patterns varied substantially between applications. Metric demonstrated highest validity in squat, followed by bench press and deadlift.14,15,20,25 Iron Path demonstrated a contrasting pattern, with highest validity in conventional and sumo deadlift, followed by front and back squat. 12
The magnitude of exercise-dependent variation for Metric was considerable. Validity ranged from squat (CV = 1.52%) to bench press (CV = 5.25%) to deadlift (CV = 25.8%). 20 Similar patterns were observed across other studies, with highest agreement in squat and significantly lower validity in bench press.14,15 In contrast, Iron Path demonstrated highest validity in deadlift (SEM = 0.020–0.042 m) compared to squat (SEM = 0.057–0.061 m), though assessed in only one study. 12
Displacement validity may be particularly sensitive to limitations inherent in 2D video analysis. Single-plane tracking cannot distinguish anterior-posterior barbell movement from vertical displacement, meaning horizontal bar path deviations or lifter rotation may be incorrectly registered as vertical displacement. Parallax error—wherein measurements taken at an angle differ from perpendicular measurements—distorts apparent barbell position. 40 This positional distortion may compromise identification of movement initiation and termination points, specifically when combined with velocity detection thresholds that rely on detecting when barbell velocity exceeds 0.1 m/s.15,25 Pixel-scaling errors further affect accuracy. 21 Because displacement is calculated from movement start to end points, errors in these measurements directly affect total displacement. These limitations may therefore have a more pronounced effect on displacement than velocity metrics, partially explaining the greater exercise-dependent variation observed across the same studies.14,15,20
Reliability
Mean velocity
Mean velocity reliability was assessed across: Metric14,20,25 and My Jump Lab.9,17–19,21,22,26
Metric demonstrated progressive improvements in mean velocity reliability across software versions. Early versions (v0.5.4) showed unacceptable reliability across squat, bench press, and deadlift, with slow bench press performing poorest (ICC = 0.79, CV = 24.4%). 14 Later versions showed clear improvements, with v4.5.0 achieving acceptable reliability in bench press across all loads (ICC = 0.929–0.952, CV = 3.91–7.09%), 25 and v4.8.1 achieving near-perfect reliability in squat (ICC = 0.994, CV = 3.8%) and acceptable reliability in bench press (ICC = 0.981, CV = 7.5%). 20 Deadlift reliability remained problematic across versions, with v4.8.1 still demonstrating unacceptable performance (ICC = 0.941, CV = 13.2%; SDC = 0.13 m/s). 20
My Jump Lab demonstrated acceptable but load-dependent reliability, consistently performing below LPTs. In bench press, reliability decreased as load increased, from 40% 1RM (CV = 8.17%) to 75% 1RM (CV = 13.55%). 18 Similar load-dependent patterns were observed in back squat, where reliability was comparable to PUSHbody. 17 My Jump Lab (v4.0) demonstrated acceptable reliability across back squat (ICC = 0.981), bench press (ICC = 0.974), and hip thrust (ICC = 0.961), outperforming Beast Sensor devices but remaining below GymAware. 19
Version-specific differences for Metric highlight the influence of software updates on measurement performance, with progressive improvements from v0.5.4 to v4.8.1 underscoring the importance of consulting version-specific validation research. Comparisons with IMU devices should be interpreted cautiously, as IMUs rely on similar velocity-based thresholds for repetition detection and face comparable challenges identifying movement initiation and termination from acceleration data.
Peak velocity
Peak velocity reliability was assessed across: Metric, 25 My Jump Lab, 16 and WL Analysis. 24
Peak velocity reliability was generally acceptable, but SDC values were a concern at higher velocities. My Jump Lab (v8.1) demonstrated acceptable reliability in back squat and bench press (ICC = 0.972 and 0.993; CV = 5.02% and 5.79%, respectively), though SDC values increased with velocity, ranging from 0.18–0.31 m/s in bench press and 0.25–0.35 m/s in back squat. 16 Metric (v4.5.0) showed a similar load-dependent pattern, with lower reliability at light loads (45% 1RM: ICC = 0.893) than heavy loads (60% 1RM: ICC = 0.966; 75% 1RM: ICC = 0.921). 25 WL Analysis was the exception, demonstrating highly consistent measurements across both the snatch and clean at 40% and 70% 1RM (SDC = 0.01 m/s). 24
Smartphone applications exhibit relatively large SDC values for peak velocity, suggesting individual measurements may be influenced by measurement error. Although reliability was high for the snatch and clean, it remains unclear whether this reflects genuinely consistent tracking or is partly a consequence of lower frame rates, which may limit sensitivity to rapid velocity fluctuations. 24
Displacement
Displacement reliability was assessed across: Metric.14,20,25
Exercise-specific reliability patterns were evident, though findings varied by assessment method and application version. Inter-device reliability between two Metric (v4.8.1) devices demonstrated acceptable reliability in back squat (ICC = 0.991; SDC = 0.019 m) and bench press (ICC = 0.974; SDC = 0.024 m), but unacceptable reliability in deadlift (ICC = 0.753; SDC = 0.098 m). 20 Test-retest reliability for Metric (v0.5.4) was unacceptable across all exercises (ICC < 0.900) for both fast and slow movement velocities, with the slow bench press condition demonstrating poorest reliability (ICC = 0.67, CV = 7.6%). 14 In contrast, Metric (v4.5.0) demonstrated acceptable test-retest reliability in bench press across all loads (ICC = 0.954–0.974; CV = 3.07–3.91%). 25
Displacement reliability demonstrated exercise-dependent patterns, with most consistent performance in squat, followed by bench press, and poorest reliability in deadlift. This likely reflects the technical challenges of tracking concentric-eccentric displacement with single-camera systems, compounded by parallax error and velocity detection thresholds. 20 Some manufacturers recommend positioning the camera closer to the ground for deadlift assessments to mitigate these effects 41 —though whether this improves measurement performance remains unexamined.
Guidance for research and practice
Smartphone-based VBT applications continue to evolve, with expanding capabilities and increasing practitioner adoption in applied settings. Despite ongoing technological improvements, acceptable validity and reliability depend on appropriate implementation. These tools remain constrained by single camera 2D methodologies that introduce visual distortions and contribute to measurement variability. Application-specific algorithms may mitigate these errors but cannot eliminate them. Practitioners must therefore understand both the technical limitations of these technologies and the implementation strategies that optimise measurement performance.
Implementation considerations
Camera positioning directly influences measurement performance and repetition detection capabilities.13,15,20 To minimise visual distortion, optimal placement generally involves aligning camera height with the midpoint of the barbell's range of motion and maintaining the barbell centred within the frame. Indeed, precise implementation may require familiarisation with individual lifter displacement characteristics. Correct setup should aim to minimise angular distortions—pitch (vertical tilt), yaw (horizontal rotation), and roll (lateral tilt)—with pitch and yaw particularly contributing to parallax error. Incorrect setup can exacerbate false-positive (phantom repetition) and false-negative (missed repetition) detections. Environmental factors, including lighting and background contrast, may further influence measurement reliability—though these effects remain unexamined.
Positional exceptions may apply for concentric-first exercises such as the deadlift, where slow repetition initiation may warrant prioritising accurate repetition detection over minimising visual distortion. Positioning the camera slightly below midpoint can improve detection in such cases. Though this trade-off requires further investigation, particularly for weightlifting movements with larger displacement ranges. Where time or resource constraints exist, adherence to application-specific positioning guidelines remains the practical recommendation. 41
Smartphone-based VBT applications demonstrate measurement variability that practitioners should account for when interpreting data. Wide limits of agreement and relatively large SDC values indicate that whilst these tools generally demonstrate acceptable validity, reproducibility for individual repetitions is limited. Consequently, changes below the SDC may reflect measurement noise rather than true performance adaptations, and reliance on single measurements may lead to inappropriate training decisions.
Practitioners should consult validation research specific to their chosen application and training context, as measurement performance varies across different application versions, exercises, and loading conditions. Emphasising trends across multiple repetitions rather than single measurements can mitigate the impact of measurement variability. Consistent implementation protocols—including standardised camera positioning, environmental control, and adherence to manufacturer guidelines—improve the reliability of within-athlete monitoring and may support evidence-based training decisions.
Recommendations for future research
Several research priorities emerge from the current evidence. Future validation studies should employ larger participant numbers and increased repetition counts to strengthen statistical power, as some included studies reported 90% confidence intervals below conventional thresholds.19,22 Reliability research should emphasis inter-device comparisons, since test–retest designs are inherently susceptible to biological variability that may obscure true measurement error. 20
Validation research should expand beyond traditional powerlifting movements to include weightlifting derivatives and exercises with higher velocities and complex bar paths. Coverage of the velocity spectrum should extend from maximal 1RM efforts to velocities exceeding 1.25 m/s to clarify whether measurement performance varies by barbell kinematics, velocity characteristics, or movement phase structure (concentric-eccentric versus eccentric-concentric). Studies employing multiple applications, exercises, and loading conditions would enable direct performance comparisons and strengthen the generalisability of findings.
Future studies should systematically address limitations inherent to 2D video analysis. Experimental investigation of parallax error, camera height, angle, and distance deviations, and environmental factors would quantify their individual and interactive influences on measurement accuracy and repetition identification. Additionally, reporting repetition misidentification events is essential to clarify whether reliability limitations arise from algorithmic, setup-related, or detection-specific factors—as highlighted in prior research.13,15,20 Such research would establish empirical thresholds for acceptable setup variation, enabling practitioners to distinguish unavoidable methodological constraints from modifiable factors.
Device and platform comparisons represent an important research gap. Direct evaluations of identical applications across iOS and Android platforms would determine whether hardware specifications or software implementation differences influence measurement accuracy. Given frequent software updates that modify tracking algorithms and processing methods, ongoing validation of new application versions is essential. Video-import capabilities offer methodological advantages here, enabling direct between-application comparisons using identical footage and isolating software-specific measurement differences.
Collectively, addressing these priorities would provide practitioners with robust, context-specific evidence to support informed measurement interpretation, optimise implementation protocols, and identify modifiable factors amenable to software or procedural refinement. Application developers are likewise encouraged to maintain detailed, publicly accessible software-update logs, as transparent documentation enables users and researchers to interpret version-specific performance changes and attribute improvements or discrepancies appropriately.
Limitations
This review acknowledges several methodological limitations. First, all screening, data extraction, and quality assessments were conducted by a single author. This introduces a potential risk of bias in study selection and data interpretation, as independent dual-screening is considered best practice in systematic reviewing. Whilst efforts were made to mitigate this through strict adherence to predetermined eligibility criteria and use of validated quality assessment tools, this remains an acknowledged limitation of the review. Second, potential publication bias warrants consideration, particularly as multiple included studies disclosed conflicts of interest involving application developers or affiliated authors,10,11,19,21,22 whilst others provided no conflict of interest declaration.9,14 Industry affiliations may have introduced bias within individual studies and, consequently, into this synthesis. Third, continuous software updates limit the generalisability of findings. Reported performance may not reflect current application versions, restricting applicability to contemporary practice. Fourth, the statistical methods employed across studies impose additional constraints. This review relied on correlation coefficients (r), coefficients of variation (CV), and effect sizes (ES) for validity appraisal, and intraclass correlation coefficients (ICC), CV, and ES for reliability appraisal. Although these represent traditional and widely accepted measures, several studies reported alternative or incompatible statistical approaches and were therefore excluded from formal recommendation. Practitioners should consult validation research specific to their chosen application and training context, and monitor software updates that may affect measurement performance.
Conclusion
This systematic review synthesised evidence on the concurrent validity and reliability of six commercially available smartphone-based VBT applications for measuring barbell velocity and displacement. Four applications (iLoad, Metric, My Jump Lab, and WL Analysis) demonstrated acceptable validity, whilst two (Metric, My Jump Lab) demonstrated acceptable reliability. No application met research-level reliability criteria, consistent with their design as field-based monitoring tools rather than laboratory-grade devices. Measurement performance varied across applications, exercises, and loading conditions. Current smartphone-based VBT applications appear suitable for recreational and field-based applied settings, though not high-precision research contexts. Practitioners should evaluate measurement performance specific to their training context and account for inherent measurement error. Future research should prioritise validation of continued application updates across varied training contexts and hardware and software configurations.
Supplemental Material
sj-docx-1-spo-10.1177_17479541261447622 - Supplemental material for The validity and reliability of commercially available smartphone-based velocity-based training applications: A systematic review with guidance for research and practice
Supplemental material, sj-docx-1-spo-10.1177_17479541261447622 for The validity and reliability of commercially available smartphone-based velocity-based training applications: A systematic review with guidance for research and practice by Nathan Puppyn and Matt Brughelli in International Journal of Sports Science & Coaching
Footnotes
Acknowledgments
The authors gratefully acknowledge Auckland University of Technology (AUT) for their support during the preparation of this master's dissertation.
Ethical considerations
Ethical approval was not required for this study as it is a systematic review of publicly available published literature and did not involve human participants or primary data collection.
Author contributions
NP: Conceptualisation, methodology, data collection, analysis, and writing of the manuscript. MB: Supervision.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
No primary data was collected or generated in this study. All sources analysed are cited within the review
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
