Reliability of observed step count in the Timed Up and Go test: A cross-sectional study

Abstract

Background and aim

The Timed Up and Go (TUG) test is one of the most widely used assessments. This study investigated the psychometric properties of the observed step count during this test performed by experienced physiotherapists, using video recordings of healthy adults.

Methods

In this cross-sectional study two physiotherapists independently observed video recordings of healthy participants completing the TUG twice, and obtained step count and the time taken to complete the TUG. For the intra-rater reliability, the step count observation was repeated after 10 months. ICC and Bland Altman plots were used to analyse the intra-and inter-rater reliability. Spearman correlation coefficient was used to investigate the correlation between step count and time taken.

Results

Thirty-six recordings from 31 healthy adults were analysed. The median step count across both raters and rating sessions was 12. The intra-rater reliability for step count was 0.924 (95% CI: 0.874 to 0.954), whereas the inter-rater reliability was 0.857 (95% CI: 0.730 to 0.919), the minimal detectable change (MDC) was 0.51. The correlation between the time taken and the average step count was rho = 0.71 (p < 0.001).

Conclusion

The results indicate that the observed step count during the TUG in healthy adults using video recordings is reliable and is positively correlated with the time to complete the test. With an MDC of 0.51 steps, a change of more than one step confidently reflects a real change in performance, beyond measurement error. Further research should explore this outcome parameter in individuals with irregular and unpredictable gait patterns.

Keywords

inter-rater reliability intra-rater reliability correlation observed step count gait analysis video analysis

Introduction

The Timed Up and Go (TUG) test is widely used as a performance-based measure of mobility in clinical practice, home-settings and research.^1–4 Minimal equipment is required, it is simple to conduct^1,2 and it is, for example, recommended for neurological physiotherapy entry-level education in Austria.⁵ The TUG test assesses functional mobility by asking a person to stand up from a chair, walk three meters, turn around, walk back to the chair and sit down again⁶ and is therefore a commonly used measurement tool in different disorders, for example in people with Parkinson’s Disease (PD).⁷ Although nowadays the test is often recorded with wearable sensors, it is still relevant to assess its validity with measures available in the clinic, such as the observed step count using video analysis. Therefore, the observed step count during the TUG could provide a gold standard against which wearable sensors can be validated. The standard outcome parameter is the time it takes to finish the task.⁸ The inter-rater reliability for the TUG test duration is excellent with an intraclass correlation coefficient (ICC) of 0.96 for people aged between 20 and 59 years and normative values in these age groups are ranging from 8.6 to 9.9 s. The TUG is widely validated for different disorders like Multiple Sclerosis,⁹ Alzheimer’s Disease,¹⁰ with typical adults, individuals with cerebral palsy, Huntington’s disease, stroke or spinal cord injury.⁶ It is also proven reliable in intra- and inter-rater analysis with ICC values within the 0.69–0.99 range in people diagnosed with PD.¹¹

Just focusing on the time taken during the TUG may lead to other aspects of movement quality being overlooked. It has been demonstrated during a modified TUG¹² and the 10-metre walking test¹³ that the step count can provide additional information about step length, when the walking distance is known. The step rate can also be calculated if the number of steps taken and the time taken to complete the test are known,¹⁴ which could help to identify subtle mobility limitations. For example, the number of steps is an important gait parameter which changes before and during Freezing of Gait (FOG) in people with PD.¹⁵ Furthermore, it is currently impossible to specify which aspects of the TUG test did improve to achieve a better time, either getting up from or sitting down on a chair or walking or turning. Observed step count is established for other mobility assessments like the step test, where the rater counts the number of steps taken up a step for 15 s.¹⁶ However, the psychometric properties of incorporating the observed step count into the standard TUG assessment have not yet been systematically investigated, highlighting the need for research in this area.

To our knowledge, no research has been conducted on the psychometric properties of the observed step count of the TUG. This study was conducted on a healthy population to obtain initial results with the purpose to conduct this study in the future on people with PD experiencing FOG.

Therefore, we aimed to evaluate the intra- and inter-rater reliability of the observed step count using video recordings of healthy adults performing the TUG. Two trained and experienced physiotherapists rated the videos observing the step counts twice, with the ratings taken 10 months apart. A previous study¹⁷ showed that recall rates after 4 weeks may be sufficiently high to affect an intra-rater study design, and that longer intervals between ratings could reduce recall bias. Mischiati, Comerford¹⁸ also found that extended intervals of several months are feasible. Additionally, the standard error of measurement (SEM) and the minimal detectable change (MDC) were calculated to evaluate the accuracy and sensitivity of this method.¹⁹ The study also investigated the correlation between the mean step count and the time assessed by a third rater in a single rating session.

Methods

This study employed a cross-sectional design to evaluate the intra- and inter-rater reliability in accordance with the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) framework to ensure transparent and standardized reporting.²⁰

Participants

To be included in the study, the participants had to be students or employees of the University of Applied Sciences, over 18 years of age, and able to walk 6 metres without the use of assistive devices. People experiencing acute pain while walking were not eligible to participate.

Sample size calculation

Key parameters for ICC hypothesis testing were established for the reliability analysis. The minimum acceptable reliability coefficient was set at 0.75, and a reliability of 0.90 was expected. The statistical parameters included a significance level (α) of 0.05 using a two-tailed test and a power of 80%. The study design incorporated two raters and repetitions per subject. Based on these specifications, the required sample size was calculated to be 33 subjects (Afrin, 2025).

Recruitment

This study was conducted as part of the elective ‘Research in Physiotherapy’ course in a Bachelor’s degree program in Physiotherapy at a local University of Applied Sciences in Austria. Students enrolled on the course acted as student researchers and were involved in the development and execution of the project, overseen by the course lecturer and lead researcher. Participants were recruited via email, and were provided with detailed information about the project, as well as the opportunity to actively opt-in via an online form. Information about the project was also disseminated in lecture rooms, and individuals could opt-in to participate during the 2-day data collection period on campus.

Ethics

Prior to the project’s commencement, the Ethics Committee of Lower Austria confirmed that a formal ethics vote was not required due to the study’s low risk and anonymous nature (GS1-EK-12/797-2024). The lead researcher provided all participants with detailed information about the study and gave them the opportunity to ask questions. The study was conducted in accordance with the principles of the Declaration of Helsinki. Written informed consent was obtained from all participants.

Data collection

All the data was collected over the course of two testing days in May 2024. After confirming the inclusion and exclusion criteria, information on age, self-reported gender and height was collected. Participants performed the TUG test twice, with each trial recorded on video. Two student researchers oversaw the TUG assessment. To avoid bias, the lead researcher did not actively participate in this process, as this person acted as one of the two raters in the video analysis.

The TUG was performed using the following procedure: Participants began seated in a standard chair with full back support, their feet flat on the floor and their arms resting on the armrests. When instructed to ‘Go’, they stood up from the chair, walked 3 metres at a comfortable and safe pace, turned around at a designated marker, returned to the chair and sat down again.²¹

In accordance with the General Data Protection Regulation,²² the decision was made to restrict the video recording to the participants’ legs alone, in order to avoid revealing any personal information and minimise the amount of data collected. The removal or covering of any objects that could indirectly identify individuals, including jewellery and tattoos, was mandatory.

Video analysis – First rating session

For the inter-rater reliability, the observational step count analysis was conducted by two physiotherapists who participated as raters. Both raters did not actively participate in the video recordings and were physiotherapists with over 15 years of clinical experience and over 10 years of teaching experience in the bachelor’s degree program in physiotherapy.

Prior to analysing the video recordings, the two raters reached a consensus on the defining characteristics of a step for observing step count during walking and turning. The focus of the step count observation was placed on the feet and not the rest of the body. The following was agreed upon:

A step is counted when a foot in an upright, vertical body position (i.e. standing) is either completely lifted off the ground (e.g. stepping while standing) or when the entire foot changes position in any direction (forward, backward or sideways), even without being completely lifted of the floor. Alternating the weight from left to right or forwards and backwards while standing without moving or lifting the feet is not a step, nor is moving the foot while sitting.

The video recordings were obtained using the integrated video recording functionality of a gait analysis software (STAPPONE Research by StAPPtronics GmbH). An external camera was connected to the laptop and positioned on a tripod at a 2-meter distance at which the 3-meter walking distance for the TUG was recorded in a lateral view.

To minimise potential rater bias and to reduce the likelihood of pattern recognition, all video recordings were randomised and renamed after the data collection. Each participant completed two trials of the TUG, which were recorded as separate video files, while removing the name of the participant and trial number using an online randomiser (https://www.randomizer.org/). After randomisation, each video was assigned a new, anonymised filename (e.g. Video_1, Video_2, etc). The randomised and renamed video files were provided to the raters in this new order. To minimise bias, the raters were blinded to each other’s assessments and to any identifying information about the participants. They independently reviewed the same set of video recordings in one session, which captured participants performing the TUG test under standardised conditions. Each rater recorded the number of steps taken during the test using a predefined case report form (see Table 1), which included the number of steps, their level of confidence (confident/not confident), and how many times the video had been watched (once/twice/several times). Qualitative observations were recorded in an open remarks section. The case report files were subsequently collected and analysed.

Table 1.

Case report form for analysis of the video recordings.

Video-ID	Step Count	Confidence	Number of times the video was watched	Open remarks
		◦ Confident ◦ Not confident	◦ Watched Once ◦ Watched twice ◦ Watched several times

Furthermore, for the correlation analysis an independent Rater 3, a physiotherapist with 4 years of clinical experience documented the time in seconds required by the participants to complete the test using video recordings. This was achieved by means of ELAN, an open-source software program that facilitates the annotation of audio and video recordings.²³

Video analysis – second rating session

To assess intra-rater reliability, a second rating session focusing solely on observing step count was conducted, using the same video recordings as in the first rating session. Raters 1 and 2 conducted their step count analyses independently, 10 months after the initial analysis. This was done, as a previous study¹⁷ showed that recall rates after 4 weeks may be sufficiently high to affect an intra-rater study design, and that longer intervals between ratings could reduce recall bias. Mischiati, Comerford¹⁸ also found that extended intervals of several months are feasible.

Analysis

Following data collection, files were checked if they could be viewed properly. After the first rating session the assessors’ open remarks were checked for reasons for further exclusion.

The quantitative analysis was done with IBM SPSS Statistics, Version: 29.0.0.0 (241). The step count data of the remaining videos was assessed for normal distribution with Shapiro-Wilk test and analysed using SPSS to evaluate intra- and inter-rater reliability through the calculation of the ICC for two-way mixed effects model and single measures according to GRRAS.²⁰ The definition for interpretation are used according to Koo²⁴, with good correlation between 0.75 and 0.90, and excellent correlation above 0.90. The results were visualised using a Bland-Altman plot. This plot can be used even when the data was not normally distributed, provided that the 95% confidence intervals (CI) for the limits of agreement are also reported, thereby improving clarity and reliability.²⁵ The SEM and MDC were calculated based on the inter-rater reliability of the observed step count across both rating sessions. The SEM was derived using the standard deviation (SD) of the number of steps and the ICC, reflecting the variability between raters.¹⁹ The smallest change that exceeds measurement error with 95% confidence, or the MDC, was calculated as the SEM multiplied by 1.96 and the square root of two.¹⁹ Due to the asymmetrical distribution of the data, the Spearman rank test was calculated to assess the correlation between the time taken and the step count. Quantitative data on confidence and the number of times the video was watched were analysed descriptively.

Results

In this study, 31 participants (self-reported gender: 12 women/9 men) were included. The participants had a median age of 22 years (range: 18–43) and a median height of 174 cm (range: 158–190 cm). Each participant performed the TUG twice, resulting in 62 video recordings.

Of the initial 62 files, three (4.8%) were technically disrupted and had to be excluded from the subsequent analysis. During the first video analysis rating session, Rater 1 and Rater 2 analysed the remaining 59 video recordings. Rater 3 analysed the videos regarding the time taken to complete the TUG.

Out of 59 videos, 22 (37.3%) were excluded due to the subject ‘stepping outside of the frame’ documented as an open remark by one or both raters. One video had technical issues (1.7%) and received the remark ‘missing’. Thirty-six videos (61%) of the 59 remained for the statistical analysis.

The process of data collection, exclusion of data files, video analysis, and quantitative analysis are visualised in the Flow Chart (Figure 1).

Figure 1.

Flow Chart of data collection and analysis and process of exclusion.

In average both raters were 86.1% confident with their observed step count and did watch the videos in 55.6% once, 34.7% twice and 6.9% more the two times. Detailed analysis showed a difference in the times the videos were watched between the raters (Twice or more than two times; Rater 1: 80.6%/Rater 2: 8.3%). Of the 27 videos watched more than once by Rater 1, 13 received the open remark ‘sliding with the heel’. Rater 2 watched three videos more than once, and two of these received the open remark ‘short step at the end’. The other video recordings watched more than once did not receive any open remarks.

The data on the observed step count from the remaining 36 video recordings was analysed across both raters and both rating sessions descriptively and displayed as median and interquartile, as the data, was not normally distributed. The observed step count during the Timed Up And Go test across Rater 1 and Rater 2 and the first and second rating session was a median number of 12 steps (range 10-14; IQ25% 11 and IQ75% 13). The time taken to complete the TUG assessed by Rater 3 in the first rating session was in the median 7.5 sec (range 5.7-10.0; IQ25% 7.0 and IQ75% 8.1) (see Table 2).

Table 2.

Intraclass Correlation coefficient, two-way mixed effects model for intra-and inter-rater reliability of the step count.

Observed step count during the Timed Up And Go Test
	Median (range)	Intra-rater reliability	Inter-rater reliability	SD	SEM	MDC
Observed step count	12 steps (10–14)	ICC 0.924 (CI 95% 0.874 to 0.954)	ICC 0.857 (CI 95% 0.730 to 0.919)	0.49	0.18	0.51
Time taken during the Timed Up And Go Test and Correlation with Observed Step Count
	Median (range)	Spearman rank correlation	p-value	Coefficient of determination
Time taken	7.5 s (5.7–10.0)	0.714 (rho)	<.001	rho² = 0.49

ICC = Intraclass Correlation Coefficient; CI = Confidence Interval; SD = Standard Deviation; SEM = Standard Error of Measurement; MDC = Minimal detectable change.

The ICC for intra-rater reliability was 0.924 (range 0.874 to 0.954), whereas the ICC for inter-rater reliability was 0.857 (0.730 to 0.919). Derived from the SD (0.49) and the ICC (0.857) from the inter-rater reliability the SEM was 0.18 steps and the MDC 0.51 steps. Please see Table 2 for more details.

The Spearman’s rank correlation coefficient showed a statistically significant positive correlation between time and observed step count during the TUG with 0.714 (rho) and a p-value of <.001. The coefficient of determination was rho² = 0.49 (see Table 2).

The data for intra-rater (see Figure 2(a)) and inter-rater (see Figure 2(b)) reliability was visualised with a Bland-Altman plot, displaying the 95% CI.

Figure 2.

(a) and (b) Bland-Altman Plot: Intra- and inter-rater step count.(a) Assessment of agreement between two rating sessions; (b) Assessment of agreement between Rater 1 and Rater 2; Red continuous line = mean difference (a) between the two rating sessions (b) between the two raters. Green dotted lines = CI 95% = limits of agreement. Blue dots = individual differences (a) between the two rating sessions (b) between the two raters. There is an overlap of the dots, if the difference was identical.

Discussion

The main objective of this cross-sectional study was to investigate the intra- and inter-rater reliability of observed step count using video recordings of the Timed Up and Go test performed by healthy adults and to correlate the time taken to the step count during the TUG. This study showed excellent intra- and good inter-rater reliability for the step count during the TUG.²⁴ The ICC values for the observed step count are similar to the intra- and inter-rater reliability regarding the time needed to conduct the TUG investigated by Kear et al.⁸ Therefore, it can be assumed, that both parameters can be assessed during the TUG with comparable reliability. The Bland-Altman plot displayed for inter-rater and for intra-rater that the limits of agreement were narrow, with fewer than two steps in either direction for both intra- and inter-rater reliability. When differences in steps occurred between the raters, or between the two rating sessions, the difference was never greater than one step. This indicates a high degree of certainty that the correct number of steps lies within ±1.

The MDC, representing the smallest change that can be interpreted as a real difference beyond measurement error, rather than random variation,¹⁹ is particularly valuable in clinical practice, to determine whether observed changes in a patient’s performance are meaningful. The MDC of 0.51 steps suggest that the step count during the TUG demonstrates high measurement precision. As there are no half steps, more than one step would need to be interpreted as a difference beyond the measurement error. These findings support the use of step counting via video analysis during the TUG.

To the best of our knowledge, no studies have established normative data on the observed step count during the standard TUG. In this study, involving young healthy participants, the median number of steps was 12. The minimum number of steps was 10, and the maximum was 14. According to Tudor-Locke, Han,²⁶ the walking cadence in healthy young adults is 100 steps per minute, which corresponds to 1.67 steps per second. The median time taken to complete the TUG in this study’s population was 7.5 s. This suggests that the estimated number of steps taken would be 12.5 steps. This finding is consistent with the median of 12 steps that was explored in this study. This data could be used as a first reference value for future studies looking at the observed step count during the TUG.

In this study, the median time needed to complete the TUG was 7.5 s with a range of 5.7 to 10.0 s. This reflects the normative data for the respective age group (20–30 years), with a median of 8.4 s and a range of 6.1 to 12.5 s.⁸ This suggest that our sample was representative for its age, as the MDC for the TUG investigated in a study by in a population with total knee arthroplasty was 2.3 s.

The statistical significant positive correlation (rho) between time and the observed step count during the TUG aligns well with existing literature.²⁷ This is not surprising as walking speed can be determined by the product of step length and cadence.²⁸ The coefficient of determination (rho² = 0.49) indicates that the variability in time explains 49% of the variation in step count. It can be therefore concluded that there is additional information in measuring the step count beyond measuring the time. Assessing and reporting both outcome parameters in combination may provide complementary insights into functional mobility evaluation and interpretation. This should be explored in different populations, such as people with PD who experience FOG, since their gait patterns are often irregular and unpredictable.

Before screening the data and observing the step count, both raters agreed on which key components to focus on when counting steps from video recordings. Accordingly, a step must be performed while standing upright and the entire foot must change position in any direction. This included shuffling, sliding the entire foot, taking side steps, and stepping backward. A step was also defined as lifting the entire foot off the ground, including stepping in place. This description agrees with the clinical definition of a step^29–31 and is supporting the minimisation of data and the avoidance of direct identifiers, such as a person’s face due to the General Data Protection Regulation.²² This definition could be used when assessing the observed step count in research and clinical practice, since a precise definition of a step is essential for ensuring consistency, accuracy and comparability in data collection and interpretation.

Limitations

This study was performed on young, healthy adults with no major gait deviations or walking disorders. Further studies are needed to evaluate intra- and inter-rater reliability, SEM and MDC in different populations with walking disorders, such as people with PD experiencing FOG.

The high dropout rate was a limitation as it reduced statistical power. However, the narrow CI regarding the intra- and inter-rater reliability suggested that the estimates were stable despite the reduced sample size. The dropout rate was mainly due to setup errors resulting in the inability to accurately count the steps based on the video recordings. It was recognised and reflected on that specific task instructions and setups are required for the TUG. For future studies the setup regarding the starting and the turning spot and the camera position should be more focused on, when using a two-dimensional lateral-view video gait-analysis like recommended by Michelini, Eshraghi.³²

A potential source of bias stemmed from the lead researcher’s involvement in randomizing the video recordings, observing step count, and analysing data, which could have influenced the study’s objectivity. This overlap in roles occurred due to limited time and staff resources. To reduce this risk, standardized procedures were implemented, and critical steps were documented to promote transparency and consistency.

Conclusion

The observed step count during the TUG using video recordings has excellent intra- and inter-rater reliability, and could therefore be used as an additional, reliable outcome parameter in future research and clinical practice. An MDC of more than one step enables to capture changes beyond the measurement error with high confidence. The high correlation between the observed step count and the time taken to perform the TUG suggested concurrent validity. Reporting both parameters may however provide complementary reliable insights into functional mobility evaluation and interpretation. Further research should explore the psychometric properties of the observed step count as an outcome parameter during the TUG in different populations.

Footnotes

Acknowledgements

We sincerely thank all the student researchers who participated in the elective course ‘Research in Physiotherapy’ within the Bachelor’s degree program in physiotherapy at the University of Applied Sciences in Krems, Austria. Their active involvement in conducting this study was essential to making this work possible. We also would like to thank all participants in this research study. Their contribution is greatly appreciated.

ORCID iD

Agnes Wilhelm

Ethical considerations

This study was conducted in accordance with the principles set out in the Declaration of Helsinki. Informed consent was obtained from all participants. Confirmation was received from the Lower Austrian ethics committee that no formal ethics application or vote was required due to the study’s low risk and the healthy volunteers involved (GS1-EK-12/797-2024).

Consent for publication

Not applicable. Participants were informed in the consent form that their data would be used anonymously for publication.

Author contributions

AW – The author made substantial contributions to the conception and design of the work; the acquisition, analysis, and interpretation of data; and drafted the manuscript.

TM – The author substantively revised the manuscript.

AN – The author made substantial contributions to the revision of the manuscript and contributed to the interpretation of data.

JJ – The author made substantial contributions to the conception and design of the work and substantively revised the manuscript.

PA – The author made substantial contributions to the conception of the project, study supervision and oversight, and substantively revised the manuscript.

All authors read and approved of the final manuscript.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is funded through the NÖ Gesundheits- und Sozialfonds (NÖGUS) via a Dissertation Call at the IMC Krems University of Applied Sciences. The funding recipient is the first author AW.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data used for this study is available on github.com: .

Appendix

References

Herman

Giladi

Hausdorff

. Properties of the ‘timed up and go' test: more than meets the eye. Gerontology 2011; 57(3): 203–210. https://doi.org/10.1159/000314963

Hellmers

Izadpanah

Elgert

, et al. Towards an automated unsupervised mobility assessment for older people based on inertial TUG measurements. Sensors 2018; 18: 17. https://doi.org/10.3390/s18103310

Denk

Herman

Zoetewei

, et al. Daily-living freezing of gait as quantified using wearables in people with Parkinson disease: comparison with self-report and provocation tests. Phys Ther 2022; 102(12): pzac129. https://doi.org/10.1093/ptj/pzac129

Zoetewei

Ginis

Goris

, et al.

Which gait tasks produce reliable outcome measures of freezing of gait in Parkinson's disease?

J Parkinsons Dis 2024; 14(6): 1163–1174. https://doi.org/10.3233/JPD-240134

Greisberger

Klinger

Dopona

, et al. Recommendations for a core assessment set for neurological physiotherapy entry-level education in Austria - a multistage process including a Delphi study. BMC Med Educ 2025; 25(1): 1145. https://doi.org/10.1186/s12909-025-07704-8

Christopher

Kraft

Olenick

, et al. The reliability and validity of the timed up and Go as a clinical tool in individuals with and without disabilities across a lifespan: a systematic review. Disabil Rehabil 2021; 43(13): 1799–1813. https://doi.org/10.1080/09638288.2019.1682066

Bouça-Machado

Pona-Ferreira

Gonçalves

, et al. Outcome measures for evaluating the effect of a multidisciplinary intervention on axial symptoms of parkinson's disease. Front Neurol 2020; 11: 328. https://doi.org/10.3389/fneur.2020.00328

Kear

Guck

McGaha

. Timed up and Go (TUG) test: normative reference values for ages 20 to 59 years and relationships with physical and mental health risk factors. J Prim Care Community Health 2017; 8(1): 9–13. https://doi.org/10.1177/2150131916659282

Sebastião

Sandroff

Learmonth

, et al. Validity of the timed Up and Go test as a measure of functional mobility in persons with multiple sclerosis. Arch Phys Med Rehabil 2016; 97(7): 1072–1077. https://doi.org/10.1016/j.apmr.2015.12.031

10.

Ries

Echternach

Nof

, et al. Test-retest reliability and minimal detectable change scores for the timed “up & go” test, the six-minute walk test, and gait speed in people with Alzheimer disease. Phys Ther 2009; 89(6): 569–579. https://doi.org/10.2522/ptj.20080258

11.

Mollinedo

Ma Cancela

. Evaluation of the psychometric properties and clinical applications of the timed Up and Go test in parkinson disease: a systematic review. J Exerc Rehabil 2020; 16(4): 302–312. https://doi.org/10.12965/jer.2040532.266

12.

Bayle

Patel

Crisan

, et al. Contribution of step length to increase walking and turning speed as a marker of Parkinson's disease progression. PLoS One 2016; 11(4): e0152469. https://doi.org/10.1371/journal.pone.0152469

13.

Morio

Izawa

Omori

, et al. The relationship between walking speed and step length in older aged patients. Diseases 2019; 7(1): 17. https://doi.org/10.3390/diseases7010017

14.

Nagano

Hori

Muramatsu

. A comparison of at-home walking and 10-meter walking test parameters of individuals with post-stroke hemiparesis. J Phys Ther Sci 2015; 27(2): 357–359. https://doi.org/10.1589/jpts.27.357

15.

Ferster

Mazilu

Troester

. Gait parameters change prior to freezing in Parkinson’s disease. A data-driven study with wearable inertial Units. In BODYNETS. ICST 2015. https://doi.org/10.4108/eai.28-9-2015.2261411

16.

Mercer

Freburger

Chang

, et al. Step test scores are related to measures of activity and participation in the first 6 months after stroke. Phys Ther 2009; 89(10): 1061–1071. https://doi.org/10.2522/ptj.20080368

17.

Campbell

Talmon

Foster

, et al. Visual memory effects on intraoperator study design: determining a minimum time gap between case reviews to reduce recall bias. Am J Clin Pathol 2015; 143(3): 412–418. https://doi.org/10.1309/AJCPUC3TYMS3QOBM

18.

Mischiati

Comerford

Gosford

, et al. Intra and inter-rater reliability of screening for movement impairments: movement control tests from the foundation matrix. J Sports Sci Med 2015; 14(2): 427–440.

19.

Furlan

Sterr

. The applicability of standard error of measurement and minimal detectable change to motor learning Research-A behavioral study. Front Hum Neurosci 2018; 12: 95. https://doi.org/10.3389/fnhum.2018.00095

20.

Kottner

Audigé

Brorson

, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol 2011; 64(1): 96–106. https://doi.org/10.1016/j.jclinepi.2010.03.002

21.

Bouça-Machado

Maetzler

Ferreira

. What is functional mobility applied to parkinson's Disease? J Parkinsons Dis 2018; 8(1): 121–130. https://doi.org/10.3233/JPD-171233

22.

Regulation (EU) . 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (general data protection regulation) (text with EEA relevance). Luxembourg: Publications Office of the European Union, 2016.

23.

Sloetjes

Wittenburg

. Annotation by Category: ELAN and ISO DCR. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA). 2008. https://archive.mpi.nl/tla/elan/cite

24.

Koo

. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 2016; 15(2): 155–163. https://doi.org/10.1016/j.jcm.2016.02.012

25.

Olofsen

Dahan

Borsboom

, et al. Improvements in the application and reporting of advanced bland–altman methods of comparison. J Clin Monit Comput 2015; 29(1): 127–139. https://doi.org/10.1007/s10877-014-9577-3

26.

Tudor-Locke

Han

Aguiar

, et al. How fast is fast enough? Walking cadence (steps/min) as a practical estimate of intensity in adults: a narrative review. Br J Sports Med 2018; 52(12): 776–788. https://doi.org/10.1136/bjsports-2017-097628

27.

Schober

Boer

Schwarte

. Correlation coefficients: appropriate use and interpretation. Anesth Analg 2018; 126(5): 1763–1768. https://doi.org/10.1213/ANE.0000000000002864

28.

Shiina

Nakagomi

Mori

, et al. Characteristics of cadence during continuous walking in daily life. Heliyon 2024; 10(9): e29969. https://doi.org/10.1016/j.heliyon.2024.e29969

29.

Birch

Vernon

Walker

, et al. Terminology and forensic gait analysis. Sci Justice 2015; 55(4): 279–284. https://doi.org/10.1016/j.scijus.2015.03.002

30.

di Biase

Di Santo

Caminiti

, et al. Gait analysis in Parkinson's disease: an overview of the Most accurate markers for diagnosis and symptoms monitoring. Sensors (Basel) 2020; 20(12). https://doi.org/10.3390/s20123529

31.

Bassett

Jr. Toth

LaMunion

, et al. Step counting: a review of measurement considerations and health-related applications. Sports Med 2017; 47(7): 1307–1315. https://doi.org/10.1007/s40279-016-0663-1

32.

Michelini

Eshraghi

Andrysek

. Two-dimensional video gait analysis: a systematic review of reliability, validity, and best practice considerations. Prosthet Orthot Int 2020; 44(4): 245–262. https://doi.org/10.1177/0309364620921290