Sage Journals: Discover world-class research

Abstract

Background

Obesity is strongly associated with impaired physical function and increased health risks. Functional performance tests such as the Timed Up and Go (TUG), Five Times Sit-to-Stand (5xSTS), and 4-Meter Walk Test (4MWT) are essential for evaluating mobility, strength, and gait speed. Although widely used in clinical practice, the reliability of remote tele-assessment of these tests in adults with obesity remains unclear.

Objective

This study aimed to determine the inter- and intra-rater reliability of tele-assessment compared with face-to-face assessment for commonly used functional performance tests in adults with obesity.

Methods

A repeated-measures observational study was conducted with 82 adults with obesity. Participants performed TUG, 5xSTS, and 4MWT tests both in a clinical setting and remotely at home using video-based tele-assessment.

Results

Inter-rater reliability was good for TUG (ICC = 0.826) and 5xSTS (ICC = 0.880), and moderate for 4MWT (ICC = 0.743). Intra-rater reliability across two tele-assessments was excellent for TUG (ICC = 0.910) and 5xSTS (ICC = 0.902), and good for 4MWT (ICC = 0.858).

Conclusion

Tele-assessment provides a reliable alternative to face-to-face assessment of functional performance tests in adults with obesity. These findings support the integration of remote functional testing into digital health practice, expanding access to mobility and strength evaluations for populations with limited access to in-person care.

Clinical Trial Registration

Not applicable.

Keywords

Tele-assessment obesity functional performance tests digital health reliability

Introduction

Obesity, a growing global health epidemic, is associated with numerous metabolic, cardiovascular, and musculoskeletal complications that significantly impact physical function and quality of life.^1,2 These complications frequently lead to impairments in physical function such as reduced mobility, diminished lower limb strength, and slower gait speed.^3,4 As a result, individuals with obesity often experience limitations in daily activities, loss of independence, and an increased risk of disability.^5,6 Identifying and monitoring these functional limitations are essential for designing effective interventions and improving clinical outcomes.

Performance-based tests provide a simple and cost-effective way to assess mobility and strength in both clinical and research settings. Among the most commonly used are the Timed Up and Go (TUG),⁷ Five Times Sit-to-Stand (5xSTS),⁸ and 4-Meter Walk Test (4MWT).⁹ These tests are practical, require minimal equipment, and have been validated across diverse populations.^7,8,10 They are widely recommended in rehabilitation, geriatrics, and chronic disease management to detect functional decline and monitor progress.¹¹ In addition, such tests are also relevant for conditions such as sarcopenia.¹²

Traditionally, these assessments have been conducted in controlled clinical environments, ensuring standardized protocols and reliable measurements. However, the landscape of healthcare delivery is rapidly evolving. Advances in digital technology, combined with the increasing demand for remote healthcare solutions, have accelerated the adoption of telemedicine and tele-assessment approaches.¹³ Although the burden of obesity is well established, evidence on the effectiveness and reliability of remote functional performance assessment remains limited.¹⁴ Remote evaluations make it possible to assess patients at home, offering greater convenience while reducing travel and improving access for people unable to attend face-to-face visits.¹⁵

Therefore, the aim of this study was to examine the inter-rater and intra-rater reliability of tele-assessment of the TUG, 5×STS, and 4MWT in adults with obesity. We hypothesized that tele-assessment would demonstrate good to excellent agreement (ICC ≥ 0.75) with face-to-face assessment, which we considered an acceptable threshold for clinical use.

Materials and methods

Study design and participants

This study was designed as a repeated-measures observational study to examine the reliability of performance tests through tele-assessment in individuals with obesity. The evaluations were carried out in two different settings: (1) in-person assessments within a clinical setting and (2) remote evaluations utilizing tele-assessment methods.

The inclusion criteria for the study required participants to have a body mass index (BMI) greater than 30 kg/m², indicating obesity, and to be capable of walking independently without the aid of assistive devices such as canes or walkers. Additionally, participants needed to express a clear willingness to participate in the research and have access to a mobile phone or similar device equipped with video calling functionality to facilitate tele-assessment. On the other hand, individuals who had medical conditions affecting the neurological, cardiopulmonary, or vestibular systems, which could impair their physical performance or impact the reliability of the assessment results, were excluded from the study. This approach aimed to ensure a homogenous sample and reduce the influence of confounding factors on the outcomes. A total of 90 individuals were screened for eligibility. Of these, 8 were excluded due to not meeting inclusion criteria, medical comorbidities that could interfere with safe test performance, declining to participate, or insufficient home space/technical conditions for tele-assessment. The final sample consisted of 82 adults with obesity. No participants were withdrawn or excluded due to difficulty performing the remote tests, safety concerns, or technical problems.

All processes involving human participants were conducted in strict accordance with the ethical standards established by both institutional and national research committees. These standards align with the principles set forth in the 1964 Helsinki Declaration and its subsequent amendments, as well as comparable ethical frameworks. The study protocol underwent thorough review and received formal approval from the Ethics Committee of Selçuk University Faculty of Health Sciences (Approval Report No: 2024/540). All participants provided written informed consent prior to their inclusion in the study. They were thoroughly informed about the purpose, procedures, potential risks, and benefits of the research, and their participation was entirely voluntary.

Outcome measures

Body weight and body composition were assessed using the TANITA BC 601 bioelectrical impedance analysis device (Tanita, Japan), a reliable tool for such evaluations. To ensure the precision and consistency of the measurements, participants were provided with comprehensive guidelines prior to the assessment. They were instructed to refrain from eating for at least four hours beforehand, avoid drinking any fluids—including water, tea, or coffee—urinate prior to the procedure, and abstain from engaging in intense physical activities for a minimum of 24 h leading up to the evaluation. Furthermore, participants were advised to remove any metallic items or accessories to prevent interference with the device's readings during the assessment process. These precautions were implemented to achieve the most accurate and reliable results possible.16 Appendicular Skeletal Muscle Mass (ASM) was calculated using this device, ensuring a precise and non-invasive evaluation of muscle mass distribution.¹²

Grip strength was measured using a digital hand dynamometer (CAMRY dynamometer) to evaluate upper limb muscle strength. Participants were seated in a chair with their back supported, feet flat on the floor, and their arm positioned at a 90-degree angle at the elbow without any external support. The dominant hand was tested. Participants were instructed to grip the dynamometer handle as firmly as possible for 3–5 s while avoiding any movement of the arm or shoulder during the test.¹⁷

The TUG test was administered to assess participants’ functional mobility and balance. Participants were instructed to sit on a standard chair with their back against the chair rest and their feet flat on the floor. At the examiner's signal, participants were required to stand up from the chair, walk a distance of 3 m at a comfortable and safe pace, turn around, walk back to the chair, and sit down. The total time taken to complete the task was recorded in seconds using a stopwatch.¹⁸

The 5xSTS was performed to evaluate lower limb strength and functional mobility. Participants were seated on a standard chair with a seat height of 43–45 cm, ensuring their back was against the backrest and their feet were flat on the floor, hip-width apart. Arms were crossed over the chest to prevent the use of hands for support during the test. At the examiner's signal, participants were instructed to stand up fully and sit back down as quickly and safely as possible five times consecutively. The total time taken to complete the five repetitions was recorded in seconds using a stopwatch. If participants failed to complete the task, required the use of their hands, or showed signs of instability, the test was discontinued. Each participant was given a practice trial to familiarize themselves with the procedure before the actual measurement.¹⁹

The 4MWT was conducted to evaluate gait speed and functional mobility. A 6-meter flat, hard, and obstacle-free walkway was identified for the test. The walkway included two 1-m acceleration and deceleration zones at the start and end. At the examiner's signal, participants were asked to walk the 4-meter central section of the path at their usual, comfortable walking pace. Timing began when the first foot crossed the starting line of the 4-meter segment and ended when the first foot crossed the finish line. The total time taken to complete the distance was recorded in seconds using a stopwatch.⁹

Evaluation process

The testing procedures for this study, including the sequence and methodology, are depicted in Figure 1. A uniform protocol was applied to all participants to ensure consistency in data collection and analysis. The evaluations commenced with face-to-face assessments conducted in a controlled clinical environment. During these initial assessments, participants’ clinical and demographic data were recorded, followed by a comprehensive body composition analysis using bioelectrical impedance analysis. This analysis provided accurate measurements of fat mass, lean body mass, and ASM. Subsequently, participants underwent grip strength testing to evaluate upper limb muscle strength.

Figure 1.

The testing process for measurements.

All participants underwent three consecutive assessment sessions following a standardized protocol:

an initial face-to-face in-person assessment,

a first remote (tele-assessment) session, and

a second remote (tele-assessment) session.

To ensure consistency, the timing between sessions was standardized so that the first tele-assessment took place approximately 24 h after the face-to-face evaluation, and the second tele-assessment was conducted approximately 24 h after the first tele-assessment. All assessments were conducted by the same researcher.

During the face-to-face session, the tests were administered in the following fixed sequence: TUG, 5xSTS, and 4MWT. The same test order was maintained in both tele-assessment sessions to ensure methodological consistency and participant safety. No walking aids were permitted during the tests, and participants were allowed to rest for at least 5 min between each test, or longer if needed.

The face-to-face assessments were conducted in the presence of the researcher, the participant, and a relative. Prior to testing, all procedures were explained in detail to both the participant and the relative. Fatigue and dyspnea were assessed before and after each test using the Borg scale, and hemodynamic parameters (heart rate and SpO₂) were measured with a portable pulse oximeter (Beurer pulse oximeter 40, Ulm, Germany). Each participant received a pulse oximeter for use during the remote sessions.

In the tele-assessment sessions, the participant and a relative performed the tests at home while one researcher monitored the entire procedure remotely via the WhatsApp application. The relative assisted with video recording, and the remotely monitoring researcher recorded all performance times simultaneously. Fatigue, dyspnoea and hemodynamic parameters were again collected before and after each test using the same procedures as in the face-to-face session.

For home-based testing, each participant was provided with a standard 5-meter measuring tape prior to the tele-assessment session. The required 4-meter distance for the 4MWT was measured by the participant or a relative under real-time remote supervision by the researcher. Standardization was ensured by confirming the full extension of the tape on a straight, unobstructed surface and verifying the visibility of both the start and end points. For the TUG test, participants were instructed to use a stable chair with armrests placed on a flat, non-slippery surface. The evaluator verified chair height, turning point location, and the availability of a straight walking path through the camera. For the 5×STS test, chair stability and height were confirmed, and camera positioning was adjusted to obtain a clear lateral view of the participant's movements. Tele-assessments were conducted using the WhatsApp video-calling platform. Participants used smartphones with a minimum video resolution of 720p and internet speeds of at least 10 Mbps. At the beginning of each session, the evaluator confirmed adequate lighting, image clarity, and full visibility of the participant's movement. Camera angles were standardized by instructing relatives to position the device perpendicular to the walking path for the TUG and 4MWT, and laterally at chair height for the 5×STS. No assessments were interrupted due to video quality or connectivity issues.

Statistical analysis

Statistical analyses were performed using IBM SPSS Statistics for Windows, Version 25.0 (IBM Corporation, Armonk, NY, USA). Figure 1 was visualized using the BioRender platform,²⁰ while Figures 2 and 3, featuring Bland–Altman plots,²¹ were generated using Microsoft Excel from the Microsoft 365 suite (Microsoft Corporation, Redmond, WA, USA). The normality of the data was assessed through the Shapiro–Wilk test, supplemented by visual inspections of histograms to verify distribution patterns. Inter-rater reliability between face-to-face and tele-assessment was calculated using a two-way random-effects model with absolute agreement (ICC (2,1)). Intra-rater reliability between the two tele-assessment sessions was evaluated using a two-way mixed-effects model with absolute agreement (ICC (3,1)). The ICC values were interpreted based on established thresholds, with scores between 0.50 and 0.74 indicating moderate reliability, scores between 0.75 and 0.89 suggesting good reliability, and scores of 0.90 or higher reflecting excellent reliability.²²

Figure 2.

Inter-rater reliability between face-to-face and tele-assessment.

Figure 3.

Intra-rater reliability between two tele-assessments.

Sample size

The required sample size for this reliability study was determined a priori using established methodological principles for ICC-based agreement analyses.²³ In reliability research, the expected ICC value is central to defining the number of participants needed to obtain stable and precise estimates. Interpretation guidelines commonly distinguish between levels of agreement, and the transition point at which reliability becomes clinically acceptable is generally considered to occur around an ICC of 0.75.²² For this reason, and to adopt a conservative yet meaningful expectation for the present study, the anticipated ICC was set at 0.75.

To translate this expected reliability level into a required sample size, methodological recommendations for ICC-based studies were followed. For designs involving two raters/measurements, with a Type I error (α) of 0.05 and Type II error (β) of 0.20 (power = 0.80), these recommendations indicate that approximately 75 participants are needed to estimate an ICC of 0.75 with adequate precision.

Results

Participant characteristics

The study included 82 participants with obesity with a mean age of 43.93 years. Of these, 54 participants (65.9%) were women. The mean BMI of the group was 35.93 kg/m². The mean ASM was 26.49 kg, and the ASM/height² was 9.69 kg/m². Further details about the participants’ characteristics are provided in Table 1.

Table 1.

Participants’ characteristics.

n = 82	Mean (SD)
Age (years)	43.93 ± 13.28
Gender, female, n (%)	54 (65.9)
Height (cm)	164.54 ± 9.32
Weight (kg)	97.27 ± 16.02
Body mass index (kg/m²)	35.93 ± 5.43
Obesity Classification n (%)
Class I	44 (53.7)
Class II	23 (28.0)
Class III	15 (18.3)
ASM (kg)	26.49 ± 6.71
ASM/height2 (kg/m²)	9.69 ± 1.83
Fat mass (%)	38.38 ± 7.63
Grip strength (kg)	32.74 ± 11.06
Comorbidities, n (%)
Diabetes mellitus	35 (42.7)
Hypertension	29 (35.4)
Thyroid disease	21 (25.6)
Chronic pulmonary diseases	16 (19.5)
Cardiac failure	12 (14.6)

Values are expressed as mean ± standard deviation for continuous variables, and frequencies were reported for categorical variables.

ASM: appendicular skeletal muscle mass.

Functional performance test results

The outcomes of the functional performance tests, including the TUG, 5xSTS, and 4MWT were assessed in both face-to-face and tele-assessment conditions, as well as a tele-assessment retest. The mean TUG times were 7.44 ± 1.13 s (face-to-face), 7.35 ± 1.04 s (tele-assessment), and 7.39 ± 1.18 s (retest). The 5xSTS test yielded results of 9.73 ± 1.65 s (face-to-face), 9.66 ± 1.61 s (tele-assessment), and 9.57 ± 1.76 s (retest). The 4MWT showed walking speeds of 1.27 ± 0.17 m/s (face-to-face), 1.32 ± 0.19 m/s (tele-assessment), and 1.32 ± 0.19 m/s (retest), as presented in Table 2.

Table 2.

Range of data of all the tests.

	Face-to-face assessment	Tele-assessment	Retest tele-assessment
TUG (s)	7.44 ± 1.13	7.35 ± 1.04	7.39 ± 1.18
5xSTS (s)	9.73 ± 1.65	9.66 ± 1.61	9.57 ± 1.76
4MWT (m/s)	1.27 ± 0.17	1.32 ± 0.19	1.32 ± 0.19

Data are presented as mean ± standard deviation.

TUG: Timed Up and Go Test; 5xSTS: Five Times Sit-to-Stand Test; 4MWT: 4-Meter Walk Test.

Reliability

The reliability analysis demonstrated good inter-rater reliability between face-to-face and tele-assessment methods for the TUG (ICC = 0.826; 95% CI: 0.743–0.884) and the 5xSTS (ICC = 0.880; 95% CI: 0.820–0.921), while the 4MWT showed moderate inter-rater reliability (ICC = 0.743; 95% CI: 0.597–0.836) (Table 3, Figure 2). Intra-rater reliability between two tele-assessments was excellent for the TUG (ICC = 0.910; 95% CI: 0.863–0.941) and the 5xSTS (ICC = 0.902; 95% CI: 0.852–0.936), and good for the 4MWT (ICC = 0.858; 95% CI: 0.789–0.906) (Table 3, Figure 3).

Table 3.

Reliability results of the tests.

	Inter-rater reliability between face-to-face and tele-assessment
	ICC (95%CI)	SEM	SEM95%	SDC95%
TUG (s)	0.826 (0.743–0.884)	0.27	0.52	0.74
5xSTS (s)	0.880 (0.820–0.921)	0.28	0.55	0.77
4MWT (m/s)	0.743 (0.597–0.836)	0.06	0.12	0.17

	Intra-rater reliability between two tele-assessments
	ICC (95%CI)	SEM	SEM95%	SDC95%
TUG (s)	0.910 (0.863–0.941)	0.14	0.28	0.40
5xSTS (s)	0.902 (0.852–0.936)	0.23	0.46	0.65
4MWT (m/s)	0.858 (0.789–0.906)	0.04	0.08	0.11

ICC: intraclass correlation coefficient; CI: confidence interval; SEM: standard error measurement; SDC: smallest detectable change; TUG: Timed Up and Go Test;5xSTS: Five Times Sit-to-Stand Test; 4MWT: 4-Meter Walk Test.

The Bland–Altman analyses demonstrated no systematic bias across the functional performance tests. Mean differences between face-to-face and tele-assessment values were close to zero for all three tests, and the distribution of points showed no directional trend across the measurement range. For inter-rater comparisons, more than 95% of the observations fell within the limits of agreement, indicating acceptable agreement between methods. Similarly, for intra-rater comparisons, the plots showed narrow limits of agreement with the vast majority of data points remaining within these boundaries, reflecting high consistency between the two tele-assessment sessions.

Discussion

To our knowledge, no previous studies have evaluated the reliability of tele-assessment for the TUG, 5×STS, and 4MWT specifically in adults with obesity. The present study therefore addresses an important gap by examining the remote administration of these widely used functional performance tests in this population. The findings demonstrated good inter-rater reliability for TUG and 5xSTS, moderate reliability for 4MWT, and excellent intra-rater reliability across repeated tele-assessments. These results suggest that remote administration of performance-based mobility and strength tests can yield reproducible data comparable to traditional face-to-face assessments.

A key strength of this study is its well-structured methodology, which ensured the application of identical protocols for both face-to-face and remote assessments, enabling robust and consistent comparative analyses. By investigating the feasibility of tele-assessment for functional performance tests in individuals with obesity, the study addresses a significant knowledge gap in the field. The use of well-established and validated tests, such as the TUG, 5xSTS, and 4MWT, enhances the clinical relevance and reliability of the findings. These tests are widely accepted in both geriatrics and rehabilitation for their practicality and validity. By emphasizing the accessibility and practicality of tele-assessment, this study highlights its transformative potential for improving healthcare delivery in both clinical and remote settings.²⁴

Functional performance tests such as the TUG, 5xSTS, and 4MWT are well-established tools for assessing mobility, lower extremity strength, and gait speed, which are critical for diagnosing and monitoring sarcopenia.²⁵ These tests have traditionally been conducted in clinical settings, where environmental controls and standardized procedures ensure reliable results.^9,18,26 However, the increasing prevalence of telemedicine highlights the need to validate these assessments for remote use, particularly for populations with limited access to healthcare facilities.¹⁴

This study demonstrated that tele-assessment methods for TUG, 5xSTS, and 4MWT are reliable and consistent with face-to-face evaluations in individuals with obesity. Similar findings have been reported in previous studies exploring telemedicine applications for functional assessments in older adults and individuals with mobility challenges. For example, studies have shown that remote evaluations of walking speed and balance are feasible and reliable when conducted via video conferencing platforms.^27,28 These findings align with the present study's results, reinforcing the potential of tele-assessment to provide accurate and clinically meaningful data.

When the present findings are compared with previous tele-assessment research, a clear pattern emerges across different clinical populations.^27–29 For inter-rater reliability, adults with obesity demonstrated good reliability for TUG (ICC = 0.826) and 5×STS (ICC = 0.880), which is slightly lower than the excellent inter-rater values reported in patients with non-specific chronic low back pain (ICC = 0.900–0.966),²⁹ but comparable to or higher than those observed in older adults,²⁷ who typically show moderate-to-good ICCs. Inter-rater reliability for the 4MWT was moderate (ICC = 0.743), consistent with previous reports that gait speed tends to show greater variability in remote environments.²⁷ In terms of intra-rater reliability, our tele-assessment results (TUG = 0.910; 5×STS = 0.902; 4MWT = 0.858) align closely with the excellent stability observed in COPD cohorts (ICC = 0.958–0.979 for TUG and 5×STS)²⁸ and are similar to values reported in low back pain populations (ICC = 0.958–0.979).²⁹ These comparisons indicate that remote performance on TUG, 5×STS, and 4MWT in adults with obesity is highly reproducible and falls within the reliability range reported across different musculoskeletal, respiratory and geriatric populations. Collectively, the evidence suggests that functional performance tests retain robust measurement properties when administered remotely, and that tele-assessment is a reliable option for evaluating mobility, lower-limb function, and gait speed in adults with obesity.

In our study, intra-rater reliability values were consistently higher than inter-rater reliability, which is an expected finding in tele-assessment methodology. Repeated tele-assessments conducted by the same evaluator benefit from highly similar testing conditions, including consistent camera positioning, environmental setup, verbal cueing, and scoring interpretation. These factors minimize measurement variability and naturally improve intra-rater agreement. In contrast, inter-rater reliability required comparing face-to-face and tele-assessment environments, which differ in spatial configuration, visual perspective, lighting, supervision, and cueing. Such contextual variability typically reduces agreement between modes. Despite these inherent differences, the inter-rater reliability in our study remained good for TUG and 5×STS, and moderate-to-good for 4MWT, indicating that tele-assessment provides clinically acceptable reliability even when compared directly with in-person evaluation.

When interpreting the absolute reliability indices in the context of prior tele-assessment research, the SEM and SDC95% values obtained in adults with obesity were generally within clinically acceptable ranges and showed meaningful differences compared with other populations. For inter-rater reliability (face-to-face vs. tele-assessment), the SDC95% values in our study were 0.74 s for TUG, 0.77 s for 5×STS, and 0.17 m/s for 4MWT. These values were substantially lower than those reported in older adults (SDC95% = 1.12 s for TUG, 1.97 s for 5×STS, and 0.19 m/s for 4MWT),²⁷ suggesting that adults with obesity may exhibit less measurement variability during tele-assessment than frail older individuals. In contrast, compared with individuals with COPD, our inter-rater values were higher, as previous COPD studies reported SDC95% values of 0.23 s for TUG and 0.31 s for 5×STS under more standardized assessment conditions.²⁸ Notably, the 4MWT SDC95% in our sample (0.17 m/s) was comparable to that observed in older adults (0.19 m/s), reinforcing the stability of gait speed measurements across different clinical groups.

For intra-rater reliability (tele-assessment repeated by the same evaluator), measurement error decreased further, with SDC95% values of 0.40 s for TUG, 0.65 s for 5×STS, and 0.11 m/s for 4MWT. These values were moderately higher than those reported in older adults (TUG = 0.43 s; 5×STS = 1.08 s; 4MWT = 0.06 m/s),²⁷ but notably lower for the 5×STS, suggesting that younger adult populations may demonstrate more consistent sit-to-stand performance. In contrast, COPD cohorts demonstrated even lower SDC thresholds (TUG = 0.08–0.10 s; 5×STS = 0.18–0.22 s), likely reflecting the more controlled home environments and consistent motor patterns present in respiratory populations.²⁸ Overall, these comparisons indicate that tele-assessment provides sufficiently low measurement error to detect true changes in mobility, lower-limb functional performance, and gait speed in adults with obesity. These findings support the use of remote assessments to monitor changes in functional performance over time in adults with obesity. Although minimal clinically important difference (MCID) values for TUG, 5xSTS and gait speed have been reported in other clinical populations, obesity-specific MCID thresholds have not yet been established. Therefore, we did not apply existing MCID values directly to our sample, and clinical interpretation relied on SEM and SDC. The absence of validated MCID values for adults with obesity limits the ability to fully determine the clinical relevance of observed differences, and future research is needed to define obesity-specific MCID for these tests.

In addition to demonstrating strong inter- and intra-rater reliability, the findings of this study carry important practical implications for digital health applications. Tele-assessment can support the broader use of telemedicine and telerehabilitation by enabling remote functional evaluation without the need for in-person visits. This is particularly relevant for individuals living in rural or underserved regions, or for situations in which access to healthcare facilities is restricted, such as during pandemics or mobility limitations. Remote assessments may also reduce operational costs and increase organizational flexibility by minimizing travel, streamlining workflow, and facilitating follow-up monitoring. Within the wider e-health and m-health framework, the integration of tele-assessment has the potential to complement obesity management strategies by improving access to functional evaluation and supporting continuous care through digital platforms.

The high prevalence of comorbidities such as diabetes and hypertension in our sample reflects typical clinical profiles of adults with obesity, but may limit generalizability to individuals with fewer health conditions. Nonetheless, all participants were medically stable, and none of the comorbidities directly affected the ability to perform the functional tests.

This study has several limitations that should be considered when interpreting the findings. First, although the sample included only adults with obesity, the distribution across obesity classes (Class I: 53.7%, Class II: 28.0%, Class III: 18.3%) was not sufficiently balanced to permit meaningful subgroup comparisons. As a result, potential differences in tele-assessment reliability across obesity severity levels, age groups, or comorbidity profiles could not be examined, which limits the generalizability of the findings across the full spectrum of obesity.

Second, tele-assessments were conducted in participants’ homes to replicate real-world conditions; however, environmental factors such as room size, lighting, camera placement, and potential distractions could not be fully standardized. Although no major safety issues, connectivity problems, or assessment interruptions occurred, minor variations in home environments may have contributed to measurement variability.

Third, the study relied on the “WhatsApp” platform for remote assessments. While this application is widely available, easy to use, and highly accessible, it lacks the precision, standardized camera settings, and advanced measurement features offered by dedicated telemedicine platforms. These technological limitations may have influenced the accuracy of timing and visual observation.

Fourth, all assessments were performed by the same researcher. Although this approach reduced between-rater variability, it made complete blinding to previous results impossible during repeated tele-assessments. This may introduce a small risk of measurement bias; however, a standardized protocol, identical test sequence, and strict instructions were implemented to minimize this effect.

Fifth, although SEM and SDC values were calculated to support the interpretation of measurement precision, obesity-specific minimal clinically important difference (MCID) thresholds for the TUG, 5xSTS, and 4MWT tests have not yet been established. The absence of validated MCID values for this population limits the ability to determine the clinical significance of observed changes.

Finally, while both inter-rater (face-to-face vs. tele-assessment) and intra-rater (two tele-assessments) reliability were analysed, the potential impact of different raters, repeated instruction styles, or variations in participant compliance between assessment settings was not evaluated. Future studies should consider using standardized telehealth systems, including multiple raters, and stratifying participants by obesity class to better understand how individual, environmental, and technological factors influence tele-assessment reliability.

It is also important to consider how these limitations may have influenced the results. The lack of stratification by obesity class may have masked potential differences in reliability across obesity severity, possibly leading to either under- or overestimation of agreement. Variability in home environments—such as differences in space, lighting, or camera positioning—may have introduced additional noise that could slightly reduce reliability estimates. Similarly, the use of WhatsApp, while practical, may have contributed to minor timing or visibility inaccuracies, which would likely bias the ICC values downward rather than upward. Conversely, because all assessments were conducted by a single, non-blinded evaluator, intra-rater reliability may have been modestly inflated due to familiarity with previous measurements. Finally, the absence of obesity-specific MCID thresholds limits the precision with which clinical relevance can be interpreted. These factors should be considered when applying the findings to broader clinical or telehealth contexts.

Taken together, these findings suggest that tele-assessment can be integrated into clinical practice as a reliable alternative when in-person evaluations are not feasible. Remote administration of TUG, 5×STS, and 4MWT can support ongoing monitoring of mobility, lower-limb strength, and gait speed in adults with obesity, particularly for individuals with limited access to healthcare facilities or those requiring frequent follow-up assessments. The low measurement error observed in this study indicates that clinicians can confidently interpret changes across repeated remote evaluations. Tele-assessment may also offer practical advantages related to cost and accessibility. Remote evaluations eliminate travel time, reduce clinic resource use, and may lower overall healthcare costs, although formal economic analyses are needed to confirm these benefits. While patient satisfaction and preference were not evaluated in this study, prior telehealth research suggests that remote assessments are generally well tolerated and may improve engagement among individuals facing mobility or transportation barriers. Future studies should directly examine patient experience and cost-effectiveness to better guide implementation.

Conclusion

In conclusion, this study demonstrates that functional performance tests can be reliably administered via tele-assessment in adults with obesity. Remote evaluations provided consistent and reproducible results in both inter-rater and repeated measurements, yielding a level of reliability comparable to that of face-to-face assessments. These findings suggest that tele-assessment may serve as a practical option to support access to functional evaluation in appropriate clinical settings.

Footnotes

Acknowledgments

We would like to thank all the participants for their time and effort.

ORCID iD

Gülşah Özsoy

Ethics approval and consent to participate

This study was approved by the Ethics Committee of Selçuk University Faculty of Health Sciences (Approval Report No: 2024/540).

Consent to participate

Written informed consent was obtained from all participants prior to data collection.

Contributorship

Gülşah Özsoy: conceptualization, methodology, formal analysis, visualization, writing—original draft, supervision. Yasemin Gedikli: investigation, data curation, project administration. Mehmet Kaan Altunok: investigation, data curation, formal analysis, visualization. Beyza Arı Gedik: investigation, writing—review and editing, interpretation of results. Nurel Ertürk: methodology, writing—review and editing, validation. İsmail Özsoy: conceptualization, writing—review and editing, supervision. All authors contributed to the writing process, reviewed the final manuscript, and approved it for submission.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Selçuk University Scientific Research Projects Coordination Unit (Project No: 24401112).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

The data that support the findings of this study are available from the corresponding authors with a signed data access agreement.

Availability of data and materials

The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Safaei

Sundararajan

Driss

, et al. A systematic literature review on obesity: understanding the causes & consequences of obesity and reviewing various machine learning approaches used to predict obesity. Comput Biol Med 2021; 136: 104754. 2021/08/25.

Bout-Tabaku

Michalsky

Jenkins

, et al. Musculoskeletal pain, self-reported physical function, and quality of life in the teen-longitudinal assessment of bariatric surgery (teen-LABS) cohort. JAMA Pediatr 2015; 169: 552–559. 2015/04/29.

Menoth Mohan

Al Anouti

Kohli

, et al. Association of obesity with musculoskeletal health and functional mobility in females-a systematic review. Int J Obes (Lond) 2025; 49: 2184–2205. 2025/09/19.

Stenholm

Alley

Bandinelli

, et al. The effect of obesity combined with low muscle strength on decline in mobility in older persons: results from the InCHIANTI study. Int J Obes (Lond) 2009; 33: 635–644. 2009/04/22.

Anderson

Wiener

Khatutsky

, et al. Obesity and people with disabilities: the implications for health care expenditures. Obesity (Silver Spring) 2013; 21: E798–E804. 2013/06/28.

Bell

Sabia

Singh-Manoux

, et al. Healthy obesity and risk of accelerated functional decline and disability. Int J Obes (Lond) 2017; 41: 866–872. 2017/02/22.

Podsiadlo

Richardson

. The timed “up & go": a test of basic functional mobility for frail elderly persons. J Am Geriatr Soc 1991; 39: 142–148. 1991/02/01.

Muñoz-Bermejo

Adsuar

Mendoza-Muñoz

, et al. Test-retest reliability of Five Times Sit to Stand Test (FTSST) in adults: a systematic review and meta-analysis. Biology (Basel) 2021; 10: 510. 2021/07/03.

Maggio

Ceda

Ticinesi

, et al. Instrumental and non-instrumental evaluation of 4-meter walking speed in older individuals. PLoS One 2016; 11: e0153583. 2016/04/15.

10.

Bohannon

Wang

. Four-Meter gait speed: normative values and reliability determined for adults participating in the NIH toolbox study. Arch Phys Med Rehabil 2019; 100: 509–513. 2018/08/10.

11.

Jahan

. Insight into functional decline assessment in older adults: a physiotherapist's perspective. Archives of Gerontology and Geriatrics Plus 2024; 1: 100048.

12.

Cruz-Jentoft

Bahat

Bauer

, et al. Sarcopenia: revised European consensus on definition and diagnosis. Age Ageing 2019; 48: 16–31. 2018/10/13.

13.

Haleem

Javaid

Singh

, et al. Telemedicine for healthcare: capabilities, features, barriers, and applications. Sens Int 2021; 2: 100117. 2021/11/23.

14.

Kahan

Look

Fitch

. The benefit of telemedicine in obesity care. Obesity (Silver Spring) 2022; 30: 577–586. 2022/02/24.

15.

Ghazal

Singh Beniwal

Dhingra

. Assessing telehealth in palliative care: a systematic review of the effectiveness and challenges in rural and underserved areas. Cureus 2024; 16: e68275. 2024/10/01.

16.

Greco

Tarsitano

Cosco

, et al. The effects of online home-based pilates combined with diet on body composition in women affected by obesity: a preliminary study. Nutrients 2024; 16: 902. 2024/03/28.

17.

Roberts

Denison

Martin

, et al. A review of the measurement of grip strength in clinical and epidemiological studies: towards a standardised approach. Age Ageing 2011; 40: 423–429. 2011/06/01.

18.

Podsiadlo

Richardson

. The timed “up & go": a test of basic functional mobility for frail elderly persons. J Am Geriatr Soc 1991; 39: 142–148.

19.

Kowall

. Lower body muscle strength, dynapenic obesity and risk of type 2 diabetes–longitudinal results on the chair-stand test from the survey of health, ageing and retirement in Europe (SHARE). BMC Geriatr 2022; 22: 24.

20.

Perkel

. The software that powers scientific illustration. Nature 2020; 582: 137–138. 2020/05/10.

21.

Giavarina

. Understanding bland altman analysis. Biochem Med (Zagreb) 2015; 25: 141–151. 2015/06/26.

22.

Koo

. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 2016; 15: 155–163. 2016/06/23.

23.

Temel

Erdogan

. Determining the sample size in agreement studies. Marmara Medical Journal 2017; 30: 101–112.

24.

Steindal

Nes

AAG

Godskesen

, et al. Advantages and challenges of using telehealth for home-based palliative care: systematic mixed studies review. J Med Internet Res 2023; 25: e43684. 2023/03/14.

25.

Donini

Busetto

Bischoff

, et al. Definition and diagnostic criteria for sarcopenic obesity: ESPEN and EASO consensus statement. Obes Facts 2022; 15: 321–335. 2022/02/24.

26.

Alcazar

Losa-Reyna

Rodriguez-Lopez

, et al. The sit-to-stand muscle power test: an easy, inexpensive and portable procedure to assess muscle power in older people. Exp Gerontol 2018; 112: 38–43. 2018/09/05.

27.

Ozsoy

Aksoy

. Intra- and inter- rater reliability of the face-to-face assessment and tele-assessment of performance-based tests in older adults. Eur Geriatr Med 2024; 15: 601–607. 2024/02/22.

28.

Ozsoy

Kodak

Kararti

, et al. Intra- and inter-rater reproducibility of the face-to-face and tele-assessment of timed-up and go and 5-times sit-to-stand tests in patients with chronic obstructive pulmonary disease. Copd 2022; 19: 125–132. 2022/04/07.

29.

Ozsoy

. Reliability of tele-assessment of five repetition sit to stand and timed up and go tests in patients with non-specific chronic low back pain. Discover Health Systems 2024; 3: 34.

Tele-assessment reliability of functional performance tests in adults with obesity

Abstract

Background

Objective

Methods

Results

Conclusion

Clinical Trial Registration

Keywords

Introduction

Materials and methods

Study design and participants

Outcome measures

Evaluation process

Statistical analysis

Sample size

Results

Participant characteristics

Functional performance test results

Reliability

Discussion

Conclusion

Footnotes

Acknowledgments

ORCID iD

Ethics approval and consent to participate

Consent to participate

Contributorship

Funding

Declaration of conflicting interests

Data availability

Availability of data and materials

References