Abstract
Introduction: Muscular strength and physical function are commonly assessed in exercise oncology trials, yet muscular power is often overlooked despite being an important determinant of morbidity and mortality. The sit-to-stand power (STSp) test offers a rapid, low-cost method to evaluate lower-body power. The purpose of this study was to assess the test-retest reliability, measurement error, and minimal detectable change (MDC) of the STSp test assessed using a linear power transducer in individuals treated for cancer. Methods: Adults with a history of cancer completed the STSp test on 2 occasions (2 -10 days apart), to evaluate test–retest reliability using intraclass correlation coefficients (ICC(3,1)). Standard error of measurement (SEM) and MDC were calculated. Results: One hundred and three individuals treated for cancer (88.3% female; mean age = 60.2 ± 10.4 years) completed the test–retest assessment. The sample was racially diverse, with 51% identifying as White and 41% as Black. Breast cancer was the most common diagnosis (68.9%). Disease stage was primary early stage, with 45.6% classified as stage I and 22.3% stage II. Participants had undergone surgery (95.2%), radiation (63.1%), and chemotherapy (60.2%). The STSp test demonstrated good reliability for peak power (ICC = 0.86; 95% CI: 0.80-0.91) and average power (ICC = 0.86; 95% CI: 0.79-0.90). The peak power SEM and MDC were 199.9 W and 554.10 W, respectively. The average power SEM and MDC were 179.05 W and 496.30 W, respectively. Conclusion: The STSp test demonstrated good test–retest reliability for assessing lower-body muscular power in individuals treated for cancer. Given its low cost and feasibility, the STSp test may be useful for clinicians seeking to monitor changes in lower extremity power in oncological settings. However, the relatively large SEM and MDC values indicate substantial within-subject variability, underscoring the need for further protocol standardization.
Clinical trial registration: ClinicalTrials.gov, NCT06039488
Introduction
It is estimated that approximately 2 million individuals were diagnosed with cancer in the United States in 2024. 1 Advances in early detection, diagnosis, and treatment have significantly improved survival rates, with over 21 million individuals projected to be living with a history of cancer by 2030. 2 Unfortunately, despite the effectiveness of cancer treatments, many are accompanied by burdensome off-target effects that reduce physical and psychological well-being.3 -6 Specifically, cancer treatments can contribute to declines in muscular strength, muscle mass, and physical function.6 -9 These off-target side effects, coupled with inactivity, accelerate the decline in physical health and increase the risks of morbidity and mortality.10,11 As the population of individuals living with or beyond cancer grows, it is essential to implement early screening and monitoring approaches to identify functional decline and assess risk of morbidity and mortality.12,13
Muscular power (the product of force and velocity) has become increasingly recognized as an important predictor of functional ability in older adults and other clinical populations, as well as trained athletes.14 -17 Compared to muscular strength, muscular power has been shown to decline more rapidly with aging,18,19 and has been associated with lower quality of life, 20 reduced gait speed, 21 and increased fall risk.22,23 Moreover, muscle power has been demonstrated to be one of the strongest predictors and contributors to mobility limitations with aging.24 -26 Importantly, muscle power has been demonstrated to be a stronger predictor of mortality than muscle strength in older adults. 27 Accordingly, systematic evaluation of muscle power in aging and clinical populations may facilitate early detection of functional impairment and guide the development of targeted interventions.18,28 Given the accelerated decline in physical function that can occur with chemotherapy, radiotherapy, surgery, and endocrine therapy,29,30 individuals treated for cancer may experience reductions in muscular power that precedes overt losses in strength or functional independence; however, this domain of physical function is not routinely assessed in clinical oncology settings.8,10,11,29,31,32 As a result, sensitive assessment of muscular power may be particularly relevant for the early identification of functional decline and for identifying individuals at heightened risk for adverse mobility-related outcomes during and following cancer treatment.33,34
Despite the recognized importance of muscular power for functional outcomes, routine assessment of lower-extremity power remains uncommon in clinical oncology practice. 35 In clinical oncology settings, where time and space are limited,36 -39 assessments need to be quick, inexpensive, and require minimal equipment. 25 Current testing procedures for muscular power ( isokinetic dynamometry 40 or force plates 41 ) require specialized expertise, are time-intensive, and require relatively large spaces for operation, which limit their feasibility in clinical settings. Field-based tests have been proposed to address these barriers, such as the 30-second sit-to-stand test, in which time to complete the test or total repetitions is entered into prediction equations to derive an index of lower-extremity power.28,42 Though informative, these protocols estimate average power across multiple repetitions and therefore do not provide a direct measure of peak power in a single effort. 27 More recently, mobile applications that estimate power from video recordings have been introduced, 17 though power is typically estimated using internal regression equations 43 that can introduce additional measurement error and may not be generalizable to other populations. 39
The sit-to-stand power test (STSp) utilizes a portable linear position transducer to directly estimate peak power output during a standard sit-to-stand movement, which may overcome the aforementioned limitations with power assessments.43,44 Specifically, the transducer records displacement and time, allowing calculation of power in watts for each repetition, from which peak power can be derived.43,44 Conceptually, the sit-to-stand power test differs from other lower-extremity power assessments by directly quantifying peak power during a single functional movement, rather than estimating average power across repeated efforts or relying on regression-based prediction models.43,44 This method is rapid, low cost and practical in confined spaces, making it a potentially valuable tool for assessing lower-body muscular power in oncology populations and clinical settings.34,35,45 The STSp has demonstrated good reliability and construct validity in older adults and individuals with chronic conditions.44,46,47 However, its measurement properties have not yet been evaluated in individuals with a history of cancer. Therefore, the primary aim of this exploratory study was to evaluate the test-retest reliability and measurement error of the sit-to-stand (STS) using a linear power transducer in a heterogenous cancer population with mixed treatment status, with a secondary aim of determining the minimal detectable change. We hypothesized that the STSp would demonstrate acceptable test-retest reliability and measurement error, supporting its use for monitoring lower extremity power in this population.
Methods
Participants/Recruitment
Participants were recruited from the Columbia, South Carolina area through referrals from local physical therapists and oncologists, as well as through self-referral in response to study flyers distributed at local events and clinical sites. For referred individuals, contact information was shared with study staff following permission to be contacted. Study staff then contacted potentially eligible individuals to provide an overview of the study purpose and procedures, discuss potential risks and benefits, and conduct an initial eligibility screening. Individuals who met preliminary eligibility criteria were subsequently scheduled for an in-person study visit to complete informed consent and baseline assessments.
Eligibility criteria included: (1) Individuals who had received or were receiving cancer treatment, including surgery, chemotherapy, radiotherapy, immunotherapy, and/or targeted therapy; (2) were 18 years or older, male or female, and were willing to sign an informed consent. Exclusion criteria were: (1) had any neuromuscular, cardiovascular, or psychological condition preventing safe testing; and (2) were unable to read or understand English, because all consent materials, study instructions, and outcome measures were administered in English and validated translations or interpreter support were not available. This study protocol was approved by the local hospital’s Institutional Review Board (1852637-15, approved May 24, 2024).
Sample Size
There are currently no clearly defined guidelines regarding the optimal sample size for studies investigating the reliability of physical performance outcomes. While recommendations for reliability studies of questionnaires and patient-reported outcome measures typically suggest sample sizes between 30 and 50 participants, 48 some have proposed that larger samples may be necessary to ensure adequate precision and generalizability. 49 In a prior study assessing the test–retest reliability and measurement error of the STSp test, 50 a minimum sample size of 17 participants was estimated to be sufficient to detect a hypothesized intraclass correlation coefficient (ICC) of .90, assuming a minimally acceptable ICC of .70, an alpha level of .05, and 80% power (β = .20). In the present study, the final sample size was determined based on a combination of logistical and time constraints; however, we aimed to enroll approximately 100 participants to ensure sufficient power to determine reliability coefficients.
Testing Procedures
Lower-body muscular power was assessed using the STSp in conjunction with a linear position transducer (TendoUnit, Trencin, Slovak Republic). The Tendo Unit has been shown as a valid and reliable device for assessing muscular power in older adults.47,51 The transducer was used to measure vertical peak power output (watts, W) during a single STS movement. Participants completed the STSp at baseline and returned 2 to 10 days later for a subsequent assessment. A belt was positioned around the participant’s waist, just superior to the iliac crest, and connected to the transducer via a Kevlar cord oriented perpendicularly to the floor. Participants began the movement from a seated position with their arms crossed over their chests. On the examiner’s cue, they were instructed to stand up as quickly and safely as possible, then return to a seated position.44,52 In cases where individuals were unable to perform the test unassisted, they were permitted to use assistive devices such as a cane, walker, or their hands placed on their thighs.
Power output (P) was calculated automatically by the Tendo software using the following equations:
Where v is vertical velocity (m·s−1), g is acceleration due to gravity (9.81 m·s−2), and a is the acceleration during the concentric (standing) phase of the movement. Participant body mass was entered into the software in pounds (lb) and internally converted by the Tendo software to kilograms prior to force and power calculations, ensuring consistency with SI units.
Each participant completed 3 single-repetition trials of the STSp, with 1 minute of rest between attempts. The highest value (in Watts) across the 3 trials was recorded as peak power (STSp Peak), and the mean value of the 3 trials was recorded as average power (STSp Average). Testing was completed at 2 timepoints; during the baseline session and during the first training session (occurring 2-10 days after baseline) as shown in Figure 1.

Overview of the timing of the Sit-to-Stand Power test.
Sit-to-stand power was assessed across 2 testing sessions separated by 2 to 10 days. During each session, participants performed 3 sit-to-stand repetitions to evaluate lower-limb power.
Statistical Analysis
Statistical analyses were conducted using R version 4.3.1. 53 Descriptive statistics were calculated for each test variable, including mean, standard deviation (SD), and range. Normality of the data was evaluated using the Shapiro-Wilk test and Q–Q plots. For all inferential statistical tests, the significance level was set at P < .05 (2-sided).
To evaluate test-retest reliability of the STS power outcomes, intraclass correlation coefficients (ICC(3,1)) were calculated using a 2 way mixed effects model with absolute agreement. 37 Reliability was estimated based on the session score used for decisions, defined as the single best repetition for STS Peak and the mean of 3 repetitions within a session for STS Average. 50 ICC values were interpreted as follows: values < 0.5 indicated poor reliability, 0.5 to 0.75 moderate reliability, 0.75 to 0.9 good reliability, and >0.9 excellent reliability. 37 Measurement error was estimated using the standard error of measurement (SEM), calculated as SEM = SD × √(1 − ICC), and the minimal detectable change (MDC), computed as MDC = 1.96 × √2 × SEM, which represents the smallest detectable difference beyond measurement error. Additionally, Bland-Altman plots were generated to assess the agreement between repeated trials and to identify potential systematic biases in the measurements.
Results
A total of 103 individuals (mean age = 60.18 ± 10.38 years; 88.35% female) completed the intervention. The cohort was composed of 51.46% White and 40.78% Black/African American individuals. Breast cancer was the most common diagnosis (68.93%), followed by colorectal cancer (15.53%). Most participants were diagnosed with Stage I cancer (45.63%) and had undergone surgery (95.15%), radiation therapy (63.11%), and chemotherapy (60.19%). Participant demographic and cancer-related characteristic information is summarized in Table 1.
Participant Demographics for Test-Retest and Pre-Post Changes.
Abbreviation: BMI, body mass index (calculated as weight in kilograms divided by height in meters squared).
Test-Retest Reliability and Measurement Error
Using a 2-way mixed effects model with absolute agreement, ICC(3,1) for STS Peak was 0.87 (95% CI: 0.80-0.91), indicating good test-retest reliability. The SEM was 200 W and the MDC was 554 W. The ICC(3,1) for STS average was 0.86 (95% CI: 0.79-0.90), also indicating good test-retest reliability. The SEM for STS average was 179 W and the MDC was 496 W (Table 2). The Bland—Altman plots showed small positive mean differences between sessions and no evidence of proportional bias for either outcome (Figure 2). For STS Peak, the mean difference was 70.70 W with limits of agreement (LoA) from −471 to 612 W. For STS Average, the mean difference was 68.60 W with LoA from −414 to 551 W. These LoA indicate that repeated measurements within an individual may vary by several hundred watts across sessions, suggesting substantial within-subject variability and reinforcing that only relatively large changes in STSp performance are likely to reflect true change beyond measurement error at the individual level.
Test-Retest Reliability and Measurement Error.

Bland Altman plots for (A) STS peak and (B) STS average.
Discussion
The purpose of this study was to assess the test-retest reliability of a lower-body muscular power assessment using STSp in individuals treated for cancer. We observed good test-retest reliability for both peak (ICC = 0.87) and average (ICC = 0.86) power, with minimal detectable changes (MDC) of 554.10 and 496.30 W, respectively. These findings suggest that STSp may be a promising tool to evaluate lower-body muscular power in oncology settings. However, the relatively large MDC values and wide limits of agreement suggest limited sensitivity to detect small within-person changes at the individual level.
Our findings align with prior studies using similar technology, although our reliability estimates were modestly lower and error margins higher than those observed in healthy or non-cancer cohorts. For example, Balachandran et al, 44 reported excellent test-retest reliability for peak STSp in community-dwelling older adults (ICC = 0.96), with lower SEM (70.4 W) and MDC (192.8 W) values. In contrast, our higher MDC values indicate greater within-subject variability, likely reflecting the broader heterogeneity of individuals treated for cancer. Compared with prior work in more homogeneous samples,37,43,51 our cohort demonstrated a wider age range and greater variability in peak power output. This broader distribution could be potentially explained by the wide spectrum of physical capabilities in individuals with cancer, commonly driven by differences in cancer type, treatment history (eg, chemotherapy, radiation, surgery), comorbidities, fatigue levels, and pre-existing functional status.32,54,55 While such heterogeneity enhances generalizability of our findings to the cancer community, it also increases measurement variability, which may reduce the precision of reliability estimates and supports cautious interpretation of small individual changes.
Methodological factors may have further contributed to the observed differences in reliability and measurement error. Although testing procedures were broadly consistent with prior studies, our protocol permitted the use of upper limb support or assistive devices to complete the task, which may have increased performance variability. 47 Unfortunately, we did not record the frequency or type of assistance, limiting our ability to explore its influence on measurement error. Further, the absence of a familiarization period may have meaningfully contributed to the higher SEM and MDC values observed in the present study. Learning effects are well documented in functional performance testing, particularly for tasks requiring coordination, timing, and rapid force production, and can inflate within-subject variability when participants are unfamiliar with the task.56 -58 Prior work in community-dwelling older adults by Gray and Paulson, 47 implemented 10 repetitions (with the final 3 used for analysis), allowing participants to adapt to the movement and reduce learning effects. Our protocol included only 3 trials, which may have introduced performance inconsistency due to unfamiliarity with the task. This learning-related variability may partially explain the elevated SEM and MDC values observed and suggests that small changes in STSp performance should be interpreted cautiously when a familiarization period is not included. Future research should consider standardizing familiarization procedures, quantifying assistance, and stratifying reliability analyses based on functional status or assistive device use to better understand sources of variability in power testing.
The MDC values observed in our study imply that interventions using the TENDO unit to assess lower-body muscular power may need to elicit changes beyond ~400 to 500 W to be considered detectable beyond measurement error.59 -61 In practice, this may limit the ability to detect small improvements in lower-functioning individuals when relying on this assessment alone. 60 However, it is also plausible that individuals with lower baseline function may demonstrate larger absolute gains in response to training, thereby exceeding the MDC threshold despite greater variability.60,61 For higher-functioning participants, however, the ability to detect marginal changes may be limited, especially in shorter interventions. 62 Additionally, the measurement error may be influenced by the sample’s mean and variability, so investigators using this test in intervention research should consider conducting reliability analyses within their own study populations.17,48,59,60 Irrespective, the relatively large MDC suggests that substantial changes in power output would be required to confidently conclude that an individual’s performance has truly changed beyond measurement error. Thus, while the test is well suited for ranking participants and for group-level comparisons, caution is warranted when interpreting small changes at the individual level.
This study had several limitations. First, the sample primarily consisted of female participants previously treated for breast cancer, which limits the generalizability of the findings to more diverse cancer populations. Second, some participants required assistance to complete the STS movement using their hands, a cane, or a walker, which may have introduced variability into the measurement. While this introduces heterogeneity, it also reflects the diversity in physical function and mobility encountered in real-world survivorship care. Inter-rater reliability was also not assessed. Although testing procedures were standardized, future studies should quantify rater effects to isolate sources of variability more precisely and compare the validity of the TENDO against a force plate to establish accuracy. Taken collectively, these limitations likely contributed to wider limits of agreement and larger MDC and reinforce the need for cautious interpretation of small within person differences. Given these sources of variability, one pragmatic strategy is to average 2 sessions to improve precision. In our data, the SEM and MDC95 for the average of 2 sessions were 146 and 406 W for peak and 131 and 364 W for average, respectively, which may aid individual level decision making. Lastly, a further limitation is that participation required English proficiency, which may limit the generalizability of these findings to non–English-speaking populations treated for cancer.
Conclusions
The sit to stand power test demonstrated good relative reliability in individuals with cancer. However, the relatively large minimal detectable change indicates that small individual level changes should be interpreted with caution, as they may not exceed measurement error. Taken collectively, these findings suggest that the STSp may be useful for monitoring group level responses or larger individual improvements, but that refinements to the protocol, including use of a brief familiarization period and standardized assistance procedures, are warranted to reduce variability. In clinical oncology settings, the STSp can be considered a rapid and low-cost option for assessing lower extremity power, though clinicians should consider interpreting change scores in the context of the SEM and MDC values reported in this study.
Footnotes
Acknowledgements
The authors thank the study participants for their time and effort and acknowledge the assistance of clinical partners who supported participant recruitment. No third-party writing or editorial assistance was received.
Author Note
Kylah E. Jackson is now at the School of Kinesiology at University of Michigan. The data collection was conducted while she was at the University of South Carolina.
Ethical Considerations
This study was approved by the Institutional Review Board at Prisma Health (approval number: 1852637-15, approved May 24, 2024) All procedures were conducted in accordance with the Declaration of Helsinki.
Consent to Participate
Written informed consent to participate was obtained from all participants prior to enrollment.
Consent for Publication
Written informed consent for publication of de-identified data was obtained from all participants.
Author Contributions
GLB, AMB, KEJ, SEH, and CMF conceived and designed the study. Data collection was performed by GLB, KSA, AMB, KEJ, and BMS. CMF, GLB, and BN performed the statistical analyses. All authors contributed to interpretation of the data, critically revised the manuscript for important intellectual content, and approved the final version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a private foundation grant from Prisma Health Hospital. The funding source had no role in the study design, data collection, analysis, interpretation of the data, or manuscript preparation.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated and analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.*
