Abstract
Background:
Functional capacity evaluation is a standardized tool that assesses work-related skills. Although there are different test batteries, the most frequently used one is Work Well Systems. This study aims to determine the validity and inter- and intra-rater reliability of remote implementation of functional capacity tests (repetitive reaching, lifting object overhead, and working overhead) in asymptomatic individuals.
Methods:
A total of 51 asymptomatic individuals were included in the study. Participants completed all tests both face-to-face and remotely. Remote assessment videos were rewatched by the same researcher and different researchers for intra- and inter-rater reliability. All processes were scored by two independent researchers.
Results:
Remotely performing repetitive reaching (intraclass correlation coefficient [ICC]: 0.85–0.92, p < .001), lifting object overhead (ICC: 0.98, p < .001), and working overhead (ICC: 0.88 p < .001) tests are valid and reliable.
Discussion:
Repetitive reaching, lifting an object overhead, and sustained overhead work tests in the Work Well Systems–Functional Capacity Evaluation test battery can be performed remotely through videoconferencing. Remotely evaluating these tests, which are especially important in work-related situations, may be important in pandemic conditions and hybrid working conitions.
Introduction
Functional capacity evaluations (FCEs) have been developed to standardize and objectively assess required work-related functional abilities (Edelaar et al., 2018). These evaluation tools are used in many different areas, including assessment of work-related performance, return to work, work-related legal cases, sport science, rehabilitation, and determination of disability level (Reneman et al., 2017; Sengers et al., 2021; Vachalathiti et al., 2020). Insurers and healthcare professionals rely on the information obtained from FCE to determine workplace and work task adjustments (Bühne et al., 2020).
The Work Well Systems–Functional Capacity Evaluation (WWS-FCE), the most commonly used of the FCE tests, is a comprehensive test battery that requires low-cost equipment and also includes work-related tasks (Chang Chien et al., 2019; Mohammed et al., 2021). The psychometric properties of the WWS-FCE have been examined and shown to have a high degree of reliability and predictive validity (Brouwer et al., 2003; Gouttebarge et al., 2004; Reneman et al., 2005). In addition, the normative values of the WWS-FCE battery have been determined in healthy individuals.
The term “telerehabilitation” includes remote synchronous or asynchronous assessment, monitoring, intervention, and supervision services (Galea, 2019). Telerehabilitation studies have increased with the COVID-19 pandemic, developments in technology, and the difficulty of accessing healthcare services in rural areas (Cottrell et al., 2017; Turolla et al., 2020). These studies offer promising findings for the field of occupational therapy as well as physical therapy. For example, one study showed that, for people with wrist, hand, and/or finger injuries, telerehabilitation, instead of a traditional program, accelerated return to work and improved functional ability and pinch strength (Blanquero et al., 2020). In addition, telerehabilitation has been shown to be cost-effective in several patient populations, including patients with low back pain, knee pain, and shoulder pain. Increasing interest in reducing the financial burden on the healthcare system also motivates researchers for telerehabilitation studies (Fatoye et al., 2020; Rennie et al., 2022). Tele-assessments are an important part of telerehabilitation for workers experiencing loss of work productivity due to disability determining the treatment effectiveness and return to work. Tele-assessments, with all their advantages, have brought with them concerns about validity due to the application environment, assessment method used, and differences in technology (Marshall et al., 2022).
Remote assessment validity and reliability studies have been conducted in various groups for range of motion, muscle strength, balance, gait assessments, some special orthopedic tests, neurodynamic tests, and lumbar posture (Cottrell & Russell, 2020; Netz et al., 2021; Peterson et al., 2019). However, to our knowledge, there is no study that has examined the remote applicability of FCE subtests. This study aims to determine the validity and inter- and intra-rater reliability of remote implementation of WWS-FCE subtests. We hypothesize that a remotely performed FCE subtest is valid and reliable in asymptomatic individuals.
Methods
Study Design
This study was planned with a repeated-measures design to establish the criterion validity and intra-rater and inter-rater reliability of remote FCE subtests of an asymptomatic population through tele-assessment and conducted at Hacettepe University, Faculty of Physical Therapy and Rehabilitation, Spine Health Unit, from September 1, 2021, to April 30, 2022 (NCT04704570). Each evaluation step was implemented with two independent researchers. Each researcher had at least 7 years of professional experience. Ethics committee approval was obtained from the university’s clinical research ethics committee. The clinical trial was recorded prospectively, that is, before participant recruitment (Clinical Trial No: NCT04704570). Signed consent was obtained from the participants. Standardized statements were used for each test together with a standardized form for the tests. Participants’ age, gender, body mass index, and employment status were recorded.
Participants
Asymptomatic volunteers who volunteered from the relatives of patients who come to the clinic were recruited and enrolled into the study. The inclusion criteria were healthy or asymptomatic adults who were between 1 and 55 years of age, literate, and scored above 21 on the Montreal Cognitive Assessment (MoCA). As the cognitive level may affect compliance with remote assessment, MoCA, which evaluates the cognitive level, was used to homogenize the group. Exclusion criteria were the presence of low back or neck pain lasting at least 3 months, any diagnosis of spinal pathology or surgery, cancer, the presence of systemic disease, and acute infection. Data were collected from individuals who have been physiotherapists for at least 9 years and have been working in spine health for 6 years. First, the study eligibility of the volunteer participants was assessed and the age, gender, body mass index, and employment status of the eligible participants were recorded. Then the study procedure was initiated.
Procedure
Participants completed all subtests twice, face-to-face and remotely (RA1). The order of remote and face-to-face evaluations was determined by the closed-envelope randomization method (Elkins, 2013). Thirty minutes rest interval was given between the two evaluations. Remote assessments were recorded and these videos were rewatched and analyzed 1 month later by the same investigator to assess intra-rater reliability (RA2). Videos were also viewed and analyzed by the other researcher to assess inter-rater reliability (RA3; Figure 1). The evaluators did not provide any feedback of the test results to the participants during the face-to-face or remote assessments.

Flow Diagram.
Face-to-Face Assessment
The face-to-face assessment was carried out with the physiotherapist and participant in the same room. The physiotherapist prepared the test materials and guided the participant.
Remote Assessment
The remote assessment was performed by the physiotherapist, synchronously guiding the participant through remote web-based teleconferencing (Zoom, Zoom Video Communications, Inc., San Jose, CA). The camera angle was adjusted to see how the participant performed the test. The participant prepared all the test materials and completed the tests with the remote guidance of the physiotherapist. Meanwhile, the physiotherapist synchronously recorded the time and weight data required for the test (Figure 2).

Tele-Assessment of Functional Capacity.
Functional Capacity Evaluation
To evaluate functional capacity, three subtests were selected from the WWS-FCE protocol, which had established validity and reliability in healthy individuals; these FCE subtests assessed hand coordination, weight handling and strength, posture and mobility (repetitive reaching, lifting object overhead, and working overhead; Bieniek & Bethge, 2014; Soer et al., 2009). In selecting the tests, consideration was given to the accessibility of the materials for remote use by individuals, as well as the safety and ease of administration of the tests. The same materials, which were determined as standard by Susan Isernhagen who developed WWS-FCE, were used in the remote and face-to-face applications of the tests (Bieniek & Bethge, 2014). The order of the tests was standard, and the repetitive reaching, lifting object overhead, and sustained overhead work tests were performed, respectively. At least one minute was left between the tests and it was stated that additional time would be given if desired. Individuals’ perceived efforts during the tests were evaluated with the Modified Borg Scale (Borg, 1990). Participants were asked to end the test when they felt fatigue at a severity of 10, according to the Modified Borg Scale during the subtests. However, they were informed that they could stop the test at any time in case of any pain or discomfort (Gross & Battié, 2002).
Repetitive Reaching Test: This test assesses the speed and coordination of the upper extremity in repetitive tasks. The time spent transferring 30 marbles (14 diameters) between two bowls placed at a distance determined by the arm opening is recorded in the sitting position. This test consists of four different subtasks, two-sided for the two upper extremities (right hand—left to the right, left hand—left to the right, right hand—right to the left, and left hand—right to the left). Test–retest reliability in healthy subjects is weak (intraclass correlation coefficient [ICC] = 0.54–0.72; Reneman et al., 2004).
Lifting Object Overhead Test: This test is used to evaluate the functional strength of the upper extremity muscles. During the test, the participant takes the weight from a table 80 cm high and is asked to lift it up to the head five times within 90 seconds and return to the original position. It starts with the lightest weight determined and progresses step-by-step toward the heaviest. Initial and maximum weight were 10 to 60 kg and 6 to 30 kg in men and women, respectively. Increases are made by 4 to 5 kg (Gross & Battié, 2002). The criteria for termination of the test are reaching maximum weight, pain, or the individual’s desire to terminate the test. Test–retest reliability in healthy subjects is high (ICC = 0.89; Reneman et al., 2004).
Sustained Overhead Work Test: This test assesses postural tolerance capacity. The test is performed in the standing position with 1 kg cuff weights tied to the participant’s arms. He was asked to manipulate the nuts and bolts as long as he can, with the arms at forehead height, and the time is recorded. Test–retest reliability in healthy subjects is high (ICC = 0.90; Soer et al., 2006).
Statistical Analyses
In the sample size analysis, 90% power, 0.05 Type 1 error were used. The total number of individuals to be included in the study was calculated to be 52. Calculations were performed using PASS 15 software (Iwane et al., 1997). The descriptive statistics of continuous variables were shown as mean and standard deviation. Besides, for categorical variables, frequency (%) was given. The level of statistical significance was set at a p value of .05. All reported p values were two-sided.
For the validity analysis of this study, the ICC between the values obtained as a result of the face-to-face and online application for each measurement variable was examined. For reliability, the same researcher’s first measurement and second measurement (intra-rater) were evaluated with the ICC. Inter-rater reliability, that is, the relationship between the evaluations of two different researchers was also evaluated with the ICC. The ICC validity–reliability classification for physiological data is >0.90 high, 0.80–0.89 moderate, and <0.79 weak. Data were analyzed using IBM SPSS Statistics for Windows V. 26.0.
Results
Fifty-one healthy participants (33 women, 18 men), with a mean age of 29.96 years (7.58), were recruited for this study. Of the participants, 38 were full-time employees and 14 were not actively working anywhere. Twenty-five people were first evaluated with remote assessment; 26 people were evaluated with face-to-face assessment first (Figure 1). Table 1 shows the mean and standard deviation values of face-to-face and RA1, RA2, and RA3 evaluations.
Descriptive Data of Study.
Note. BMI = body mass index; RA1 = tele-assessment; RA2 = rewatching the tele-assessment video by same researcher; RA3 = rewatching the tele-assessment video by different researcher.
Table 2 provides the ICC values for each outcome measure between face-to-face and remote assessments. Moderate to high degree of validity was found between the face-to-face and remote assessment for repetitive reaching tests (p < .001). A high degree of validity was found during the face-to-face and remote assessment for lifting object overhead test (p < .001). A moderate degree of validity was found between the face-to-face and remote assessment for sustained overhead work test (p < .001).
Results of the Validity Between Face-to-Face and Tele-Assessment.
Note. ICCs (ρ) were calculated using a two-way mixed-effects model, p < .05. ICC = intraclass correlation coefficient; CI = confidence interval.
Intra- and inter-rater ICCs values are given in Table 3. Given the RA1 and RA2 evaluation results, it is seen that it has very good intra-rater reliability (p < .001). The results of the RA1 and RA3 evaluations revealed a very good inter-rater reliability (p < .001).
Intra- and Inter-Rater Reliability of Remote Functional Performance Assessment.
Note. ICCs were calculated using a two-way mixed-effects model, p < .05. RA1 = Tele-assessment; RA2 = rewatching the tele-assessment video by same researcher; RA3 = rewatching the tele-assessment video by different researcher; ICC = intraclass correlation coefficient; CI = confidence interval.
Discussion
The results of this study show that the application of FCE subtests (repetitive reaching, lifting object overhead, and working overhead) through tele-assessment is valid and reliable in asymptomatic individuals, which supports our hypothesis. Based on our findings, remote evaluation of all tests and subtasks was highly valid and reliable in motivated asymptomatic volunteers.
The FCE tests are often used to determine return to work by assessing work capacity. For example, in countries such as the United States, Australia, the Netherlands, and Germany, FCE tests are used as an important tool in determining work capacity (Ansuategui Echeita et al., 2019; Bühne et al., 2020; Edelaar et al., 2020; Mohammed et al., 2021; Pavai, 2019). It is recommended that these tests include as many work-related activities as possible (Soo Hoo, 2019). In addition, the FCE test is important in determining the performance and injury risk of athletes with and without disabilities (Niederbracht et al., 2008; Ustunkaya et al., 2007). The WWS-FCE includes five performance categories (weight handling and strength, posture and mobility, locomotion, balance, and hand coordination; Soer et al., 2009). Some studies in the literature have selected some tests from the WWS-FCE battery (Reneman et al., 2004; Trippolini et al., 2013). In this study, we aimed to evaluate the feasibility of FCE subtests remotely, owing to pandemic conditions or inability to reach a health center or a specialist. We chose tests that were particularly relevant to upper extremity functions and whose materials were more accessible.
According to the some studies on kinematic and neurophysiological assessments, tele-assessments are cost-effective and valid and reliable for various methods in different populations (Rau et al., 2013; Zanin et al., 2022). It also provides flexibility regarding time and place. This flexibility may increase external validity, but it may also create hesitations about the standardization of tests (Ferrar et al., 2021). However, some key points may increase the accuracy of the tele-assessment (Ferrar et al., 2021). Tests with a short time to complete and a low number of repetitions increase test compliance, which can increase the accuracy of the test (Ackerman & Kanfer, 2009). The tests used in this study were of short duration and it was observed that the participant compliance was high. In addition, the test materials were easy-to-access materials, such as weights, balls, nuts, and bolts. These are thought to have a positive effect on the results of this study.
According to the results, when compared with the normative value study by Soer et al., the average repetitive reaching test scores were better in this study, whereas the sustained overhead work and lifting object overhead scores were lower (Soer et al., 2009). The reason why repetitive reaching test results were better in our study may be that our mean age was lower than in Soer et al.’s study. The younger age group may have had an advantage, as this test assesses speed and coordination (Khanafer et al., 2019). However, the results for the sustained overhead work and lifting object overhead tests were surprising. Whereas the number of male participants in Soer et al.’s study was 64%, this rate was 35.29% in our study. Men have higher muscle strength and endurance than women (Leblanc et al., 2015). This may be the reason why the strength and endurance scores in our results were below the normative values. In addition, many factors, such as physical activity level, nutrition, cognitive level, education level, and race are also important determinants of muscle strength (Bartels et al., 2018; Buckinx & Aubertin-Leheudre, 2019; Christensen et al., 2001; Maeda & Akagi, 2017). Although the normative studies in the literature provide valuable information to predict functional capacity, parameters, such as individual’s muscle strength, endurance, coordination, and balance can be affected by many factors. Given that these factors were not evaluated for our study, this may create a limitation for the interpretation of our findings. In addition, this study was conducted in motivated asymptomatic individuals, which may prevent generalization of the results to patients with musculoskeletal problems. Furthermore, it has been stated that biopsychosocial evaluation of the individual may be useful in cases such as determining athletic performance, work performance, or return to work (Soer et al., 2014). Therefore, while the FCE provides important and valuable information about functional performance and the return to work process, it is recommended that normative values be interpreted with care to avoid any abuse (Soer et al., 2014).
Conclusion
The repetitive reaching, lifting an object overhead, and sustained overhead work tests of the WWS-FCE test battery can be performed remotely through videoconferencing in asymptomatic individuals. Our results can be used for asymptomatic individuals and may lead to studies on remote administration of these tests in athletes or patients with musculoskeletal problems. Given the requirements of the globalizing world, the importance of adapting to technology, the financial burden of having to go to a health center, cost-effectiveness analyses, and pandemic conditions, our findings may make an important contribution to the literature, both scientifically and clinically.
Applying Research to Occupational Health Practice
Functional capacity evaluations aim to assess daily activities skills. Recently, there has been an increasing interest in tele-assessment owing to technological developments and pandemic conditions. The application of functional performance subtests of the Work Well Systems–Functional Capacity Evaluation (repetitive reaching, lifting an object overhead, and sustained overhead work tests) through tele-assessment is valid and reliable in asymptomatic individiuals. Further research is needed before deploying these evaluations remotely for work-related assessments and return-to-work decisions.
Footnotes
Conflict of Interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethics Approval
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the clinical research ethics committee of Hacettepe University (December 29, 2020/No. 2020/20-14 [KA-20110]).
Clinical Trial Number
NCT04704570: Date of registration: January 08, 2021.
