Abstract
Background
There are very few standard instruments currently available for measuring upper extremity (UE) functions for patients with stroke in Thailand.
Objectives
This study aims to examine the concurrent validity, construct validity, and stability reliability of the Functional Test for Hemiplegic Upper Extremity (FTHUE)-Thai version for patients with stroke.
Methods
Thirty hemiplegic participants from five community rehabilitation centers in Chiang Mai province and 30 healthy subjects were recruited. The FTHUE-Thai version and the Fugl-Meyer Assessment for the Upper Extremity (FMA-UE) were the instruments used. Concurrent validity was determined by investigating the relationship between the FTHUE-Thai version and the FMA-UE. Construct validity was investigated by comparing the performance of FTHUE-Thai version between stroke participants and healthy subjects. The stability reliability of the FTHUE-Thai version, which measured the UE function of stroke participants twice in a two-week’s period, was also investigated. The statistics used were Spearman’s correlation coefficient and the Mann-Whitney test.
Results
There were significant correlations between the UE function, as measured by the FTHUE-Thai version, and the arm, and hand sub-scores, as well as the total scores of the FMA-UE (r = 0.93, r = 0.84, and r = 0.95, respectively), indicating good concurrent validity. Stability reliability was also good (r = 0.98, weighted kappa = 0.94). A known group technique test revealed significantly different scores between stroke patients and healthy subjects (p < .001), indicating good construct validity.
Conclusion
The FTHUE-Thai version could be a reliable measurement tool for the UE function in stroke patients in the Thai context.
Keywords
Introduction
Stroke is the leading cause of death and causes long-term disability in those who survive it (Suwanwela, 2014; World Health Organization, 2021). In 2012, stroke mortality was 30.7 per 100,000 people in Thailand (Suwanwela, 2014) and this increased to 44.8 in 2014, 47.8 in 2017, and 52.8 in 2020 (Division of Non-Communicable Diseases, Ministry of Public Health, 2020; Division of Non-Communicable Diseases, Ministry of Public Health, 2022; Srivanichakorn, 2017). In 2016, the total recorded incident rate of stroke in Thailand was 451.4 per 100,000 people, which increased to 467.5 in 2017, 506.2 in 2018, and 542.54 in 2019 (Division of Non-Communicable Diseases, Ministry of Public Health, 2022). Approximately 90% of stroke victims suffer from the sequelae of stroke, mainly weakness of the muscles and sensory deficits on the affected side (Bunyamark, 2017; Gillen, 2013, 2016), which leads to poor performance both in the upper and lower extremities for these individuals. More than 70% of patients who have suffered a stroke have some degree of upper extremity (UE) dysfunction (Dutta et al., 2022) due to muscle weakness, spasticity, limited joint range of motion (ROM), pain, etc. (Gillen, 2016, 2018). Poor UE function could obstruct the ability of stroke survivors to perform activities of daily living (ADL), work, and recreational activities.
There are a variety of standard tools that occupational therapists use for assessing the function of the arms and hands in people with hemiplegia. One example is the Fugl-Meyer Assessment for the Upper Extremity (FMA-UE). The FMA-UE is a highly recommended, reliable, valid, and responsive measure to assess upper limb function after stroke (Gladstone et al., 2002; Winstein et al., 2016). However, there are not many functional competency tests related to daily life in this instrument (Fong et al., 2004). The Action Research Arm Test (ARAT) is a standardized assessment instrument designed to evaluate hemiplegic individuals’ upper limb rehabilitation. It has been discovered that this instrument has excellent construct validity as well as intra- and inter-rater reliability (Chanubol et al., 2012). Nevertheless, it has been demonstrated to have low sensitivity when assessing stroke patients’ modest deficits (Carpinella et al., 2014).
The Wolf Motor Function Test (WMFT) was developed to measure upper limb motor activity following stroke and has good reliability, but it is recommended only for individuals with mild to moderate upper limb impairment (Taub et al., 2011). The graded Wolf Motor Function Test (gWMFT) was created to evaluate affected upper limb ranging from mild to severe. However, there was inconsistent scoring and administration practices among researchers, leading to some authors modifying the gWMFT to fit the objectives of their study (Bonifer et al., 2005; Iwamuro et al., 2011; Triandafilou & Kamper, 2014). To prevent resource duplication in clinical settings, Ng et al. (2008) studied the utility of ARAT and WMFT for hemiplegic UE functions after stroke. At a higher level, the results showed that ARAT can help tell the difference between the severity of UE impairment, while the WMFT was better at telling the difference between the severity of impairment in people whose UE function was lower.
The Minnesota Manual Dexterity Test (MMDT) and Nine-Hole Peg Test (NPT) are used to test the functioning of the lower arms and hands without covering the upper arm (Fong et al., 2004). The Jebsen Hand Function Test is a standardized instrument for clinical assessment of hand function. However, research has shown that the Jebsen Hand Function Test is not completely stable over time and needs a short phase of getting used to (5–10 trials) so that participants can set a stable baseline (Hummel et al., 2005).
The FTHUE is a performance test to measure the functional limitations of both the arms (upper and lower) and hands on the weakened side of the body in stroke survivors (Rowe et al., 2017). It was first developed by Wilson et al. (1984) with the aim of measuring UE movement in patients with hemiplegia. The FTHUE has been used as a baseline and outcome measure for stroke rehabilitation studies (Winstein et al., 2004). However, there are too many test items in this instrument and it takes a long time to assess (approximately 30 minutes) (Fong et al., 2004).
Fong et al. (2004) developed the Hong Kong version of the FTHUE in 2004, updated from Wilson, Baker, and Craddock’s FTHUE (1984), for use with people with hemiplegia in Hong Kong. The testing items have been reduced to 14 activities and the activity movement patterns have been adjusted to be culturally appropriate for Asians. The FTHUE-Hong Kong version (FTHUE-HK) focused on overall movement of both arms and hands, and it takes less time than the original version to finish the whole assessment. The test equipment is inexpensive and can be used to test people with hemiplegia with varying degrees of severity. The FTHUE-HK had good psychometric properties in terms of concurrent and content validity, inter-rater agreement, and test-retest reliability (Fong et al., 2004). In addition, the FTHUE-HK has been demonstrated to be a very useful tool for measuring rehabilitation outcomes after task-specific training for hemiplegic UE, such as when studying the effect of modified constraint-induced movement therapy (mCIMT) on hemiplegic UE function (Leung et al., 2009), and the effects of sensory cueing on voluntary arm use in stroke patients (Fong et al., 2011). The FTHUE-HK is also used as a reliable screening tool for measuring UE impairment in stroke survivors prior to participating in specific program training, such as a promotion of UE recovery by using a wearable wrist device (Wei et al., 2019), and the timing-dependent interaction effects of dual transcranial direct current stimulation (tDCS) in mirror therapy (MT) for hemiplegic UE (Jin et al., 2019).
In Thailand, the authors developed the FTHUE-Thai version (Pingmuang et al., 2016) by adapting the Hong Kong version through a translation process (English to Thai) and then translating it back according to standard methods. Psychometric properties of internal consistency and inter-rater reliability were tested on 30 samples of people with hemiplegia. The results of the study showed that the FTHUE-Thai version had very good internal consistency and excellent inter-rater reliability.
However, concurrent validity, construct validity, and stability reliability of the assessment tool have not yet determined, and these are very important concerns for any tools that are necessary for evaluating or testing treatment in patients with disabilities. Concurrent validity refers to the ability of an instrument to distinguish between individuals who differ based on some criteria in terms of their present condition (Polite & Hungler, 2004). For instance, the ability test for differentiation between patients in a rehabilitation institution who can and cannot be discharged could be correlated with their current functional performance. Therefore, an instrument that can measure or reflect the functional ability of the patient at that time, which is close to their real performance, is considered a good concurrent validity tool. Finding a correlation between the instrument and well-established tools is one way of determining whether it has good concurrent validity (Portney & Gross, 2020). Construct validity reflects the ability of an instrument to measure the theoretical dimensions of a construct (Portney & Gross, 2020). The most general type of evidence in support of construct validity is provided when a test can discriminate between individuals who are known to have the trait and those who do not; this method is called the known group technique (Portney & Gross, 2020). Stability reliability is defined as the degree to which test scores remain unchanged when measuring a stable individual characteristic on two or more occasions. Finding the correlation of the test scores between trials is the method used to demonstrate stability reliability.
The purpose of this study, therefore, was to examine the concurrent validity, construct validity, and stability reliability of the FTHUE-Thai version for patients with stroke in Thailand.
Methods
Study design
The present study was an investigation of the psychometric properties of the FTHUE-Thai version in areas of concurrent validity, construct validity by means of the known group technique, and stability reliability.
Participants
In this study, 30 stroke survivors and 30 healthy adult samples were recruited. Stroke participants were recruited by means of purposive sampling from patients undergoing treatment at 5 community rehabilitation centers in Chiang Mai province, Thailand.
The inclusion criteria for people with hemiplegia were as follows: (1) patients with hemiplegia from the first stroke; (2) patients having had hemiplegia for more than 1 year; (3) age range, 20–80 years; (4) capable of understanding and following two-to three-part commands; (5) no visual or hearing impairments; and (6) being willing and cooperative in testing. The exclusion criteria for people with hemiplegia were as follows: (1) stroke patients with global and/or receptive aphasia who were unable to understand commands; (2) patients paralyzed on both sides of the body from recurrent stroke; and (3) persons with musculoskeletal injuries or arthritis of the joints in the upper limbs.
The purposive sampling of healthy adults was conducted with people from the same community as the five disability rehabilitation centers. Six people with socio-demographic qualifications, similar to those with hemiplegia, were recruited in each community. The inclusion criteria for the healthy adult subjects were: (1) age range of 20 to 80 years; and (2) no pain in the arms and hands during instrument testing.
Instruments
The instruments used in this study include the followings:
1. The Functional Test for Hemiplegic Upper Extremity-Thai version (FTHUE-Thai version): The FTHUE was first developed at Rancho Los Amigos Hospital, USA (Wilson et al., 1984). It aimed to measure the recovery of the hemiplegic UE from non-use to full-hand function (Rowe et al., 2017). The test consists of 18 activities arranged in a hierarchy of seven functional levels based on their degree of difficulty, with a pass-fail grading system within the allotted 3 minutes for each level’s activities (Fong et al., 2004). The difficulty of the activities is related to the motor recovery of the hemiplegic UE, according to Brunnstrom’s criteria (Trombly-Latham, 2008). It takes around 30–45 minutes to complete a single evaluation (Fong et al., 2004; Rowe et al., 2017). Fong et al. (2004) have developed the FTHUE-HK to adapt the instrument to suit people in Asian cultures. There are 14 testing activities, which are sequenced into seven levels of difficulty. Scores rank from 1 to 7 by levels of difficulty in the FTHUE-HK, and it takes around 10 minutes for the evaluation process to be completed. There was a high significant correlation between the motor function of the FTHUE-HK and the total UE score (r = 0.88, p < .01), and the hand sub-score of the FMA-UE (r = 0.88, p < .01) indicated good concurrent validity of the FTHUE-HK compared to a well-established FMA-UE. The FTHUE-HK also demonstrated satisfactory inter-rater agreement on both the testing procedure and its functional levels (r = 0.93, p < .01) (Fong et al., 2004). The test-retest reliability of the instrument, as measured by Spearman’s correlation, was 0.90 (p < .01) (Fong et al., 2004).
In Thailand, researchers have adapted the FTHUE-HK, both the test items and its manual, into the Thai language (Pingmuang et al., 2016). The process involved back-translation, field testing of the pre-final version, and final adjustments. As we have a similar culture, the testing items and scoring were kept the same. The study of the psychometric properties of 30 stroke participants in Thailand demonstrated that the FTHUE-Thai version has very good internal consistency (r = 0.83, p < .01) and excellent inter-rater reliability (r = 0.96, p < .01) (Pingmuang et al., 2016). Figure 1 shows the test items (a) and the equipment (b) in the FTHUE-Thai version prototype. The prototype of the FTHUE-Thai version: (a) the test items; (b) the equipment.
2. The Fugl-Meyer Assessment for Upper Extremity (FMA-UE): This instrument is one of the most widely recognized measures of upper extremity motor impairment post-stroke (Gladstone et al., 2002; Woodbury et al., 2007). The FMA-UE has shown excellent inter-rater reliability (Lin et al., 2009; See et al., 2013), moderate to good responsiveness (Arya et al., 2011), and good concurrent validity when compared with similar tests of arm motor functioning (Filiatrault et al., 1991; Fong et al., 2004; Lin et al., 2009; Malouin et al., 1994). FMA-UE is widely used to determine the severity of stroke and quantify recovery (Crow et al., 2014). Both the intra- and inter-rater reliability of the FMA-UE, by means of the intraclass correlation coefficient (ICC), have been demonstrated to be excellent, with reported values above 0.90, both for the total and subscale levels in the chronic and subacute phases (Lin et al., 2009; See et al., 2013). The UE testing in the FMA consists of four parts: (1) arm (shoulder/elbow/forearm); (2) wrist; (3) hand; and (4) coordination/speed. The scorings are administered in ascending numerical order, an order that is believed to follow the sequence of recovery post-stroke. The 33 test items that constitute the motor domain of the FMA-UE are scored on an ordinal scale of 0 (cannot perform), 1 (perform partially), and 2 (perform fully), resulting in a range of possible scores from zero to 66. The FMA-UE was used as a well-established instrument to find its correlation with the FTHUE-Thai version in the present study because both aim to measure motor recovery in the UE of patients after stroke. Although the FMA-UE may focus on motor impairment, studies have shown that functional outcomes after a stroke can be predicted by increases in FMA-UE scores (Krakauer, 2005; Nijland et al., 2010), which is similar to the FTHUE. The FMA-UE is recommended for planning treatment and evaluating treatment outcomes in the UE in stroke patients (Lundquist & Maribo, 2017), which is the same as the FTHUE. In addition, the test items in both instruments are ranked in order from nonuse and reflex testing to motor function evaluation from proximal to distal parts in the UE.
Procedure
Two research assistants (RAs), who were occupational therapists with more than 5 years of experience in the field of stroke rehabilitation, were blinded to the study goals and were also trained in using the FTHUE-Thai version and the FMA-UE, collecting data from stroke participants and healthy adult subjects. The grading criteria of the FTHUE-Thai version and FMA-UE are objective, so that bias from the testers can be prevented while administering the instruments over two periods of time. The RAs collected data for concurrent validity by using both the FTHUE-Thai version and the FMA-UE with the same participants (15 subjects for each RA). Concurrent validity is the extent to which the target test correlates with a reference standard taken at about the same time (Portney & Gross, 2020). The present study compares a new assessment, the FTHUE-Thai version, with a validated standard tool, the FMA-UE. To ensure construct validity, each of the two RAs collected data from 15 stroke subjects and 15 healthy participants. Therefore, a total of 30 stroke survivors and 30 healthy individuals participated in the comparison test of the FTHUE-Thai version using the known group method. The known group approach is a means of testing discrimination between the subjects in the current study, who are stroke survivors, and the healthy adult participants, who do not have the trait. For stability reliability, each RA collected data from 15 stroke participants twice over a 2-week period, then calculated the correlation between the two trials. The time interval protocol for investigation of stability reliability in the present study followed the instruction from Tirakanant (2014), who noted that retesting the instrument to see its stability should be conducted within a reasonable interval of 1-2 weeks in order to prevent the developmental effects of treatment where the symptoms or development of the tested person have had little change due to other factors. All the participants signed written consent forms prior to participating in the study. This research project was approved by the Human Research Ethics Committee, Faculty of Associated Medical Sciences, Chiang Mai University, Thailand. The project code is AMSEC-64EX-119, and the approval number is 5/2565.
Statistical analysis
Descriptive statistics were used to describe the socio-demographic data of the sample. Spearman’s correlation coefficient was used to analyze the correlation between the FTHUE-Thai version and the motor function domains of the FMA-UE. It was also used to investigate the correlation between the first and second FTHUE-Thai version tests with stroke participants. The scoring agreement between the first and second trials of the FTHUE-Thai version was conducted using a weighted kappa analysis. The Mann-Whitney test was used to compare FTHUE-Thai version scores between stroke patients and healthy adult subjects (construct validity by means of the known group technique). Statistical significance was set at p < .05 for all tests.
Results
Sociodemographic Characteristics of Participants at Baseline.
*p < .05.
aThe highest score of the FTHUE-Thai version is 7.
bThe highest score of the FMA-UE is 66.
It was shown in Table 1 that there were no significant differences between the healthy adult subjects and stroke participants in terms of sex (χ 2 = 0.61, p = .604), age (χ 2 = 0.13, p = .988), education (χ 2 = 0.28, p = .871), dominant hand (χ 2 = 0.10, p = .754), or marital status (χ 2 = 2.46, p = .293). Whereas the motor function of the UE in stroke individuals and healthy adult subjects at baseline differs significantly as measured by the FTHUE-Thai version (χ 2 = 32.85, p < .001) and the FMA-UE (χ 2 = 60.00, p < .001).
Concurrent validity
Correlation Between the FTHUE-Thai Version and the FMA-UE Scores in Stroke Participants (n = 30).
*p < .001.
Data from Table 2 shows that there were significant correlations between the functional levels of the FTHUE-Thai version and the arm sub-score (r = 0.93, p < .001), the hand sub-score (r = 0.84, p < .001), and the total score (r = 0.95, p < .001) of the FMA-UE. These numbers suggest that the concurrent validity of the FTHUE-Thai version is good.
Construct validity
Construct Validity by the Known Group Technique of the FTHUE-Thai Version.
*p < .001.
aIQR = Interquartile Range.
The FTHUE-Thai version scores were compared between stroke patients and healthy individuals using the Mann-Whitney test. The findings, presented in Table 3, indicated statistically significant differences (p < .001) between the two groups, with a z value of -5.70. The results demonstrate the strong construct validity of the instrument.
Stability reliability
Correlation Between the First and Second Trials of the FTHUE-Thai Version (n = 30).
*p < .001.

Scoring agreement between the first and second trials of the FTHUE-Thai version.
The data from Table 4 indicates a significant correlation between the initial and subsequent trials of the FTHUE-Thai version, as determined by Spearman’s correlation coefficient (r = 0.98, p < .001). To gather more data about the stability reliability of the FTHUE-Thai version, researchers employed a scatter plot graph (Figure 2) and conducted a weighted kappa analysis to assess the level of agreement between the two trials in terms of scoring.
Data from Figure 2 demonstrated good scoring agreement for the FTHUE-Thai version between the first and second trials. A weighted kappa analysis for confirming the scoring agreement of the instrument revealed a value of 0.94.
Discussion
The present study examined the psychometric properties of the FTHUE-Thai version in the areas of concurrent validity, construct validity, and stability reliability. Results indicated that the FTHUE-Thai version has good concurrent validity when investigating its correlation with sub-scores and the total score of the standard FMA-UE. In addition, a known group technique test of the FTHUE-Thai version revealed significantly different performance between stroke patients and healthy subjects, indicating good construct validity. The FTHUE-Thai version also demonstrated good stability reliability when calculating its correlation and scoring agreement between the first and second trials in stroke survivors.
According to the study results, there was a significant correlation between functional levels of the FTHUE-Thai version and the arm sub-score, the hand sub-score, and the total score of the FMA-UE (r = 0.93, r = 0.84, and r = 0.95, respectively; p < .001), which indicated good concurrent validity of the FTHUE-Thai version compared with a well-established motor function assessment instrument (Table 2). We selected the FMA-UE as the gold standard for comparison with the FTHUE-Thai version due to its widespread use, validity, and reliability in measuring UE motor function, and the results demonstrated a strong correlation between the two. This suggests that the FTHUE-Thai version offers nearly the same measurement quality as the FMA-UE in Thai clinical settings. The present study is consistent with a study by Fong et al. (2004), which investigated the concurrent validity of the FTHUE-HK with the FMA-UE and found a very good correlation between the two (r = 0.88). In addition, Filiatrault et al. (1991) studied relationships among three UE tests, including the Barthel Index, the FMA-UE, and the FTHUE original version, in 18 hemiplegic patients, which revealed that there was a high correlation between the scores on the FMA-UE and the FTHUE.
Results of the present study also revealed that there was a statistically significant difference (p < .001) in scores of the FTHUE-Thai version test between stroke survivors and healthy subjects (Table 3). This demonstrated that the tool could distinguish between people with disabilities and those who do not have impairments in their arms or hands, and these scores indicated good construct validity of the FTHUE-Thai version. This result is consistent with the results of a study by Rowe et al. (2017), who investigated the FTHUE’s measurement properties and found that scores on this instrument accurately showed how well hemiplegic patients could do simple and complex motor movements for functional tasks. People with lower ability did worse on the items than people with higher ability. However, the process of construct validation should not be performed by a single investigation because constructs are not “real”; that is, they are not directly observable and exist only as concepts that are constructed to represent an abstract trait (Portney & Gross, 2020). Therefore, investigation of the construct validity of the FTHUE-Thai version by another technique such as “convergent validity,” which is the method to find correlation between two measures believed to reflect the same underlying phenomenon, will yield similar results or will be highly correlated (Portney & Gross, 2020). In contrast, a study by means of “discriminant validity,” using measurement tools to assess distinct or contrasting characteristics, will yield different results or a low correlation (Portney & Gross, 2020). These considerations could be further evidence regarding the creditability of the FTHUE-Thai version in terms of construct validity.
The findings of the present study revealed that the instrument had a high correlation between the first and second trials (r = 0.98, p < .001) (Table 4). In addition, the scatter plot graph (Figure 2) and the weighted kappa value (0.94) demonstrated very good scoring agreement between both trials. All these results indicated a good stability reliability of the FTHUE-Thai version. Stability reliability determines the ability of an instrument to measure subject performance consistently. It measures the stability of stable construct scores obtained from the same person on two or more separate occasions (Portney, 2020). This is very important for measurement tools used in rehabilitation, such as the FTHUE-Thai version, to detect the real condition of the hemiplegic UE at a designated time in order for clinicians to appropriately set goals and plans for treatment.
There are some limitations in the present study. First, we used only the FMA-UE as a standard instrument to determine its correlation with the FTHUE-Thai version. The FMA-UE aims to measure motor recovery in the UE of patients after stroke, similar to the FTHUE. However, the FMA-UE focuses more on assessing body impairment than the FTHUE-Thai version. A study of concurrent validity with other similar well-established instruments, such as the WMFT and gWMFT, may help further confirm the good validity of the FTHUE-Thai version. Another limitation is that the scoring on the testing of the FTHUE-Thai version on stroke survivors depended solely on the RA judgment both in the first and second trials, even though they had been trained in using the instrument. Video recording for visual analysis may help improve the test-retest reliability.
Conclusion
Results obtained from the present study substantiated the good psychometric properties of the FTHUE-Thai version in areas of concurrent validity, construct validity, and stability reliability. An instrument with good concurrent validity is useful when it is being proposed as an alternative to the current gold standard methods. This means that rehabilitation practitioners and occupational therapists in Thailand can use the FTHUE-Thai version as an alternative tool to perform quick screening of UE function in stroke patients in their clinical practices. In addition, construct validity is a very important psychometric property of the tools as it reflects the ability of the instrument to measure what it wants to assess in terms of the theoretical dimensions. The FTHUE-Thai version shows good potential to measure functional motion of the hemiplegic UE. The FTHUE-Thai version also demonstrated good stability reliability, indicating the instrument’s ability to obtain consistent outcomes from the administration of the test. The above findings confirm the appropriateness of the FTHUE-Thai version as a reliable tool for clinicians as a functional assessment for hemiplegic UE in stroke survivors in the Thai context.
Footnotes
Acknowledgements
The authors would like to thank all stroke participants of the 5 community rehabilitation centers in Chiang Mai province, Thailand for their involvement in the study.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Faculty of Associated Medical Sciences, Chiang Mai University, Thailand.
