The Protocol for Active Movement Extent Discrimination Assessment (AMEDA) is Reliable When Shortened From 50 to 25 Stimuli to Reduce Testing Fatigue

Abstract

Active movement extent discrimination assessment (AMEDA) is a psychophysical task that evaluates proprioception and tactile acuity of the lower limbs, and it is a method of determining sensorimotor ability. Sensorimotor ability is the ability to judge small differences in movement tasks through the process of receiving sensory messages (sensory input) and producing a response (motor output). Participant attention lapses in prior psychophysical studies have been implicated as a cause for increased measurement variance thresholds in these types of assessments. Since minimizing the time needed for the AMEDA may help to reduce attention lapses, we compared the reliability of the 50-repetition AMEDA protocol (Group 1) with that of a 25-repetition protocol (Group 2). We assessed the split half reliability of these two approaches, using the Spearman-Brown Adjusted Pearson correlation (r). For each method, we calculated Bland-Altman Plots and Intra Class Correlation Coefficients to compare the reliability of the two data sets and determine the 95% confidence intervals. Split-half test re-test Spearman-Brown Adjusted Pearson r (r_full) was Group 1 r_full = 0.83 and Group 2 r_full = 0.85. The Bland-Altman Plots indicated only a small degree of bias from the zero-difference line, with 95% of the difference points lying within the limits of agreement. For Group 1, the intraclass correlation coefficient (ICC) two-way, agreement was 0.83 (95% CI 0.54–0.93) and for Group 2, the ICC, two-way, agreement, was 0.85 (95% CI 0.66–0.93). The MDC90 for Group 1 was 0.082 AUC units and for Group 2, it was 0.086 AUC units. The combined data for Group 1 plus Group 2 Bland-Altman Plot indicated only a small degree of bias from the zero-difference line, with 95% of the difference points lying within the limits of agreement. The MDC90 for the combined groups was 0.08 AUC units. The multiple methods from previous research assessing test re-test reliability that we applied to our two data sets indicate that the 25-response AMEDA was a reliable system for evaluating sensorimotor function in the lower limbs and may be an alternative for the more traditional 50-response protocol in which lapses in participant attention from fatigue or other biases may be a concern. There are also practical advantages in time restricted athletic screenings to a shorter administration of this assessment.

Keywords

proprioception kinesthesia AMEDA sensorimotor ankle

Introduction

Sensorimotor ability is the ability to judge small differences in movement tasks through the process of receiving sensory messages (sensory input) and producing a response (motor output). Active movement extent discrimination assessment (AMEDA) is a psychophysical task that evaluates proprioception and tactile acuity of the lower limbs, and it is a method of determining sensorimotor ability. The AMEDA device is used for assessing sensorimotor sensitivity to degrees of ankle inversion in the lower limbs (Han et al., 2016). This tool provides ecologically valid measures of lower limb sensorimotor ability, and its outputs have been sensitive to measures of athletic talent in cross sectional studies (Han et al., 2015a, 2015b). Lower AMEDA scores have been associated with increased probability of a future injury in many populations (Antcliff et al., 2023; Steinberg et al., 2019; Svorai Band et al., 2021). The ankle AMEDA has been shown to have good test-retest and intra-rater reliability in healthy adults (ICC: 0.80) (Witchalls et al., 2014). However, reliability can be impacted by such factors as ankle dysfunction; for example chronic ankle instability reduced the reliability to a moderate (ICC: 0.60) (Shi et al., 2023).”

In psychophysical research participant attention lapses have been implicated as a cause for increased measurement variance thresholds (Waskom et al., 2019; Witton et al., 2017). A lapse in attention, or disengagement from the task, may mean that the participant failed to register the extent of stimulus movement and made a random guess that does not reflect their subjective judgement. In these assessments, it is necessary to ensure that a stimulus sequence is sufficiently long to obtain useful data but not so long as to encourage disengagement or inattention (Waskom et al., 2019; Witton et al., 2017). Morgan et al. (2000) indicated that 10–20 trials were sufficient to enable participants to achieve a reliable estimate of an implicit standard in a psychophysical judgement task. The AMEDA test is a psychophysical cognitive-motor assessment that satisfies Pronk et al.’s (2022) review criteria for determination of split-half reliability using the Spearman-Brown Adjusted Pearson r_full (Borg et al., 2022). The AMEDA test requires participants to make judgements about five discrete ankle movement extents presented 10 times each (i.e., 50 responses). Thus, the AMEDA test results can be split to compare two sets of 25 responses each, (i.e., half of the original 50 response test) but even 25 trials are a greater number of trials than the 20 trials suggested by Morgan et al. (2000).

The split-half reliability method measures the internal consistency of a test when the two halves of the test can be assumed to measure the same thing (Pronk 2022). Since both halves of the test should measure the same construct at a similar level of precision and difficulty, scores on one half should correlate significantly with scores on the other half. The required sample size for adequate power in the two groups has been as described by Borg et al. (2022), and previous investigators have demonstrated intraclass correlation coefficients (ICC) for the complete AMEDA to be between 0.8 and 0.89 (Waddington et al., 2004; Witchalls et al., 2014). In the current study we anticipated similar ICC values on test-retest when using each of the two sets of 25 positions in the full test of 50 positions. Using G*Power 3.1 software (RRID:SCR_013726), we calculated an a priori power analysis, assuming the null (0.85), versus the anticipated (0.70), effect sizevalues from Borg et al.’s (2022) study, 80% statistical power, and .05 statistical significance. We obtained an estimated required sample of 38 participants.

We calculated the Spearman-Brown Adjusted Pearson r_full, from the Pearson r derived from each half of the original data sets using the following equation.

r f u l l = \frac{2 (r h a l f)}{1 + r h a l f}

r_full; Spearman-Brown Adjusted Pearson Correlation Coefficient (50), and r_half ; split-half Pearson r (25) (Pronk et al., 2022).

The MDC_90, indicating the change in scores needed for 90% confidence that the change in the measurement was not the result of inter-trial variability or measurement error as described by Steffen and Seney (2008), was calculated as follows.

M D C 90 = S E M x \sqrt{2} x 1.65

S E M = s d x \sqrt{1 - I C C} .

SEM; standard error of the measurement, and sd; standard deviation of the first measurement. (Steffen & Seney, 2008).

Method

Participants and Ethical Considerations

In this study, we applied test re-test reliability analyses to two different de-identified AMEDA data sets previously collected from participants in another validation study for which participant inclusion had undergone separate ethics approval review (UCHREC Project Number 17–211) as de-identified data. We included no individually identifiable data from these prior research participants and all participants were over the age of 18 and had previously provided informed consent for the subsequent use of their data. One of these data sets included 19 female professional football (soccer) athletes (Group 1) and the other included 26 male basketball players (Group 2) competing on a national level development squad. All athletes were participating in high performance sport at the time of the original collection of the data used for this research.

Procedure

All participants in the earlier research underwent an AMEDA assessment that required them to stand on the device, with the non-tested foot on a fixed plate and the tested foot placed on a movable plate. The participant was asked to make an active movement into mid-range ankle inversion, which rotated the plate and the lateral side of the foot down until the plate movement was arrested at an adjustable metal stop. The device provided five movement stop positions which were numbered from 1 to 5, assigning each movement displacement in the order from the shallow rotation position (position 1) to the deepest angle position (position 5). Each participant underwent three familiarization trials, repeating all five available movement extents in the sequence. During testing the participant moved the plate as described above, following which the participant returned the plate to the horizontal position and decided the extent of the inversion rotation (i.e., number 1,2,3,4 or 5). The test consisted of a total of 50 ankle movements, with all the five movement degrees presented to the participant in a randomized sequence (i.e., five positions presented 10 times in a random order) (Han et al., 2016). The AMEDA was scored by casting the participant’s responses into a five by five response matrix and determining an Area Under the Curve (AUC) in which a 0.5 movement was equivalent to a chance response and 1.0 was a perfect score.

Statistical Analyses

For this study, we calculated split half test-retest reliablities on the first and second halves of the 50 stimuli test, essentially dividing test scores into scores on the first 25 stimuli and the last 25 stimuli for both participant groups. Histograms were plotted from the data to visually determine the normality of the distributions. To determine any differences in performance on the tests between the first and second half test scores for each group (i.e., learning effects), we computed an analysis of variance (ANOVA). We also assessed split half reliability using the Spearman-Brown Adjusted Pearson r_full (Pronk et al., 2022). We used Bland-Altman Plots to determine relative agreement between each of the two data sets (Giavarina, 2015). The Bland Altman plot enables evaluation of any bias between the mean differences of two quantitative methods of measurement and provides an estimate of an agreement interval, within which 95% of the differences of the second method, compared to the first one, occur (Giavarina, 2015).

Bland-Altman plots for the split half model provide more information on the differences between the measures, allowing a visual examination of the data and enabling an evaluation of the global agreement between the two measurements. We used an additional check of reliability, intraclass correlation coefficients two way, agreement (ICCs) to assess the reliabilities of the two data sets and determine the 95% confidence intervals of the reliability measure (Koo & Li, 2016). The ICC is sensitive to the extent to which participants keep their ranking order in repeated measurements and may indicate the degree of systematic difference between participants (Liljequist et al., 2019). The ICC is a reliability measure that reflects both the degree of correlation between two measures and the degree of agreement between measurements. Portney (2020) suggested that “Values less than 0.5 are indicative of poor reliability, values between 0.5 and 0.75 indicate moderate reliability, values between 0.75 and 0.9 indicate good reliability, and values greater than 0.90 indicate excellent reliability.” We also determined the Minimal Detectable Change 90 (MDC90) to provide a measure of the degree of change in the AMEDA measure needed to ensure, with 90% certainty, that the change in the AMEDA measurement was greater than the result of inter-trial variability or measurement error (Haley & Fragala-Pinkham, 2006).

Results

Participants’ demographic descriptors and AMEDA scores are provided in Table 1. For the split-half test re-test method, data from the two separate athlete groups were considered separately in calculations of the Spearman-Brown Adjusted Pearson r (r_full). For Group 1 (19 female professional football athletes) the Spearman-Brown Adjusted Pearson r_full = 0.83 and for Group 2 (26 male basketball national development squad) the Spearman-Brown Adjusted Pearson r_full = 0.85.

Table 1.

Participants’ Demographic Descriptors and AMEDA Scores.

Group	AMEDA first 25 (mean AUC units)	AMEDA second 25 (mean AUC units)	AMEDA 50 mean AUC units
Female football athletes n = 19 (age 18–29)	0.70	0.70	0.71
Female football athletes n = 19 (age 18–29)	SD = 0.09	SD = 0.07	SD = 0.06
Male basketball athletes n = 26 (age 18–22)	0.68	0.69	0.70
Male basketball athletes n = 26 (age 18–22)	SD = 0.09	SD = 0.10	SD = 0.06
Female and male combined group	0.69	0.69	0.71
Female and male combined group	SD = 0.09	SD = 0.09	SD = 0.06

Using Student’s t-tests the means of the 25 stimulus protocol for Group 1, (n = 19) and Group 2, (n = 26) were compared with the mean of the 50-stimulus combined group (n = 45) and found to be not significantly different (t(18) = 0.77, p = .87 and t(25) = 0.61, p = .53 respectively). There was no statistically significant differences between the mean AMEDA scores of the male and female athletes, Groups 1 and 2 (Kruskal-Wallis One Way ANOVA p = .686) and the data sets were combined for further analysis. The combined groups r_full was 0.84, with scores above 0.5 indicating a high degree of test re-test reliability (Cohen, 1992).

The Bland-Altman plots for the split half model are illustrated in Figures 1 –3. The plots for Groups 1 and 2 (Figures 1 and 2) indicate only a small degree of bias from the zero-difference line, with 95% of the difference points lying within the limits of agreement. Combining the two groups, the Bland-Altman Plot (Figure 3) indicated only a small degree of bias from the zero-difference line, with 95% of the difference points lying within the limits of agreement.

Figure 1.

Group 1 Bland-Altman Plot.

Figure 2.

Group 2 Bland-Altman Plot.

Figure 3.

Combined groups Bland-Altman Plot.

The ICC results for Group 1, two-way, agreement was 0.83 (95% CI 0.54–0.93) and for Group 2, 0.85 (95% CI 0.66–0.93), indicating good to excellent reliability for both groups. The ICC two-way, agreement for the combined groups was 0.84 (95% CI 0.73–0.91), with the 95% CI again indicating a range from good to excellent reliability.

The MDC₉₀ for Group 1 was 0.082 AUC units and for Group 2, 0.086 AUC units. When the groups were combined the MDC₉₀ = 0.08 AUC units, indicating that where there was a change of greater than or equal to 0.08 AUC units on an individual’s AMEDA score we can be 90% certain that the change in the measurement was not the result of inter-trial variability or measurement error.

Discussion

In this study, we sought to determine whether the use of a 25-response sequence for the AMEDA protocol would result in reduced reliability of the outcome score for assessing sensorimotor acuity when compared with the full 50-response sequence AMEDA. Using a 50-response protocol, previous investigators have reported that the ICCs for the AMEDA AUCs across two repetition trials reflected “good to high” Test 1 to Test 2 reliability (ICC = .89); across three repetition trials, reliability was “good” (ICC = .80; 95% CI 0.69 to 0.87, p < .001) with a maximum Standard Error of 0.014 (0.012–0.014) (Waddington et al., 2004; Witchalls et al., 2014). When we separated the 50-response AMEDA test into a first half of 25 and a second half of 25 responses, we found good to excellent reliability between the first and second half measures for both groups indicating that the 25-response protocol could be used in place of the 50-response AMEDA test.

Based on theaw results, the 50-response sequence AMEDA protocol can be replaced by a 25-response sequence protocol. The 50-response sequence takes approximately 5 minutes of test administration time, with the 25-response halving this. In time-poor environments, such as for athletic health assessments or in busy clinical environments this finding enables reliable, faster assessment of sensorimotor ability. Previous studies using the AMEDA have shown that physical fatigue reduced participants’ scores (Steinberg et al., 2023a, 2023b), and that scores were linked to higher cognitive function (Antcliff et al., 2021). Since the potential to reduce physical fatigue and cognitive load by shortening the protocol has potential to mitigate these fatigue influences and enable testing of proprioception in a broader range of testing contexts, we undertook the present study.

Our finding of no differences between the mean values for AMEDA performance scores between both the male and female athlete groups suggests that the determination of proprioceptive and tactile acuity measured with the AMEDA assessment task is a fundamental aspect of sensorimotor performance and is essentially the same between males and females. This makes sense if we consider that sensorimotor ability is less likely to be affected by factors such as muscle mass related strength, power or body size that usually separates males and females on conventional motor skill tasks (Drinkwater et al., 2008; Yanovitch et al., 2008).

In terms of the MDC₉₀, where there is a change of greater than or equal to 0.08 AUC units on an individual’s AMEDA score, we can be 90% certain that the change in the measurement was not the result of inter-trial variability or measurement error. This represented approximately 12% of the raw AMEDA scores, and this finding is compatible with the values seen for the y-balance test and the balance-error scoring system (Amin et al., 2014; Foldager et al., 2023). As an example of the value of this level of sensitivity, all three tools have demonstrated discriminative ability between healthy participants and those with chronic ankle instability (Hertel et al., 2006; Linens et al., 2014; Steinberg et al., 2023a, 2023b).

Limitations and Directions for Further Research

This study was based on samples of young adult male and female athletes. It remains to be seen whether these data would generalize to other populations, such as non-athletes, adolescents, older adults, or those with ankle instability or other disabilities.

Conclusion

The multiple methods of assessing test re-test reliability applied to our two data sets as obtained from previous research indicate that the 25-response AMEDA was a reliable system for evaluating sensorimotor function in the lower limbs and may be an alternative to the more traditional 50-response protocol where lapses in participant attention from fatigue or other biases may be a concern. There are practical advantages to a shorter administration of this assessment for environments where time available for assessment is limited.

Footnotes

Declaration of Conflicting Interests

Professor Gordon Waddington is a founding shareholder in Prism Neuro Pty Ltd

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Gordon Waddington

Jeremy Witchalls

Author Biographies

Professor Gordon Waddington is UC AIS Professor of Sports Medicine in the UC Research Institute for Sport and Exercise and Professor of Physiotherapy in the Faculty of Health. He gained his PhD from the University of Sydney in 2000. He has published more than 100 research papers in peer reviewed journals and is currently Editor in Chief of the Journal of Science and Medicine in Sport and on the review boards of the Australian Journal of Physiotherapy, Experimental Brain Research, British Journal of Sports Medicine, Perceptual and Motor Skills, Physical Therapy in Sport and Physiotherapy and the Grants Review Panel of the Physiotherapy Research Foundation and is a NHMRC Project Grant reviewer. His major research interest is in the assessment of human somatosensory function and its relationship to movement performance and the impact of injury and rehabilitation.

Associate Professor Jeremy Witchalls has been a clinical physiotherapist since 1989, with particular experience in musculoskeletal injuries. He first qualified in England, and worked in the NHS before moving to Australia, after which he worked in private practice, and for the Australian Defence Force for 10 years. Areas of Teaching: Physiotherapy Research Interests: PhD and ongoing research into the role of somatosensory perception (proprioception) in injury and performance. Clinical interventions for musculoskeletal performance and injuryInjury surveillance and prevention Military human performance.

References

Amin

D. J.

Coleman

Herrington

L. C.

(2014). The test-retest reliability and minimal detectable change of the balance error scoring system. Journal of Sports Sciences, 2(2014), 200–207. DOI:10.17265/2332-7839/2014.04.005

Antcliff

Welvaert

Witchalls

Wallwork

S. B.

Waddington

(2021). Assessing proprioception in an older population: Reliability of a protocol based on active movement extent discrimination. Perceptual and Motor Skills, 128(5), 2075–2096. https://doi.org/10.1177/00315125211029906

Antcliff

S. R.

Witchalls

J. B.

Wallwork

S. B.

Welvaert

Waddington

G. S.

(2023). Developing a multivariate prediction model of falls among older community-dwelling adults using measures of neuromuscular control and proprioceptive acuity: A pilot study. Australasian Journal on Ageing, 42(3), 463–471. https://doi.org/10.1111/ajag.13191

Borg

Bach

O'Brien

Sainani

(2022). Calculating sample size for reliability studies. Physical Medicine and Rehabilitation, 14(8), 1018–1025. https://doi.org/10.1002/pmrj.12850

Cohen

(1992). A power primer. Psychological Bulletin, 112(1), 155–159. https://doi.org/10.1037//0033-2909.112.1.155

Drinkwater

E. J.

Pyne

D. B.

McKenna

M. J.

(2008). Design and interpretation of anthropometric and fitness testing of basketball players. Sports Medicine, 38(7), 565–578. https://doi.org/10.2165/00007256-200838070-00004

Foldager

F. N.

Aslerin

Bæ Kdahl

Tønning

L. U.

Mechlenburg

(2023). Interrater, test-retest reliability of the Y balance test: A reliability study including 51 healthy participants. International Journal of Exercise Science, 16(4), 182–192.

Giavarina

(2015). Understanding Bland Altman analysis. Biochemia Medica, 25(2), 141–151. https://doi.org/10.11613/BM.2015.015

Haley

Fragala-Pinkham

(2006). Interpreting change scores of tests and measures used in physical therapy. Physical Therapy, 86(5), 735–743.

10.

Han

Anson

Waddington

Adams

Liu

(2015b). The role of ankle proprioception for balance control in relation to sports performance and injury. BioMed Research International, 2015, 842804. https://doi.org/10.1155/2015/842804

11.

Han

Waddington

Adams

Anson

Liu

(2016). Assessing proprioception: A critical review of methods. Journal of Sport and Health Science, 5(1), 80–90. https://doi.org/10.1016/j.jshs.2014.10.004

12.

Han

Waddington

Anson

Adams

(2015a). Level of competitive success achieved by elite athletes and multi-joint proprioceptive ability. Journal of Science and Medicine in Sport, 18(1), 77–81. https://doi.org/10.1016/j.jsams.2013.11.013

13.

Hertel

Braham

R. A.

Hale

S. A.

Olmsted-Kramer

L. C.

(2006). Simplifying the star excursion balance test: Analyses of subjects with and without chronic ankle instability. Journal of Orthopedic & Sports Physical Therapy, 36(3), 131–137. https://doi.org/10.2519/jospt.2006.36.3.131

14.

Koo

(2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163. https://doi.org/10.1016/j.jcm.2016.02.012

15.

Liljequist

Elfving

Skavberg Roaldsen

(2019). Intraclass correlation - a discussion and demonstration of basic features. PLoS One, 14(7), Article e0219854. https://doi.org/10.1371/journal.pone.0219854

16.

Linens

S. W.

Ross

S. E.

Arnold

B. L.

Gayle

Pidcoe

(2014). Postural-stability tests that identify individuals with chronic ankle instability. Journal of Athletic Training, 49(1), 15–23. https://doi.org/10.4085/1062-6050-48.6.09

17.

Morgan

M. J.

Watamaniuk

S. N.

McKee

S. P.

(2000). The use of an implicit standard for measuring discrimination thresholds. Vision Research, 40(17), 2341–2349. https://doi.org/10.1016/s0042-6989(00)00093-6

18.

Portney

L. G.

(2020). Foundations of clinical research. In Edition 4 applications to evidence based practice. F. A. Davis.

19.

Pronk

Molenaar

Wiers

Murre

(2022). Methods to split cognitive task data for estimating split-half reliability: A comprehensive review and systematic assessment. Psychonomic Bulletin & Review, 29(1), 44–54. https://doi.org/10.3758/s13423-021-01948-3

20.

Shi

Ganderton

Tirosh

Adams

Ei-Ansary

Han

(2023). Test-retest reliability of ankle range of motion, proprioception, and balance for symptom and gender effects in individuals with chronic ankle instability. Musculoskeletal Science and Practice, 66, 102809. https://doi.org/10.1016/j.msksp.2023.102809

21.

Steffen

Seney

(2008). Test-retest reliability and minimal detectable change on balance and ambulation tests, the 36-item short-form health survey, and the unified Parkinson disease rating scale in people with parkinsonism. Physical Therapy, 88(6), 733–746. https://doi.org/10.2522/ptj.20070214

22.

Steinberg

Adams

Ayalon

Dotan

Bretter

Waddington

(2019). Recent ankle injury, sport participation level, and tests of proprioception. Journal of Sport Rehabilitation, 28(8), 824–830. https://doi.org/10.1123/jsr.2018-0164

23.

Steinberg

Elias

Zeev

Witchalls

Waddington

(2023a). The function of the proprioceptive, vestibular and visual systems following fatigue in individuals with and without chronic ankle instability. Perceptual and Motor Skills, 130(1), 239–259. https://doi.org/10.1177/00315125221128634

24.

Steinberg

Elias

Zeev

Witchalls

Waddington

(2023b). Another look at fatigued individuals with and without chronic ankle instability: Posturography and proprioception. Perceptual and Motor Skills, 130(1), 260–282. https://doi.org/10.1177/00315125221134153

25.

Svorai Band

Pantanowitz

Funk

Waddington

Steinberg

(2021). Factors associated with musculoskeletal injuries in an infantry commander’s course. The Physician and Sports Medicine, 49(1), 81–91. https://doi.org/10.1080/00913847.2020.1780098

26.

Waddington

G. S.

Adams

R. D.

(2004). The effect of a 5-week wobble-board exercise intervention on ability to discriminate different degrees of ankle inversion, barefoot and wearing shoes: A study in healthy elderly. Journal of the American Geriatrics Society, 52(4), 573–576. https://doi.org/10.1111/j.1532-5415.2004.52164.x

27.

Waskom

M. L.

Okazawa

Kiani

(2019). Designing and interpreting psychophysical investigations of cognition. Neuron, 104(1), 100–112. https://doi.org/10.1016/j.neuron.2019.09.016

28.

Witchalls

Waddington

Adams

Blanch

(2014). Chronic ankle instability affects learning rate during repeated proprioception testing. Physical Therapy in Sport, 15(2), 106–111. https://doi.org/10.1016/j.ptsp.2013.04.002

29.

Witton

Talcott

J. B.

Henning

G. B.

(2017). Psychophysical measurements in children: Challenges, pitfalls, and considerations. PeerJ, 5, Article e3231. https://doi.org/10.7717/peerj.3231

30.

Yanovich

Evans

Israeli

Constantini

Sharvit

Merkel

Epstein

Moran

D. S.

Moran

D. S.

(2008). Differences in physical fitness of male and female recruits in gender-integrated army basic training. Medicine & Science in Sports & Exercise, 40(11 Suppl), S654–S659. https://doi.org/10.1249/MSS.0b013e3181893f30