Abstract
Introduction
Given the lack of haptic feedback inherent in prosthetic devices, a natural and adaptable feedback scheme must be implemented. While multimodal feedback has proven successful in aiding dexterous performance, it can be mentally tasking on the individual. Conversely, cross-modal schemes relying on sensory substitution have proven to be equally effective in aiding task performance without cognitively burdening the user to the same degree.
Objectives
This experiment investigated the effectiveness of the cross-modal feedback scheme through using audio feedback to represent prosthetic grasping strength during dynamic control of a prosthetic hand.
Methods
A total of five individuals participated in two sets of experiments (four subjects in the first, one subject in the second). Participants were asked to control the grasping strength exerted by a prosthetic hand while using real-time audio feedback in order to reach up to three different levels of force within a trial set.
Results
The cross-modal feedback scheme successfully provided users with the robust ability to modulate grasping strength in real-time using only audio feedback.
Conclusion
Audio feedback effectively conveys haptic information to the user of a prosthetic hand. Retention of the training knowledge is evident and can be generalized to perform new (i.e. untrained) tasks.
Keywords
Introduction
As the usage of advanced upper limb prostheses increases, the sophistication and complexity of the devices also increases. An effective and standardized method of providing amputees with real-time, closed-loop control of the device, however, has not yet been commercially implemented. Unlike physiological limbs, prosthetic limbs cannot inherently provide the user with force feedback. The amputee must be actively involved in controlling and manipulating the limb, relying on vision, to accomplish a given task.1–3 Although visual feedback is effective in guiding operation of the prosthetic limb (specifically a hand), it alone cannot provide cutaneous information such as grasp strength. This limitation hinders the ability to fluidly and naturally perform delicate tasks, such as holding a glass while preventing slippage. Neural interfaces for prosthetics are heavily researched. Electrode implants and reinnervation,3–6 vibrotactile feedback,7–9 and audio feedback2,10–12 have all been used in place of cutaneous and proprioceptive feedback in order to reduce the limitations of and reliance on visual feedback. Additionally, multimodal schemes demonstrate success with respect to ease of use, learnability, and performance 13 of both simple and complex motor tasks.
However, multimodal feedback is often criticized as individuals may become overburdened due to the increased quantity of information needed to be processed.14–16 As a result of trying to process too much information, the quality of the individual’s performance on a given task will decrease, thereby reducing the utility of the system and ultimately the prosthetic device. Cross-modal associations between visual, haptic, 17 and auditory senses 18 have been shown to be present in the somatosensory cortex. Given the plasticity of the human brain and the ability for sensory substitution, 19 these cross-modal associations can be taken advantage of in real-time control of prosthetic devices. This paper proposes the implementation of a cross-modal feedback methodology, whereby audio feedback substitutes force feedback during dynamic force control of a prosthetic hand. Short-term use of the feedback architecture conveys essential information about the prosthesis to the user, and results in promoting efficient and robust control. Additionally, the audio feedback architecture used in this study has proven to successfully aid users in discriminating between different types of objects, illustrating the adaptability and feasibility of the system. 2
Other investigations into audio feedback mechanisms indicate it is not only an effective agent for conveying force feedback to the user of a prosthetic hand, but also provides the user with an easily-learned and highly adaptive knowledge of the feedback,10–12 augmenting task performance. 20
The current study shows that the feedback system is highly effective in assisting grasping tasks, provides users with more stable control, and supports previous research on the feedback method. 2 It is hypothesized that the audio feedback architecture is a highly learnable, flexible, and extensible mechanism for conveying the force exerted by a prosthetic hand. Learnability refers to the ability of the user to acquire practical knowledge and understanding of the feedback architecture, and improve in performance efficiency during usage. Learnability is assessed by evaluating the completion times of grasping tasks with a prosthetic hand. Flexibility and extensibility refers to the ability of users to extend their knowledge of the feedback architecture to perform unfamiliar tasks. This is assessed by evaluating the ability of users to apply different levels of grasping force on an object, relying solely on the feedback architecture to guide completion of the task.
Methods
Subjects
This study consisted of testing five healthy subjects, four of which were right-handed, and one was left-handed. Four subjects participated in Experiment 1, and one subject participated in Experiment 2. All subjects were informed of the experimental protocol and gave their informed consent according to the procedures approved by the Arizona State University (ASU) Institutional Review Board (IRB) (protocol: #1201007252).
Feedback architecture
The feedback architecture used in this study is the same as that in Gibson and Artemiadis. 2 A Touchbionics Inc. i-Limb Ultra prosthetic right hand is equipped with a glove containing 20 force-sensing resistors (FSRs) (Flexiforce A301), each having a 14 mm width. The sensors are systematically divided into three regions with each region having a frequency of 200, 300, and 400 Hz respectively, a frequency mapping which has proved successful in previous experiments. 2 This frequency mapping was chosen over a single frequency mapping because it helps provide an experience closer to the human hand. A single frequency mapping is useful for determining the position of the hand (open, closed, or grasping). However, a multi-frequency mapping, as was used in these experiments, conveys additional information to the user, such as position of applied force. The electrical circuit built for the system of sensors consists of force sensors wired in parallel to each other, and finally in series with the terminal resistor. A 5 V voltage, supplied through the analog input port of an Arduino Mega 2560 microcontroller, powers the circuit.
Without any force applied to the FSRs, the sensor effectively functions as an infinite resistance. This causes the terminal resistor voltage drop to be zero, creating an open circuit. Applying force to the sensor linearly decreases the resistance of the FSR in relation to the magnitude of the force, and as a result, the total voltage across the sensor decreases when the applied force increases. Due to this behavior, the voltage drops across the terminal resistors increases, providing a voltage input
Experimental protocol
This study consisted of two primary experiments. The first experiment, Experiment 1, assessed the potential and feasibility of audio feedback as a substitute for haptic feedback. The experiment’s purpose was to show that an individual can not only learn to rely on the cross-modal feedback, but also can generalize the knowledge acquired to new and unfamiliar situations, so as to eventually provide the user with dynamic control over the prosthetic just as he/she would have over his/her hand. Subjects performed the experiment using three different finger combinations: index and thumb (Combination 1), index, middle, and thumb (Combination 2), and all five fingers (Combination 3). These combinations were chosen because they simulate grasping actions the subject is likely to perform on a daily basis, such as pinching a pin, grasping a pen, or grasping and holding a bottle. Experiment 2 assesses the ability of the system to be learned over a longer period of time, and evaluates the degree to which the subject is able to demonstrate retention of the learning. Throughout both experiments, subjects were asked to achieve three levels of prosthetic grasp strength – low, medium, and high – which translate to readings of approximately 8–18 N, 40–50 N, and 62–66 N, respectively, on the FSR sensors.
For this study, the audio feedback architecture (glove) was placed on the i-Limb Ultra prosthetic hand, which was held in a constant position and orientation in space. Thus, the hand could only perform the function of grasping, i.e. opening and closing of the fingers. Throughout both experiments, the subjects were asked to grasp a rigid plastic cylinder, mounted to a table within grasping distance of the prosthetic hand. Subjects controlled the opening and closing of the prosthetic hand using a keyboard. One key controlled the opening velocity and another controlled the closing velocity, incrementally increasing or decreasing the velocity with each keystroke. As a result, the force applied during the grasping task proportionally increased and decreased according to each keystroke. The dominant or the non-dominant hand could be used in controlling the keyboard according to subject’s preference. Subjects interacted with the system through a Graphical User Interface (GUI) developed in the Matlab environment (Figure 1), which was displayed on a 27-inch monitor. Audio feedback was received by the subjects through a pair of AudioTechnica ATH-ANC9 headphones. Figure 1 illustrates this process.
(a) This schematic represents the different components of the experiment. The subject first sends a command to the PC by pressing a keystroke on the keyboard. The PC interprets the command as either opening or closing the i-Limb, and sends the appropriate command to the i-Limb. The force sensors on the i-Limb capture the applied force (measured in mV), and send both the force reading to the PC and the audio-feedback signal to the subject. The user has visual feedback of the force reading through a graphical user interface (GUI) displayed on the PC monitor. (b) The experimental setup showing the user (top right) controlling the grasping force of the i-Limb robot hand (top left) using the force feedback GUI (bottom left).
Experiment 1
Training Phase: Day 1. The first phase of the experiment was the Training Phase. Subjects performed a total of 540 training trials (60 per force/combination pair). For each set of 60 trials, the subjects received both audio and visual feedback of the grasping strength of the i-Limb Ultra for 25% of the trials. For the remaining 75% of the trials, the subjects had either audio feedback only or the same audio-visual feedback. Therefore, there was a total of 12 trials with audio feedback only. The order of these trials was randomly predetermined and the same for all subjects. The duration of each trial was fixed at approximately 14.3 seconds. During training, subjects were prompted with a target force range on the Matlab GUI (Figure 1). Using the real-time visual feedback of the grasping strength from the GUI as a guide, subjects were told to control the grasping strength of the i-Limb to reach the target. The training served to introduce the audio feedback to the subjects and create a cognitive association between the auditory sensations experienced by the subject and the corresponding levels of force exerted by the prosthetic hand. This is illustrated in Figure 1. Evaluation Phase: Day 2. The second phase of Experiment 1 evaluated and assessed the subject’s learning of the audio feedback system that was acquired in the Training Phase. The subjects performed a total of 270 trials (90 trials per combination). Each set of trials corresponding to the combinations had a total of 30 trials for each level of force; however, the order of forces was randomly predetermined. The levels of force and grasping combinations were the same used in the Training Phase (low, medium, and high). Subjects were prompted with the target force range in the same manner as in the Training Phase. During the duration of the trial, the subjects did not receive any visual feedback. However, after the trial the subjects received a visual prompt of the average of the last 15% (approximately last 2 s) of their force data. This prompt served as a method of promoting knowledge retention. Generalization Phase: Day 3. The third phase of Experiment 1 required subjects to generalize their knowledge of the audio-feedback mechanism to new levels of force for which they had not been previously trained. Two new force thresholds were selected for this phase of the experiment. The first, new-low (35–45 mV) translating to approximately 24–34 N, was between the previous low and medium levels, while the second, new-medium (55–65 mV) translating to approximately 53–62 N, was between the medium and high levels from the previous phases. The subjects performed a total of 180 trails (60 per grasping combination). In Phase 3, forces were presented to users in the same way as those in Phase 2, namely at randomly predetermined order.
Experiment 2
Experiment 2 consisted of two phases: training and evaluation. This experiment took place over six days: Day 1 for the training phase, Days 2–6 for the evaluation phase. Only grasping Combination 3 was used in this experiment. The experiment used all three levels of force (low, medium, high). The duration of each trial was fixed at approximately 14.3 s. The experiment’s phases were designed and executed in the same manner as the corresponding phases of Experiment 1 but with one notable difference. The Training Phase consisted of one session with a total of 180 trials (60 trials for each of the three levels of force), whereas the Evaluation Phase consisted of 450 total trials divided between five sessions (90 trials per day for five non-consecutive days). The proportion of all three levels of grasping strength was randomly and evenly distributed within each session.
Performance metrics
To statistically evaluate performance, the following criteria were used. All data analysis was performed after applying a moving-average low-pass filter with a span
Once sorted as either success or failure, the response characteristics of the data were calculated, these being the rise and settling time. In this case, rise time has been defined as the time required for the user to travel from the start of the trial to the data point equivalent to 90% of the final recorded data value of applied force (percentage of trial completed = 100) of that same trial (see Figure 2). Thus, the rise time is effectively the time needed from the start of the trial to reach a value 90% of the final data point. This interval was chosen to compensate for any adjustments the subjects made during initial grasping that could cause the data to display false positives. The settling time, on the other hand, is the duration of the trial needed for the applied force to reach and stay within a 5% threshold of the final recorded applied force (see Figure 2).
Graphic user interface for the subject. The user is prompted with the force threshold (two horizontal reference lines) and in real time receives feedback of the forces exerted by the prosthetic hand (red solid line). Instances showing the definition of the rise and settling time are shown with dashed vertical lines. The rise time is determined as the time required for the subjects to apply force equivalent to 90% of the final recorded applied force (percent of trail completed=100). The settling time is determined by the time required for the subject to reach and stay within a 5% threshold of the final recorded applied force.
Statistical analysis
Rise and settling times were determined first for all successful trials. These times were pooled together for each day, combination, and subject pairing respectively. Rise and settling time were analyzed for statistical significance using a two-sample student
For Experiment 1, the baseline dataset was composed of the Day 1 trials with multimodal feedback (audio and visual). As such, the statistical test evaluated the mean of trials Day 1 audio-only, Day 2, and Day 3, respectively, against the mean of the Day 1 audio-visual sample to determine if the two are equal.
This analysis provides an assessment of the cross-modal audio feedback.
In Experiment 2, rise and settling times were analyzed using a two-sample student
Results
The experimental data of all subjects was analyzed using the Matlab software package. Performance metrics for the dataset were rise and settling time, while the two-sample This figure represents the rise and settling times of Experiment 1 for Subject 1, a representative sample in the experiment. The rise and settling times are broken down into the three combinations. L, M, H, NL, and NM represent “low”, “medium”, “high”, “new-low”, and “new-medium” respectively. Rise time analysis demonstrates an improvement in task performance over time, indicating that with audio feedback alone individuals can effectively perform the grasping task. (a) This figure shows the rise times for Combination 1. The average rise times for the low and new-low force levels lie within ± 5% of 12%. The medium and new-medium average rise times lie approximately within ± 2% of 14%. The average rise for the high forces lie within approximately ± 8% of 29%. (b) This figure shows the settling times for Combination 1. The average settling time for the low and new-low force level lies within ± 20% of 45%. The average medium and new medium lies within ± 5% of 25%. The average high lies within ± 15 of 55%. (c) This figure shows the rise time for Combination 2. The average for the low and new low levels lies within ± 4% of 15%. The average for the medium and new medium lies within ± 3% of 12%. The average for the high force level lies within ± 5% of 25%. (d) This figure shows the settling times for Combination 2. The average settling time for low and new-low levels of force lies within ± 40% of 60%. The average medium and new medium lies within ± 15 of 45%. The average high lies within ± 15% of 50%. (e) The figure shows the rise times for Combination 3. The average for the low and new-low levels of force lies within ± 3% of 15%. The average for the medium and new-medium lies within ± 10% of 25%. The average for the high force level lies within ± 30% of 70%. (f) The figure shows the settling times for combination 3. The average for the low and new-low levels of force lies within ± 12% of 72%. The average for the medium and new-medium lies within ± 10% of 25%. The average for the high force level lies within ± 8% of 60%. This table shows the statistical significance of the trials performed in Experiment 1 for subject 1. The statistical significance is determined using a two-sample student 
In Experiment 2, the variability within the sample for each of the rise times significantly decreases from the Day 1 audio trial to Day 3 (Figure 4(a)). However, the average rise times do not significantly change across this span. Nevertheless, Day 3 resembles the baseline dataset very closely. The average rise time for each of the three force levels lies within ± 5% of 20%. Analysis of the settling times on a force by force basis yields an overall decreasing trend from Day 1 audio-only data set to the Day 6 data set. This finding supports that of Experiment 1. The Day 2, 3, and 6 settling times of Experiment 2 are statistically significant to those of the Day 1 audio-visual (Table 2), each having a Subfigure (a) shows the rise times for Experiment 2, and Subfigure (b) shows the settling times of Experiment 2. In both subfigures the average rise time for the low force and the average rise time for the medium force across all six days lies within ± 5% of 20%. The average rise time for the high force lies within ± 8% of 22%. The average settling time for the low force lies within ± 10% of 88% The average settling time of the medium force lies within ± 20% of 64%. The average settling time of the high force lies within ± 10% of 65%. This table shows the statistical significance of the trials performed in Experiment 2. The statistical significance is determined using a two-sample student 
Discussion
This study investigated the use of cross-modal feedback as an efficient and effective mechanism for controlling forces exerted by a prosthetic hand in real-time. The decrease in average rise time throughout the experiment and the statistical significance between the Day 1 audio-visual trials and the Day 3 trials for Experiment 1 support the hypothesis that subjects are able to not only learn the audio-feedback mechanism, using it to perform a basic grasping task for which they have been trained, but also to generalize their knowledge and understanding of the audio feedback to unfamiliar tasks, without any visual cues. This result highlights the extensibility of the feedback method (i.e. subjects are able to adapt and successfully perform a task in real time).
The statistical significance between the rise times of Day 1 audio-visual and Day 3 trials, and the low average rise time for Day 3, illustrate the users’ confidence in the audio feedback when generalizing their knowledge of the audio feedback to untrained tasks. Thus, audio feedback alone is considered to have similar performance to the visual/haptic feedback combination. In Sigrist et al., 21 audio feedback is shown to improve task performance by augmenting other sensory input, such as visual feedback. However, the results of Experiment 1 affirm that audio feedback is also an effective substitute to visual/haptic feedback. Furthermore, the decrease seen in the variability of the rise times of the Experiment 2 trials confirms the Experiment 1 findings regarding the learnability and extensibility of the feedback architecture and the hypothesis that performance of the grasping task improves over time. The increase in average rise times for Day 4 and 6 is attributed to a seven- and five-day resting period in between sessions, respectively. While initially feeling confident prior to the start of the experiment on Days 4 and 6, some challenge in repeating the trial was reported. As such, the increase in average rise time is not considered to be in opposition to the hypothesis that task performance improves over time. Experiment 2 also exhibits a similar behavior to Experiment 1, with respect to settling time (Figures 3 and 4(b)). Subjects did report difficulty in maintaining the force applied by the i-Limb Ultra prosthetic hand throughout each session, noting the need for adjustments in order to stay within the given level of force for the trial. Although the average settling times are significantly high, Experiment 2 (Figure 4(b)) demonstrates that task performance with respect to settling time is stable. Both experiments do exhibit high variance in rise and settling times. Subjects did report fatigue through the course of the experiment due to the multiple trials requested. Due to this, the high variability in response times is attributed to the fatigue of the subjects in addition to the small sample size of the experimental group.
The results of these experiments are in agreement with other previous studies on the control of prosthetics using sensory feedback. In An et al., 22 a continued performance improvement in object manipulation was observed with low standard error, using vibrotactile feedback in place of visual feedback. However, the method we propose is more practical compared to vibrotactile feedback and less invasive and obtrusive to the user. Of the different combinations used in the study, the third combination, where all five fingers performed the grasping task showed reliable results. The other two combinations did not report as strong of a learning trend. The phenomena observed in this study support the initial hypothesis that cross-modal feedback is a highly learnable, extensible, and flexible mechanism for conveying grasping force from the prosthetic device to the user. It elicits the neurological principles of sensory substitution and plasticity, providing the individual with a highly adaptable knowledge and familiarity of the feedback system that can be used for dynamic control over a prosthetic device. Further experiments are needed to alleviate phenomena that might lead to high variability in the results possible due to subject fatigue. However, this paper serves as a proof of concept for the proposed interface, and the experiments conducted and analyzed here support its efficacy.
While the multi-frequency mapping was used to ensure a more realistic experience for the subject, the subjects did not report noticing or paying significant attention to the different pitches of the feedback while performing the grasping tasks, nor did they report any significant impact of the different frequencies to their task performance. It will be beneficial to conduct future studies for comparing the performance of multiple subjects using a single frequency mapping to a multi-frequency mapping while controlling the force exerted during grasping tasks.
Conclusion
This paper proposes the implementation of a cross-modal audio-feedback architecture in hand prostheses for dynamic control of grasp strength, highlighting the learnability and adaptability of the feedback mechanism. Experimental results indicate that users are able to not only modulate the grasp strength of the device in order to achieve familiar contact forces levels (i.e. those learned during the training sessions), but also to reach unfamiliar levels. Additionally, the knowledge acquired through an initial training session, while it does show slight deterioration after extended periods of non-use, is persistent in the individual. Per the findings, such a mechanism will assist practitioners in developing cross-modal rehabilitation techniques for upper-limb control and coordination for the design of the next generation of advanced prosthetic devices.
Footnotes
Declaration of conflicting interests
None declared.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
