Abstract
Voice-enabled technologies such as VoiceOver (screen reader) and the Seeing AI app (image recognition) have revolutionized daily tasks for people with visual disabilities, fostering greater independence and information access. However, a gap remains in understanding the user experience (UX) of these technologies. This study investigated how those with visual disabilities interacted with VoiceOver and the Seeing AI app. A convenience sample of eight participants with visual disabilities engaged in direct observations while using these technologies. The study utilized the System Usability Scale (SUS) to assess perceived usability and analyzed findings using descriptive statistics. Results indicated a poorer UX with VoiceOver compared to the Seeing AI app, with challenges identified in graphical user interfaces (GUIs), voice and gesture commands. Relevant recommendations were made to enhance usability. The study emphasizes the need for more intuitive GUIs and optimized voice/gesture interactions for users with visual disabilities.
Introduction
Advancements in technology greatly improve accessibility for people with visual disabilities across various domains, such as healthcare, education, rehabilitation training, and Internet use (Kim, 2018). Smartphones and voice assistants empower them to communicate, access information, and navigate their surroundings (WebAIM, 2018). AI-powered voice-enabled technologies, such as Microsoft’s Seeing AI, further enhance independence by enabling object and person identification. Despite these advancements, users with visual disabilities still face challenges due to app incompatibility, complex gestures, and poor user interface designs (Celusnak, 2016; Leporini et al., 2012; Wong & Tan, 2012). However, there remains a gap in our understanding of the user experience of voice-enabled technologies among people with visual disabilities. This study aims to address this knowledge gap by directly observing how people with visual disabilities interact with voice-enabled applications (e.g., VoiceOver and the Seeing AI app). Through direct observations, the study can contribute to a deep understanding of user experiences and the identification of user interface designs for improvement.
Method
Participants
The study included a convenience sample of eight participants with visual disabilities who met the inclusion criteria: aged 18 or older and visual acuity worse than 20/70 (World Health Organization, 2019). To minimize potential bias in the data, the study excluded individuals with prior experience using VoiceOver and the Seeing AI, as their familiarity could influence their interactions with the applications.
Materials
The study used an iPhone 12 mini equipped with the Seeing AI app and VoiceOver. Participants’ interactions with these technologies were recorded using a video camera during instructional sessions. Instructional materials were sourced from the official User Guides for VoiceOver and the Seeing AI app. User experience was assessed using a System Usability Scale (SUS).
Procedure
Participants were introduced to VoiceOver and the Seeing AI app, received instruction on their functionalities, and verbally completed the SUS. Descriptive statistics were applied to analyze their responses. Video recordings of their interactions with the technologies were reviewed to identify specific challenges and successes participants encountered.
Results
SUS
VoiceOver
The mean score of participants’ summed responses was 52.5 ± 16.11, indicating a poor user experience according to Bangor et al. (2009). The mean score for positive items was 3.52 ± 0.52, while for negative items, it was 3.32 ± 1.06. Participants appreciated various features but expressed low confidence due to the perceived complexity of VoiceOver, leading them to believe that they would likely require technical assistance.
Seeing AI App
The mean score of participants’ summed responses was 74.50 ± 12.17, indicating a good user experience (Bangor et al., 2009). The mean score for positive items was 4.24 ± 0.46, indicating participants were mostly satisfied with the app, while for negative items, it was 2.28 ± 1.18, indicating participants were less frustrated.
Observation of User Interactions
VoiceOver
Participants encountered various challenges with hand gesture commands, including scrolling in the wrong direction and difficulty selecting items due to a lack of mental models for spatial layouts. The study also identified issues with graphical user interfaces (GUI), including irremovable notifications, inconsistent slider directions, and difficulties in distinguishing between images/texts and clickable buttons. Challenges related to voice user interfaces (VUI) were also identified, including unclear messages from the voice assistants, unused wake phrases (e.g., “Hey, Siri”) and unlisted voice commands.
Seeing AI App
In comparison to VoiceOver, the Seeing AI app resulted in fewer instances of poor user experience, such as fingers covering the camera and the camera not capturing the entire paragraph.
Discussions
Our observation of user interactions yielded valuable insights into the challenges faced by participants when using VoiceOver and the Seeing AI app. Based on these findings, several recommendations were proposed to enhance the usability of both assistive technologies. To enhance user experience associated with hand gesture commands, recommendations included providing customizable options for scrolling direction and offering audio feedback for touch interactions. For better user experience with GUI designs, recommendations included providing alternative options for closing notifications, applying Fitts’ Law principles for repositioning GUI components, ensuring consistency in slider gestures, and enhancing visual cues (e.g., color contrast) and button designs. To achieve greater user experience with VUI designs, recommendations included allowing immediate reiteration of voice commands without repeating the wake phrase and enabling the voice assistant to anticipate or learn new voice commands not initially programmed. Recommendations for the Seeing AI included implementing alerts, such as beep sounds, to notify users when their fingers are covering the camera and providing audible notifications when the entire paragraph is not captured.
The research findings also shed light on limitations of the User Guides. They should provide alternative formats accessible to users with visual disabilities, as they consist solely of texts and images. Regardless of product quality, inadequate instructions can lead to user dissatisfaction due to difficulties in comprehensively understanding and using the product’s features and functionalities.
Limitations should be noted in interpreting the results. The small sample size restricts generalizability. Additionally, its controlled lab setting may differ from real-world use of voice-enabled technologies. Long-term experiences and learning curves were not explored, potentially affecting usability perceptions. Future research could overcome these limitations by employing larger, diverse samples and investigating user experiences in ecologically relevant settings over extended periods.
Conclusion
The study assessed the usability of voice-enabled technologies for people with visual disabilities. Compared to the Seeing AI app, VoiceOver presented a greater number of challenges. Accordingly, recommendations for improvement were suggested. These design recommendations can lead to more user-friendly and accessible voice-enabled technologies, empowering people with visual disabilities for independent living.
Footnotes
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This material is based upon work partially supported by the National Science Foundation under Grant No. 1831969 and 2315735.
