🤖 AI Summary
This study addresses the challenges faced by blind adults in non-visual text input on touchscreen-only smartphones, identifying key issues: poor speech recognition robustness, weak noise resilience, inefficient error correction, high learning overhead, and low adoption of word prediction. Through 12 in-depth, semi-structured interviews and iterative prototype evaluations, we systematically propose five actionable design directions: (1) enhancing noise-robust automatic speech recognition; (2) designing screen-independent audio feedback mechanisms; (3) strengthening error recovery workflows; (4) reducing the learning burden of novel interaction paradigms; and (5) developing more natural, context-aware non-visual word prediction models. Furthermore, we introduce an empirically grounded prioritization framework structured along three dimensions—usability, learnability, and environmental adaptability—to guide accessibility-focused mobile interaction design. The findings provide both theoretical insights and practical implementation pathways for inclusive human–computer interaction.
📝 Abstract
Text input on mobile devices without physical keys can be challenging for people who are blind or low-vision. We interview 12 blind adults about their experiences with current mobile text input to provide insights into what sorts of interface improvements may be the most beneficial. We identify three primary themes that were experiences or opinions shared by participants: the poor accuracy of dictation, difficulty entering text in noisy environments, and difficulty correcting errors in entered text. We also discuss an experimental non-visual text input method with each participant to solicit opinions on the method and probe their willingness to learn a novel method. We find that the largest concern was the time required to learn a new technique. We find that the majority of our participants do not use word predictions while typing but instead find it faster to finish typing words manually. Finally, we distill five future directions for non-visual text input: improved dictation, less reliance on or improved audio feedback, improved error correction, reducing the barrier to entry for new methods, and more fluid non-visual word predictions.