Identifying the Desired Word Suggestion in Simultaneous Audio

📅 2025-01-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the low recognition accuracy and inefficiency of speech-word suggestions in non-visual text input under concurrent audio environments. We propose a novel dual-word simultaneous playback paradigm. Through perceptual experiments and precise temporal manipulation of speech stimuli, we find that a mere 150-ms inter-word asynchrony achieves 84% word recognition accuracy—statistically indistinguishable from conventional sequential playback (86%)—while improving response speed by 32%. We further construct a controllable lexicon tailored for predictive keyboards and validate the paradigm’s effectiveness and robustness via rigorous human-computer interaction evaluations. This work is the first to empirically demonstrate the critical role of auditory temporal tolerance in multi-word speech recognition, revealing that slight asynchrony enhances processing efficiency without compromising accuracy. It provides both a theoretically grounded and practically deployable solution for accessible auditory-assisted input systems.

Technology Category

Application Category

📝 Abstract
We explore a method for presenting word suggestions for non-visual text input using simultaneous voices. We conduct two perceptual studies and investigate the impact of different presentations of voices on a user's ability to detect which voice, if any, spoke their desired word. Our sets of words simulated the word suggestions of a predictive keyboard during real-world text input. We find that when voices are simultaneous, user accuracy decreases significantly with each added word suggestion. However, adding a slight 0.15 s delay between the start of each subsequent word allows two simultaneous words to be presented with no significant decrease in accuracy compared to presenting two words sequentially (84% simultaneous versus 86% sequential). This allows two word suggestions to be presented to the user 32% faster than sequential playback without decreasing accuracy.
Problem

Research questions and friction points this paper is trying to address.

Multiaudio Environment
Auditory Feedback
Typing Accuracy and Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-audio-source Synchronization
Micro-pause Insertion
Efficiency Enhancement in Word Selection
🔎 Similar Papers
No similar papers found.