๐ค AI Summary
This work proposes a finite state machineโbased voice-guided human-robot collaborative workflow orchestration framework to address the challenges of scaling expert knowledge in industrial settings and the degradation of operational quality caused by variability among personnel and conversational interactions. By integrating speech-based intent understanding under explicit state constraints with modular workflows, the approach establishes an interpretable, reproducible, and cognitively lightweight collaboration paradigm. It also provides unified coordination of heterogeneous resources, including GUI-based software and collaborative robots. Evaluated in an industrial pilot involving turbine blade inspection and repair preparation, the system significantly reduces end-to-end process time while ensuring high repeatability and operational consistency.
๐ Abstract
This paper presents EBuddy, a voice-guided workflow orchestrator for natural human-machine collaboration in industrial environments. EBuddy targets a recurrent bottleneck in tool-intensive workflows: expert know-how is effective but difficult to scale, and execution quality degrades when procedures are reconstructed ad hoc across operators and sessions. EBuddy operationalizes expert practice as a finite state machine (FSM) driven application that provides an interpretable decision frame at runtime (current state and admissible actions), so that spoken requests are interpreted within state-grounded constraints, while the system executes and monitors the corresponding tool interactions. Through modular workflow artifacts, EBuddy coordinates heterogeneous resources, including GUI-driven software and a collaborative robot, leveraging fully voice-based interaction through automatic speech recognition and intent understanding. An industrial pilot on impeller blade inspection and repair preparation for directed energy deposition (DED), realized by human-robot collaboration, shows substantial reductions in end-to-end process duration across onboarding, 3D scanning and processing, and repair program generation, while preserving repeatability and low operator burden.