🤖 AI Summary
On mobile devices, editing text with large language models (LLMs) disrupts writing continuity due to frequent context switching between the editing interface and a separate chat interface.
Method: This paper proposes an *in-situ semantic-aware touch interaction paradigm*, enabling users to directly invoke LLM-based text generation or compression within the text area via intuitive gestures (e.g., expand/pinch), eliminating context switches. The core innovation is the first formal semantic mapping between touch gestures and LLM-driven text transformations, coupled with a three-tier visual feedback system: no prompt, length indicator only, and length-plus-keyword indicator. The system integrates lightweight LLM invocation and real-time gesture recognition.
Results: A user study (N=14) confirms feasibility and usability; the “length-plus-keyword” feedback significantly improves generation accuracy and perceived user control, reduces task completion time by 23%, and decreases erroneous interactions by 37%.
📝 Abstract
Interacting with Large Language Models (LLMs) for text editing on mobile devices currently requires users to break out of their writing environment and switch to a conversational AI interface. In this paper, we propose to control the LLM via touch gestures performed directly on the text. We first chart a design space that covers fundamental touch input and text transformations. In this space, we then concretely explore two control mappings: spread-to-generate and pinch-to-shorten, with visual feedback loops. We evaluate this concept in a user study (N=14) that compares three feedback designs: no visualisation, text length indicator, and length + word indicator. The results demonstrate that touch-based control of LLMs is both feasible and user-friendly, with the length + word indicator proving most effective for managing text generation. This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.