🤖 AI Summary
This work addresses the inefficiency of traditional information retrieval dataset annotation, which relies on generic tools and struggles to meet the growing demand for high-quality question-answering data driven by large language models and retrieval-augmented generation (RAG). To overcome this limitation, the authors propose AIANO, a human-AI collaborative annotation tool that integrates large language model suggestions, an interactive interface, and a RAG-oriented workflow. While preserving full annotator control, AIANO significantly enhances both annotation efficiency and quality. User studies demonstrate that AIANO nearly doubles annotation speed compared to baseline tools, offers superior usability, and effectively improves downstream retrieval accuracy.
📝 Abstract
The rise of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) has rapidly increased the need for high-quality, curated information retrieval datasets. These datasets, however, are currently created with off-the-shelf annotation tools that make the annotation process complex and inefficient. To streamline this process, we developed a specialized annotation tool - AIANO. By adopting an AI-augmented annotation workflow that tightly integrates human expertise with LLM assistance, AIANO enables annotators to leverage AI suggestions while retaining full control over annotation decisions. In a within-subject user study ($n = 15$), participants created question-answering datasets using both a baseline tool and AIANO. AIANO nearly doubled annotation speed compared to the baseline while being easier to use and improving retrieval accuracy. These results demonstrate that AIANO's AI-augmented approach accelerates and enhances dataset creation for information retrieval tasks, advancing annotation capabilities in retrieval-intensive domains.