Question Answering for Multi-Release Systems: A Case Study at Ciena

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of reduced question-answering accuracy in multi-version software systems, where documentation across versions is highly similar yet contains subtle differences that confuse existing QA systems. To tackle this issue, the authors propose QAMR, a novel chatbot that introduces a retrieval-augmented generation (RAG) framework specifically tailored for multi-version documentation. The framework incorporates a dual-chunking strategy—optimizing chunks separately for retrieval and generation—along with query rewriting and context selection mechanisms. Evaluated on both real-world industrial data and public benchmarks, QAMR achieves a question-answering accuracy of 88.5% and a retrieval accuracy of 90%, representing improvements of 16.5% and 12% over baseline methods, respectively, while also reducing response time by 8%.

Technology Category

Application Category

📝 Abstract
Companies regularly have to contend with multi-release systems, where several versions of the same software are in operation simultaneously. Question answering over documents from multi-release systems poses challenges because different releases have distinct yet overlapping documentation. Motivated by the observed inaccuracy of state-of-the-art question-answering techniques on multi-release system documents, we propose QAMR, a chatbot designed to answer questions across multi-release system documentation. QAMR enhances traditional retrieval-augmented generation (RAG) to ensure accuracy in the face of highly similar yet distinct documentation for different releases. It achieves this through a novel combination of pre-processing, query rewriting, and context selection. In addition, QAMR employs a dual-chunking strategy to enable separately tuned chunk sizes for retrieval and answer generation, improving overall question-answering accuracy. We evaluate QAMR using a public software-engineering benchmark as well as a collection of real-world, multi-release system documents from our industry partner, Ciena. Our evaluation yields five main findings: (1) QAMR outperforms a baseline RAG-based chatbot, achieving an average answer correctness of 88.5% and an average retrieval accuracy of 90%, which correspond to improvements of 16.5% and 12%, respectively. (2) An ablation study shows that QAMR's mechanisms for handling multi-release documents directly improve answer accuracy. (3) Compared to its component-ablated variants, QAMR achieves a 19.6% average gain in answer correctness and a 14.0% average gain in retrieval accuracy over the best ablation. (4) QAMR reduces response time by 8% on average relative to the baseline. (5) The automatically computed accuracy metrics used in our evaluation strongly correlate with expert human assessments, validating the reliability of our methodology.
Problem

Research questions and friction points this paper is trying to address.

multi-release systems
question answering
software documentation
versioned documentation
retrieval-augmented generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-release systems
retrieval-augmented generation
dual-chunking
query rewriting
context selection
🔎 Similar Papers
No similar papers found.
P
Parham Khamsepour
University of Ottawa, 800 King Edward Avenue, Ottawa ON K1N 6N5, Canada
M
Mark Cole
Ciena Corp, 7035 Ridge Road, Hanover, MD 21076, USA
I
Ish Ashraf
Ciena Corp, 7035 Ridge Road, Hanover, MD 21076, USA
S
Sandeep Puri
Ciena Corp, 7035 Ridge Road, Hanover, MD 21076, USA
M
M. Sabetzadeh
University of Ottawa, 800 King Edward Avenue, Ottawa ON K1N 6N5, Canada
Shiva Nejati
Shiva Nejati
EECS / University of Ottawa
Software EngineeringSoftware TestingSBSEAI4SEmodel-driven engineering