🤖 AI Summary
Large language models (LLMs) suffer from severe hallucination and insufficient domain-specific knowledge when generating code or answering questions in reservoir computing (RC).
Method: This paper proposes an interactive programming assistant tailored for the ReservoirPy library, integrating retrieval-augmented generation (RAG) with a domain-specific knowledge graph and performing lightweight fine-tuning on Codestral-22B to enable precise knowledge retrieval and context-aware code generation.
Contribution/Results: To our knowledge, this is the first work embedding a knowledge graph into an RAG framework for RC, significantly mitigating hallucination. Empirical evaluation on ReservoirPy-related coding and question-answering tasks demonstrates superior performance over state-of-the-art models—including ChatGPT-4o and NotebookLM—validating both effectiveness and generalization potential in specialized, niche domains.
📝 Abstract
We introduce a tool designed to improve the capabilities of Large Language Models (LLMs) in assisting with code development using the ReservoirPy library, as well as in answering complex questions in the field of Reservoir Computing. By incorporating external knowledge through Retrieval-Augmented Generation (RAG) and knowledge graphs, our approach aims to reduce hallucinations and increase the factual accuracy of generated responses. The system provides an interactive experience similar to ChatGPT, tailored specifically for ReservoirPy, enabling users to write, debug, and understand Python code while accessing reliable domain-specific insights. In our evaluation, while proprietary models such as ChatGPT-4o and NotebookLM performed slightly better on general knowledge questions, our model outperformed them on coding tasks and showed a significant improvement over its base model, Codestral-22B.