🤖 AI Summary
In STEM+C education, large language model (LLM)-based teaching agents suffer from retrieval failure and hallucination in RAG systems due to semantic sparsity in student utterances. To address this, we propose Log-Contextualized RAG (LC-RAG), the first approach to jointly embed multimodal learning environment interaction logs—including collaborative modeling actions, tool usage, and dialogue history—into the RAG retrieval process, thereby enabling dynamic semantic alignment between student discourse and domain knowledge. LC-RAG integrates dialogue modeling, log representation learning, and a collaborative computational modeling environment (XYZ) to generate context-aware, personalized feedback. Experiments on collaborative modeling tasks show that LC-RAG improves retrieval accuracy by 32% over a dialogue-only RAG baseline. It significantly enhances the relevance, credibility, and cognitive support capability of the Copa teaching agent, demonstrating robust performance in authentic pedagogical settings.
📝 Abstract
Collaborative dialogue offers rich insights into students' learning and critical thinking. This is essential for adapting pedagogical agents to students' learning and problem-solving skills in STEM+C settings. While large language models (LLMs) facilitate dynamic pedagogical interactions, potential hallucinations can undermine confidence, trust, and instructional value. Retrieval-augmented generation (RAG) grounds LLM outputs in curated knowledge, but its effectiveness depends on clear semantic links between user input and a knowledge base, which are often weak in student dialogue. We propose log-contextualized RAG (LC-RAG), which enhances RAG retrieval by incorporating environment logs to contextualize collaborative discourse. Our findings show that LC-RAG improves retrieval over a discourse-only baseline and allows our collaborative peer agent, Copa, to deliver relevant, personalized guidance that supports students' critical thinking and epistemic decision-making in a collaborative computational modeling environment, XYZ.