Rate-Distortion Guided Knowledge Graph Construction from Lecture Notes Using Gromov-Wasserstein Optimal Transport

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of generating high-quality AI-assisted assessment items from unstructured lecture notes in educational settings, this paper proposes a knowledge graph (KG) optimization framework grounded in rate-distortion theory. Methodologically, we introduce Fused Gromov–Wasserstein optimal transport into KG modeling to quantify semantic structural similarity within measure spaces, and jointly optimize semantic embeddings with five refinement operations—addition, merging, splitting, deletion, and relinking—to achieve Pareto-optimal trade-offs between KG size (rate) and semantic fidelity (distortion). The resulting compact KG exhibits an interpretable rate-distortion curve. Empirical evaluation on data science lecture notes demonstrates significant improvements: multiple-choice questions generated from the refined KG outperform those derived directly from raw notes across all 15 quality metrics. This work provides a scalable, interpretable, and structurally grounded foundation for AI-driven educational content understanding and assessment.

Technology Category

Application Category

📝 Abstract
Task-oriented knowledge graphs (KGs) enable AI-powered learning assistant systems to automatically generate high-quality multiple-choice questions (MCQs). Yet converting unstructured educational materials, such as lecture notes and slides, into KGs that capture key pedagogical content remains difficult. We propose a framework for knowledge graph construction and refinement grounded in rate-distortion (RD) theory and optimal transport geometry. In the framework, lecture content is modeled as a metric-measure space, capturing semantic and relational structure, while candidate KGs are aligned using Fused Gromov-Wasserstein (FGW) couplings to quantify semantic distortion. The rate term, expressed via the size of KG, reflects complexity and compactness. Refinement operators (add, merge, split, remove, rewire) minimize the rate-distortion Lagrangian, yielding compact, information-preserving KGs. Our prototype applied to data science lectures yields interpretable RD curves and shows that MCQs generated from refined KGs consistently surpass those from raw notes on fifteen quality criteria. This study establishes a principled foundation for information-theoretic KG optimization in personalized and AI-assisted education.
Problem

Research questions and friction points this paper is trying to address.

Converting unstructured lecture materials into structured knowledge graphs
Optimizing knowledge graph compactness while preserving semantic information
Improving educational content quality for AI-powered learning systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using Gromov-Wasserstein Optimal Transport for KG alignment
Applying rate-distortion theory to guide KG construction
Employing refinement operators to minimize rate-distortion Lagrangian
🔎 Similar Papers
No similar papers found.
Yuan An
Yuan An
College of Computing and Informatics, Drexel University
Data IntegrationKnowledge GraphOntologyData MiningMachine Learning
R
Ruhma Hashmi
College of Computing and Informatics, Drexel University, Philadelphia, PA 19104, USA
M
Michelle Rogers
College of Computing and Informatics, Drexel University, Philadelphia, PA 19104, USA
J
Jane Greenberg
College of Computing and Informatics, Drexel University, Philadelphia, PA 19104, USA
Brian K. Smith
Brian K. Smith
Mississippi State University