PustakAI: Curriculum-Aligned and Interactive Textbooks Using Large Language Models

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Large language models (LLMs) exhibit low accuracy, weak curriculum alignment, and poor pedagogical relevance when applied to India’s NCERT educational context—particularly for Grades 6–8 English and Science. Method: We introduce NCERT-QA, the first structured bilingual (English–Hindi) QA dataset explicitly aligned with NCERT curricula, covering factual, inferential, and evaluative reasoning questions. We systematically evaluate meta-prompting, few-shot prompting, and chain-of-thought prompting across open-source (Gemma, Llama) and commercial LLMs. Contribution/Results: Curriculum alignment significantly improves answer accuracy and instructional utility; specific prompt-model combinations effectively mitigate hallucination and enhance reasoning consistency. This work establishes a reproducible data benchmark, an empirical evaluation framework, and actionable optimization strategies for LLM adaptation in education—addressing a critical gap in regionally grounded, curriculum-driven AI education research.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in understanding and generating human-like content. This has revolutionized various sectors such as healthcare, software development, and education. In education, LLMs offer potential for personalized and interactive learning experiences, especially in regions with limited teaching resources. However, adapting these models effectively to curriculum-specific content, such as the National Council of Educational Research and Training (NCERT) syllabus in India, presents unique challenges in terms of accuracy, alignment, and pedagogical relevance. In this paper, we present the framework"PustakAI"footnote{Pustak means `book'in many Indian languages.} for the design and evaluation of a novel question-answering dataset"NCERT-QA"aligned with the NCERT curriculum for English and Science subjects of grades 6 to 8. We classify the curated QA pairs as Factoid, Inferential, and Others (evaluative and reasoning). We evaluate the dataset with various prompting techniques, such as meta-prompt, few-shot, and CoT-style prompting, using diverse evaluation metrics to understand which approach aligns more efficiently with the structure and demands of the curriculum. Along with the usability of the dataset, we analyze the strengths and limitations of current open-source LLMs (Gemma3:1b, Llama3.2:3b, and Nemotron-mini:4b) and high-end LLMs (Llama-4-Scout-17B and Deepseek-r1-70B) as AI-based learning tools in formal education systems.

Problem

Research questions and friction points this paper is trying to address.

Adapting LLMs to curriculum-specific content with accuracy and relevance

Creating interactive educational tools for regions with limited teaching resources

Evaluating LLM performance on curriculum-aligned question-answering tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Created NCERT-QA dataset for curriculum alignment

Evaluated multiple prompting techniques for efficiency

Analyzed open-source and high-end LLMs as learning tools

🔎 Similar Papers

No similar papers found.