Large Language Models in K-12 Education: Alignment with State Curriculum Standards and Student Personas

📅 2026-06-03

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This study addresses the critical gap in aligning large language models (LLMs) with K–12 educational standards and examines potential biases in their responses to student identity attributes. It presents the first systematic evaluation of how well LLMs adhere to U.S. state-level history curriculum standards and respond to variations in student characteristics—such as grade level, geographic location, race, and gender. Through curriculum document analysis, controlled prompting experiments, and simulated user personas, the research finds that while LLMs appropriately adjust content difficulty by grade, their outputs often reflect state political leanings more than actual curricular mandates. Responses exhibit minimal bias concerning race and gender, indicating limited but promising identity-aware adaptation. This work underscores the dual challenges of policy alignment and equity in deploying AI for education.

📝 Abstract

As Large Language Models (LLMs) become increasingly popular in educational settings, they raise important questions about the ethical implications of their use. Publicly available online chatbots are quickly improving in capability and accuracy leading to more widespread use, including among students looking for help with their homework. This makes it crucial to consider whether these models are aligned with educational standards. Because curriculum standards in the United States are set at the state level, they differ significantly in required content, emphasis, and narrative focus. In this work, we develop an LLM-based pipeline to identify variations in U.S. History curricula across states and evaluate the extent to which different LLMs reflect these state-specific curricular differences. In addition, we conduct controlled experiments that vary user personas by stating user attributes such as geographic location, grade level, gender and race to evaluate the sensitivity of LLM responses to user characteristics. We find that while models are able to adjust their presentation of historical topics, these shifts may come from the perceived political leanings of states and do not necessarily reflect actual curriculum content. Additionally, models successfully adapt to a student's grade level while showing minimal sensitivity to race or gender, suggesting they are capable of useful adaptation to student personas with limited demographic bias. Together, these findings highlight potential risks that open access to LLM chatbots may cause to student learning outcomes stemming from misalignment with state curriculum standards and highlight the need for more robust alignment techniques.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Curriculum Alignment

K-12 Education

Student Personas

State Standards

Innovation

Methods, ideas, or system contributions that make the work stand out.

curriculum alignment

student personas

large language models