Exploring LLM-Generated Feedback for Economics Essays: How Teaching Assistants Evaluate and Envision Its Use

📅 2025-05-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how structured feedback generated by large language models (LLMs) can enhance teaching assistants’ (TAs’) efficiency and quality in grading undergraduate economics assignments. We developed a rubric-guided, fine-tuned LLM feedback engine and evaluated its real-world pedagogical utility through Word-integrated annotations and think-aloud qualitative protocols. Our method introduces a novel “stepwise feedback generation with intermediate result visualization” framework to improve human-in-the-loop controllability and interpretability. Results demonstrate that fine-grained, pedagogically grounded rubrics substantially improve AI feedback quality; all five participating TAs consistently reported gains in grading speed, feedback consistency, and analytical depth; and over 80% of AI-generated suggestions were either directly adopted or required only minor editing. This work constitutes the first systematic empirical validation of AI as an editable, human-augmenting feedback tool in knowledge-intensive, humanities-and-social-sciences writing assessment.

Technology Category

Application Category

📝 Abstract
This project examines the prospect of using AI-generated feedback as suggestions to expedite and enhance human instructors' feedback provision. In particular, we focus on understanding the teaching assistants' perspectives on the quality of AI-generated feedback and how they may or may not utilize AI feedback in their own workflows. We situate our work in a foundational college Economics class, which has frequent short essay assignments. We developed an LLM-powered feedback engine that generates feedback on students' essays based on grading rubrics used by the teaching assistants (TAs). To ensure that TAs can meaningfully critique and engage with the AI feedback, we had them complete their regular grading jobs. For a randomly selected set of essays that they had graded, we used our feedback engine to generate feedback and displayed the feedback as in-text comments in a Word document. We then performed think-aloud studies with 5 TAs over 20 1-hour sessions to have them evaluate the AI feedback, contrast the AI feedback with their handwritten feedback, and share how they envision using the AI feedback if they were offered as suggestions. The study highlights the importance of providing detailed rubrics for AI to generate high-quality feedback for knowledge-intensive essays. TAs considered that using AI feedback as suggestions during their grading could expedite grading, enhance consistency, and improve overall feedback quality. We discuss the importance of decomposing the feedback generation task into steps and presenting intermediate results, in order for TAs to use the AI feedback.
Problem

Research questions and friction points this paper is trying to address.

Evaluating AI-generated feedback quality for economics essays
Assessing teaching assistants' adoption of AI feedback workflows
Improving feedback consistency and efficiency using LLM suggestions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-powered feedback engine for essay grading
Think-aloud studies with teaching assistants
Decomposing feedback generation into steps
🔎 Similar Papers
No similar papers found.
Xinyi Lu
Xinyi Lu
University of Michigan
Human-Computer InteractionHuman-AI Collaboration
A
Aditya Mahesh
University of Michigan, Ann Arbor MI 48109, USA
Z
Zejia Shen
University of Michigan, Ann Arbor MI 48109, USA
M
Mitchell Dudley
University of Michigan, Ann Arbor MI 48109, USA
L
Larissa Sano
University of Michigan, Ann Arbor MI 48109, USA
X
Xu Wang
University of Michigan, Ann Arbor MI 48109, USA