🤖 AI Summary
K–12 teachers rarely develop scoring rubrics from scratch due to the time-intensive nature of the task and ambiguous standards, highlighting a critical need for efficient support tools. This study investigates how teachers leverage the MagicSchool.ai platform—integrating prompt engineering and AI chatbots—to generate and customize rubrics during a summer professional development workshop. Using a mixed-methods approach involving surveys, discussions, and exit tickets, followed by thematic analysis, the research demonstrates that AI-generated rubrics serve as high-quality initial drafts that substantially enhance efficiency. A notable trade-off between rigor and level of detail emerged in teacher feedback. Participants generally affirmed the structural clarity and ethical acceptability of AI-assisted rubrics but identified limitations in grade-level appropriateness, alignment with learning objectives, and editing flexibility. Findings support adopting this technology under conditions that preserve teacher agency and enable seamless customization.
📝 Abstract
This study investigates K--12 teachers' perceptions and experiences with AI-supported rubric generation during a summer professional development workshop ($n = 25$). Teachers used MagicSchool.ai to generate rubrics and practiced prompting to tailor criteria and performance levels. They then applied these rubrics to provide feedback on a sample block-based programming activity, followed by using a chatbot to deliver rubric-based feedback for the same work.
Data were collected through pre- and post-workshop surveys, open discussions, and exit tickets. We used thematic analysis to analyze the qualitative data. Teachers reported that they rarely create rubrics from scratch because the process is time-consuming and defining clear distinctions between performance levels is challenging.
After hands-on use, teachers described AI-generated rubrics as strong starting drafts that improved structure and clarified vague criteria. However, they emphasized the need for teacher oversight due to generic or grade-misaligned language, occasional misalignment with instructional priorities, and the need for substantial editing.
Survey results indicated high perceived clarity and ethical acceptability, moderate alignment with assignments, and usability as the primary weakness -- particularly the ability to add, remove, or revise criteria. Open-ended responses highlighted a ``strictness-versus-detail'' trade-off: AI feedback was often perceived as harsher but more detailed and scalable. As a result, teachers expressed conditional willingness to adopt AI rubric tools when workflows support easy customization and preserve teacher control.