Watermark in the Classroom: A Conformal Framework for Adaptive AI Usage Detection

📅 2025-07-30

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the high false positive rate (FPR) and demographic bias in AI misuse detection within educational settings, proposing the first adaptive AI-use detection framework tailored for classroom environments. Methodologically, it introduces the first integration of lightweight watermarking with post-hoc calibration, enabling robust detection across languages, AI models, and multi-level AI editing (e.g., grammatical refinement, content expansion) via statistical *p*-value testing, while dynamically calibrating FPR across students with diverse native-language backgrounds. Key contributions are: (1) mitigating systematic misclassification of non-native speakers and other underrepresented groups; (2) enabling precise identification of policy violations while permitting compliant AI-assisted learning; and (3) delivering a quantifiable, production-ready tool for academic integrity monitoring. Experiments demonstrate stable FPR ≤ 5% across multilingual and multi-model configurations—significantly outperforming baseline methods.

Technology Category

Application Category

📝 Abstract

As artificial intelligence tools become ubiquitous in education, maintaining academic integrity while accommodating pedagogically beneficial AI assistance presents unprecedented challenges. Current AI detection systems fail to control false positive rates (FPR) and suffer from bias against minority student groups, prompting institutional suspensions of these technologies. Watermarking techniques offer statistical rigor through precise $p$-values but remain untested in educational contexts where students may use varying levels of permitted AI edits. We present the first adaptation of watermarking-based detection methods for classroom settings, introducing conformal methods that effectively control FPR across diverse classroom settings. Using essays from native and non-native English speakers, we simulate seven levels of AI editing interventions--from grammar correction to content expansion--across multiple language models and watermarking schemes, and evaluate our proposal under these different setups. Our findings provide educators with quantitative frameworks to enforce academic integrity standards while embracing AI integration in the classroom.

Problem

Research questions and friction points this paper is trying to address.

Control false positive rates in AI detection for education

Address bias in AI detection against minority student groups

Adapt watermarking methods for varying AI usage in classrooms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts watermarking for classroom AI detection

Uses conformal methods to control false positives

Evaluates diverse AI editing levels and models

🔎 Similar Papers

No similar papers found.

Authors to Follow