🤖 AI Summary
This study addresses the high false positive rate (FPR) and demographic bias in AI misuse detection within educational settings, proposing the first adaptive AI-use detection framework tailored for classroom environments. Methodologically, it introduces the first integration of lightweight watermarking with post-hoc calibration, enabling robust detection across languages, AI models, and multi-level AI editing (e.g., grammatical refinement, content expansion) via statistical *p*-value testing, while dynamically calibrating FPR across students with diverse native-language backgrounds. Key contributions are: (1) mitigating systematic misclassification of non-native speakers and other underrepresented groups; (2) enabling precise identification of policy violations while permitting compliant AI-assisted learning; and (3) delivering a quantifiable, production-ready tool for academic integrity monitoring. Experiments demonstrate stable FPR ≤ 5% across multilingual and multi-model configurations—significantly outperforming baseline methods.
📝 Abstract
As artificial intelligence tools become ubiquitous in education, maintaining academic integrity while accommodating pedagogically beneficial AI assistance presents unprecedented challenges. Current AI detection systems fail to control false positive rates (FPR) and suffer from bias against minority student groups, prompting institutional suspensions of these technologies. Watermarking techniques offer statistical rigor through precise $p$-values but remain untested in educational contexts where students may use varying levels of permitted AI edits. We present the first adaptation of watermarking-based detection methods for classroom settings, introducing conformal methods that effectively control FPR across diverse classroom settings. Using essays from native and non-native English speakers, we simulate seven levels of AI editing interventions--from grammar correction to content expansion--across multiple language models and watermarking schemes, and evaluate our proposal under these different setups. Our findings provide educators with quantitative frameworks to enforce academic integrity standards while embracing AI integration in the classroom.