🤖 AI Summary
The widespread adoption of AI tools has undermined traditional outcome-oriented assessment paradigms. This paper identifies a “grand misalignment” shared by natural language generation (NLG) evaluation and university student grading—both overemphasize final outputs while neglecting underlying cognitive processes. To address this, we propose the Pedagogical Multi-Factor Assessment (P-MFA) model, integrating educational measurement theory, human-AI collaborative analysis, and formative assessment principles. P-MFA employs multi-source evidence triangulation and cognitive trajectory tracking to enable process-oriented, evidence-based, dynamic assessment. Empirically validated in Finnish higher education, the model significantly enhances assessment validity and robustness against AI-enabled academic misconduct. Its core innovation lies in adapting multi-factor authentication logic to educational assessment, establishing the first transferable, scalable, and integrated pedagogy-assessment framework for the AI-augmented era.
📝 Abstract
This paper explores the growing epistemic parallel between NLG evaluation and grading of students in a Finnish University. We argue that both domains are experiencing a Great Misalignment Problem. As students increasingly use tools like ChatGPT to produce sophisticated outputs, traditional assessment methods that focus on final products rather than learning processes have lost their validity. To address this, we introduce the Pedagogical Multi-Factor Assessment (P-MFA) model, a process-based, multi-evidence framework inspired by the logic of multi-factor authentication.