🤖 AI Summary
Large language model (LLM) chat interfaces often encourage superficial learning and hinder the development of critical thinking. To address this, we propose a novel argumentation-aware critical question generation method grounded in a two-stage lightweight model collaboration framework: first, a Questioner module generates multiple candidate questions; second, a Judge module re-ranks them by relevance to select the optimal question—thereby actively challenging vague or unsupported claims and fostering deep reasoning. This work presents the first end-to-end open-source small-model solution for the CQs-Gen 2025 shared task, relying solely on lightweight open models and prompt engineering—without fine-tuning. It achieved first place in the ACL 2025 workshop’s associated evaluation, demonstrating the efficacy, practicality, and scalability of small-scale LLMs for critical question generation.
📝 Abstract
The widespread adoption of chat interfaces based on Large Language Models (LLMs) raises concerns about promoting superficial learning and undermining the development of critical thinking skills. Instead of relying on LLMs purely for retrieving factual information, this work explores their potential to foster deeper reasoning by generating critical questions that challenge unsupported or vague claims in debate interventions. This study is part of a shared task of the 12th Workshop on Argument Mining, co-located with ACL 2025, focused on automatic critical question generation. We propose a two-step framework involving two small-scale open source language models: a Questioner that generates multiple candidate questions and a Judge that selects the most relevant ones. Our system ranked first in the shared task competition, demonstrating the potential of the proposed LLM-based approach to encourage critical engagement with argumentative texts.