🤖 AI Summary
This study addresses the tendency of users to uncritically accept recommendations from AI decision-support systems, which can lead to erroneous judgments. To mitigate this issue, the authors propose a data-driven prompting mechanism that enhances prospective reasoning in human-AI collaboration by automatically generating reflective questions via large language models (LLMs). The approach integrates a structured question taxonomy, a cognitive engagement scale, and LLM-based question generation. A prototype system was developed and evaluated in a clinical setting, combining methods from LLMs, human-computer interaction design, and cognitive assessment. Empirical results demonstrate that the proposed mechanism significantly improves clinicians’ critical appraisal of AI-generated outputs, eliciting positive user feedback and offering a novel pathway toward developing AI systems that function as “thinking tools” rather than mere decision aids.
📝 Abstract
Many generative AI systems as well as decision-support systems (DSSs) provide operators with predictions or recommendations. Various studies show, however, that people can mistakenly adopt the erroneous results presented by those systems. Hence, it is crucial to promote critical thinking and reflection during interaction. One approach we are focusing on involves encouraging reflection during machine-assisted decision-making by presenting decision-makers with data-driven questions. In this short paper, we provide a brief overview of our work in that regard, namely: 1) the development of a question taxonomy, 2) the development of a prototype in the medical domain and the feedback received from clinicians, 3) a method for generating questions using a large language model, and 4) a proposed scale for measuring cognitive engagement in human-AI decision-making. In doing so, we contribute to the discussion about the design, development, and evaluation of tools for thought, i.e., AI systems that provoke critical thinking and enable novel ways of sense-making.