🤖 AI Summary
Automated coding of conversational data faces challenges due to high contextual complexity and large language models’ (LLMs) difficulty in accurately modeling the interplay between communicative acts and underlying events. To address this, we propose an LLM-assisted deductive coding framework featuring: (1) a novel dual-path prediction mechanism—role-specific prompting and chain-of-thought reasoning; (2) context representation explicitly grounded in joint modeling of communicative acts and events; and (3) cross-model collaborative coding (leveraging GPT-4-turbo, GPT-4o, and DeepSeek) coupled with event–act association-driven consistency verification, performed by GPT-4o. Experiments demonstrate substantial improvements in coding accuracy: act-level predictions consistently outperform event-level ones, and contextual consistency verification yields significant performance gains. This work establishes a scalable, high-fidelity automated coding paradigm for qualitative conversational analysis.
📝 Abstract
Dialogue data has been a key source for understanding learning processes, offering critical insights into how students engage in collaborative discussions and how these interactions shape their knowledge construction. The advent of Large Language Models (LLMs) has introduced promising opportunities for advancing qualitative research, particularly in the automated coding of dialogue data. However, the inherent contextual complexity of dialogue presents unique challenges for these models, especially in understanding and interpreting complex contextual information. This study addresses these challenges by developing a novel LLM-assisted automated coding approach for dialogue data. The novelty of our proposed framework is threefold: 1) We predict the code for an utterance based on dialogue-specific characteristics -- communicative acts and communicative events -- using separate prompts following the role prompts and chain-of-thoughts methods; 2) We engaged multiple LLMs including GPT-4-turbo, GPT-4o, DeepSeek in collaborative code prediction; 3) We leveraged the interrelation between events and acts to implement consistency checking using GPT-4o. In particular, our contextual consistency checking provided a substantial accuracy improvement. We also found the accuracy of act predictions was consistently higher than that of event predictions. This study contributes a new methodological framework for enhancing the precision of automated coding of dialogue data as well as offers a scalable solution for addressing the contextual challenges inherent in dialogue analysis.