🤖 AI Summary
This paper addresses the low accuracy and lack of transparency in generative AI–driven thematic analysis. To tackle these issues, we propose a reproducible AI-assisted coding framework. Methodologically, it integrates the GPT-4 Turbo API, structured stepwise prompt engineering, and Python-based automation to implement a systematic coding pipeline that preserves verbatim quotations and precise page-level citations. Our contributions are threefold: (1) First demonstration of AI coding inter-rater reliability matching human-level agreement; (2) Thematic classification accuracy statistically equivalent to expert human coders; and (3) Superior thematic interpretation—producing more abstract, conceptual, and contextually grounded explanations than human analysts. Collectively, the framework significantly enhances verification, traceability, and reproducibility, establishing a methodological paradigm for AI-augmented qualitative coding in social science research.
📝 Abstract
This study highlights the transparency and accuracy of GenAI's inductive thematic analysis, particularly using GPT-4 Turbo API integrated within a stepwise prompt-based Python script. This approach ensured a traceable and systematic coding process, generating codes with supporting statements and page references, which enhanced validation and reproducibility. The results indicate that GenAI performs inductive coding in a manner closely resembling human coders, effectively categorizing themes at a level like the average human coder. However, in interpretation, GenAI extends beyond human coders by situating themes within a broader conceptual context, providing a more generalized and abstract perspective.