🤖 AI Summary
Prior research lacks a unified framework to evaluate how NLP tools affect qualitative analysis quality across distinct collaborative modalities. Method: This paper introduces the first evaluation framework specifically designed for collaborative qualitative research, conducting controlled experiments grounded in authentic qualitative workflows to systematically compare two NLP-assisted topic discovery tools under synchronous versus asynchronous collaboration. We propose a novel hybrid evaluation methodology—integrating quantitative metrics (e.g., topic overlap ratio, clustering cohesion) with expert-derived qualitative criteria—tailored to diverse collaboration paradigms. Results: Synchronous collaboration significantly improves topic consistency and correctness, whereas asynchronous collaboration enhances topic diversity; moreover, tool performance exhibits marked sensitivity to collaboration mode. These findings uncover critical interaction mechanisms between collaborative strategies and NLP tools in shaping qualitative analysis quality, providing empirical grounding for human-AI co-design in qualitative research contexts.
📝 Abstract
NLP-assisted solutions have gained considerable traction to support qualitative data analysis. However, there does not exist a unified evaluation framework that can account for the many different settings in which qualitative researchers may employ them. In this paper, we take a first step in this direction by proposing an evaluation framework to study the way in which different tools may result in different outcomes depending on the collaboration strategy employed. Specifically, we study the impact of synchronous vs. asynchronous collaboration using two different NLP-assisted qualitative research tools and present a comprehensive analysis of significant differences in the consistency, cohesiveness, and correctness of their outputs.