LLMCode: Evaluating and Enhancing Researcher-AI Alignment in Qualitative Analysis

📅 2025-04-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of contextual alignment between large language models (LLMs) and human interpretive reasoning in Research through Design (RfD) qualitative analysis, particularly concerning the credibility limitations of LLM-generated design insights. To evaluate qualitative coding performance, we propose a novel dual-metric framework integrating Intersection-over-Union (IoU) and an enhanced Hausdorff distance—first enabling quantitative assessment of LLMs’ capacity to reproduce designers’ deep, subjective interpretations. Through two human-AI collaborative experiments involving 26 professional designers, we find that while LLMs excel at deductive coding, they struggle with context-dependent interpretive judgments; conversely, bidirectional adaptation mechanisms significantly improve both insight quality and cognitive flexibility for humans and LLMs alike. We open-source LLMCode, underscoring the irreplaceable depth of human interpretation and providing a methodological foundation and empirical evidence for trustworthy human-AI collaboration in RfD.

Technology Category

Application Category

📝 Abstract
The use of large language models (LLMs) in qualitative analysis offers enhanced efficiency but raises questions about their alignment with the contextual nature of research for design (RfD). This research examines the trustworthiness of LLM-driven design insights, using qualitative coding as a case study to explore the interpretive processes central to RfD. We introduce LLMCode, an open-source tool integrating two metrics, namely Intersection over Union (IoU) and Modified Hausdorff Distance, to assess the alignment between human and LLM-generated insights. Across two studies involving 26 designers, we find that while the model performs well with deductive coding, its ability to emulate a designer's deeper interpretive lens over the data is limited, emphasising the importance of human-AI collaboration. Our results highlight a reciprocal dynamic where users refine LLM outputs and adapt their own perspectives based on the model's suggestions. These findings underscore the importance of fostering appropriate reliance on LLMs by designing tools that preserve interpretive depth while facilitating intuitive collaboration between designers and AI.
Problem

Research questions and friction points this paper is trying to address.

Assessing alignment between human and LLM-generated qualitative insights
Evaluating LLM's interpretive depth in design research coding
Enhancing human-AI collaboration for trustworthy design analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source tool LLMCode assesses human-AI alignment
Uses IoU and Modified Hausdorff Distance metrics
Promotes human-AI collaboration in qualitative analysis
🔎 Similar Papers
No similar papers found.