🤖 AI Summary
This paper addresses the insufficient modeling of developers’ psychological and affective states in software engineering (SE). Through a systematic literature review, it synthesizes the current applications and potential of psycholinguistic tools—particularly the Linguistic Inquiry and Word Count (LIWC)—in SE research. Drawing on cross-platform textual data from GitHub, Stack Overflow, and other sources, the study analyzes 43 empirical studies and identifies three core thematic areas: communication patterns, organizational climate, and positive psychology. Results demonstrate LIWC’s emerging utility in emotion detection, personality inference, and bias analysis in human–large-language-model interactions. However, critical limitations persist, including inadequate domain-specific term adaptation and absent validity validation in 26 studies. To address these gaps, the paper proposes an extensible evaluation framework and concrete methodological improvements, thereby advancing human-centered SE research toward greater rigor, transparency, and reproducibility.
📝 Abstract
Context: A deeper understanding of human factors in software engineering (SE) is essential for improving team collaboration, decision-making, and productivity. Communication channels like code reviews and chats provide insights into developers' psychological and emotional states. While large language models excel at text analysis, they often lack transparency and precision. Psycholinguistic tools like Linguistic Inquiry and Word Count (LIWC) offer clearer, interpretable insights into cognitive and emotional processes exhibited in text. Despite its wide use in SE research, no comprehensive review of LIWC's use has been conducted. Objective: We examine the importance of psycholinguistic tools, particularly LIWC, and provide a thorough analysis of its current and potential future applications in SE research. Methods: We conducted a systematic review of six prominent databases, identifying 43 SE-related papers using LIWC. Our analysis focuses on five research questions. Results: Our findings reveal a wide range of applications, including analyzing team communication to detect developer emotions and personality, developing ML models to predict deleted Stack Overflow posts, and more recently comparing AI-generated and human-written text. LIWC has been primarily used with data from project management platforms (e.g., GitHub) and Q&A forums (e.g., Stack Overflow). Key BSE concepts include Communication, Organizational Climate, and Positive Psychology. 26 of 43 papers did not formally evaluate LIWC. Concerns were raised about some limitations, including difficulty handling SE-specific vocabulary. Conclusion: We highlight the potential of psycholinguistic tools and their limitations, and present new use cases for advancing the research of human factors in SE (e.g., bias in human-LLM conversations).