🤖 AI Summary
This study reveals that debiasing large language models (LLMs) severely impairs their cultural commonsense reasoning capabilities. Addressing the limitation of existing evaluation frameworks—which treat social bias detection and cultural understanding as disjoint tasks—we introduce SOBACO, the first unified Japanese benchmark jointly assessing social bias detection and cultural commonsense reasoning. Through systematic experiments across multiple LLMs and debiasing methods, we find that mainstream debiasing techniques significantly degrade cultural commonsense performance, with accuracy dropping by up to 75%. This provides the first empirical evidence of a substantial trade-off between fairness optimization and cultural understanding. Our core contributions are: (1) SOBACO, the first cross-dimensional benchmark for culturally sensitive evaluation; and (2) a new paradigm advocating context-aware debiasing that explicitly preserves cultural knowledge—establishing a critical foundation for developing culturally grounded debiasing strategies.
📝 Abstract
Large language models (LLMs) exhibit social biases, prompting the development of various debiasing methods. However, debiasing methods may degrade the capabilities of LLMs. Previous research has evaluated the impact of bias mitigation primarily through tasks measuring general language understanding, which are often unrelated to social biases. In contrast, cultural commonsense is closely related to social biases, as both are rooted in social norms and values. The impact of bias mitigation on cultural commonsense in LLMs has not been well investigated. Considering this gap, we propose SOBACO (SOcial BiAs and Cultural cOmmonsense benchmark), a Japanese benchmark designed to evaluate social biases and cultural commonsense in LLMs in a unified format. We evaluate several LLMs on SOBACO to examine how debiasing methods affect cultural commonsense in LLMs. Our results reveal that the debiasing methods degrade the performance of the LLMs on the cultural commonsense task (up to 75% accuracy deterioration). These results highlight the importance of developing debiasing methods that consider the trade-off with cultural commonsense to improve fairness and utility of LLMs.