🤖 AI Summary
This study addresses the lack of a systematic criterion for selecting principal component analysis (PCA) dimensions in supervised semantic differentiation (SSD). To resolve this, the authors propose a PCA sweep procedure that jointly optimizes dimension selection with respect to representational capacity, gradient interpretability, and stability of the K-value. The method introduces an interpretable semantic gradient analysis framework integrating semantic embeddings, regression modeling, clustering, and text retrieval, substantially enhancing analytical robustness and transparency. Experiments on AI discourse corpora demonstrate that the framework reliably identifies semantic gradients significantly associated with the narcissism “admiration” dimension and reveals two distinct discursive styles—optimistic collaboration and skeptical mockery—while the “rivalry” dimension shows no consistent association.
📝 Abstract
Supervised Semantic Differential (SSD) is a mixed quantitative-interpretive method that models how text meaning varies with continuous individual-difference variables by estimating a semantic gradient in an embedding space and interpreting its poles through clustering and text retrieval. SSD applies PCA before regression, but currently no systematic method exists for choosing the number of retained components, introducing avoidable researcher degrees of freedom in the analysis pipeline. We propose a PCA sweep procedure that treats dimensionality selection as a joint criterion over representation capacity, gradient interpretability, and stability across nearby values of K. We illustrate the method on a corpus of short posts about artificial intelligence written by Prolific participants who also completed Admiration and Rivalry narcissism scales. The sweep yields a stable, interpretable Admiration-related gradient contrasting optimistic, collaborative framings of AI with distrustful and derisive discourse, while no robust alignment emerges for Rivalry. We also show that a counterfactual using a high-PCA dimension solution heuristic produces diffuse, weakly structured clusters instead, reinforcing the value of the sweep-based choice of K. The case study shows how the PCA sweep constrains researcher degrees of freedom while preserving SSD's interpretive aims, supporting transparent and psychologically meaningful analyses of connotative meaning.