🤖 AI Summary
Existing text-based personality recognition methods predominantly employ hard-label classification, neglecting the graded and prototypical nature of personality judgments. Method: This paper proposes ProtoMBTI—a novel framework integrating psychological prototype theory into MBTI text inference for the first time. It constructs an LLM-guided, multidimensionally enriched corpus; fine-tunes a lightweight encoder (≤2B parameters) via LoRA; and introduces a dynamic “retrieve–reuse–revise–retain” reasoning mechanism, coupled with prompt-driven prototype voting and adaptive prototype library updating. Contribution/Results: ProtoMBTI achieves cognitively aligned personality modeling, significantly improving accuracy across all four MBTI dimensions and 16 types on Kaggle and Pandora benchmarks. It further enhances model interpretability and cross-dataset generalization capability.
📝 Abstract
Personality recognition from text is typically cast as hard-label classification, which obscures the graded, prototype-like nature of human personality judgments. We present ProtoMBTI, a cognitively aligned framework for MBTI inference that operationalizes prototype theory within an LLM-based pipeline. First, we construct a balanced, quality-controlled corpus via LLM-guided multi-dimensional augmentation (semantic, linguistic, sentiment). Next, we LoRA-fine-tune a lightweight (<=2B) encoder to learn discriminative embeddings and to standardize a bank of personality prototypes. At inference, we retrieve top-k prototypes for a query post and perform a retrieve--reuse--revise--retain cycle: the model aggregates prototype evidence via prompt-based voting, revises when inconsistencies arise, and, upon correct prediction, retains the sample to continually enrich the prototype library. Across Kaggle and Pandora benchmarks, ProtoMBTI improves over baselines on both the four MBTI dichotomies and the full 16-type task, and exhibits robust cross-dataset generalization. Our results indicate that aligning the inference process with psychological prototype reasoning yields gains in accuracy, interpretability, and transfer for text-based personality modeling.