🤖 AI Summary
This work formally defines humblebragging—i.e., implicit self-promotion masked as complaint or modesty—as a novel pragmatic task in natural language understanding. To address it, we propose a quadruple-based formalization framework and introduce HB24, the first high-quality dataset comprising 3,340 samples generated by GPT-4o and rigorously validated by human annotators. Empirical analysis confirms its substantial cognitive challenge for both humans and models. We conduct a systematic evaluation using three methodological paradigms: traditional machine learning, fine-tuned BERT/RoBERTa models, and prompt-engineering–driven large language models (LLMs), establishing a human-annotated benchmark. The best-performing model achieves an F1 score of 0.88, substantially outperforming baselines. Results demonstrate that humblebragging detection transcends conventional sentiment or intent classification, demanding fine-grained pragmatic reasoning. This work establishes a foundational resource and a new paradigm for modeling implicit rhetoric and pragmatic inference in NLP.
📝 Abstract
Humblebragging is a phenomenon where individuals present self-promotional statements under the guise of modesty or complaints. For example, a statement like,"Ugh, I can't believe I got promoted to lead the entire team. So stressful!", subtly highlights an achievement while pretending to be complaining. Detecting humblebragging is important for machines to better understand the nuances of human language, especially in tasks like sentiment analysis and intent recognition. However, this topic has not yet been studied in computational linguistics. For the first time, we introduce the task of automatically detecting humblebragging in text. We formalize the task by proposing a 4-tuple definition of humblebragging and evaluate machine learning, deep learning, and large language models (LLMs) on this task, comparing their performance with humans. We also create and release a dataset called HB24, containing 3,340 humblebrags generated using GPT-4o. Our experiments show that detecting humblebragging is non-trivial, even for humans. Our best model achieves an F1-score of 0.88. This work lays the foundation for further exploration of this nuanced linguistic phenomenon and its integration into broader natural language understanding systems.