Tethering Broken Themes: Aligning Neural Topic Models with Labels and Authors

📅 2024-10-22

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Existing neural topic models suffer from semantic drift, poor interpretability, and insufficient exploitation of metadata (e.g., labels, authors). To address these issues, we propose FANToM—the first neural topic model that jointly models label supervision and author interests. Within a variational autoencoder framework, FANToM unifies semantic, annotation, and social dimensions; it incorporates label classification loss and author distribution regularization into end-to-end training to achieve triple alignment among topics, labels, and authors. Our method significantly improves topic coherence (+12.3%) and cross-modal alignment accuracy—achieving +18.7% in both label and author F1 scores. Moreover, FANToM enables interpretable author interest lineage discovery and structured analysis. By seamlessly integrating heterogeneous signals, it establishes a novel paradigm for explainable, multi-source collaborative topic modeling.

Technology Category

Application Category

📝 Abstract

Topic models are a popular approach for extracting semantic information from large document collections. However, recent studies suggest that the topics generated by these models often do not align well with human intentions. Although metadata such as labels and authorship information are available, it has not yet been effectively incorporated into neural topic models. To address this gap, we introduce FANToM, a novel method to align neural topic models with both labels and authorship information. FANToM allows for the inclusion of this metadata when available, producing interpretable topics and author distributions for each topic. Our approach demonstrates greater expressiveness than conventional topic models by learning the alignment between labels, topics, and authors. Experimental results show that FANToM improves existing models in terms of both topic quality and alignment. Additionally, it identifies author interests and similarities.

Problem

Research questions and friction points this paper is trying to address.

Align neural topic models with labels

Incorporate authorship into topic models

Improve topic quality and alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates labels into topic models

Incorporates authorship information effectively

Enhances topic interpretability and author analysis

🔎 Similar Papers

A Large Language Model Guided Topic Refinement Mechanism for Short Text Modeling