🤖 AI Summary
This work addresses the limited interpretability of text-embedding-driven graph clustering. We propose a general interpretability framework tailored to multiple static word embedding types—particularly GloVe. Methodologically, we model document similarity as cosine similarity in the word vector space, construct a semantic graph, perform graph-based clustering, and integrate visualization with feature attribution techniques to yield semantic-level explanations of clustering outcomes. Our key contribution is the first systematic extension of graph clustering interpretability to non-contextual, static embeddings—thereby overcoming prior reliance on contextualized models like BERT. Experimental results demonstrate that the framework consistently enhances clustering transparency and generalizability across diverse semantic spaces, significantly improving the understandability and trustworthiness of model decisions.
📝 Abstract
In a previous paper, we proposed an introduction to the explainability of Graph Spectral Clustering results for textual documents, given that document similarity is computed as cosine similarity in term vector space.
In this paper, we generalize this idea by considering other embeddings of documents, in particular, based on the GloVe embedding idea.