🤖 AI Summary
Existing cascade models typically treat network structure and user identity as isolated or monolithic factors, failing to capture the synergistic interplay of multiple social factors in social media hashtag diffusion. To address this, we propose the first decoupled evaluation framework comprising ten interpretable factors that separately quantify network- and identity-driven contributions, and develop a joint computational model integrating both. Leveraging a novel dataset of 1,337 cultural innovation hashtags, we conduct multi-factor cascade simulation, counterfactual analysis, and classification prediction. Our analysis reveals that distinct diffusion dimensions—popularity, growth rate, and adopter composition—are governed by heterogeneous factor combinations, enabling adaptive selection of optimal modeling configurations per hashtag category. Experiments demonstrate that the joint model significantly outperforms single-factor baselines, achieving an average 23% improvement in simulation accuracy for race/region- and sports/news-related hashtags.
📝 Abstract
The diffusion of culture online is theorized to be influenced by many interacting social factors (e.g., network and identity). However, most existing computational cascade models consider just a single factor (e.g., network or identity). This work offers a new framework for teasing apart the mechanisms underlying hashtag cascades. We curate a new dataset of 1,337 hashtags representing cultural innovation online, develop a 10-factor evaluation framework for comparing empirical and simulated cascades, and show that a combined network+identity model better simulates hashtag cascades than network- or identity-only counterfactuals. We also explore heterogeneity in performance: While a combined network+identity model best predicts the popularity of cascades, a network-only model best predicts cascade growth and an identity-only model best predicts adopter composition. The network+identity model has the highest comparative advantage among hashtags used for expressing racial or regional identity and talking about sports or news. In fact, we are able to predict what combination of network and/or identity best models each hashtag and use this to further improve performance. Our results show the utility of models incorporating the interactions of network, identity, and other social factors in the diffusion of hashtags in social media.