🤖 AI Summary
To address the limited interpretability of Large Hadron Collider (LHC) event classification in high-energy physics, this paper proposes an interpretable graph neural network integrating Graph Transformers with a Mixture-of-Experts (MoE) architecture. It is the first work to incorporate expert specialization into graph Transformers, where MoE layers enable physics-informed feature specialization—e.g., modeling distinct particle production mechanisms or decay topologies. Coupled with differentiable attention visualization, the model explicitly aligns attention heatmaps with domain-specific physical priors, including transverse momentum and track topology. Evaluated on graph-structured ATLAS simulation data for supersymmetry signal versus Standard Model background classification, the method achieves state-of-the-art accuracy while generating physically meaningful attention evidence. Quantitative and qualitative analysis confirms that the model’s decision rationale strongly conforms to established physical principles, thereby enhancing both predictive performance and scientific interpretability.
📝 Abstract
The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, have shown promise in tasks such as event classification and object identification by representing collisions as graphs. However, while Graph Neural Networks excel in predictive accuracy, their"black box"nature often limits their interpretability, making it difficult to trust their decision-making processes. In this paper, we propose a novel approach that combines a Graph Transformer model with Mixture-of-Expert layers to achieve high predictive performance while embedding interpretability into the architecture. By leveraging attention maps and expert specialization, the model offers insights into its internal decision-making, linking predictions to physics-informed features. We evaluate the model on simulated events from the ATLAS experiment, focusing on distinguishing rare Supersymmetric signal events from Standard Model background. Our results highlight that the model achieves competitive classification accuracy while providing interpretable outputs that align with known physics, demonstrating its potential as a robust and transparent tool for high-energy physics data analysis. This approach underscores the importance of explainability in machine learning methods applied to high energy physics, offering a path toward greater trust in AI-driven discoveries.