π€ AI Summary
To address the lack of interpretability, low efficiency of manual reverse engineering, and weak resilience against obfuscation in Android malware analysis, this paper proposes XAIDroidβthe first explainable malicious code localization framework integrating graph attention mechanisms with API call graphs. XAIDroid constructs a semantically enriched API call graph and jointly employs a Graph Attention Model (GAM) and a Graph Attention Network (GAT) to collaboratively learn node importance, enabling fine-grained localization of malicious code segments and visualizing decision rationales. Evaluated on both synthetic and real-world datasets, XAIDroid significantly improves recall and F1-score while demonstrating high robustness, strong scalability, and superior resistance to obfuscation techniques. By unifying automated detection with transparent, human-interpretable reasoning, XAIDroid establishes a novel paradigm for trustworthy, explainable mobile malware analysis.
π Abstract
With the escalating threat of malware, particularly on mobile devices, the demand for effective analysis methods has never been higher. While existing security solutions, including AI-based approaches, offer promise, their lack of transparency constraints the understanding of detected threats. Manual analysis remains time-consuming and reliant on scarce expertise. To address these challenges, we propose a novel approach called XAIDroid that leverages graph neural networks (GNNs) and graph attention mechanisms for automatically locating malicious code snippets within malware. By representing code as API call graphs, XAIDroid captures semantic context and enhances resilience against obfuscation. Utilizing the Graph Attention Model (GAM) and Graph Attention Network (GAT), we assign importance scores to API nodes, facilitating focused attention on critical information for malicious code localization. Evaluation on synthetic and real-world malware datasets demonstrates the efficacy of our approach, achieving high recall and F1-score rates for malicious code localization. The successful implementation of automatic malicious code localization enhances the scalability, interpretability, and reliability of malware analysis.