Agree to Disagree: Exploring Consensus of XAI Methods for ML-based NIDS

Today our paper “Agree to Disagree: Exploring Consensus of XAI Methods for ML-based NIDS” was presented at the 1st Workshop on Network Security Operations (NecSecOr). This paper examines the effectiveness and consensus of various explainable AI (XAI) methods in enhancing the interpretability of machine learning-based Network Intrusion Detection Systems (ML-NIDS), finding that while some methods align closely, others diverge, underscoring the need for careful selection to build trust in real-world cybersecurity applications.

Abstract:

The increasing complexity and frequency of cyber attacks require Network Intrusion Detection Systems (NIDS) that can adapt to evolving threats. Artificial intelligence (AI), particularly machine learning (ML), has gained increasing popularity in detecting sophisticated attacks. However, their potential lack of interpretability remains a significant barrier to their widespread adoption in practice, especially in security-sensitive areas. In response, various explainable AI (XAI) methods have been proposed to provide insights into the decision-making process. This paper investigates whether these XAI methods, including SHAP, LIME, Tree Interpreter, Saliency, Integrated Gradients, and DeepLIFT, produce similar explanations when applied to ML-NIDS. By analyzing consensus among these methods across different datasets and ML models, we explore whether an agreement exists that could simplify the practical adoption of XAI in cybersecurity, as similar explanations would eliminate the need for rigorous selection processes. Our findings reveal varying degrees of consensus among the methods, suggesting that while some align closely, others diverge significantly, highlighting the need for careful selection and combination of XAI tools to enhance trustworthiness in real-world applications.

Search