Survey on Explainability-Weaponising Adversarial Attack Vectors against Deep Neural Networks and Artificial Intelligence

Publication type

Survey

Date

14.04.2026

Description

Adversarial machine learning has revealed the fragility of deep neural networks, while explainable artificial intelligence has been introduced to improve the transparency and trust of AI. It has recently been demonstrated, however, that xAI can be weaponised, enabling adversaries to amplify the effectiveness and efficiency of adversarial attacks. This paper presents the first systematic survey dedicated to xAI-weaponising adversarial attacks. The literature is synthesised across four adversarial goals: evasion, poisoning/backdoors, privacy/inference, and model extraction. A unified taxonomy is proposed that organises attack vectors according to adversarial goals, operational roles of xAI, and attacker capabilities. The bibliographic methodology follows PRISMA guidelines, with structured queries applied to IEEE Xplore, ACM Digital Library, SpringerLink, ScienceDirect, and Google Scholar, complemented by snowballing. The date range was set to 2020-2025. The findings indicate that evasi on attacks dominate current literature, while poisoning and extraction attacks remain comparatively underexplored. Open challenges and research directions are identified. This survey reframes xAI from a purely diagnostic tool to a security-critical interface and provides a foundation for principled defence.

Authors

Ryszard Choraś, Rafał Kozik, Michał Choraś, Marek Pawlicki

Go to Scitepress