Propagating Transparency: A Deep Dive into the Interpretability of Neural Networks

Somani, Ayush; Horsch, Ludwig Alexander; Bopardikar, Ajit; Prasad, Dilip Kumar

Publisert versjon (PDF)

Dato

2024-08-19

Type

Journal article
Tidsskriftartikkel
Peer reviewed

Forfatter

Somani, Ayush; Horsch, Ludwig Alexander; Bopardikar, Ajit; Prasad, Dilip Kumar

Sammendrag

In the rapidly evolving landscape of deep learning (DL), understanding the inner workings of neural networks remains a significant challenge. The need for transparency and accountability in DL models grows in importance as they become more prevalent in decision-making processes. Interpreting these models is key to addressing this challenge. This paper offers a comprehensive overview of interpretable methods for neural networks, particularly convolutional nets. The focus is on gradient-based propagation techniques that provide insight into the intricate mechanisms behind neural network predictions. Using a systematic review, we classify interpretability approaches that are based on gradients, dive into the theory of notable methods, and compare their strengths and weaknesses. Furthermore, we investigate different evaluation metrics for interpretable systems, often generalized under the term eXplainable Artificial Intelligence (XAI). We highlight the importance of these factors in evaluating the faithfulness, robustness, localization, complexity, randomization, and adherence to the axiomatic principles of XAI methods. Our objective is to assist researchers and practitioners in advancing towards a future for artificial intelligence that is characterized by a deeper understanding of its workings, thereby providing the desired transparency and accuracy. To this end, we offer a comprehensive summary of the latest advances in the field.

Forlag

Universitetet i Oslo

Sitering

Somani A, Horsch A, Bopardikar, Prasad DK. Propagating Transparency: A Deep Dive into the Interpretability of Neural Networks. Nordic Machine Intelligence (NMI). 2024;4(2):1-18

Metadata

Vis full innførsel

Samlinger

Artikler, rapporter og annet (informatikk) [486]

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution 4.0 International (CC BY 4.0)