DIB-X: Formulating Explainability Principles for a Self-Explainable Model Through Information Theoretic Learning

Choi, Changkyu; Yu, Shujian; Kampffmeyer, Michael Christian; Salberg, Arnt-Børre; Handegard, Nils Olav; Jenssen, Robert

Akseptert manusversjon (PDF)

Dato

2024-03-18

Type

Journal article
Tidsskriftartikkel
Peer reviewed

Forfatter

Choi, Changkyu; Yu, Shujian; Kampffmeyer, Michael Christian; Salberg, Arnt-Børre; Handegard, Nils Olav; Jenssen, Robert

Sammendrag

The recent development of self-explainable deep learning approaches has focused on integrating well-defined explainability principles into learning process, with the goal of achieving these principles through optimization. In this work, we propose DIB-X, a self-explainable deep learning approach for image data, which adheres to the principles of minimal, sufficient, and interactive explanations. The minimality and sufficiency principles are rooted from the trade-off relationship within the information bottleneck framework. Distinctly, DIB-X directly quantifies the minimality principle using the recently proposed matrix-based Rényi’s α-order entropy functional, circumventing the need for variational approximation and distributional assumption. The interactivity principle is realized by incorporating existing domain knowledge as prior explanations, fostering explanations that align with established domain understanding. Empirical results on MNIST and two marine environment monitoring datasets with different modalities reveal that our approach primarily provides improved explainability with the added advantage of enhanced classification performance.

Forlag

IEEE

Sitering

Choi, Yu, Kampffmeyer, Salberg, Handegard, Jenssen. DIB-X: Formulating Explainability Principles for a Self-Explainable Model Through Information Theoretic Learning. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 2024:7170-7174

Metadata

Vis full innførsel

Samlinger

Artikler, rapporter og annet (fysikk og teknologi) [1062]