DIB-X: Formulating Explainability Principles for a Self-Explainable Model Through Information Theoretic Learning
Permanent lenke
https://hdl.handle.net/10037/36693Dato
2024-03-18Type
Journal articleTidsskriftartikkel
Peer reviewed
Forfatter
Choi, Changkyu; Yu, Shujian; Kampffmeyer, Michael Christian; Salberg, Arnt-Børre; Handegard, Nils Olav; Jenssen, RobertSammendrag
The recent development of self-explainable deep learning approaches has focused on integrating well-defined explainability principles into learning process, with the goal of achieving these principles through optimization. In this work, we propose DIB-X, a self-explainable deep learning approach for image data, which adheres to the principles of minimal, sufficient, and interactive explanations. The minimality and sufficiency principles are rooted from the trade-off relationship within the information bottleneck framework. Distinctly, DIB-X directly quantifies the minimality principle using the recently proposed matrix-based Rényi’s α-order entropy functional, circumventing the need for variational approximation and distributional assumption. The interactivity principle is realized by incorporating existing domain knowledge as prior explanations, fostering explanations that align with established domain understanding. Empirical results on MNIST and two marine environment monitoring datasets with different modalities reveal that our approach primarily provides improved explainability with the added advantage of enhanced classification performance.
Forlag
IEEESitering
Choi, Yu, Kampffmeyer, Salberg, Handegard, Jenssen. DIB-X: Formulating Explainability Principles for a Self-Explainable Model Through Information Theoretic Learning. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 2024:7170-7174Metadata
Vis full innførselSamlinger
Copyright 2024 The Author(s)