RELAX: Representation Learning Explainability

Wickstrøm, Kristoffer; Trosten, Daniel Johansen; Løkse, Sigurd Eivindson; Boubekki, Ahcene; Mikalsen, Karl Øyvind; Kampffmeyer, Michael; Jenssen, Robert

Publisert versjon (PDF)

Dato

2023-03-11

Type

Journal article
Tidsskriftartikkel
Peer reviewed

Forfatter

Wickstrøm, Kristoffer; Trosten, Daniel Johansen; Løkse, Sigurd Eivindson; Boubekki, Ahcene; Mikalsen, Karl Øyvind; Kampffmeyer, Michael; Jenssen, Robert

Sammendrag

Despite the significant improvements that self-supervised representation learning has led to when learning from unlabeled data, no methods have been developed that explain what influences the learned representation. We address this need through our proposed approach, RELAX, which is the first approach for attribution-based explanations of representations. Our approach can also model the uncertainty in its explanations, which is essential to produce trustworthy explanations. RELAX explains representations by measuring similarities in the representation space between an input and masked out versions of itself, providing intuitive explanations that significantly outperform the gradient-based baselines. We provide theoretical interpretations of RELAX and conduct a novel analysis of feature extractors trained using supervised and unsupervised learning, providing insights into different learning strategies. Moreover, we conduct a user study to assess how well the proposed approach aligns with human intuition and show that the proposed method outperforms the baselines in both the quantitative and human evaluation studies. Finally, we illustrate the usability of RELAX in several use cases and highlight that incorporating uncertainty can be essential for providing faithful explanations, taking a crucial step towards explaining representations.

Forlag

Springer Nature

Sitering

Wickstrøm, Trosten, Løkse, Boubekki, Mikalsen, Kampffmeyer, Jenssen. RELAX: Representation Learning Explainability. International Journal of Computer Vision. 2023;131(6):1584-1610

Metadata

Vis full innførsel

Samlinger

Artikler, rapporter og annet (fysikk og teknologi) [1062]

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution 4.0 International (CC BY 4.0)