Improving Representation Learning for Deep Clustering and Few-shot Learning
Permanent link
https://hdl.handle.net/10037/29847Date
2023-08-23Type
Doctoral thesisDoktorgradsavhandling
Author
Trosten, Daniel JohansenAbstract
The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern machine learning systems can operate with few or no labels. The introduction of deep learning and deep neural networks has led to impressive advancements in several areas of machine learning. These advancements are largely due to the unprecedented ability of deep neural networks to learn powerful representations from a wide range of complex input signals. This ability is especially important when labeled data is limited, as the absence of a strong supervisory signal forces models to rely more on intrinsic properties of the data and its representations.
This thesis focuses on two key concepts in deep learning with few or no labels. First, we aim to improve representation quality in deep clustering - both for single-view and multi-view data. Current models for deep clustering face challenges related to properly representing semantic similarities, which is crucial for the models to discover meaningful clusterings. This is especially challenging with multi-view data, since the information required for successful clustering might be scattered across many views. Second, we focus on few-shot learning, and how geometrical properties of representations influence few-shot classification performance. We find that a large number of recent methods for few-shot learning embed representations on the hypersphere. Hence, we seek to understand what makes the hypersphere a particularly suitable embedding space for few-shot learning.
Our work on single-view deep clustering addresses the susceptibility of deep clustering models to find trivial solutions with non-meaningful representations. To address this issue, we present a new auxiliary objective that - when compared to the popular autoencoder-based approach - better aligns with the main clustering objective, resulting in improved clustering performance. Similarly, our work on multi-view clustering focuses on how representations can be learned from multi-view data, in order to make the representations suitable for the clustering objective. Where recent methods for deep multi-view clustering have focused on aligning view-specific representations, we find that this alignment procedure might actually be detrimental to representation quality. We investigate the effects of representation alignment, and provide novel insights on when alignment is beneficial, and when it is not. Based on our findings, we present several new methods for deep multi-view clustering - both alignment and non-alignment-based - that out-perform current state-of-the-art methods.
Our first work on few-shot learning aims to tackle the hubness problem, which has been shown to have negative effects on few-shot classification performance. To this end, we present two new methods to embed representations on the hypersphere for few-shot learning. Further, we provide both theoretical and experimental evidence indicating that embedding representations as uniformly as possible on the hypersphere reduces hubness, and improves classification accuracy. Furthermore, based on our findings on hyperspherical embeddings for few-shot learning, we seek to improve the understanding of representation norms. In particular, we ask what type of information the norm carries, and why it is often beneficial to discard the norm in classification models. We answer this question by presenting a novel hypothesis on the relationship between representation norm and the number of a certain class of objects in the image. We then analyze our hypothesis both theoretically and experimentally, presenting promising results that corroborate the hypothesis.
Has part(s)
Paper I: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2023). Leveraging Tensor Kernels to Reduce Objective Function Mismatch in Deep Clustering. (Submitted manuscript).
Paper II: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2021). Reconsidering Representation Alignment for Multi-view Clustering. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Published version available at https://doi.org/10.1109/CVPR46437.2021.00131. Accepted manuscript version available in Munin at https://hdl.handle.net/10037/24371.
Paper III: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2023). On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, 23976-23985. Also available at CVPR 2023 open access.
Paper IV: Trosten, D.J., Chakraborty, R., Løkse, S., Wickstrøm, K., Jenssen, R. & Kampffmeyer, M. (2023). Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, 7527-7536. Also available at CVPR 2023 open access.
Paper V: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. Norm-Count Hypothesis: On the Relationship Between Norm and Object Count in Visual Representations. (Submitted manuscript).
Publisher
UiT Norges arktiske universitetUiT The Arctic University of Norway
Metadata
Show full item recordCollections
The following license file are associated with this item:
Related items
Showing items related by title, author, creator and subject.
-
Influence of environmental tonicity changes on lipophilic drug release from liposomes
Nikolaisen, Trygg Einar (Mastergradsoppgave; Master thesis, 2018-05-15)Introduction: Liposomes as drug delivery systems has been widely studied as a way to solubilize poorly soluble drugs, reduce side effects of chemotherapeutics and increase circulation time in vivo. Since the first descriptions of liposomes over 60 years ago, they have shown tendencies to shrink and swell when the external environment of the liposomes is altered. This phenomenon has been studied in ... -
Implementing an electronic health record in a Nigerian secondary healthcare facility. Prospects and challenges
Attah, Ambrose Ojadale (Master thesis; Mastergradsoppgave, 2017-11-02)Nigeria is witnessing continuing advocacy and increase in number of individuals yearning for computerization of health information and healthcare processes. However, little is known about the opinions of the diverse healthcare providers who would ensure the successful implementation and meaningful use of health information technology in the country (Adeleke, Erinle et al. 2015). This study explores ... -
Geometric Modeling- and Sensor Technology Applications for Engineering Problems
Pedersen, Aleksander (Doctoral thesis; Doktorgradsavhandling, 2020-10-20)In applications for technical problems, Geometric modeling and sensor technology are key in both scientific and industrial development. Simulations and visualization techniques are the next step after defining geometry models and data types. This thesis attempts to combine different aspects of geometric modeling and sensor technology as well as to facilitate simulation and visualization. It includes ...