Improving Representation Learning for Deep Clustering and Few-shot Learning

Trosten, Daniel Johansen

dc.contributor.advisor	Kampffmeyer, Michael
dc.contributor.author	Trosten, Daniel Johansen
dc.date.accessioned	2023-08-10T10:39:08Z
dc.date.available	2023-08-10T10:39:08Z
dc.date.issued	2023-08-23
dc.description.abstract	<p>The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern machine learning systems can operate with few or no labels. The introduction of deep learning and deep neural networks has led to impressive advancements in several areas of machine learning. These advancements are largely due to the unprecedented ability of deep neural networks to learn powerful representations from a wide range of complex input signals. This ability is especially important when labeled data is limited, as the absence of a strong supervisory signal forces models to rely more on intrinsic properties of the data and its representations. <p>This thesis focuses on two key concepts in deep learning with few or no labels. First, we aim to improve representation quality in deep clustering - both for single-view and multi-view data. Current models for deep clustering face challenges related to properly representing semantic similarities, which is crucial for the models to discover meaningful clusterings. This is especially challenging with multi-view data, since the information required for successful clustering might be scattered across many views. Second, we focus on few-shot learning, and how geometrical properties of representations influence few-shot classification performance. We find that a large number of recent methods for few-shot learning embed representations on the hypersphere. Hence, we seek to understand what makes the hypersphere a particularly suitable embedding space for few-shot learning. <p>Our work on single-view deep clustering addresses the susceptibility of deep clustering models to find trivial solutions with non-meaningful representations. To address this issue, we present a new auxiliary objective that - when compared to the popular autoencoder-based approach - better aligns with the main clustering objective, resulting in improved clustering performance. Similarly, our work on multi-view clustering focuses on how representations can be learned from multi-view data, in order to make the representations suitable for the clustering objective. Where recent methods for deep multi-view clustering have focused on aligning view-specific representations, we find that this alignment procedure might actually be detrimental to representation quality. We investigate the effects of representation alignment, and provide novel insights on when alignment is beneficial, and when it is not. Based on our findings, we present several new methods for deep multi-view clustering - both alignment and non-alignment-based - that out-perform current state-of-the-art methods. <p>Our first work on few-shot learning aims to tackle the hubness problem, which has been shown to have negative effects on few-shot classification performance. To this end, we present two new methods to embed representations on the hypersphere for few-shot learning. Further, we provide both theoretical and experimental evidence indicating that embedding representations as uniformly as possible on the hypersphere reduces hubness, and improves classification accuracy. Furthermore, based on our findings on hyperspherical embeddings for few-shot learning, we seek to improve the understanding of representation norms. In particular, we ask what type of information the norm carries, and why it is often beneficial to discard the norm in classification models. We answer this question by presenting a novel hypothesis on the relationship between representation norm and the number of a certain class of objects in the image. We then analyze our hypothesis both theoretically and experimentally, presenting promising results that corroborate the hypothesis.	en_US
dc.description.doctoraltype	ph.d.	en_US
dc.description.popularabstract	The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern deep learning systems can operate with few or no labels. This thesis focuses on two key concepts in deep learning with limited labels. Clustering, which refers to discovering groups in unlabeled datasets - and few-shot learning, where the objective is to classify new data based on a small number of labeled examples. Across 5 papers, we conduct extensive theoretical and experimental analyses, resulting in novel insights, and new state-of-the-art methodology. Our contributions constitute significant advancements in deep clustering and few-shot learning. Since these fields likely will play a key role in the future of deep learning and artificial intelligence, our work has the potential to contribute to developing the next generation of intelligent systems.	en_US
dc.description.sponsorship	Research Council of Norway, grant numbers: 303514, 309439, and 315029.	en_US
dc.identifier.isbn	978-82-8236-530-7 (printed version)
dc.identifier.isbn	978-82-8236-531-4 (electronic/pdf version)
dc.identifier.uri	https://hdl.handle.net/10037/29847
dc.language.iso	eng	en_US
dc.publisher	UiT Norges arktiske universitet	en_US
dc.publisher	UiT The Arctic University of Norway	en_US
dc.relation.haspart	<p>Paper I: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2023). Leveraging Tensor Kernels to Reduce Objective Function Mismatch in Deep Clustering. (Submitted manuscript). <p>Paper II: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2021). Reconsidering Representation Alignment for Multi-view Clustering. <i>2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition</i>. Published version available at <a href=https://doi.org/10.1109/CVPR46437.2021.00131>https://doi.org/10.1109/CVPR46437.2021.00131</a>. Accepted manuscript version available in Munin at <a href=https://hdl.handle.net/10037/24371>https://hdl.handle.net/10037/24371</a>. <p>Paper III: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2023). On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering. <i>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023</i>, 23976-23985. Also available at <a href=https://openaccess.thecvf.com/content/CVPR2023/html/Trosten_On_the_Effects_of_Self-Supervision_and_Contrastive_Alignment_in_Deep_CVPR_2023_paper.html>CVPR 2023 open access</a>. <p>Paper IV: Trosten, D.J., Chakraborty, R., Løkse, S., Wickstrøm, K., Jenssen, R. & Kampffmeyer, M. (2023). Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings. <i>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023</i>, 7527-7536. Also available at <a href= https://openaccess.thecvf.com/content/CVPR2023/html/Trosten_Hubs_and_Hyperspheres_Reducing_Hubness_and_Improving_Transductive_Few-Shot_Learning_CVPR_2023_paper.html>CVPR 2023 open access</a>. <p>Paper V: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. Norm-Count Hypothesis: On the Relationship Between Norm and Object Count in Visual Representations. (Submitted manuscript).	en_US
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2023 The Author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-sa/4.0	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)	en_US
dc.subject	VDP::Mathematics and natural science: 400::Mathematics: 410::Statistics: 412	en_US
dc.subject	VDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Statistikk: 412	en_US
dc.subject	VDP::Mathematics and natural science: 400::Information and communication science: 420::Knowledge based systems: 425	en_US
dc.subject	VDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420::Kunnskapsbaserte systemer: 425	en_US
dc.subject	VDP::Mathematics and natural science: 400::Information and communication science: 420::Simulation, visualization, signal processing, image processing: 429	en_US
dc.subject	VDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420::Simulering, visualisering, signalbehandling, bildeanalyse: 429	en_US
dc.subject	Machine Learning	en_US
dc.subject	Maskinlæring	en_US
dc.subject	Deep Learning	en_US
dc.subject	Dyplæring	en_US
dc.title	Improving Representation Learning for Deep Clustering and Few-shot Learning	en_US
dc.type	Doctoral thesis	en_US
dc.type	Doktorgradsavhandling	en_US

Tilhørende fil(er)

Navn:: thesis.pdf
Størrelse:: 23.05Mb
Format:: PDF
Beskrivelse:: Thesis

Åpne

Navn:: license.txt
Størrelse:: 1.093Kb
Format:: Tekstfil

Åpne

Denne innførselen finnes i følgende samling(er)

Doktorgradsavhandlinger (NT-fak) [322]

Vis enkel innførsel

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Improving Representation Learning for Deep Clustering and Few-shot Learning

Tilhørende fil(er)

Denne innførselen finnes i følgende samling(er)

Relaterte innførsler

Influence of environmental tonicity changes on lipophilic drug release from liposomes ﻿

Implementing an electronic health record in a Nigerian secondary healthcare facility. Prospects and challenges ﻿

Geometric Modeling- and Sensor Technology Applications for Engineering Problems ﻿

Influence of environmental tonicity changes on lipophilic drug release from liposomes

Implementing an electronic health record in a Nigerian secondary healthcare facility. Prospects and challenges

Geometric Modeling- and Sensor Technology Applications for Engineering Problems