Vis enkel innførsel

dc.contributor.advisorKampffmeyer, Michael
dc.contributor.authorTrosten, Daniel Johansen
dc.date.accessioned2023-08-10T10:39:08Z
dc.date.available2023-08-10T10:39:08Z
dc.date.issued2023-08-23
dc.description.abstract<p>The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern machine learning systems can operate with few or no labels. The introduction of deep learning and deep neural networks has led to impressive advancements in several areas of machine learning. These advancements are largely due to the unprecedented ability of deep neural networks to learn powerful representations from a wide range of complex input signals. This ability is especially important when labeled data is limited, as the absence of a strong supervisory signal forces models to rely more on intrinsic properties of the data and its representations. <p>This thesis focuses on two key concepts in deep learning with few or no labels. First, we aim to improve representation quality in deep clustering - both for single-view and multi-view data. Current models for deep clustering face challenges related to properly representing semantic similarities, which is crucial for the models to discover meaningful clusterings. This is especially challenging with multi-view data, since the information required for successful clustering might be scattered across many views. Second, we focus on few-shot learning, and how geometrical properties of representations influence few-shot classification performance. We find that a large number of recent methods for few-shot learning embed representations on the hypersphere. Hence, we seek to understand what makes the hypersphere a particularly suitable embedding space for few-shot learning. <p>Our work on single-view deep clustering addresses the susceptibility of deep clustering models to find trivial solutions with non-meaningful representations. To address this issue, we present a new auxiliary objective that - when compared to the popular autoencoder-based approach - better aligns with the main clustering objective, resulting in improved clustering performance. Similarly, our work on multi-view clustering focuses on how representations can be learned from multi-view data, in order to make the representations suitable for the clustering objective. Where recent methods for deep multi-view clustering have focused on aligning view-specific representations, we find that this alignment procedure might actually be detrimental to representation quality. We investigate the effects of representation alignment, and provide novel insights on when alignment is beneficial, and when it is not. Based on our findings, we present several new methods for deep multi-view clustering - both alignment and non-alignment-based - that out-perform current state-of-the-art methods. <p>Our first work on few-shot learning aims to tackle the hubness problem, which has been shown to have negative effects on few-shot classification performance. To this end, we present two new methods to embed representations on the hypersphere for few-shot learning. Further, we provide both theoretical and experimental evidence indicating that embedding representations as uniformly as possible on the hypersphere reduces hubness, and improves classification accuracy. Furthermore, based on our findings on hyperspherical embeddings for few-shot learning, we seek to improve the understanding of representation norms. In particular, we ask what type of information the norm carries, and why it is often beneficial to discard the norm in classification models. We answer this question by presenting a novel hypothesis on the relationship between representation norm and the number of a certain class of objects in the image. We then analyze our hypothesis both theoretically and experimentally, presenting promising results that corroborate the hypothesis.en_US
dc.description.doctoraltypeph.d.en_US
dc.description.popularabstractThe amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern deep learning systems can operate with few or no labels. This thesis focuses on two key concepts in deep learning with limited labels. Clustering, which refers to discovering groups in unlabeled datasets - and few-shot learning, where the objective is to classify new data based on a small number of labeled examples. Across 5 papers, we conduct extensive theoretical and experimental analyses, resulting in novel insights, and new state-of-the-art methodology. Our contributions constitute significant advancements in deep clustering and few-shot learning. Since these fields likely will play a key role in the future of deep learning and artificial intelligence, our work has the potential to contribute to developing the next generation of intelligent systems.en_US
dc.description.sponsorshipResearch Council of Norway, grant numbers: 303514, 309439, and 315029.en_US
dc.identifier.isbn978-82-8236-530-7 (printed version)
dc.identifier.isbn978-82-8236-531-4 (electronic/pdf version)
dc.identifier.urihttps://hdl.handle.net/10037/29847
dc.language.isoengen_US
dc.publisherUiT Norges arktiske universiteten_US
dc.publisherUiT The Arctic University of Norwayen_US
dc.relation.haspart<p>Paper I: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2023). Leveraging Tensor Kernels to Reduce Objective Function Mismatch in Deep Clustering. (Submitted manuscript). <p>Paper II: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2021). Reconsidering Representation Alignment for Multi-view Clustering. <i>2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition</i>. Published version available at <a href=https://doi.org/10.1109/CVPR46437.2021.00131>https://doi.org/10.1109/CVPR46437.2021.00131</a>. Accepted manuscript version available in Munin at <a href=https://hdl.handle.net/10037/24371>https://hdl.handle.net/10037/24371</a>. <p>Paper III: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. (2023). On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering. <i>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023</i>, 23976-23985. Also available at <a href=https://openaccess.thecvf.com/content/CVPR2023/html/Trosten_On_the_Effects_of_Self-Supervision_and_Contrastive_Alignment_in_Deep_CVPR_2023_paper.html>CVPR 2023 open access</a>. <p>Paper IV: Trosten, D.J., Chakraborty, R., Løkse, S., Wickstrøm, K., Jenssen, R. & Kampffmeyer, M. (2023). Hubs and Hyperspheres: Reducing Hubness and Improving Transductive Few-shot Learning with Hyperspherical Embeddings. <i>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023</i>, 7527-7536. Also available at <a href= https://openaccess.thecvf.com/content/CVPR2023/html/Trosten_Hubs_and_Hyperspheres_Reducing_Hubness_and_Improving_Transductive_Few-Shot_Learning_CVPR_2023_paper.html>CVPR 2023 open access</a>. <p>Paper V: Trosten, D.J., Løkse, S., Jenssen, R. & Kampffmeyer, M. Norm-Count Hypothesis: On the Relationship Between Norm and Object Count in Visual Representations. (Submitted manuscript).en_US
dc.rights.accessRightsopenAccessen_US
dc.rights.holderCopyright 2023 The Author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/4.0en_US
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)en_US
dc.subjectVDP::Mathematics and natural science: 400::Mathematics: 410::Statistics: 412en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Statistikk: 412en_US
dc.subjectVDP::Mathematics and natural science: 400::Information and communication science: 420::Knowledge based systems: 425en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420::Kunnskapsbaserte systemer: 425en_US
dc.subjectVDP::Mathematics and natural science: 400::Information and communication science: 420::Simulation, visualization, signal processing, image processing: 429en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420::Simulering, visualisering, signalbehandling, bildeanalyse: 429en_US
dc.subjectMachine Learningen_US
dc.subjectMaskinlæringen_US
dc.subjectDeep Learningen_US
dc.subjectDyplæringen_US
dc.titleImproving Representation Learning for Deep Clustering and Few-shot Learningen_US
dc.typeDoctoral thesisen_US
dc.typeDoktorgradsavhandlingen_US


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)