Show simple item record

dc.contributor.advisorJenssen, Robert
dc.contributor.authorMyhre, Jonas Nordhaug
dc.date.accessioned2012-03-19T12:45:48Z
dc.date.available2012-03-19T12:45:48Z
dc.date.issued2011-12-15
dc.description.abstractIn this thesis we present a new semi-supervised classification technique based on the Kernel Entropy Component Analysis (KECA) transformation and the least absolute shrinkage selection operator (LASSO). The latter is a constrained version of the least squares classifier. Traditional supervised classification techniques only use a limited set of labeled data to train the classifier, thus leaving a large part of the data practically unused. If we have very little training data available it is obvious that the classifier will have problems generalizing well as too few points will not fully represent the classes we are training the classifier to separate. So creating semi-supervised classifiers that somehow includes information from unlabeled data is a natural extension. This is further confirmed by the fact that labeling of data is very often a boring and time consuming task that can only be done by a few experts on the field in question rather than general pattern recognition experts, while unlabeled data are often abundant and no experts are needed. One way of taking advantage of unlabeled data, which is the one we will use in this thesis, is to first transform the data to a new space using all data, both labeled and unlabeled, and in this new space use the labeled points to create a classifier. The idea is that when we include all the unlabeled data in the transformation the, often scarcely populated, labeled data set will represent the data better than without the unlabeled points. Previous work have shown very good results using the data transformations Laplacian eigenmaps with ordinary least squares and Data Spectroscopy with the LASSO. We transform the data with a new method developed at the University of Tromsø called Kernel Entropy Component Analysis and combine it with LASSO classification. Previous work using ordinary least squares and KECA have shown good results. This transformation preserves entropy components from the input data which has been showed can give much better representation of data than the otherwise almost exclusively used variance measure. Through different experiments we show that this semi-supervised classifier in most cases performs comparable to or better than the previous results using other data transformations. We also show that the LASSO has an almost exclusively positive effect on classification after data transformation. A deeper analysis of how the LASSO classifier works together with the Kernel Entropy Component Analysis transformation is included, and we compare the results to the closely related Kernel Principal Component Analysis transformation. Finally we test the new classifier on a data set consisting of different facial expression showing that including unlabeled data leads to much better classification result than with straight forward least squares or LASSO classification.en
dc.identifier.urihttps://hdl.handle.net/10037/3992
dc.identifier.urnURN:NBN:no-uit_munin_3714
dc.language.isoengen
dc.publisherUniversitetet i Tromsøen
dc.publisherUniversity of Tromsøen
dc.rights.accessRightsopenAccess
dc.rights.holderCopyright 2011 The Author(s)
dc.subject.courseIDFYS-3921en
dc.subjectKernel Entropy Component Analysis, information theoretic learning, semi-supervised learning, LASSO, linear classification, data transformation, machine learning, pattern recognitionen
dc.subjectVDP::Mathematics and natural science: 400::Information and communication science: 420::Simulation, visualization, signal processing, image processing: 429en
dc.titleSemi-supervised Classification using Kernel Entropy Component Analysis and the LASSO.en
dc.typeMaster thesisen
dc.typeMastergradsoppgaveen


File(s) in this item

Thumbnail
Thumbnail

This item appears in the following collection(s)

Show simple item record