Vis enkel innførsel

dc.contributor.advisorAnfinsen, Stian Normann
dc.contributor.authorDretvik, Vilde Fonn
dc.date.accessioned2021-08-04T06:29:27Z
dc.date.available2021-08-04T06:29:27Z
dc.date.issued2021-06-21en
dc.description.abstractThis work is about classifying time series with missing data with the help of imputation and selected machine learning algorithms and methods. The author has used imputation to replace missing values in two data sets, one containing surgical site infection (SSI) data of 11 types of blood samples of patients over 20 days, and another data set called uwave which contain 3D accelerometer data of several patterns made by a subset of people, where two patterns were selected. The SSI data set is known to possess informative missingness. For the uwave data, missing data was simulated by removing data points in an informative (not random) way to simulate missing data. The DTW and Euclidean distances were computed for each imputed data set to make distance grid matrices, and used to performed classification on the data using the K Nearest Neighbour (KNN) classifier and the Support Vector Machine (SVM) classifier. Furthermore the data set features were augmented by adding masks that indicate the presence of missing data and counters of consecutive spells of missing data to help exploit informative missingness. The augmented dataset was used to classify the data using the same classifiers and distance methods mentioned earlier, in addition to a newer classifier called the Temporal Convolution Network (TCN), which used the augmented data in combination with imputation of the original data. It was found that applying Dynamic Time Warping (DTW) was unnecessary for the KNN classifier, and that Euclidean distance was sufficient. Augmenting the data was found to improve the overall results for the SVM and KNN classifier. The TCN was found to need more work due to giving unstable test results with much lower values than the validation would imply.en_US
dc.identifier.urihttps://hdl.handle.net/10037/21916
dc.language.isoengen_US
dc.publisherUiT Norges arktiske universitetno
dc.publisherUiT The Arctic University of Norwayen
dc.rights.holderCopyright 2021 The Author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/4.0en_US
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)en_US
dc.subject.courseIDFYS-3941
dc.subjectVDP::Mathematics and natural science: 400::Mathematics: 410::Applied mathematics: 413en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Anvendt matematikk: 413en_US
dc.subjectVDP::Mathematics and natural science: 400::Mathematics: 410::Statistics: 412en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Statistikk: 412en_US
dc.subjectVDP::Mathematics and natural science: 400::Information and communication science: 420::Knowledge based systems: 425en_US
dc.subjectVDP::Matematikk og Naturvitenskap: 400::Informasjons- og kommunikasjonsvitenskap: 420::Kunnskapsbaserte systemer: 425en_US
dc.subjectVDP::Medical disciplines: 700::Health sciences: 800::Other health science disciplines: 829en_US
dc.subjectVDP::Medisinske Fag: 700::Helsefag: 800::Andre helsefag: 829en_US
dc.titleImputation and classification of time series with missing data using machine learningen_US
dc.typeMastergradsoppgavenor
dc.typeMaster thesiseng


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)