Show simple item record

dc.contributor.advisorJenssen, Robert
dc.contributor.authorMikalsen, Karl Øyvind
dc.date.accessioned2019-02-08T12:42:57Z
dc.date.available2019-02-08T12:42:57Z
dc.date.issued2019-02-15
dc.description.abstractIn healthcare, vast amounts of data are stored digitally in the electronic health records (EHRs). EHRs represent a largely untapped source of clinically relevant information, which combined with advances in machine learning, have the potential to transform healthcare into a more data-driven direction. However, due to the complexity and poor quality of the EHRs, data-driven healthcare is facing many challenges. In this thesis, we address the challenge posed by lack of ground-truth labels and provide methodological solutions to challenges related with missing data, temporality, and high dimensionality. Towards that end, we present four lines of work where we develop novel unsupervised and weakly supervised learning methodology. The first work presents a kernel for multivariate time series with missing values, which frequently occur in the EHRs. Key components in the method are clustering and ensemble learning. Experiments on benchmark datasets demonstrate that the proposed kernel is robust to hyper-parameter choices and performs well in presence of missing data. Next, we present a dimensionality reduction method, which is designed to account for many of the challenges data-driven healthcare is facing. One of them is high dimensionality, but in addition, the method is capable of exploiting noisy and partially labeled multi-label data. We provide a case study of patients suffering from chronic diseases. In the third work, we present a kernel capable of exploiting informative missingness in multivariate time series, as well as a novel semi-supervised kernel. The effectiveness of the proposed methods is demonstrated via experiments on benchmark data and a case study of patients suffering from infectious postoperative complications. In the last work, we perform phenotyping of patients with postoperative delirium using a weakly supervised learning framework, wherein clinical knowledge is used to generate a noisy labeled training set, which in turn is used to train classifiers. Experiments on a dataset collected from a Norwegian university hospital demonstrate the efficiency of the framework.en_US
dc.description.doctoraltypeph.d.en_US
dc.description.popularabstractIn healthcare, vast amounts of data are stored digitally in the electronic health records (EHRs). The EHRs represent a largely untapped source of clinically relevant information, which combined with advances in data science and machine learning, have the potential of leaping forward quality of care at an individual patient level. To this end, we address challenges specific to healthcare data and develop novel machine learning methodology. Studies of anonymized patient data demonstrate how the proposed methods ultimately can lead to e.g. a reduction in the number of postoperative complications and better quality of life for patients with chronic diseases. This thesis brings us one step closer to the realization of data-driven healthcare in which healthcare, as we know it today, is assisted by autonomous monitoring systems as well as diagnosis and decision support tools based on data-driven approaches and machine learning.en_US
dc.identifier.isbn978-82-8236-332-7 (trykt) og 978-82-8236-333-4 (pdf)
dc.identifier.urihttps://hdl.handle.net/10037/14659
dc.language.isoengen_US
dc.publisherUiT Norges arktiske universiteten_US
dc.publisherUiT The Arctic University of Norwayen_US
dc.relation.haspart<p>Paper I: Mikalsen, K.Ø., Bianchi, F.M., Soguero-Ruiz, C. & Jenssen, R. (2018). Time series cluster kernel for learning similarities between multivariate time series with missing data. <i>Pattern Recognition, 76</i>, 569-581. The article is available in the thesis introduction. Also available at <a href= https://doi.org/10.1016/j.patcog.2017.11.030> https://doi.org/10.1016/j.patcog.2017.11.030. </a> Accepted manuscript available at <a href=http://hdl.handle.net/10037/13578>http://hdl.handle.net/10037/13578. </a><p> <p>Paper II: Mikalsen, K.Ø., Soguero-Ruiz, C., Bianchi, F.M. & Jenssen, R. Noisy multi-label semi-supervised dimensionality reduction (Submitted manuscript). Published version in <i>Pattern Recognition, 90</i>, 257-270 available at <a href=https://doi.org/10.1016/j.patcog.2019.01.033>https://doi.org/10.1016/j.patcog.2019.01.033. </a><p> Paper III: Mikalsen, K.Ø., Soguero-Ruiz, C., Bianchi, F.M., Revhaug, A. & Jenssen, R. Time series cluster kernels to exploit informative missingness and incomplete label information. (Submitted manuscript). <p> <p>Paper IV: Mikalsen, K.Ø., Soguero-Ruiz, C., Jensen, K., Hindberg, K., Gran, M., Revhaug, A. … Jenssen, R. (2017). Using anchors from free text in electronic health records to diagnose postoperative delirium. <i>Computer Methods and Programs in Biomedicine, 152</i>, 105–114. The article is available in the thesis introduction. Also available at <a href=https://doi.org/10.1016/j.cmpb.2017.09.014> https://doi.org/10.1016/j.cmpb.2017.09.014. </a><p>en_US
dc.rights.accessRightsopenAccessen_US
dc.rights.holderCopyright 2019 The Author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/3.0en_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)en_US
dc.subjectMachine Learningen_US
dc.subjectVDP::Technology: 500::Medical technology: 620en_US
dc.subjectVDP::Teknologi: 500::Medisinsk teknologi: 620en_US
dc.titleAdvancing Unsupervised and Weakly Supervised Learning with Emphasis on Data-Driven Healthcareen_US
dc.typeDoctoral thesisen_US
dc.typeDoktorgradsavhandlingen_US


File(s) in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)