Vis enkel innførsel

dc.contributor.advisorBongo, Lars Ailo
dc.contributor.authorVoets, Mike
dc.date.accessioned2018-05-31T07:44:18Z
dc.date.available2018-05-31T07:44:18Z
dc.date.issued2018-05-15
dc.description.abstractWe aim to give an insight into aspects of developing and deploying a deep learning algorithm to automate biomedical image analyses. We anonymize sensitive data from a medical archive system, attempt to replicate and further improve published methods, and scale out our algorithm to support large-scale analyses. Specifically, our contributions are described as follows. First, to anonymize and extract mammograms for the development of a breast cancer detection algorithm, we wrote a script for mammograms that reside in a data-locking, sensitive, and proprietary PACS. The script will be used in a larger project to extract mammograms from all screening points in Norway. Second, because this script is currently being authorized by Helsenord IKT, we instead developed an algorithm for a similar screening problem in the biomedical field. In order not to reinvent the wheel, we investigated earlier work. The high-impact article JAMA 2016; 316(22) describes a high performance deep learning algorithm that detects diabetic retinopathy, reporting a receiver operating characteristic curve (AUC) of 0.99. We attempted to replicate the method. Our AUC of 0.74 and 0.59 did however not reach the reported results, possibly by differences in data, or by missing details in the methodology. Third, by modifying the data preprocessing methods in the diabetic retinopathy algorithm slightly, the AUC increased to 0.94 and 0.82. These findings emphasize the challenges of replicating deep learning methods that have their source code not published, and do not use publicly available data. Fourth, benchmarks were run to assess the resources needed to run algorithm development and automated analyses on a national (Norwegian) scale. We estimate that a breast cancer detection algorithm can be trained on 4 GPUs in less than 17 hours, with a sublinear speed-up of 3.36 times compared to 1 GPU. Evaluation with inexpensive GPUs has been shown to perform instantly. Lastly, with our experiences and lessons learned in mind, we conclude with literature suggestions and recommendations to develop and to deploy an algorithm for breast cancer detection in a large-scale screening program.en_US
dc.identifier.urihttps://hdl.handle.net/10037/12808
dc.language.isoengen_US
dc.publisherUiT Norges arktiske universiteten_US
dc.publisherUiT The Arctic University of Norwayen_US
dc.rights.accessRightsopenAccessen_US
dc.rights.holderCopyright 2018 The Author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/3.0en_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)en_US
dc.subject.courseIDINF-3990
dc.subjectVDP::Technology: 500::Information and communication technology: 550::Computer technology: 551en_US
dc.subjectVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Datateknologi: 551en_US
dc.titleDeep Learning: From Data Extraction to Large-Scale Analysisen_US
dc.typeMaster thesisen_US
dc.typeMastergradsoppgaveen_US


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)