Show simple item record

dc.contributor.advisorBellika, Johan Gustav
dc.contributor.advisorRuiz, Luis Marco
dc.contributor.advisorSkrøvseth, Stein Olav
dc.contributor.authorHailemichael, Meskerem Asfaw
dc.description.abstractMotivation: Despite its enormous benefits, EHR data reuse is limited because of multi-dimensional challenges where privacy comes on the forefront. Recently various privacy-preserving statistical computation tools have emerged. However, they have limited privacy guarantee and use ad-hoc techniques for privacy-preserving computation of statistical functions. Purpose: The purpose of this thesis is to develop a system that enables to compute a wide variety of statistical functions on distributed EHRs, while preserving the privacy of patients and health institutions. Materials and Methods: Systematic literature review of privacy-preserving techniques for health data reuse was performed to understand the state-of-the-art. The result of the review and meetings with users were used as sources of requirements. Agile methodology was used for implementation of a prototype system called Emnet. Emnet uses openEHR-based EHRs as common data model to achieve interoperability among health institutions. We have prepared test openEHR data sets and a virtual environment that simulates the real working environment for testing. Result: We have developed and tested privacy-preserving techniques for research data set preparation and statistical computation. The research eligibility criteria and required attributes are expressed as a computable query using Archetype Query Language (AQL), and each health institution executes the query and locally stores the resulting data set. The data sets are physically distributed across the health institutions, yet they collectively make the research data set, which we call Virtual Dataset. Statistical computations on the Virtual Dataset are performed using two main techniques, (1) decomposition of statistical functions into summation forms and described as a computation graph; and (2) secure summation protocols. Conclusion: The developed techniques enable statistical computation on distributed health data, while preserving the privacy of patients and health institutions. Currently, mean, variance, Standard Deviation, Covariance and Pearson’s r are implemented in Emnet. However, the techniques are generic to implement more statistical functions, as long as they can be decomposed into summation forms. The work presented in this thesis contributes for advancement of privacy-preserving health data reuse. It is also relevant to other domains where they have similar requirements as health care.en_US
dc.publisherUiT Norges arktiske universiteten_US
dc.publisherUiT The Arctic University of Norwayen_US
dc.rights.holderCopyright 2015 The Author(s)
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)en_US
dc.subjectVDP::Medical disciplines: 700::Health sciences: 800en_US
dc.subjectVDP::Medisinske Fag: 700::Helsefag: 800en_US
dc.subjectComputation Graphen_US
dc.subjectData reuseen_US
dc.subjectHealth Information Systemen_US
dc.subjectHealth Researchen_US
dc.subjectStatistical Computingen_US
dc.subjectSecure Multi-party Computationen_US
dc.subjectSecure Summationen_US
dc.subjectVirtual Dataseten_US
dc.titleEmnet: A System for Privacy-preserving Statistical Computation on Distributed Health Dataen_US
dc.typeMaster thesisen_US

File(s) in this item


This item appears in the following collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)