| dc.contributor.advisor | Bellika, Johan Gustav |  | 
| dc.contributor.advisor | Ruiz, Luis Marco |  | 
| dc.contributor.advisor | Skrøvseth, Stein Olav |  | 
| dc.contributor.author | Hailemichael, Meskerem Asfaw |  | 
| dc.date.accessioned | 2015-06-23T05:07:28Z |  | 
| dc.date.accessioned | 2016-05-18T13:48:55Z |  | 
| dc.date.available | 2016-05-18T13:48:55Z |  | 
| dc.date.issued | 2015-05-18 |  | 
| dc.description.abstract | Motivation:  Despite  its  enormous  benefits,  EHR  data  reuse  is  limited because  of  multi-dimensional  challenges  where  privacy  comes  on  the  forefront.  Recently  various  privacy-preserving  statistical computation tools have emerged.  However, they have  limited  privacy guarantee  and  use  ad-hoc  techniques  for  privacy-preserving  computation  of  statistical functions. 
Purpose: The purpose of this thesis is to develop a system that enables to compute a wide 
variety of statistical functions on distributed EHRs, while preserving the privacy of patients 
and health institutions. 
Materials and Methods:  Systematic  literature  review  of  privacy-preserving  techniques  for 
health data reuse was performed to understand the state-of-the-art. The result of the review 
and meetings with users were used as sources of requirements. Agile methodology was used 
for implementation of a prototype system called Emnet.
Emnet uses openEHR-based EHRs as common data model to achieve interoperability among 
health institutions. We have prepared test openEHR data sets and a virtual environment that 
simulates the real working environment for testing. 
Result:  We  have  developed  and  tested  privacy-preserving  techniques  for  research data set 
preparation  and statistical  computation.  The  research  eligibility  criteria  and  required 
attributes are expressed as a computable query using Archetype Query Language (AQL), and 
each health institution executes the query and locally stores the resulting data set. The data 
sets  are  physically  distributed  across  the  health  institutions,  yet  they  collectively  make  the 
research data set, which we call Virtual Dataset.
Statistical computations on the Virtual Dataset are performed using two main techniques, (1) 
decomposition of statistical functions into summation forms and described as a computation 
graph; and (2) secure summation protocols.
Conclusion:  The  developed  techniques  enable  statistical  computation  on  distributed  health 
data,  while  preserving  the  privacy  of  patients  and  health  institutions.  Currently, mean, 
variance,  Standard  Deviation,  Covariance  and Pearson’s  r are  implemented  in  Emnet. 
However, the techniques are generic to implement more statistical functions, as long as they 
can be decomposed into summation forms. The work presented in this thesis contributes for 
advancement  of  privacy-preserving  health  data  reuse.  It  is  also  relevant  to  other  domains 
where they have similar requirements as health care. | en_US | 
| dc.identifier.uri | https://hdl.handle.net/10037/9154 |  | 
| dc.identifier.urn | URN:NBN:no-uit_munin_8724 |  | 
| dc.language.iso | eng | en_US | 
| dc.publisher | UiT Norges arktiske universitet | en_US | 
| dc.publisher | UiT The Arctic University of Norway | en_US | 
| dc.rights.accessRights | openAccess |  | 
| dc.rights.holder | Copyright 2015 The Author(s) |  | 
| dc.rights.uri | https://creativecommons.org/licenses/by-nc-sa/3.0 | en_US | 
| dc.rights | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) | en_US | 
| dc.subject.courseID | INF-3997 | en_US | 
| dc.subject | VDP::Medical disciplines: 700::Health sciences: 800 | en_US | 
| dc.subject | VDP::Medisinske Fag: 700::Helsefag: 800 | en_US | 
| dc.subject | Computation Graph | en_US | 
| dc.subject | Data reuse | en_US | 
| dc.subject | EHR | en_US | 
| dc.subject | Health Information System | en_US | 
| dc.subject | Health Research | en_US | 
| dc.subject | Privacy | en_US | 
| dc.subject | Statistical Computing | en_US | 
| dc.subject | Secure Multi-party Computation | en_US | 
| dc.subject | Secure Summation | en_US | 
| dc.subject | Virtual Dataset | en_US | 
| dc.title | Emnet: A System for Privacy-preserving Statistical Computation on Distributed Health Data | en_US | 
| dc.type | Master thesis | en_US | 
| dc.type | Mastergradsoppgave | en_US |