How will anonymization of simulated clinical data affect the data utility of pharmacoepidemiological studies?

Cheang, Chi Kei

dc.contributor.advisor	Svendsen, Kristian
dc.contributor.author	Cheang, Chi Kei
dc.date.accessioned	2021-05-14T05:45:38Z
dc.date.available	2021-05-14T05:45:38Z
dc.date.issued	2019-05-14	en
dc.description.abstract	Background: The pressure to share more data and being more transparency of clinical study reports has grown and becomes an important topic in recent years. Before clinical data and clinical results can be shared they must undergo anonymization. How anonymization of clinical data affects the utility is poorly-studied, especially in pharmacoepidemiology. Objective: The aim of the study is to describe and evaluate how anonymization of simulated clinical data will affect the data utility of pharmacoepidemiological analyses of these data. Method: We have simulated five clinical datasets with different characteristics, associations, types of outcome and study populations. Suppression, generalization, randomization and k-anonymity were used as our anonymization approaches. These methods will be evaluated by the change in the data and statistical results before and after anonymization. Result: K-anonymity and suppression were the methods that affected the simulated clinical data the most, while generalization and randomization affected the data least. With k-anonymity and suppression there is a risk to overestimating the clinical results due to the elimination of unique records. On the other hand, generalization and randomization preserved the most data utility but they were less effective in anonymizing the data. Conclusion: Our study revealed that different anonymization approaches can affect the clinical results differently. The more we anonymize a record or attribute, the less utility is provided. It is therefore important to construct a balance of data utility and effectiveness of anonymization before the clinical data are published. More investigations about how anonymization of clinical data affects data utility are needed in order to maximize the benefit of using anonymized clinical data to improve public health.	en_US
dc.identifier.uri	https://hdl.handle.net/10037/21177
dc.language.iso	eng	en_US
dc.publisher	UiT Norges arktiske universitet	no
dc.publisher	UiT The Arctic University of Norway	en
dc.rights.holder	Copyright 2019 The Author(s)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-sa/4.0	en_US
dc.rights	Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)	en_US
dc.subject.courseID	FAR-3911
dc.subject	VDP::Medisinske Fag: 700::Helsefag: 800::Samfunnsfarmasi: 812	en_US
dc.subject	VDP::Medical disciplines: 700::Health sciences: 800::Community pharmacy: 812	en_US
dc.subject	VDP::Medisinske Fag: 700::Helsefag: 800::Andre helsefag: 829	en_US
dc.subject	VDP::Medical disciplines: 700::Health sciences: 800::Other health science disciplines: 829	en_US
dc.title	How will anonymization of simulated clinical data affect the data utility of pharmacoepidemiological studies?	en_US
dc.type	Mastergradsoppgave	no
dc.type	Master thesis	en

Tilhørende fil(er)

Navn:: license.txt
Størrelse:: 1.093Kb
Format:: Tekstfil

Åpne

Navn:: thesis.pdf
Størrelse:: 2.143Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Mastergradsoppgaver i farmasi [283]

Vis enkel innførsel

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)