dc.contributor.advisor | Bongo, Lars Ailo | |
dc.contributor.advisor | Sommerseth, Hilde Leikny | |
dc.contributor.author | Park, Narae | |
dc.date.accessioned | 2023-01-27T06:32:39Z | |
dc.date.available | 2023-01-27T06:32:39Z | |
dc.date.issued | 2022-08-02 | en |
dc.description.abstract | The Historical Population Register (HPR) is a project to build the longitudinal life history of individuals by integrating the historical records of the people in Norway since the 19th century. This study attempted to improve the linking rate between the 1875-1900 censuses in HPR, which is currently low, using machine learning approaches. To this end, I developed a machine learning model for linking that is suitable for the Norwegian census and tested various algorithms, feature sets, and match selection options. I compared the results in terms of performance and match size, and also examined their representativeness to the entire population. The study results showed that the linking rate of HPR can be significantly improved by machine learning approaches while maintaining high accuracy. In addition, this study presented a reference for future use by demonstrating how the performance varies depending on the feature set and match selection. On the other hand, this study also revealed that linked data generally do not represent the population of the census, and the characteristics and degree of bias vary depending on the linking algorithm, suggesting that caution is needed when using linked data for research. | en_US |
dc.description | For errata and source code: <a href=https://github.com/uit-hdl/rhd-linking>https://github.com/uit-hdl/rhd-linking</a>. | |
dc.identifier.uri | https://hdl.handle.net/10037/28399 | |
dc.language.iso | eng | en_US |
dc.publisher | UiT Norges arktiske universitet | no |
dc.publisher | UiT The Arctic University of Norway | en |
dc.rights.holder | Copyright 2022 The Author(s) | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-sa/4.0 | en_US |
dc.rights | Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) | en_US |
dc.subject.courseID | INF-3990 | |
dc.subject | Historical record linkage | en_US |
dc.subject | Norwegian census | en_US |
dc.subject | Historical Population register | en_US |
dc.subject | Machine learning | en_US |
dc.title | Record linkage of Norwegian historical census data using machine learning | en_US |
dc.type | Mastergradsoppgave | no |
dc.type | Master thesis | en |