Show simple item record

dc.contributor.authorThorvaldsen, Gunnar
dc.date.accessioned2022-02-16T13:10:19Z
dc.date.available2022-02-16T13:10:19Z
dc.date.issued2021-03-31
dc.description.abstractTranscribing the 1950 Norwegian census with 3.3 million person records and linking it to the Central Population Register (CPR) provides longitudinal information about significant population groups during the understudied period of the mid-20th century. Since this source is closed to the public, we receive no help from genealogists and rather use machine learning techniques to semi-automate the transcription. First the scanned manuscripts are split into individual cells and multiple names are divided. After the birthdates were transcribed manually in India, a lookup routine searches for families with matching sets of birthdates in the 1960 census and the CPR. After manual checks with GUI routines, the names are copied to the text version of the 1950 census, also storing the links to the CPR. Other fields like occupations or gender contain numeric or letter codes and are transcribed wholesale with routines interpreting the layout of the graphical images. Work employing these methods has also started on the 1930 census, which is the last of the Norwegian censuses to be transcribed.en_US
dc.identifier.citationThorvaldsen G. Automating Historical Source Transcription. Historical Life Course Studies. 2021;10(3):59-63en_US
dc.identifier.cristinIDFRIDAID 1976905
dc.identifier.doi10.51964/hlcs9568
dc.identifier.issn2352-6343
dc.identifier.urihttps://hdl.handle.net/10037/24070
dc.language.isoengen_US
dc.publisherOpenjournalsen_US
dc.relation.journalHistorical Life Course Studies
dc.rights.accessRightsopenAccessen_US
dc.rights.holderCopyright 2021 The Author(s)en_US
dc.titleAutomating Historical Source Transcriptionen_US
dc.type.versionpublishedVersionen_US
dc.typeJournal articleen_US
dc.typeTidsskriftartikkelen_US
dc.typePeer revieweden_US


File(s) in this item

Thumbnail

This item appears in the following collection(s)

Show simple item record