Coding Historical Causes of Death Data with Large Language Models

Pedersen, Bjørn-Richard; Islam, Maisha; Kristoffersen, Doris Tove; Bongo, Lars Ailo Aslaksen; Garrett, Eilidh; Reid, Alice; Sommerseth, Hilde Leikny

dc.contributor.author	Pedersen, Bjørn-Richard
dc.contributor.author	Islam, Maisha
dc.contributor.author	Kristoffersen, Doris Tove
dc.contributor.author	Bongo, Lars Ailo Aslaksen
dc.contributor.author	Garrett, Eilidh
dc.contributor.author	Reid, Alice
dc.contributor.author	Sommerseth, Hilde Leikny
dc.date.accessioned	2025-01-20T12:48:02Z
dc.date.available	2025-01-20T12:48:02Z
dc.date.issued	2024-10-31
dc.description.abstract	This paper investigates the feasibility of using pre-trained generative Large Language Models (LLMs) to automate the assignment of ICD-10 codes to historical causes of death. Due to the complex narratives often found in historical causes of death, this task has traditionally been manually performed by coding experts. We evaluate the ability of GPT-3.5, GPT-4, and Llama 2 LLMs to accurately assign ICD-10 codes on the HiCaD dataset that contains causes of death recorded in the civil death register entries of 19,361 individuals from Ipswich, Kilmarnock, and the Isle of Skye in the UK between 1861–1901. Our findings show that GPT-3.5, GPT-4, and Llama 2 assign the correct code for 69%, 83%, and 40% of causes, respectively. However, we achieve a maximum accuracy of 89% by standard machine learning techniques. All LLMs performed better for causes of death that contained terms still in use today, compared to archaic terms. Also, they performed better for short causes (1–2 words) compared to longer causes. LLMs therefore do not currently perform well enough for historical ICD-10 code assignment tasks. We suggest further fine-tuning or alternative frameworks to achieve adequate performance.	en_US
dc.identifier.citation	Pedersen, Islam, Kristoffersen, Bongo, Garrett, Reid, Sommerseth. Coding Historical Causes of Death Data with Large Language Models. Lecture Notes in Computer Science (LNCS). 2024;Bridging the Gap Between AI and Reality	en_US
dc.identifier.cristinID	FRIDAID 2342336
dc.identifier.doi	10.1007/978-3-031-73741-1_3
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.uri	https://hdl.handle.net/10037/36234
dc.language.iso	eng	en_US
dc.publisher	Springer Nature	en_US
dc.relation.journal	Lecture Notes in Computer Science (LNCS)
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2025 The Author(s)	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0	en_US
dc.rights	Attribution 4.0 International (CC BY 4.0)	en_US
dc.title	Coding Historical Causes of Death Data with Large Language Models	en_US
dc.type.version	publishedVersion	en_US
dc.type	Journal article	en_US
dc.type	Tidsskriftartikkel	en_US
dc.type	Peer reviewed	en_US

File(s) in this item

Name:: article.pdf
Size:: 1.078Mb
Format:: PDF

View/Open

This item appears in the following collection(s)

Artikler, rapporter og annet (arkeologi, historie, religionsvitenskap og teologi) [301]

Show simple item record

Except where otherwise noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)