Show simple item record

dc.contributor.advisorDalmo, Rune
dc.contributor.advisorPedersen, Bjørn-Richard
dc.contributor.authorWilhelmsen, Kristoffer Berg
dc.date.accessioned2024-07-18T06:24:31Z
dc.date.available2024-07-18T06:24:31Z
dc.date.issued2024-05-15en
dc.description.abstractThis thesis assesses the impact of fine-tuning and rag on llms in accurately assigning icd-10 codes to historical causes of death. Using funeral records from Trondheim, Norway (1830-1920), we fine-tuned Llama 3 and Mistral on 2000 records. Twelve experiments were conducted on 2000 additional records to evaluate the accuracy of each knowledge-injection technique, as well as a combination of the two. The results indicate that fine-tuning as a standalone knowledge-injection technique achieved the highest accuracy, generating 88% full matches and 2% partial matches for icd-10 codes, up from 58% full matches and 25% partial matches in previous research. However, concerns regarding memorization of training data due to the lack of diversity in the available dataset remain. Moreover, combining RAG with fine-tuning led to a decrease in accuracy, while a sole rag approach decreased the results even further. These findings serve as proof-of-concept for the automatic assignment of icd-10 codes to historical causes of death, paving the way for future research.en_US
dc.identifier.urihttps://hdl.handle.net/10037/34160
dc.language.isoengen_US
dc.publisherUiT Norges arktiske universitetno
dc.publisherUiT The Arctic University of Norwayen
dc.rights.holderCopyright 2024 The Author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-sa/4.0en_US
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)en_US
dc.subject.courseIDDTE-3900
dc.subjectfine-tuningen_US
dc.subjectlarge language modelsen_US
dc.subjectretrieval-augmented generationen_US
dc.subjecticd-10en_US
dc.subjectquantizationen_US
dc.subjectlow-rank adaptationen_US
dc.titleFine-tuning Large Language Models on historical causes of death dataen_US
dc.typeMaster thesisen
dc.typeMastergradsoppgaveno


File(s) in this item

Thumbnail
Thumbnail

This item appears in the following collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)