Abstract
This thesis focuses on developing a Machine Learning (ML) model to predict the first hospital admission in older Norwegian patients.
Addressing the complexity of predicting all-cause hospitalizations required a comprehensive approach, beginning with a systematic review of existing work. The review highlighted some important gaps in the use of ML for predicting hospitalization such as challenges in representing Health Code Systems (HCSs), the quality of reporting, and individual clinical interpretations. These insights were starting points for the methodological and practical frameworks presented in this thesis.
To tackle the representation of HCSs, while taking into consideration the trade-off between model performance and meaningful clinical interpretations, we proposed a methodology based on Network Analysis (NA) modularity detection. The idea was to group these codes after their prevalence in the population. HCSs such as the Internation Classification of Disease (ICD), were modeled as a network, where nodes represent the codes and edges quantify co-occurrence among patients. The methodology demonstrated good predicting performance and several advantages over traditional grouping approaches. Building on that, and to validate the clinical relevance of this methodology, we demonstrated a framework for detecting and interpreting Multimorbidity Patterns (MPs) using data from the Norwegian elderly hospitalized population.
This thesis focuses on developing a Machine Learning (ML) model to predict the first hospital admission in older Norwegian patients.
Addressing the complexity of predicting all-cause hospitalizations required a comprehensive approach, beginning with a systematic review of existing work. The review highlighted some important gaps in the use of ML for predicting all-cause hospitalization such as challenges in representing high-dimensional Health Code Systems (HCSs), the quality of reporting, clinical interpretations on the individual patient level, and models’ deployment. These insights were the starting point for the methodological and practical framework presented in this thesis.
To tackle the representation of HCSs, while taking into consideration the trade-off between model performance and meaningful clinical interpretations, we proposed a methodology based on Network Analysis (NA) modularity detection. The idea was to group these codes after their prevalence in the population. HCSs such as the Internation Classification of Disease (ICD), were modeled as a network, where nodes represent the codes and edges quantify co-occurrence among patients. The methodology demonstrated good prediction performance and several advantages over traditional grouping approaches. Building on that, and to validate the clinical relevance of this methodology, we demonstrated a framework for detecting and interpreting Multimorbidity Patterns (MPs) using data from the Norwegian older patient hospitalized population.
We finally developed an ML model to predict all-cause somatic hospitalizations. We applied a pipeline to achieve good model performance and to find the most influential features for predicting hospitalizations. We also aimed to address some of the identified gaps in the literature and integrate the usage of the proposed methodology of representing HCSs. The model pipeline incorporated diverse data samples for model training, feature selection technique, and algorithm groups. The model was deployed as a web application to demonstrate the potential use of this work in practice. The thesis provides a clinically relevant framework for healthcare systems investigating similar outcomes and puts the foundation for future research on the Norwegian national level to refine predictive models, expand multimorbidity analyses, and address challenges in clinical deployment.
Denne avhandlingen fokuserer på å utvikle en maskinlæringsmodell (ML) for å predikere den første sykehusinnleggelsen hos eldre norske pasienter.
Håndteringen av kompleksiteten av predikering alle typer sykehusinnleggelser krevde en omfattende tilnærming. Vi begynte med en systematisk gjennomgang av eksisterende arbeid. Gjennomgangen avdekket flere viktige svakheter i bruken av ML til å forutsi sykehusinnleggelser, blant annet utfordringer med representasjon av Health Code Systems (HCS), kvaliteten på rapportering og individuelle kliniske tolkninger. Disse funnene dannet utgangspunktet for de metodiske og praktiske rammene som presenteres i denne avhandlingen.
For å takle representasjon av HCS og balansere hensynet til både modellens ytelse og meningsfulle kliniske tolkninger, foreslo vi en metodikk basert på modulæritetsdeteksjon i nettverksanalyse (NA). Målet var å gruppere kodene ut fra hvor vanlige de er i befolkningen. HCS-er, som for eksempel International Classification of Disease (ICD), ble modellert som et nettverk der noder representerer kodene og kantene viser samforekomst blant pasienter. Metodikken ga god modellytelse og viste flere fordeler sammenlignet med tradisjonelle grupperingsmetoder. For å bekrefte den kliniske relevansen av denne tilnærmingen, lagde vi et rammeverk for å oppdage og tolke multimorbiditetsmønstre (MP-er) ved bruk av data fra eldre sykehuspasienter i Norge.
Vi utviklet til slutt en ML-modell for å predikere somatiske sykehusinnleggelser av alle årsaker. Hensikten var å fylle noen av kunnskapshullene i litteraturen og inkludere den foreslåtte metodikken for å representere HCS. Modellen ble bygget ved hjelp av ulike datahåndteringsteknikker, metoder for utvelgelse av funksjoner og flere algoritmegrupper. Den ble deretter gjort tilgjengelig som en nettapplikasjon for å illustrere hvordan den kan tas i bruk i praksis. Avhandlingen presenterer et klinisk relevant rammeverk for helsesektorer som ønsker å undersøke tilsvarende problemstillinger, og legger grunnlaget for videre forskning på nasjonalt nivå i Norge for å forbedre prediksjonsmodeller, utvide multimorbiditetsanalyser og adressere utfordringer ved klinisk implementering.
Has part(s)
Paper I: Askar, M., Tafavvoghi, M., Småbrekke, L., Bongo, L.A. & Svendsen, K. (2024). Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review. PLoS One, 19(8), e0309175. Also available in Munin at https://hdl.handle.net/10037/37006.
Paper II: Askar, M., Småbrekke, L., Holsbø, E., Bongo, L.A. & Svendsen, K. (2024). “Using network analysis modularity to group health code systems and decrease dimensionality in machine learning models.” Exploratory Research in Clinical and Social Pharmacy, 14, 100463. Also available in Munin at https://hdl.handle.net/10037/34833.
Paper III: Askar, M., Garcia, B.H. & Svendsen, K. Exploring multimorbidity patterns in older hospitalized Norwegian patients using Network Analysis modularity. (Submitted manuscript). Now published in International Journal of Medical Informatics, 201, 2025, 105954, available in Munin at https://hdl.handle.net/10037/37005.
Paper IV: Askar, M., Småbrekke, L., Holsbø, E., Bongo, L.A. & Svendsen, K. Machine Learning-Based Prediction of Non-elective Hospitalizations in older Norwegians patients: A Multi-Register Study Using Ensemble Methods. (Manuscript).