ub.xmlui.mirage2.page-structure.muninLogoub.xmlui.mirage2.page-structure.openResearchArchiveLogo
    • EnglishEnglish
    • norsknorsk
  • Velg spraaknorsk 
    • EnglishEnglish
    • norsknorsk
  • Administrasjon/UB
Vis innførsel 
  •   Hjem
  • Fakultet for naturvitenskap og teknologi
  • Institutt for informatikk
  • Mastergradsoppgaver i informatikk
  • Vis innførsel
  •   Hjem
  • Fakultet for naturvitenskap og teknologi
  • Institutt for informatikk
  • Mastergradsoppgaver i informatikk
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

Preserving Privacy in Interactions with Large Language Models

Permanent lenke
https://hdl.handle.net/10037/37874
Thumbnail
Åpne
no.uit:wiseflow:7267640:62187160.pdf (1.064Mb)
(PDF)
Dato
2025
Type
Master thesis

Forfatter
Lorentzen, Nikolai
Sammendrag
In this thesis we investigate the preservation of privacy in user interactions with Large Language Models (LLMs), focusing on transforming user queries to enhance privacy while maintaining the usability of answers. The research is contextualized within the FysBot mobile health application, which aims to motivate physical activity. The core problem addressed is the potential leakage of sensitive user information through prompts sent to LLM-based chatbots, stemming from risks like data memorization, re-identification, and logging. This thesis proposes a privacy-preserving system designed to mitigate these risks by modifying queries before they reach the external LLM. The developed system employs several techniques: numerical data (e.g., steps, geolocation, heart rate, time) is perturbed using randomized noise through methods like General Additive Data Perturbation (GADP) and Multiplicative Data Perturbation (MDP), tailored to the specific data type to maintain utility. Sensitive textual information is identified and substituted with semantic labels chosen via cosine similarity on text embeddings. The system was implemented in Python, utilizing models like ChatGPT 3.5 and text-embedding-3-small. Evaluation of the system involved performance benchmarking and a user survey. Benchmarking revealed a significant overhead, with an approximate 2.3-fold increase in data sent, a 3.7-fold increase in data received, and a 3-fold increase in execution time when the privacy-preserving system was used. The user survey, conducted with participants from health research and the general public, indicated that while a vast majority preferred answers generated from original, sensitive prompts. 50% of participants were willing to accept a reduction in the usability of answers in exchange for enhanced privacy. Hesitancy was often linked to the criticality of the sensitive information (diagnoses), where accuracy was deemed paramount. This thesis concludes that it is feasible to develop a system that enhances end-user privacy in LLM interactions with a manageable loss in usability. However, the introduced overhead suggests Backend implementation is more viable for mobile applications. Future work could focus on handling real-time sensitive data detection or optimizing the system’s performance.
 
 
 
Forlag
UiT The Arctic University of Norway
Metadata
Vis full innførsel
Samlinger
  • Mastergradsoppgaver i informatikk [129]
Copyright 2025 The Author(s)

Bla

Bla i hele MuninEnheter og samlingerForfatterlisteTittelDatoBla i denne samlingenForfatterlisteTittelDato
Logg inn

Statistikk

Antall visninger
UiT

Munin bygger på DSpace

UiT Norges Arktiske Universitet
Universitetsbiblioteket
uit.no/ub - munin@ub.uit.no

Tilgjengelighetserklæring