• De-identifying Norwegian Clinical Text using Resources from Swedish and Danish 

      Lamproudis, Anastasios; Mora, Sara; Olsen Svenning, Therese; Torsvik, Torbjørn; Chomutare, Taridzo Fred; Ngo, Phuong Dinh; Dalianis, Hercules (Journal article; Tidsskriftartikkel, 2023)
      The lack of relevant annotated datasets represents one key limitation in the application of Natural Language Processing techniques in a broad number of tasks, among them Protected Health Information (PHI) identification in Norwegian clinical text. In this work, the possibility of exploiting resources from Swedish, a very closely related language, to Norwegian is explored. The Swedish dataset is ...
    • Deidentifying a Norwegian clinical corpus - An effort to create a privacy-preserving Norwegian large clinical language model 

      Ngo, Phuong Dinh; Tejedor Hernandez, Miguel Angel; Olsen Svenning, Therese; Chomutare, Taridzo Fred; Budrionis, Andrius; Dalianis, Hercules (Journal article; Tidsskriftartikkel; Peer reviewed, 2024)
      This study discusses the methods and challenges of deidentifying and pseudonymizing Norwegian clinical text for research purposes. The results of the NorDeid tool for deidentification and pseudonymization on different types of protected health information were evaluated and discussed, as well as the extension of its functionality with regular expressions to identify specific types of sensitive ...