Retrospective evaluation of a CE-marked AI system, including 1,017,208 mammography screening examinations
Permanent link
https://hdl.handle.net/10037/37956Date
2025-03-26Type
Journal articleTidsskriftartikkel
Peer reviewed
Author
Hovda, Tone; Larsen, Marthe; Bergan, Marie Burns; Gjesvik, Jonas; Akslen, Lars Andreas; Hofvind, Solveig Sand-HanssenAbstract
Materials and methods - We used data from screening examinations performed from 2004 to 2021 at ten breast centers in BreastScreen Norway. In the standard independent double reading setting, each radiologist scored each breast from 1 (negative) to 5 (high probability of cancer). The AI system assigned each examination an NT and an SN score; the NT score aimed to classify examinations as negative with minimal misclassification while the SN score aimed to classify examinations as positive with high confidence. N70 was defined as being among the 70% with the lowest NT score and P3 was defined as being among the 3% with the highest SN score.
Results - A total of 1,017,208 screening examinations were included in the study sample. At N70, 1.8% (107/5977) of the screen-detected and 34.5% (625/1812) of the interval cancers were defined as negative. Using P3 to define cases as positive, 81.5% (4871/5977) of the screen-detected and 19.0% (344/1812) of the interval cancers were defined as positive. Among the screen-detected cancers in N70, 11.2% (12/107) had an interpretation score > 2 by both radiologists.
Conclusion - The AI system performed well according to identifying negative cases and cancer cases. Thus, the AI system can be used to reduce workload for the radiologists and potentially increase the sensitivity of mammography.