• Metrics reloaded: recommendations for image analysis validation 

      Maier-Hein, Lena; Reinke, Annika; Godau, Patrick; Tizabi, Minu D.; Buettner, Florian; Christodoulou, Evangelia; Glocker, Ben; Isensee, Fabian; Kleesiek, Jens; Kozubek, Michal; Reyes, Mauricio; Riegler, Michael; Wiesenfarth, Manuel; Kavur, A. Emre; Sudre, Carole H.; Baumgartner, Michael; Eisenmann, Matthias; Heckmann-Nötzel, Doreen; Rädsch, Tim; Acion, Laura; Antonelli, Michela; Arbel, Tal; Bakas, Spyridon; Benis, Arriel; Blaschko, Matthew B.; Cardoso, M. Jorge; Cheplygina, Veronika; Cimini, Beth A.; Collins, Gary S.; Farahani, Keyvan; Ferrer, Luciana; Galdran, Adrian; van Ginneken, Bram; Haase, Robert; Hashimoto, Daniel A.; Hoffman, Michael M.; Huisman, Merel; Jannin, Pierre; Kahn, Charles E.; Kainmueller, Dagmar; Kainz, Bernhard; Karargyris, Alexandros; Karthikesalingam, Alan; Kofler, Florian; Kopp-Schneider, Annette; Kreshuk, Anna; Kurc, Tahsin; Landman, Bennett A.; Litjens, Geert; Madani, Amin; Maier-Hein, Klaus; Martel, Anne L.; Mattson, Peter; Meijering, Erik; Menze, Bjoern; Moons, Karel G. M.; Müller, Henning; Nichyporuk, Brennan; Nickel, Felix; Petersen, Jens; Rajpoot, Nasir; Rieke, Nicola; Saez-Rodriguez, Julio; Sánchez, Clara I.; Shetty, Shravya; van Smeden, Maarten; Summers, Ronald M.; Taha, Abdel A.; Tiulpin, Aleksei; Tsaftaris, Sotirios A.; Van Calster, Ben; Varoquaux, Gaël; Jäger, Paul F. (Journal article; Tidsskriftartikkel; Peer reviewed, 2024-02-12)
      Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework ...
    • Understanding metric-related pitfalls in image analysis validation 

      Reinke, Annika; Tizabi, Minu D.; Baumgartner, Michael; Eisenmann, Matthias; Heckmann-Nötzel, Doreen; Kavur, A. Emre; Rädsch, Tim; Sudre, Carole H.; Acion, Laura; Antonelli, Michela; Arbel, Tal; Bakas, Spyridon; Benis, Arriel; Buettner, Florian; Cardoso, M. Jorge; Cheplygina, Veronika; Chen, Jianxu; Christodoulou, Evangelia; Cimini, Beth A.; Farahani, Keyvan; Ferrer, Luciana; Galdran, Adrian; van Ginneken, Bram; Glocker, Ben; Godau, Patrick; Hashimoto, Daniel A.; Hoffman, Michael M.; Huisman, Merel; Isensee, Fabian; Jannin, Pierre; Kahn, Charles E.; Kainmueller, Dagmar; Kainz, Bernhard; Karargyris, Alexandros; Kleesiek, Jens; Kofler, Florian; Kooi, Thijs; Kopp-Schneider, Annette; Kozubek, Michal; Kreshuk, Anna; Kurc, Tahsin; Landman, Bennett A.; Litjens, Geert; Madani, Amin; Maier-Hein, Klaus; Martel, Anne L.; Meijering, Erik; Menze, Bjoern; Moons, Karel G. M.; Müller, Henning; Nichyporuk, Brennan; Nickel, Felix; Petersen, Jens; Rafelski, Susanne M.; Rajpoot, Nasir; Reyes, Mauricio; Riegler, Michael; Rieke, Nicola; Saez-Rodriguez, Julio; Sánchez, Clara I.; Shetty, Shravya; Summers, Ronald M.; Taha, Abdel A.; Tiulpin, Aleksei; Tsaftaris, Sotirios A.; Van Calster, Ben; Varoquaux, Gaël; Yaniv, Ziv R.; Jäger, Paul F.; Maier-Hein, Lena (Journal article; Tidsskriftartikkel; Peer reviewed, 2024-02-12)
      Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical ...