dc.contributor.author | Martinsen, Iver | |
dc.contributor.author | Sørensen, Steffen Aagaard | |
dc.contributor.author | Ortega, Samuel | |
dc.contributor.author | Godtliebsen, Fred | |
dc.contributor.author | Tejedor, Miguel | |
dc.contributor.author | Myrvoll-Nilsen, Eirik | |
dc.date.accessioned | 2025-08-04T21:03:17Z | |
dc.date.available | 2025-08-04T21:03:17Z | |
dc.date.issued | 2025-07-24 | |
dc.description.abstract | Foraminifera are shell-bearing microorganisms that are commonly found in marine deposits on the seabed. They are important indicators in many analyses, are used in climate change research, monitoring marine environments, evolutionary studies, and are also frequently used in the oil and gas industry. Although some research has focused on automating the classification of foraminifera images, few have addressed the uncertainty in these classifications. Although foraminifera classification is not a safety-critical task, estimating uncertainty is crucial to avoid misclassifications that could overlook rare and ecologically significant species that are informative indicators of the environment in which they lived. Uncertainty estimation in deep learning has gained significant attention and many methods have been developed. However, evaluating the performance of these methods in practical settings remains a challenge. To create a benchmark for uncertainty estimation in the classification of foraminifera, we administered a multiple choice questionnaire containing classification tasks to four senior geologists. By analyzing their responses, we generated human-derived uncertainty estimates for a test set of 260 images of foraminifera and sediment grains. These uncertainty estimates served as a baseline for comparison when training neural networks in classification. We then trained multiple deep neural networks using a range of uncertainty quantification methods to classify and state the uncertainty about the classifications. The results of the deep learning uncertainty quantification methods were then analyzed and compared with the human benchmark, to see how the methods performed individually and how the methods aligned with humans. Our results show that human-level performance can be achieved with deep learning and that test-time data augmentation and ensembling can help improve both uncertainty estimation and classification performance. Our results also show that human uncertainty estimates are helpful indicators for detecting classification errors and that deep learning-based uncertainty estimates can improve calibration and classification accuracy | en_US |
dc.identifier.citation | Martinsen I, Sørensen SA, Ortega S, Godtliebsen F, Tejedor, Myrvoll-Nilsen E. Quantifying uncertainty in foraminifera classification: How deep learning methods compare to human experts. Artificial Intelligence in Geosciences. 2025;6(2) | en_US |
dc.identifier.cristinID | FRIDAID 2393417 | |
dc.identifier.doi | 10.1016/j.aiig.2025.100145 | |
dc.identifier.issn | 2666-5441 | |
dc.identifier.uri | https://hdl.handle.net/10037/37900 | |
dc.language.iso | eng | en_US |
dc.publisher | Elsevier | en_US |
dc.relation.journal | Artificial Intelligence in Geosciences | |
dc.relation.projectID | Norges forskningsråd: 332901 | en_US |
dc.relation.projectID | Norges forskningsråd: 309439 | en_US |
dc.rights.accessRights | openAccess | en_US |
dc.rights.holder | Copyright 2025 The Author(s) | en_US |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0 | en_US |
dc.rights | Attribution 4.0 International (CC BY 4.0) | en_US |
dc.title | Quantifying uncertainty in foraminifera classification: How deep learning methods compare to human experts | en_US |
dc.type.version | publishedVersion | en_US |
dc.type | Journal article | en_US |
dc.type | Tidsskriftartikkel | en_US |