Estimating model uncertainty of artificial intelligence (AI)-based breast cancer detection algorithms could help guide the reading strategy in breast cancer screening. For example, the recall decision can be made solely by AI when it exhibits high certainty, while cases where the certainty is low should be read by radiologists. This study aims to evaluate two metrics to predict model uncertainty of a lesion characterization network: 1) the variance of a set of outputs generated with stochastic layer depth, and 2) the entropy of the average output. To test these approaches, 367 mammography exams with cancer (333 screen-detected, and 34 interval) and 367 cancer-negative exams from the Dutch Breast Cancer Screening Program were included. Using a commercial lesion detection algorithm operating at high sensitivity, 6,477 suspicious regions were included (14.1% labeled malignant). By varying the uncertainty threshold, the predictions were classified as certain or uncertain by a specified proportion. Radiologists double reading had a sensitivity of 90.9% (95% CI 89.0% – 92.7%) and a specificity of 93.8% (95% CI 93.2% – 96.2%) for all regions. At equal specificity, the network had a sensitivity of 92.1% (95% CI 89.9% – 94.0%) for all regions. The sensitivity of the network was higher for regions with low uncertainty for both approaches; for the top 50% most certain regions the sensitivity was 96.9% (95% CI 94.7% – 98.4%) and 97.1% (95% CI 94.9% – 98.8%) at equal specificity to radiologists. In conclusion, AI-based lesion classification uncertainty of breast regions can be estimated by applying stochastic layer depth during prediction.
KEYWORDS: Image segmentation, Breast, Digital breast tomosynthesis, Computer aided diagnosis and therapy, Mammography, Digital mammography, Tomography, Neural networks
Semantic segmentation of breast images is typically performed as a preprocessing step for breast cancer detection by Computer Aided Diagnosis (CAD) systems. While most literature on region segmentation is based on conventional techniques like line estimation, thresholding and atlas-based approaches, such methods may have problems with generalisation. This paper investigates a robust multi-vendor breast region segmentation system for full field digital mammograms (FFDM) and digital breast tomography (DBT) using a U-Net neural network. Additionally, the effect of adding attention gates to the U-Net architecture was analysed. The proposed networks were trained and tested in a cross-validation setting on in-house FFDM/DBT data and the public INbreast datasets, comprising over 10,000 FFDM and 3,500 DBT images from five different vendors. Dice scores were obtained in the range 0.978 - 0.985, with slightly higher scores for the architecture that includes attention gates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.