Paper
29 May 2024 Assessing the impact of counterfactuals for textural changes in mammogram classification
Ridhi Arora, Juhun Lee
Author Affiliations +
Proceedings Volume 13174, 17th International Workshop on Breast Imaging (IWBI 2024); 131741M (2024) https://doi.org/10.1117/12.3027048
Event: 17th International Workshop on Breast Imaging (IWBI 2024), 2024, Chicago, IL, United States
Abstract
This study utilized the concept of counterfactuals to understand the decision-making process of AI-based computer aided diagnosis (CADx) algorithms on mammogram images. Counterfactual analysis allowed us to dissect the causal relationships of these algorithms by asking questions, such as “what effect could be seen on classifier’s prediction if there is no texture (or grayed-out) inside the lesion region?”. Our purpose is not aim to classify lesions accurately; we focused on providing deeper understanding into “why” and “how” of classifier decisions, paving way for more transparent and interpretable AI in medical imaging. We used CBIS-DDSM dataset, which contains 1,318 (681 benign and 637 malignant) images for training and 378 (231 benign and 147 malignant) images for testing. We made four counterfactual cases: 1) replacing benign foreground (B FG: Benign Foreground Grayed-out) with original image’s mean intensity (MI) vs. original malignant (M), 2) replacing benign background (B BG: Benign Background Grayed-out) with original image’s MI vs. original malignant (M), 3) replacing malignant foreground (M FG: Malignant Foreground Grayed-out) with original image’s MI vs. original benign (B), and 4) replacing malignant background (M BG: Malignant Background Grayed-out) with original image’s MI vs. original benign (B). We trained three convolutional neural networks (CNNs)—MobileNet, ResNet50, and ResNet50v2—to classify benign and malignant cases (with non-counterfactual, baseline). We found that each classifier tends to be more sensitive (negatively react, i.e., degraded performance) to changes in background for benign cases (B BG) than the changes in foreground for malignant cases (M FG). Furthermore, ResNet50 demonstrated robustness (correct classification) to counterfactual modifications, signifying best AUC for B BG (AUC=0.83) then other counterparts. While, ResNet50v2 has shown robustness for the foreground changes in the benign images (B FG) with an AUC of 0.82.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Ridhi Arora and Juhun Lee "Assessing the impact of counterfactuals for textural changes in mammogram classification", Proc. SPIE 13174, 17th International Workshop on Breast Imaging (IWBI 2024), 131741M (29 May 2024); https://doi.org/10.1117/12.3027048
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Mammography

Image classification

Computer aided detection

Education and training

Image processing

Breast cancer

Evolutionary algorithms

Back to Top