We are investigating the potential for differences in study conclusions when assessing the estimated impact of a
computer-aided detection (CAD) system on readers' performance. The data utilized in this investigation were derived
from a multi-reader multi-case observer study involving one hundred mammographic background images to which
fixed-size and fixed-intensity Gaussian signals were added, generating a low- and high-intensity signal sets. The study
setting allowed CAD assessment in two situations: when CAD sensitivity was 1) superior or 2) lower than the average
reader. Seven readers were asked to review each set in the unaided and CAD-aided reading modes, mark and rate their
findings. Using this data, we studied the effect on study conclusion of three clinically-based receiver operating
characteristic (ROC) scoring definitions. These scoring definitions included both location-specific and non-location-specific
rules. The results showed agreement in the estimated impact of CAD on the overall reader performance. In
the study setting where CAD sensitivity is superior to the average reader, the mean difference in AUC between the
CAD-aided read and unaided read was 0.049 (95%CIs: -0.027; 0.130) for the image scoring definition that is based on
non-location-specific rules, and 0.104 (95%CIs: 0.036; 0.174) and 0.090 (95%CIs: 0.031; 0.155) for image scoring
definitions that are based on location-specific rules. The increases in AUC were statistically significant for the location-specific
scoring definitions. It was further observed that the variance on these estimates was reduced when using the
location-specific scoring definitions compared to that using a non-location-specific scoring definition. In the study
setting where CAD sensitivity is equivalent or lower than the average reader, the mean differences in AUC are slightly
above 0.01 for all image scoring definitions. These increases in AUC were not statistical significant for any of the
image scoring definitions. The results on the variance analysis differed from those observed in the other study setting.
This investigation furthers our understanding of the relationships between non-localization-specific and localization-specific
ROC assessment methodologies and their relevance to clinical practice.
|