There are no data on the effect of disease prevalence during training for interpreting digital breast tomosynthesis (DBT) based screening examinations on the performance of residents and fellows. We assessed the performance of six residents (four after one breast imaging rotation and two after two rotations) and two fellows in breast imaging when interpreting DBT screening examinations in a multi-case, mode balanced, test- train-test retrospective reader study (127 training and 160 testing cases). Half were trained with feedback of verified truth after reviewing each case with low prevalence of disease (13/127) and half with high prevalence (52/128). The pre- and post- training dataset was the same. Performance measures were compared (sensitivity, specificity and AUC). Readers trained with the low prevalence set decreased the overall recall rate of non-cancer cases (FPF from 0.21 to 0.13, <0.001), and of cases with known malignancies (TPF from 0.70 to 0.61, p=0.004, due primarily to one clearly outlier reader). Readers trained with the high prevalence increased the overall recall rate (albeit, not statistically significant) of non-cancer cases (FPF from 0.16 to 0.18, p=0.07), and a borderline significant increase of cancer cases (TPF from 0.61 to 0.66, p=0.04). Fellows post six months of specialty training in each group had no significant changes in sensitivity, specificity, or AUC after training (smallest p>0.07). Both residents with two rotations experience had significant changes in sensitivity and specificity (highest p<0.028), but not in AUC. Early training with low disease prevalence of “what not to recall” should be included during training.
|