Translator Disclaimer
3 May 2017 Lack of agreement between radiologists: implications for image-based model observers
Author Affiliations +
We tested the agreement of radiologists’ rankings of different reconstructions of breast computed tomography images based on their diagnostic (classification) performance and on their subjective image quality assessments. We used 102 pathology proven cases (62 malignant, 40 benign), and an iterative image reconstruction (IIR) algorithm to obtain 24 reconstructions per case with different image appearances. Using image feature analysis, we selected 3 IIRs and 1 clinical reconstruction and 50 lesions. The reconstructions produced a range of image quality from smooth/low-noise to sharp/high-noise, which had a range in classifier performance corresponding to AUCs of 0.62 to 0.96. Six experienced Mammography Quality Standards Act (MQSA) radiologists rated the likelihood of malignancy for each lesion. We conducted an additional reader study with the same radiologists and a subset of 30 lesions. Radiologists ranked each reconstruction according to their preference. There was disagreement among the six radiologists on which reconstruction produced images with the highest diagnostic content, but they preferred the midsharp/noise image appearance over the others. However, the reconstruction they preferred most did not match with their performance. Due to these disagreements, it may be difficult to develop a single image-based model observer that is representative of a population of radiologists for this particular imaging task.
© 2017 Society of Photo-Optical Instrumentation Engineers (SPIE) 2329-4302/2017/$25.00 © 2017 SPIE
Juhun Lee, Robert M. Nishikawa, Ingrid S. Reiser, Margarita L. Zuley, and John M. Boone "Lack of agreement between radiologists: implications for image-based model observers," Journal of Medical Imaging 4(2), 025502 (3 May 2017).
Received: 3 November 2016; Accepted: 17 April 2017; Published: 3 May 2017

Back to Top