Translator Disclaimer
6 April 2005 Influence of panel size and expert skill on truth panel performance when combining expert ratings
Author Affiliations +
The focus of this manuscript is to investigate the statistical properties of expert panels used as a substitute for clinical truth through a simplistic Monte Carlo simulation model. We use Gaussian models to simulate both normal and abnormal distributions of ideal-observer test statistics. These distributions are designed to produce an ideal observer area under the ROC curve (AUC) of 0.85. Expert observers are modeled as an ideal observer test statistic degraded by a zero-mean Gaussian random variable. Different expert skill levels are achieved by changing the added variance. The experts' skill ranges between 0.6 and 0.8 in AUC. We combine decisions from 2-10 experts into a panel score by taking the median of all expert ratings as the panel test statistic. In experiment 1, truth panels made up of 2, 4, 8 and 10 experts who had the same skill level (AUC=0.8) achieved mean AUCs of 0.82, 0.83, 0.84, and 0.84, respectively. For experiment 2, the experts' skill level was varied uniformly between 0.6 and 0.8 in AUC. Panel performance decreased in experiment 2 compared to the fixed skill level panels in experiment 1. However, panels composed of 8 and 10 experts still achieved an AUC greater than 0.80, the maximum of any individual expert. These simulation experiments, while idealized and simplistic, are a starting point for understanding the implications of using a panel of experts as surrogate truth in ROC studies when a gold standard is not available.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Nicholas Petrick, Brandon D. Gallas, Frank W. Samuelson, Robert F. Wagner, and Kyle J. Myers "Influence of panel size and expert skill on truth panel performance when combining expert ratings", Proc. SPIE 5749, Medical Imaging 2005: Image Perception, Observer Performance, and Technology Assessment, (6 April 2005);

Back to Top