PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
We previously conducted an observer study evaluating radiologists' performance for characterization of mammographic masses on serial mammograms with and without CAD. 253 temporal image pairs (138 malignant and 115 benign) from 96 patients containing masses on serial mammograms were used. The interval change characteristics of the masses on each temporal pair were analyzed by our CAD program to differentiate malignant and benign masses. The classifier achieved a test Az value of 0.87 for the data set. Eight MQSA radiologists and 2 fellows assessed the temporal masses and provided estimates of the likelihood of malignancy (LM) and BI-RADS assessment without and then with CAD. The LM estimates were provided on a quasi-continuous confidence-rating scale (CRS) of 1 to 100. In the current study we investigated the effects of using discrete CRS with fewer categories on ROC analysis. We simulated three discrete CRSs containing 5, 10, and 20 categories by binning the radiologists’ LM quasi-continuous ratings. For the ten radiologists, without CAD, the average Az in estimating the LM for the 5, 10, 20 and 100 category CRSs were 0.788, 0.786, 0.785, and 0.787, respectively. With CAD, the observers' Az improved to 0.845, 0.843, 0.844, and 0.843, respectively. The improvement was statistically significant (p<0.011) for each CRS. The partial area index for the four CRSs without CAD was 0.198, 0.204, 0.200, and 0.206, respectively. With CAD the partial area index was also significantly improved to 0.369, 0.365, 0.369, and 0.366, respectively (p<0.006 for all CRSs). The use of continuous and discrete confidence-rating scales in this study had minimal effect on the analysis of observer performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We introduce an interesting interpretation of the ROC Curve that, subsequently, opens a new research paradigm. We define the "Diagnostician Operating Choice" (DOC) Curve to be the set of all (True Positive Probability/True Negative Probability) or ("skill in diseased population"/"skill in non-diseased population" when considered from the diagnostician's perspective) options made available to a particular radiologist when interpreting a particular diagnostic technology. The DOC Curve is, thus, the choice set presented to the diagnostician by their interaction with the technology. This new paradigm calls for tools that can measure the particular choice set of any particular individual radiologist interpreting a particular technology when applied in a particular clinical setting. Fundamental requirements for this paradigm are for the DOC Curve to be unique to individuals and constant across similar experimental conditions. To investigate constancy, we analyzed data from a reading study of 10 radiologists. Each radiologist interpreted the same set of 148 screening mammograms twice using a modified version of BI-RADS. ROC Curves for each radiologist were computed and compared between the two reading occasions with the CORROC2 program. None of the areas were statistically significantly different (p<0.05), providing confirmation (but not proof) of constancy across the two reading conditions. The DOC Curve paradigm suggests new areas of research focusing on the behavior in individuals interacting with technology. A clear need is for more efficient estimation of individual DOC Curves based on limited case sets. Paradoxically, the answer to this last problem might lie in using large population-based ("MRMC") studies to develop highly efficient and externally validated standardized testing tools for assessment of the individual.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The multiple-reader, multiple-case (MRMC) paradigm of Swets and Pickett (1982) for ROC analysis was expressed as a components of variance model by Dorfman, Berbaum, and Metz (1992) and validated by Roe and Metz (1997) for Type I error rates. Our group proposed an analysis of the MRMC components of variance model using bootstrap (Beiden, Wagner, and Campbell, 2000) experiments instead of jackknife pseudo-values. These approaches have been challenged by some contemporary authors (e.g. Zhou, Obuchowski, and McClish, 2002). The purpose of the present paper is to formally compare the models and to carry out validation tests of their performance. We investigate different approaches to statistical inference, including several types of nonparametric bootstrap confidence intervals and report on validation and simulation experiments of Type I errors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Current approaches to ROC analysis use the MRMC (multiple-reader, multiple-case) paradigm in which several readers read each case and their ratings are used to construct an estimate of the area under the ROC curve or some other ROC-related parameter. Standard practice is to decompose the parameter of interest according to a linear model into terms that depend in various ways on the readers, cases and modalities. It is assumed that the terms are statistically independent (or at least uncorrelated). Bootstrap methods are then used to estimate the variance of the estimate and the contributions from the individual terms in the assumed expansion. Though the methodological aspects of MRMC analysis have been studied in detail, the literature on the probabilistic basis of the individual terms is sparse. In particular, few papers state what probability law applies to each term and what underlying assumptions are needed for the assumed independence. This paper approaches the MRMC problem from a mechanistic perspective. For a single modality, three sources of randomness are included: the images, the reader skill and the reader uncertainty. The probability law on the parameter estimate is written in terms of three nested conditional probabilities, and random variables associated with this probability are referred to as triply stochastic.
The triply stochastic probability is used to define the overall average of any ROC parameter as well as certain partial averages of utility in MRMC analysis. When this theory is applied to estimates of an ROC parameter for a single modality, it is shown that the variance of the estimate can be written as a sum of three terms, rather than the four that would be expected in MRMC analysis. The usual terms in MRMC expansions do not appear naturally in multiply-stochastic theory.
A rigorous MRMC expansion can be constructed by adding and subtracting partial averages to the parameter of interest in a tautological manner. In this approach the parameter is decomposed into a sum of four random uncorrelated, zero-mean random variables, with each term clearly defined in terms of conditional probabilities.
When the variance of the expansion is computed, however, numerous subtractions occur, and there is no apparent advantage to computing the variance term by term; the final result is the same as one gets from the triply stochastic decomposition, at least for the Wilcoxon estimator. No other nontrivial MRMC expansion appears to be possible.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Most prior 2AFC experiments have been designed using a small number of signal strengths with many scenes for each strength. Percent correct is then computed for each level and fit to the assumed psychometric function. However, this introduces error because the signal strengths of individual responses are shifted. An alternative approach is to compute the statistical likelihood as a function of the threshold and width of the psychometric response curve. The best fit is then determined by finding the threshold and width that maximize the likelihood. In this paper, we discuss a method for analyzing 2AFC observer responses using maximum likelihood estimation (MLE) techniques. The logit model is used to represent the psychometric function and derive the likelihood. A conjugate gradient search algorithm is then used to find the maximum likelihood. The method is illustrated using human observer results from a previous study while statistical characteristics of the method are examined using simulated response data. The human observer results show that the psychometric function varies between observers and from test to test. The simulations show that the variance of the threshold and width exhibit a 1/Nobs relationship (σ=1.5201*Nobs-0.5236), where Nobs is the number of observations made in a 2AFC test ranging from 10 to 30000. The variance of the human observer data was in close agreement with the simulations. These results indicate that the method is robust over a wide range of observations and can be used to predict human responses. The results of the simulations also suggest how to minimize error in future studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Acquiring multiple images of the same patient (e.g., mediolateral oblique and craniocaudal view mammograms) can, in principle, help improve diagnostic accuracy. We investigated theoretically, in the context of computer-aided diagnosis (CAD), four methods of combining multiple computer outputs obtained from multiple images of the same patient: taking the average, the median, the maximum, or the minimum of the individual assessments. We assumed that multiple computer outputs for each patient are equally accurate and that they can be transformed monotonically to the same pair of truth conditional normal distributions. We found that both the average and the median always produce improved area under the ROC curve (AUC) compared to single-view images, and that the average always performs better than the median. Furthermore, the maximum and the minimum can also produce improved AUCs and can outperform the average under certain situations, but in other situations they can produce worse results than single-view images. Moreover, except for the median, each method can be the best-performing method under specific conditions. Finally, as the strength of correlation between image pairs increases, the maximum and the minimum tend to perform the best more often whereas the average is less often the best performer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The focus of this manuscript is to investigate the statistical properties of expert panels used as a substitute for clinical truth through a simplistic Monte Carlo simulation model. We use Gaussian models to simulate both normal and abnormal distributions of ideal-observer test statistics. These distributions are designed to produce an ideal observer area under the ROC curve (AUC) of 0.85. Expert observers are modeled as an ideal observer test statistic degraded by a zero-mean Gaussian random variable. Different expert skill levels are achieved by changing the added variance. The experts' skill ranges between 0.6 and 0.8 in AUC. We combine decisions from 2-10 experts into a panel score by taking the median of all expert ratings as the panel test statistic. In experiment 1, truth panels made up of 2, 4, 8 and 10 experts who had the same skill level (AUC=0.8) achieved mean AUCs of 0.82, 0.83, 0.84, and 0.84, respectively. For experiment 2, the experts' skill level was varied uniformly between 0.6 and 0.8 in AUC. Panel performance decreased in experiment 2 compared to the fixed skill level panels in experiment 1. However, panels composed of 8 and 10 experts still achieved an AUC greater than 0.80, the maximum of any individual expert. These simulation experiments, while idealized and simplistic, are a starting point for understanding the implications of using a panel of experts as surrogate truth in ROC studies when a gold standard is not available.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In clinical interpretations radiologists do not know the location of lesions that may be present in the images, and they differ in their abilities to search images and find lesions. The receiver operating characteristic (ROC) model does not directly model the search ability of the observer. The aims of this work were (a) to present a simplified model that includes a search parameter, and (b) describe an algorithm for estimating the parameters of the model from free-response receiver operating characteristic (FROC) data. The model consists of two unit-variance normal distributions, noise and signal, separated by mu. The lesion decision variable (DV) samples are generated by sampling the signal distribution s times, where s is the known number of lesions in the image. The noise DV samples are generated by sampling the noise distribution n times, where n is the unknown number of noise sites in the image. The model regards n as an integer random variable that is realized by sampling a Poisson distribution with intensity parameter lambda. Under the assumption that all DV samples are independent, a maximum likelihood method succeeded in estimating the population values of the parameters from simulated FROC data. The ROC curves predicted by the model are "proper", i.e., they do not cross the chance diagonal. The model also predicts the widely observed result in ROC studies that the noise distribution is narrower than the signal distribution, corresponding to b < 1 in the familiar ROC model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A connected network of blood vessels surrounds and permeates almost every organ of the human body. The ability to define detailed blood vessel trees enables a variety of clinical applications. This paper discusses four such applications and some of the visualization challenges inherent to each.
Guidance of endovascular surgery: 3D vessel trees offer important information unavailable by traditional x-ray projection views. How best to combine the 2- and 3D image information is unknown.
Planning/guidance of tumor surgery: During tumor resection it is critical to know which blood vessels can be interrupted safely and which cannot. Providing efficient, clear information to the surgeon together with measures of uncertainty in both segmentation and registration can be a complex problem.
Vessel-based registration: Vessel-based registration allows pre-and intraoperative images to be registered rapidly. The approach both provides a potential solution to a difficult clinical dilemma and offers a variety of visualization opportunities.
Diagnosis/staging of disease: Almost every disease affects blood vessel morphology. The statistical analysis of vessel shape may thus prove to be an important tool in the noninvasive analysis of disease. A plethora of information is available that must be presented meaningfully to the clinician.
As medical image analysis methods increase in sophistication, an increasing amount of useful information of varying types will become available to the clinician. New methods must be developed to present a potentially bewildering amount of complex data to individuals who are often accustomed to viewing only tissue slices or flat projection views.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Four observer groups with different levels of expertise were tested to determine the effect of feedback on eye movements and accuracy whilst performing a simple radiological task. The observer groups were 8 experts, 9 year 1 radiography students, 9 year 3 radiography students, and 10 naive observers (psychology students). The task was fracture detection in the wrist. A test bank of 32 films was compiled with 14 normals, 6 grade 1 fractures (subtle appearance), 6 grade 2 fractures, and 6 grade 3 fractures (obvious appearance). Eye tracking was carried out on all observers to demonstrate differences in visual activity. Observers were asked to rate their confidence in their decision on a ten point scale. Feedback was presented to the observers in the form of circles displayed on the film where fixations had occurred, the size of which was proportional to the length of fixation. Observers were asked to repeat their decision rating. Accuracy was determined by ROC analysis and the area under the curve (AUC). In two groups, the novices and first year radiography students, the feedback resulted in no significant difference in the AUC. In the other two groups, experts (p = 0.002) and second year radiography students (p = 0.031), feedback had a negative effect on performance. The eye tracking parameters were measured for all subjects and compared. This is work in progress, but initial analysis of the data suggests that in a simple radiological task such as fracture detection, where search is very limited, feedback by encouraging observers to look harder at the image can have a negative effect on image interpretation performance, however for the novice feedback is beneficial as post feedback eye-tracking parameters measured more closely matched those of the experts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In Mammography, gaze duration at given locations has been shown to positively correlate with decision outcome in those locations. Furthermore, most locations that contain an unreported malignant lesion attract the eye of experienced radiologists for almost as long as locations that contain correctly reported cancers. This suggests that faulty detection is not the main reason why cancers are missed; rather, failures in the perceptual and decision making processes in the location of these finding may be of significance as well. Models of medical image perception advocate that the decision to report or to dismiss a perceived finding depends not only on the finding itself but also on the background areas selected by the observer to compare the finding with, in order to determine its uniqueness. In this paper we studied the visual search strategy of experienced mammographers as they examined a case set containing cancer cases and lesion-free cases. For the cancer cases, two sets of mammograms were used: the ones in which the lesion was reported in the clinical practice, and the most recent prior mammogram. We determined how changes in lesion conspicuity between the prior mammogram to the most recent mammogram affected the visual search strategy of the observers. We represented the changes in visual search using spatial frequency analysis, and determined whether there were any significant differences between the prior and the most recent mammograms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In mammography, computer-aided diagnosis (CAD) techniques for mass detection and classification mainly use local image information to determine whether a region is abnormal or not. There is a lot of interest in developing CAD methods that use context, asymmetry, and multiple view information. However, it is not clear to what extent this may improve CAD results. In this study, we made use of human observers to investigate the potential benefit of using context information for CAD. We investigated to what extent human readers make use of context information derived from the whole breast area and from asymmetry for the tasks of mass detection and classification. Results showed that context information can be used to improve CAD programs for mass detection. However, there is still a lot to be gained from improvement of local feature extraction and classification. This is demonstrated by the fact that the observers did much better in classifying true positive (TP) and false positive (FP) regions than the CAD program. For classification of benign and malignant masses context seems to be less important.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We derive a general random-effects model to study the differences in spatial frequency features across class-type (true positive (TP) or true negative (TN) or False Positive (FP)), a sample of 40 mammogram cases, and 9 readers. We derive a measure of feature conspicuity or salience using visually inspired spatial frequency filters and mammogram regions of interest derived from eye-position data. Repeated-measures ANOVA is performed on the salient features obtained from all cases. We hypothesize that statistically significant differences in the average salience measure (D-score) are seen across both class-types and cases. We believe this to be useful for determining the similarity between images in training and testing sets used in CADx algorithm development or for a priori determination of test set difficulty. Further, we hypothesize that our salience measure is useful for distinguishing the spatial frequency bands most relied upon to distinguish true negative and true positive responses. This is useful in discerning the "bottoms-up" cues used to direct the point of gaze during mammogram inspection. These results indicate that our salience measure is useful as an indicator of image similarity and for separating TP & TN regions of interest.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
ROC experiments generally assume observers to be fully attentive to the task of the experiment and, therefore, that the experiment measures observers' innate ability in performing the task. In a detection task, inattention can cause an observer to overlook signals or signal-like potential false positives in an image. In an ROC experiment that involves a detection task, inattention can either cause an observer to report a strong confidence for signal absent, or cause the observer to report on some other finding less salient than the finding that the observer fails to notice. The purpose of this study was to determine the effect of observer inattention on empirical ROC estimates and the appropriateness of the conventional binormal model for fitting such ROC curves. An experiment was designed in which observers were asked to detect a signal of a simple geometric shape and varying contrast in a background of Gaussian noise. The images sometimes also contained other objects of different geometric shapes in locations that did not overlap the signal. Observer inattention was simulated by blocking a quarter of each image from observer view. Results showed that observer inattention caused the empirical ROC curve to decrease, but curve fitting with the binormal model appeared to be equally accurate with and without observer inattention. However, a "proper" ROC model might be more appropriate than the binormal model because the binormal model invariably produced a "hook" near the upper-right corner of the ROC unit square that was not indicated by the empirical operating points. We conclude that observer inattention in a detection task can be evaluated as part of a conventional ROC experiment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Center for Gamma-Ray Imaging is developing a number of small-animal SPECT imaging systems. These systems consist of multiple stationary detectors, each of which has its own multiple-pinhole collimator. The location of the pinhole plates (i.e., magnification), the number of pinholes within each plate, as well the pinhole locations are all adjustable. The performance of the Bayesian ideal observer sets the upper limit on task performance and can be used to optimize imaging hardware, such as pinhole configurations. Markov-chain Monte Carlo techniques have been developed to compute the ideal observer but require complete knowledge of the statistics of both the imaging system (such as the noise) and the class of random objects being imaged, in addition to an accurate forward model connecting the object to the image. Ideal observer computations using Monte Carlo techniques are burdensome because the forward model must be simulated millions of times for each imaging system. We present an efficient technique for computing the Bayesian ideal observer for multiple-pinhole, small-animal SPECT systems that accounts for both the finite-size of the pinholes and the stochastic nature of the objects being imaged. This technique relies on an efficient, radiometrically correct forward model that maps an object to an image in less than 20 milliseconds. An analysis of the error of the forward model, as well as the results of a ROC study using the ideal observer test statistic is presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We analyzed a variety of recently proposed decision rules for
three-class classification from the point of view of ideal observer
decision theory. We considered three-class decision rules which have
been proposed recently: one by Scurfield, one by Chan et al., and one
by Mossman. Scurfield's decision rule can be shown to be a special
case of the three-class ideal observer decision rule in two different
situations: when the pair of decision variables is the pair of
likelihood ratios used by the ideal observer, and when the pair of
decision variables is the pair of logarithms of the likelihood ratios.
Chan et al. start with an ideal observer model, where two of the
decision lines used by the ideal observer overlap, and the third line
becomes undefined. Finally, we showed that the Mossman decision rule
(in which a single decision line separates one class from the other
two, while a second line separates those two classes) cannot be a
special case of the ideal observer decision rule. Despite the
considerable difficulties presented by the three-class classification
task compared with two-class classification, we found that the
three-class ideal observer provides a useful framework for analyzing a
wide variety of three-class decision strategies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Efficiencies of the human observer and channelized-Hotelling observers (CHOs) relative to the ideal observer for signal-detection tasks are discussed. A CHO using Laguerre-Gauss channels, which we call an efficient CHO (eCHO), and a CHO adding a scanning scheme to the eCHO to include signal-location uncertainty, which we call a scanning eCHO (seCHO), are considered. Both signal-known-exactly (SKE) tasks and signal-known-statistically (SKS) tasks are
considered. Signal location is uncertain for the SKS tasks, and lumpy backgrounds are used for background uncertainty in both the tasks. Markov-chain Monte Carlo methods are employed to determine ideal-observer performance on the detection tasks. Psychophysical studies are conducted to compute human-observer performance on the same tasks. A maximum-likelihood estimation method is employed to fit smooth psychometric curves with observer performance measurements. Efficiency is computed as the squared ratio of the detectabilities of the observer of interest to a standard observer. Depending on image statistics, the ideal observer or the Hotelling observer is used as the standard observer. The results show that the eCHO performs poorly in detecting signals with location uncertainty and the seCHO performs
only slightly better while the ideal observer outperforms the human observer and CHOs for both the tasks. Human efficiencies are approximately less than 2.5% and 41%, respectively, for the SKE and SKS tasks, where the gray levels of the lumpy background are non-Gaussian distributed. These results also imply that human observers are not affected by signal-location uncertainty as much as the ideal
observer. However, for the SKE tasks using Gaussian-distributed lumpy backgrounds, the human efficiency ranges between 28% and 42%. Three different simplified pinhole imaging systems are simulated and the humans and the model observers rank the systems in the same order for both the tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For the 2-class detection problem (signal absent/present), the likelihood ratio is an ideal observer in that it minimizes Bayes risk for arbitrary costs and it maximizes AUC, the area under the ROC curve. The AUC-optimizing property makes it a valuable tool in imaging system optimization. If one considered a different task, namely, joint detection and localization of the signal, then it would be similarly valuable to have a decision strategy that optimized a relevant scalar figure of merit. We are interested in quantifying performance on decision tasks involving location uncertainty using the LROC methodology. We derive decision strategies that maximize the area under the LROC curve, ALROC. We show that these decision strategies minimize Bayes risk under certain reasonable cost constraints. We model the detection-localization task as a decision problem in three increasingly realistic ways. In the first two models, we treat location as a discrete parameter having finitely many values resulting in an (L+1) class classification problem. In our first simple model, we do not include search tolerance effects and in the second, more general, model, we do. In the third and most general model, we treat location as a continuous parameter and also include search tolerance effects. In all cases, the essential proof that the observer maximizes ALROC is obtained with a modified version of the Neyman-Pearson lemma using Lagrange multiplier methods. A separate form of proof is used to show that in all three cases, the decision strategy minimizes the Bayes risk under certain reasonable cost constraints.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Including internal noise in computer model observers to degrade model observer performance to human levels is a common method to allow for quantitatively comparisons of human and model performance. In this paper, we studied two different types of methods for injecting internal noise to Hotelling model observers. The first method adds internal noise to the output of the individual channels: a) Independent non-uniform channel noise, b) Independent uniform channel noise. The second method adds internal noise to the decision variable arising from the combination of channel responses: a) internal noise standard deviation proportional to decision variable's standard deviation due to the external noise, b) internal noise standard deviation proportional to decision variable's variance caused by the external noise. We tested the square window Hotelling observer (HO), channelized Hotelling observer (CHO), and Laguerre-Gauss Hotelling observer (LGHO). The studied task was detection of a filling defect of varying size/shape in one of four simulated arterial segment locations with real x-ray angiography backgrounds. Results show that the internal noise method that leads to the best prediction of human performance differs across the studied models observers. The CHO model best predicts human observer performance with the channel internal noise. The HO and LGHO best predict human observer performance with the decision variable internal noise. These results might help explain why previous studies have found different results on the ability of each Hotelling model to predict human performance. Finally, the present results might guide researchers with the choice of method to include internal noise into their Hotelling models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bayesian artificial neural networks (BANNs) have proven useful in two-class classification tasks, and are claimed to provide good estimates of ideal-observer-related decision variables (the a posteriori class membership probabilities). We wish to apply the BANN methodology to three-class classification tasks for computer-aided diagnosis, but we currently lack a fully general extension of two-class receiver operating characteristic (ROC) analysis to objectively evaluate three-class BANN performance. It is well known that "the likelihood ratio of the likelihood ratio is the likelihood ratio." Based on this, we found that the decision variable which is the a posteriori class membership probability of an observational data vector is in fact equal to the a posteriori class membership probability of that decision variable. Under the assumption that a BANN can provide good estimates of these a posteriori probabilities, a second BANN trained on the output of such a BANN should perform very similarly to an identity function. We performed a two-class and a three-class simulation study to test this hypothesis. The mean squared error (deviation from an identity function) of a two-class BANN was found to be 2.5x10E-4. The mean squared error of the first component of the output of a three-class BANN was found to be 2.8x10-4, and that of its second component was found to be 3.8x10-4. Although we currently lack a fully general method to objectively evaluate performance in a three-class classification task, circumstantial evidence suggests that two- and three-class BANNs can provide good estimates of ideal-observer-related decision variables.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In medicine, images are taken so that specific tasks can be performed. Thus, any measure of image quality must account for the task the images are to be used for and the observer performing the task. Performing task-based optimizations using human observers is generally difficult, time consuming, expensive and, in the case of hardware optimizations, not necessarily ideal. Model observers have been successfully used in place of human observers. The channelized Hotelling observer is one such model observer. Depending on the choice of channels, the channelized Hotelling observer can be used to either predict human-observer performance or as an ideal observer. This paper will focus on the use of the channelized Hotelling observer as an approximation of the ideal linear observer. Laguerre Gauss channels have proven useful for ideal-observer computations, but these channels are somewhat limited because they require the signal to be known exactly both in terms of location and shape. In fact, the Laguerre Gauss channels require the signal to be radially symmetric. We have devised a new method of determining "efficient" channels that does not require the signal to be symmetric and can even account for signal variability. This method can even be used for linear estimation tasks. We have compared the performances of the channelized Hotelling observer using both this new set of channels and the Laguerre Gauss channels for a signal-known-exactly detection task, and found that they correlate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the visual discrimination model (VDM) approach to measuring image quality two input images are analyzed by an algorithm that calculates a just-noticeable-difference (JND) index. It has been claimed that the JND-index can be used to predict target detectability in the medical imaging detection task and that it could "eventually replace the time-intensive and complicated ROC studies". In earlier work we have suggested that this claim may be incorrect. We showed that the JND-index and observer performance did not always correlate and that there were sometimes striking disagreements between the two. The purpose of this work is to present a modified method of using the VDM, termed "channelized-VDM" that correlates better with observer performance. A second purpose is to demonstrate another problem, namely predicting the optimal window-level setting of an image, where conventional VDM usage makes incorrect predictions. We show that the channelized VDM method makes better predictions in this case too. Based on our studies we caution against conventional VDM usage for image quality optimization in the medical imaging detection task. Additional material is available on the author's website (http://jafroc.radiology.pitt.edu).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Previous studies in which the JNDmetrix visual discrimination model (VDM) was applied to predict effects of image display and processing factors on lesion detectability have shown promising results for mammographic images with microcalcification clusters. In those studies, just-noticeable-difference (JND) metrics were computed for signal-present and signal-absent image pairs with the same background. When this "paired discriminability" method was applied to Gaussian signals in 1/f3 filtered noise, however, it was unable to predict detection thresholds measured in 2AFC trials for different backgrounds. We suggested previously (SPIE 2002) that a statistical model observer using channel responses from "single-ended" VDM simulations could predict detection performance with different backgrounds. The implementation and evaluation of that VDM-channelized model observer is described in this paper. Model performance was computed for sets of signal and noise images from two observer performance studies involving the detection of simulated or real breast masses. For the first study, the VDM-channelized model observer was able to predict the dependence of detection thresholds on signal size (contrast-detail slope) for 2AFC detection of Gaussian signals on different 1/f3 noise backgrounds. Variations in the detectability of masses in mammograms from the second study correlated well with model performance as a function of display type (LCD vs. CRT) and viewing angle (on-axis vs. 45° off-axis). The performance of the VDM-channelized model observer was superior to results obtained using either the VDM paired discriminability method or a conventional nonprewhitening model observer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper studies the principle of transform coding and identifies the quantization noise as the sole distortion. It shows that compression noise is a linear transform of quantization noise, which is usually generated during quantization of transform coefficients using uniform scalar quantizers. The quantization noise may not distribute uniformly as distributions and quantization step sizes vary among transform coefficients. This paper derives the marginal, pairwise and joint probability density functions (pdfs) of multi-dimensional quantization noise. It also shows the mean vector and covariance matrix of quantization noise in closed-form. Based on above results, this paper derives closed-form compression noise statistics, which include marginal pdfs, pairwise pdfs and joint pdf, mean vector and covariance matrix of compression noise. This paper shows compression noise has a jointly normal distribution, which enables its calculation to have reasonable computation complexity. The derived statistics of quantization and compression noise are verified by using the JPEG compression algorithm and lumpy background images. Verification results show that derived statistics closely predicts estimated ones. This paper provides a theoretical foundation to derive closed-form model observers and to define closed-form quality measures for compressed medical images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We conducted experiments to determine human performance in detecting
and discriminating microcalcification-like objects in mammographic
background. This study is an extension of our previous work where we investigated detection and discrimination of known objects in white noise background (SKE/BKE taks). In the present experiments, we used hybrid images, consisting of computer-generated images of three signal shapes which were added into mammographic background extracted from digitized normal mammograms.
Human performance was measured by determining percentage correct (PC)
in 2-AFC experiments for the tasks of detecting a signal or discriminating between two signal shapes. PC was converted into a detection or discrimination index d' and psychometric functions were created by plotting d' as function of square root of signal energy.
Human performance was compared to predictions of a NPWE model observer. We found that the slope of the linear portion of the psychometric function for detection was smaller than that for discrimination, as opposed to what we we observed for white noise backgrounds, where the psychometric function for detection was significantly steeper than that for discrimination. We found that human performance was qualitatively reproduced by model observer predictions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The aim of this work was to investigate and quantify the effects of system noise, nodule location, anatomical noise and anatomical background on the detection of lung nodules in different regions of the chest x-ray. Simulated lung nodules of diameter 10 mm but with varying detail contrast were randomly positioned in four different kinds of images: 1) clinical images collected with a 200 speed CR system, 2) images containing only system noise (including quantum noise) at the same level as the clinical images, 3) clinical images with removed anatomical noise, 4) artificial images with similar power spectrum as the clinical images but random phase spectrum. An ROC study was conducted with 5 observers. The detail contrast needed to obtain an Az of 0.80, C0.8, was used as measure of detectability. Five different regions of the chest x-ray were investigated separately. The C0.8 of the system noise images ranged from only 2% (the hilar regions) to 20% (the lateral pulmonary regions) of those of the clinical images. Compared with the original clinical images, the C0.8 was 16% lower for the de-noised clinical images and 71% higher for the random phase images, respectively, averaged over all five regions. In conclusion, regarding the detection of lung nodules with a diameter of 10 mm, the system noise is of minor importance at clinically relevant dose levels. The removal of anatomical noise and other noise sources uncorrelated from image to image leads to somewhat better detection, but the major component disturbing the detection is the overlapping of recognizable structures, which are, however, the main aspect of an x-ray image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For diagnosis of breast cancer by mammography, the mammograms must be viewed by a radiologist. The purpose of this study was to determine the effect of display resolution on the specific clinical task of detection of breast lesions by a human observer. Using simulation techniques, this study proceeded through four stages. First, we inserted simulated masses and calcifications into raw digital mammograms. The resulting images were processed according to standard image processing techniques and appropriately windowed and leveled. The processed images were blurred according to MTFs measured from a clinical Cathode Ray Tube display. JNDMetrix, a Visual Discrimination Model, examined the images to estimate human detection. The model results suggested that detection of masses and calcifications decreased under standard CRT resolution. Future work will confirm these results with human observer studies. (This work was supported by grants NIH R21-CA95308 and USAMRMC W81XWH-04-1-0323.)
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Under certain assumptions the detectability of the ideal observer can be defined as the integral of the system Noise Equivalent Quanta multiplied by the squared object spatial frequency distribution. Using the detector Noise-Equivalent-Quanta (NEQD) for the calculation of detectability inadequately describes the performance of an x-ray imaging system because it does not take into account the effects of patient scatter and geometric unsharpness. As a result, the ideal detectability index is overestimated, and hence the efficiency of the human observer in detecting objects is underestimated. We define a Generalized-NEQ (GNEQ) for an x-ray system referenced at the object plane that incorporates the scatter fraction, the spatial distributions of scatter and focal spot, the detector MTFD, and the detector Normalized-Noise-Power-Spectrum (NNPSD). This GNEQ was used in the definition of the ideal detectability for the evaluation of the human observer efficiency during a two Alternative Forced Choice (2-AFC) experiment, and was compared with the case where only the NEQD was used in the detectability calculations. The 2-AFC experiment involved the detection of images of polyethylene tubes (diameters between 100-300 um) filled with iodine contrast (concentrations between 0-120 mg/cm3) placed onto a uniform head equivalent phantom placed near the surface of a microangiographic detector (43 um pixel size). The resulting efficiency of the human observer without regarding the effects of scatter and geometric unsharpness was 30%. When these effects were considered the efficiency was increased to 70%. The ideal observer with the GNEQ can be a simple optimization method of a complete imaging system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We investigated how size and lesion location affect detection of simulated mass lesions in chest radiography. Simulated lesions were added to the center of 10 cm x 10 cm regions of digital chest radiographs, and used in 4-Alternative Forced-Choice (4-AFC) experiments. We determined the lesion contrast required to achieve a 92% correct detection rate I(92%). The mass size was manipulated to range from 1 to 10 mm, and we investigated lesion detection in the lung apex, hilar region, and in the sub-diaphragmatic region. In these experiments, the observer obtained I(92%) from randomized repeats obtained at each of seven lesion sizes, with the results plotted as I(92%) versus lesion size. In addition we investigated the effect of using the same background in the four 4-AFC experiments (twinned) and random backgrounds from the same anatomical region taken from 20 different radiographs. In all three anatomical regions investigated, the slopes of the contrast detail curve for the random background experiments were negative for lesion sizes less than 2.5, 3.5, and 5.5 mm in the hilar (slope of -0.26), apex (slope of -0.54), and sub-diaphragmatic (slope of -0.53) regions, respectively. For lesion sizes greater than these, the slopes were 0.34, 0.23, and 0.40 in the hilar, apex, and sub-diaphragmatic regions, respectively. The positive slopes for portions of the contrast-detail curves in chest radiography are a result of the anatomical background, and show that larger lesions require more contrast for visualization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As new imaging technologies, such as Digital Radiograph (DR), advance, radiologists nowadays are able to detect smaller nodules than before. However, inter-observer variations exhibited in diagnosis still remain as critical challenges that need to be studied and addressed. In this research, inter-observer variation of pulmonary nodule marking and characterizing on DR images was studied in two phases, with the first phase focused on the analysis of inter-observer variations, and the second phase focused on the reduction of variations by using a computer system (IQQA(R)-Chest) that provides intelligent qualitative and quantitative analysis to help radiologists in the softcopy reading of DR chest images. Large inter-observer variations in pulmonary nodule identification and characterization on DR chest images were observed, even between expert radiologists. Experimental results also showed that less experienced radiologists could greatly benefit from the computer assistance, including substantial decrease of inter-observer variation and improvement of nodule detection rates. Moreover, radiologists with different levels of skillfulness may achieve similar high level performance after using the computer system. The computer system showed a high potential for providing a valuable assistance to the examination of DR chest images, especially as DR is adopted to screen large populations for lung cancer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optimizing Imaging Systems and Diagnostic Decisions
CRT displays are generally used for softcopy display in the digital reading room, but LCDs are being used more frequently. LCDs have many useful properties, but can suffer from significant degradation when viewed off-axis. We compared observer performance and human visual system model performance for on and off-axis CRT and LCD viewing. 400 mammographic regions of interest with different lesion contrasts were shown on and off-axis to radiologists on a CRT and LCD. Receiver Operating Characteristic (ROC) techniques were used to analyze observer performance and results were correlated with the predictions of the human vision model (JNDmetrix model). Both sets of performance metrics showed that LCD on-axis viewing was better than the CRT; and off-axis was significantly better with the CRT. Off-axis LCD viewing of radiographs can degrade observer performance compared to a CRT.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this work was to study how the pixel size of digital detectors can affect shape determination of microcalcifications in mammography. Screen-film mammograms containing microcalcifications clinically proven to be indicative of malignancy were digitised at 100 lines/mm using a high-resolution Tango drum scanner. Forty microcalcifications were selected to cover an appropriate range of sizes, shapes and contrasts typically found of malignant cases. Based on the measured MTF and NPS of the combined screen-film and scanner system, these digitised images were filtered to simulate images acquired with a square sampling pixel size of 10 μm x 10 μm and a fill factor of one. To simulate images acquired with larger pixel sizes, these finely sampled images were re-binned to yield a range of effective pixel sizes from 20 μm up to 140 μm. An alternative forced-choice (AFC) observer experiment was conducted with eleven observers for this set of digitised microcalcifications to determine how pixel size affects the ability to discriminate shape. It was found that observer score increased with decreasing pixel size down to 60 μm (p<0.01), at which point no significant advantage was obtained by using smaller pixel sizes due to the excessive relative noise-per-pixel. The relative gain in shape discrimination ability at smaller pixel sizes was larger for microcalcifications that were smaller than 500 μm and circular.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have previously utilized lumpy object models and simulated imaging systems in conjunction with the ideal observer to compute figures of merit for hardware optimization. In this paper, we describe the development of methods and phantoms necessary to validate or experimentally carry out these optimizations. Our study was conducted on a four-camera small-animal SPECT system that employs interchangeable pinhole plates to operate under a variety of pinhole configurations and magnifications (representing optimizable system parameters). We developed a small-animal phantom capable of producing random backgrounds for each image sequence. The task chosen for the study was the detection of a 2mm diameter sphere within the phantom-generated random background. A total of 138 projection images were used, half of which included the signal. As our observer, we employed the channelized Hotelling observer (CHO) with Laguerre-Gauss channels. The signal-to-noise (SNR) of this observer was used to compare different system configurations. Results indicate agreement between experimental and simulated data with higher detectability rates found for multiple-camera, multiple-pinhole, and high-magnification systems, although it was found that mixtures of magnifications often outperform systems employing a single magnification. This work will serve as a basis for future studies pertaining to system hardware optimization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The objective of this study was to evaluate a new Cardiac Enhancement Filter (CEF) for noise reduction and edge enhancement of Computed Tomography Cardiac Angiography examinations. The filter is an adaptive noise reduction filter designed to achieve near real time functioning. Using a CT performance phantom, standard measurements of image quality, including spatial resolution, low contrast resolution, and image noise were assessed with and without the CEF. Quantitative assessment of the CEF showed slightly improved spatial resolution at 50% and 10% modulation, similar low contrast resolution and significantly lower image noise (up to 38%) characteristics. Two patient datasets were used for the clinical evaluation of the filter. The filter effectively reduced image noise by 13 to 22% in clinical datasets. These datasets exhibited a significant decrease in image noise without loss of vessel sharpness or introduction of new image artifacts. The results of the initial testing are encouraging, yet additional investigations are required to further assess the filter's clinical utility.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lower x-ray exposures are commonly used in radiographic exams to reduce the patient radiation dose. An unwanted side effect is that the noise level increases as the exposure level is reduced. Image enhancement techniques increasing image contrast, such as sharpening and dynamic range compression tend to increase the appearance of noise. A Gaussian filter-based noise suppression algorithm using an adaptive soft threshold has been designed to reduce the noise appearance in low-exposure images. The advantage of this technique is that the algorithm is signal-dependent, and therefore will only impact image areas with low signal-to-noise ratio. Computed radiography images captured with lower exposure levels were collected from clinical sites, and used as controls in an observer study. The noise suppression algorithm was applied to each of the control images to generate test images. Hardcopy printed film versions of control and test images were presented side-by-side on a film alternator to six radiologists. The radiologists were asked to rate the control and test images using a 9-point diagnostic quality rating scale and a 7-point delta-preference rating scale. The results showed that the algorithm reduced noise appearance, which was preferred, while preserving the diagnostic image quality. This paper describes the noise suppression algorithm and reports on the results of the observer study.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Flat panel detectors have a large number of parameters that affect fluoroscopy image quality. Scintillator thickness is very important and can be changed in fabrication. In general, with increasing thickness, there is a degradation of MTF with spatial blurring but improved conversion efficiency. This design trade-off should be optimized for visualization. Using quantitative experimental and techniques, we simulated three detector models, including a direct detector and two indirect detectors with different scintillator thickness (160 and 210 mg/cm2) and displayed each "acquired" pixel directly on the screen without further processing in a sequence of fluoroscopy images. To measure image quality, we investigated detection of two interventional devices: stents and guide wires. Human observers and a channelized human observer model both demonstrated that detection depended on detector scintillator thickness and the type of interventional device. Detection performance was improved with the thicker scintillator, especially at low exposure. A simulated direct detector gave less blurring and even better detection performance for stent detection. The thick indirect detector gave contrast sensitivities equal to those for the direct detector for the case of guide wire detection. An ideal observer model gave trends similar to those for human observers, even though it does not account for many features of human viewing of image sequences and gives extraordinarily high detection SNR values because it uses all images in a sequence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
PURPOSE: To propose a method for Parametric Statistical Weights (PSW) estimations and analyze its statistical impact in Computer-Aided Diagnosis Imaging Systems based on a Relative Similarity (CADIRS) classification approach.
MATERIALS AND METHODS: A Multifactor statistical method was developed and applied for Parametric Statistical Weights calculations in CADIRS. The implemented PSW method was used for statistical estimations of PSW impact when applied to a clinically validated breast ultrasound digital database of 332 patients' cases with biopsy proven findings. The method is based on the assumption that each parameter used in Relative Similarity (RS) classifier contributes to the deviation of the diagnostic prediction proportionally to the normalized value of its coefficient of multiple regression. The calculated by CADIRS Relative Similarity values with and without PSW were statistically estimated, compared and analyzed (on subset of cases) using classic Receiver Operator Characteristic (ROC) analysis methods.
RESULTS: When CADIRS classification scheme was augmented with PSW the Relative Similarity the calculated values were 2-5% higher in average. Numeric estimations of PSW allowed decomposition of statistical significance for each component (factor) and its impact on similarity to the diagnostic results (biopsy proven).
CONCLUSION: Parametric Statistical Weights in Computer-Aided Diagnosis Imaging Systems based on a Relative Similarity classification approach can be successfully applied in an effort to enhance overall classification (including scoring) outcomes. For the analyzed cohort of 332 cases the application of PSW increased Relative Similarity to the retrieved templates with known findings by 2-5% in average.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fluoroscopically guided interventional procedures require adequate image quality for medical decision making and eye-hand coordination. Other activities requiring light must be performed while the operator views the images. The lighting environment of a typical modern interventional cardiology laboratory was photographically documented during the performance of several interventional cardiology procedures. This laboratory was originally constructed in 1990, and reequipped several times. No one focused substantial attention on lighting design at any time. Key elements of the room design were simulated using a commercial 3-D rendering program. Matching photographs of the actual room with the simulation provides a semi-quantitative estimate of the properties of the real room and its equipment. The clinical images on the viewing monitors are overlaid by a substantial degree of diffuse reflections as well as a number of direct and indirect specular reflections from other light sources in the laboratory. Their intensity was greater on those monitors that incorporated a glass protective plate in front of the LCD displays. The effect of reflections on performance is quantitatively unknown. Extinguishing offending lights cannot be done without interfering with other critical aspects of the procedure. However, simple changes in room architecture and equipment should substantially reduce reflections.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A visually lossless compression algorithm for digitized radiographs, which predicts the maximum contrast that wavelet subband quantization distortions can exhibit in the reconstructed image such that the distortions are visually undetectable, is presented. Via a psychophysical experiment, contrast thresholds were measured for the detection of 1.15-18.4 cycles/degree wavelet subband quantization distortions in five digitized radiographs; results indicate that digitized radiographs impose image- and frequency-selective effects on detection. A quantization algorithm is presented which predicts the thresholds for individual images based on a model of visual masking. When incorporated into JPEG-2000 and applied to a suite of images, results indicate that digitized radiographs can be compressed in a visually lossless manner at an average compression ratio of 6.25:1, with some images requiring visually lossless ratios as low as 4:1 and as great as 9:1. The proposed algorithm thus yields images that require the minimum bit-rate such that the reconstructed images are visually indistinguishable from the original images. The primary utility of the proposed algorithm is its ability to provide image-adaptive visually lossless compression, thereby avoiding overly conservative or overly aggressive compression.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image quality assessment plays a crucial role in many applications. Since the ultimate receiver in most of the image processing environments are humans, objective measures of quality that correlate with subjective perception are actively sought. Limited success has been achieved in deriving robust quantitative measures that can automatically and efficiently predict perceived image quality. The majority of structural similarity techniques are based on aggregation of local statistics within a local window. The choice of right window sizes to produce results compatible with visual perception is a challenging task with these methods. This paper introduces an intuitive metric that exploits the dominance of Fourier phase over magnitude in images. The metric is based on cross correlation of phase images to assess the image quality. Since the phase captures structural information, a phase-based similarity metric would best mimic the visual perception. With the availability of multi-dimensional Fourier and wavelet transforms, this metric can be directly used to assess quality of multi-dimensional images
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Using data from a clinical trial of a commercial CAD system for lung cancer detection we separately analyzed the location, if any, selected on each film by 15 radiologists as they interpreted chest radiographs, 160 of which did not contain cancers. On the cancer-free cases, the radiologists showed statistically significant difference in decisions while using the CAD (p-value 0.002). Average specificity without computer assistance was 78%, and with computer assistance 73%. In a clinical trial with CAD for lung cancer detection there are multiple machine false positives. On chest radiographs of older current or former smokers, there are many scars that can appear like cancer to the interpreting radiologists. We are reporting on the radiologists' false positives and on the effect of machine false positive detections on observer performance on cancer-free cases. The only difference between radiologists occurred when they changed their initial true negative decision to false positive (p-value less than 0.0001), average confidence level increased, on the scale from 0.0 to 100.0, from 16.9 (high confidence of non-cancer) to 53.5 (moderate confidence cancer was present). We are reporting on the consistency of misinterpretation by multiple radiologists when they interpret cancer-free radiographs of smokers in the absence of CAD prompts. When multiple radiologists selected the same false positive location, there was usually a definite abnormality that triggered this response. The CAD identifies areas that are of sufficient concern for cancer that the radiologists will switch from a correct decision of no cancer to mark a false positive, previously overlooked, but suspicious appearing cancer-free area; one that has often been marked by another radiologist without the use of the CAD prompt. This work has implications on what should be accepted as ground truth in ROC studies: One might ask, "What a false positive response means?" when the finding, clinically, looks like cancer-it just isn’t cancer, based on long-term follow-up or histology.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Although full-field digital mammography (FFDM) systems are currently being used to acquire mammograms in digital format, the digital displays being used to display these images are less than ideal compared to traditional film-screen display. The resolution of softcopy displays is less than film and certain properties of the softcopy displays themselves (e.g., MTF) are less than optimal compared to film. We developed methods to compensate for some of these softcopy display efficiencies, based on careful physical characterization of the displays and image processing software to correct the deficiencies. An observer study was done to test the effectiveness of these techniques. A series of FFDM images acquired from different manufacturer devices was shown to six radiologists. Images were displayed at reduced resolution but they could be magnified to show full size. A window could be activated that 1) brought the image detail within the window to full resolution, 2) autoranged the displayed gray levels, and 3) corrected for the non-isotropic MTF of the CRT display. This study examined the way the observers utilized the viewing tools and their overall performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Joseph Ken Leader, Denise Chough M.D., Ronald J. Clearfield M.D., Marie A. Ganott M.D., Christiane Hakim M.D., Lara Hardesty M.D., Betty Shindel M.D., Jules H. Sumkin M.D., John M. Drescher, et al.
Proceedings Volume Medical Imaging 2005: Image Perception, Observer Performance, and Technology Assessment, (2005) https://doi.org/10.1117/12.592823
Radiologists' performance reviewing and rating breast cancer screening mammography exams using a telemammography system was evaluated and compared with the actual clinical interpretations of the same interpretations. Mammography technologists from three remote imaging sites transmitted 245 exams to a central site (radiologists), which they (the technologists) believed needed additional procedures (termed "recall"). Current exam image data and non-image data (i.e., technologist's text message, technologist's graphic marks, patient's prior report, and Computer Aided Detection (CAD) results) were transmitted to the central site and displayed on three high-resolution, portrait monitors. Seven radiologists interpreted ("recall" or "no recall") the exams using the telemammography workstation in three separate multi-mode studies. The mean telemammography recall rates ranged from 72.3% to 82.5% while the actual clinical recall rates ranged from 38.4% to 42.3% across the three studies. Mean Kappa of agreement ranged from 0.102 to 0.213 and mean percent agreement ranged from 48.7% to 57.4% across the three studies. Eighty-seven percent of the disagreement interpretations occurred when the telemammography interpretation resulted in a recommendation to recall and the clinical interpretation resulted in a recommendation not to recall. The poor agreement between the telemammography and clinical interpretations may indicate a critical dependence on images from prior screening exams rather than any text based information. The technologists were sensitive, if not specific, to the mammography features and changes that may lead to recall. Using the telemammography system the radiologists were able to reduce the recommended recalls by the technologist by approximately 25 percent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the outcome of an initial study to compare softcopy displays (Liquid Crystal Displays) with 8 bits contrast resolution with those of 11 bits contrast resolution. Of particular interest was decreased detection of objects and appearance of artifacts like false contours as a result of quantization. The study was based on simulation of objects like squares, discs and Gaussian nodules at different amplitudes on uniform backgrounds. We also placed simple objects like disks and Gaussian objects into a clinical image. These objects were displayed on the two different types of displays to untrained observers. These objects were also analyzed quantitatively in their digital form with a computer program made for display and analysis. The study found that subtle objects can be missed and artifacts such as false contours can occur, dependent on signal amplitude and noise. A comprehensive observer study is necessary to confirm and refine these results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When network distribution of movie files was considered as reference, it could be useful that the lossy compression movie files which has small file size. We chouse three kinds of coronary stricture movies with different moving speed as an examination object; heart rate of slow, normal and fast movies. The movies of MPEG-1, DivX5.11, WMV9 (Windows Media Video 9), and WMV9-VCM (Windows Media Video 9-Video Compression Manager) were made from three kinds of AVI format movies with different moving speeds. Five kinds of movies that are four kinds of compression movies and non-compression AVI instead of the DICOM format were evaluated by Thurstone's method. The Evaluation factors of movies were determined as "sharpness, granularity, contrast, and comprehensive evaluation." In the virtual bradycardia movie, AVI was the best evaluation at all evaluation factors except the granularity. In the virtual normal movie, an excellent compression technique is different in all evaluation factors. In the virtual tachycardia movie, MPEG-1 was the best evaluation at all evaluation factors expects the contrast. There is a good compression form depending on the speed of movies because of the difference of compression algorithm. It is thought that it is an influence by the difference of the compression between frames. The compression algorithm for movie has the compression between the frames and the intra-frame compression. As the compression algorithm give the different influence to image by each compression method, it is necessary to examine the relation of the compression algorithm and our results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Summation and axial slab reformation (ASR) of thin-section CT dataset are increasingly used to increase productivity against data explosion and to increase the image quality. We hypothesized that the summation or ASR can substitute primary reconstruction (PR) directly from a raw projection data. PR datasets (5-mm section thickness, 20% overlap) were reconstructed in 150 abdominal studies. Summation and ASR datasets of the same image positions and nominal section thickness were calculated from thin-section reconstruction images (2-mm section thickness, 50% overlap). Median root-mean-square error between PR and summation (9.55: 95% CI: 9.51, 9.59) was significantly greater than that between PR and ASR (7.12: 95% CI: 7.08, 7.17) (p < 0.0001). Three radiologists independently analyzed 2,000 pairs of PR and test images (PR [as control], summation, or ASR) to determine if summation or ASR is distinguished from PR. Multireader-multicase ROC analysis showed that Az value was 0.597 (95% CI: 0.552, 0.642) for the discrimination between PR and summation, and 0.574 (95% CI: 0.529, 0.619) for the discrimination between PR and ASR. The difference between these two Az values were not significant (p = 0.41). Radiologists can distinguish between the PR image and the summation or ASR image in abdominal studies, although this discrimination performance is slightly better than would be expected from random guessing. Image fidelity of ASR is higher than that of summation, if PR is regarded as the reference standard.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We develop the image based computer assisted diagnosis system for benign paroxysmal positional vertigo (BPPV) that consists of the balance control system simulator, the 3D eye movement simulator, and the extraction method of nystagmus response directly from an eye movement image sequence. In the system, the causes and conditions of BPPV are estimated by searching the database for record matching with the nystagmus response for the observed eye image sequence of the patient with BPPV. The database includes the nystagmus responses for simulated eye movement sequences. The eye movement velocity is obtained by using the balance control system simulator that allows us to simulate BPPV under various conditions such as canalithiasis, cupulolithiasis, number of otoconia, otoconium size, and so on. Then the eye movement image sequence is displayed on the CRT by the 3D eye movement simulator. The nystagmus responses are extracted from the image sequence by the proposed method and are stored in the database. In order to enhance the diagnosis accuracy, the nystagmus response for a newly simulated sequence is matched with that for the observed sequence. From the matched simulation conditions, the causes and conditions of BPPV are estimated. We apply our image based computer assisted diagnosis system to two real eye movement image sequences for patients with BPPV to show its validity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the field of oriental traditional medicine, precise analysis of tongue and skin condition such as color, moisture, bulging or swelling, and/or pulse condition such as palpation amplitude and stiffness is very important to reach reliable diagnosis. Such "live" inspection before the "live" objects is most desirable. However, if it can be obtained "virtually" using information technologies, physicians may diagnose their patients distantly.
In twenty-five patients of oriental traditional medicine, as well as usual physical examination from modern medicine, specific examination from oriental traditional medicine were performed. Patients were indicated to take their tongue and/or skin by high resolution digital camera or CCD camera attached to the cell phone and send to the clinic from their home through internet. Then we analyzed both "virtual" and "live" images personally.
As for the information of color and moisture from images of tongue and skin, it did not show much difference between the images from "live" and "virtual". As for the property of three-dimensional information such as bulging or swelling of tongue and/or skin surface, it was too hard to judge precisely from two-dimensional information.
The images from "virtual" inspection were thought to be enough for diagnosis of the patient condition even for the oriental traditional medicine. We suggested the reliability on application of tele-diagnosis even on the field of complementary and alternative medicine. For further study, we plan to evaluate three-dimensional information of tongue and/or skin surface and haptic information of pulse palpation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In our previous method, thresholding for feature extraction was one of the biggest problems. We have developed two methods for obtaining the optimal threshold, but they are not enough to apply to all the cancers in CT. In this paper, we prepared about 9 thresholds and extracted features by using these thresholds. Then we applied a discriminate method using a subspace derived from the extracted feature metrices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Objective: Assess the performance of a computer aided diagnosis (CAD) system for automatic detection of pulmonary nodules at CT scans compared to single and double reading by radiologists. Material and methods: A nodule detection CAD system (Siemens LungCare NEV VB10) was applied to low-dose-CT (LDCT) scans of nine patients with pulmonary metastases and compared to findings of three radiologists; standard-dose-CT (SDCT) was acquired simultaneously to establish ground truth. Study design was approved by the Institutional Review Board and the appropriate German authorities. Ground truth was established by fusion of sets of detected nodules from independent reading by three radiologists at LDCT and SDCT scans and CAD results. Special focus was taken on the size of nodules detected only by CAD compared to the size of all detected nodules. Results: Average sensitivity of 54% (range 51-55%) was observed for single reading by one radiologist. Application of the CAD system demonstrated a similar sensitivity of 55%. Double reading by two radiologists increased sensitivity to an average of 67% (range 67-68%). The difference to single reading was significant (p<0.001). Use of CAD as second opinion after single reading increased the sensitivity to 79% (range 77-81%) which proved to be significantly better than double reading (p<0.001). 11% of nodules with a size of more than 4 mm were detected only by CAD. Conclusion: CAD as second reader offered a significant increase in sensitivity compared to conventional double reading. Therefore, CAD is a valuable second opinion for the detection of pulmonary nodules.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Parallel magnetic resonance imaging through sensitivity encoding using multiple receiver coils has emerged as an effective tool to reduce imaging time or improve the image quality. Reconstructed image quality is limited by the noise in the acquired k-space data, inaccurate estimation of the sensitivity map, and the ill-conditioned nature of the coefficient matrix. Tikhonov Regularization is currently the most popular method to solve the ill-condition problem. Selections of the regularization map and the regularization parameter are very important. The Perceptual Difference Model (PDM) is a quantitative image quality evaluation tool which has been successfully applied to varieties of MR applications. High correlation between the human rating and the PDM score shows that PDM could be suitable for evaluating image quality in parallel MR imaging. By applying PDM, we compared four methods of selecting the regularization map and four methods of selecting regularization parameter. We find that generalized series (GS) method to select the regularization map together with spatially adaptive method to select the regularization parameter gives the best solution to reconstruct the image. PDM also work as a quantitative image quality index to optimize two important free parameters in spatially adaptive method. We conclude that PDM is an effective tool in helping design and optimize reconstruction methods in parallel MR imaging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Meshes are currently used to model objects, namely human organs and other structures. However, if they have a large number of triangles, their rendering times may not be adequate to allow interactive visualization, a mostly desirable feature in some diagnosis (or, more generally, decision) scenarios, where the choice of adequate views is important. In this case, a possible solution consists in showing a simplified version while the user interactively chooses the viewpoint and, then, a fully detailed version of the model to support its analysis. To tackle this problem, simplification methods can be used to generate less complex versions of meshes. While several simplification methods have been developed and reported in the literature, only a few studies compare them concerning the perceived quality of the obtained simplified meshes.
This work describes an experiment conducted with human observers in order to compare three different simplification methods used to simplify mesh models of the lungs. We intended to study if any of these methods allows a better-perceived quality for the same simplification rate.
A protocol was developed in order to measure these aspects. The results presented were obtained from 32 human observers. The comparison between the three mesh simplification methods was first performed through an Exploratory Data Analysis and the significance of this comparison was then established using other statistical methods. Moreover, the influence on the observers' performances of some other factors was also investigated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The JNDmetrix human visual system model developed by the Sarnoff Corporation is used to predict observer performance on visual discrimination tasks. It begins with two paired images as the initial input and ends with a JND map that shows the magnitude and spatial location of visible differences between the two input images. The goal of this experiment was to determine if the location and magnitude of JNDs identified by the model corresponded to visual search parameters of the human observer. Radiologists searched 20 mammograms with multiple masses and microcalcifications of different subtleties as their eye-position was recorded. The JNDmetrix model analyzed the same images and identified, with JNDs, discriminable areas on the images. Lesions with lower subtlety ratings were detected later in search than more obvious ones (FNs later than TPs). When the subtler lesions were detected (TP) dwell time was longer than more obvious lesions, but the FNs received shorter total dwell. The subtler lesions when detected (TP) received more total fixation clusters than more obvious ones, but the FNs received fewer. The correlation between the model JNDs and the eye-position parameters was high. Understanding the influence of lesion subtlety on search may help us better model and predict human observer performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We carried out an observer performance study for evaluating the performance of 16 radiologists without and with a computer-aided diagnostic (CAD) scheme for determination of the likelihood of malignancy of lung nodules on HRCT in a database of 28 primary lung cancers and 28 benign nodules. The results of our observer study showed that radiologists’ performance was improved with the CAD scheme, and their performance with the CAD scheme was greater than either radiologists alone or computer alone. Our purpose in this study was to analyze radiologists’ responses with the CAD scheme in their task of differentiation between malignant and benign nodules on HRCT. Our results indicated that the average change in radiologists’ ratings (difference between radiologists' ratings with the CAD scheme and radiologists' initial ratings) was strongly related to (A) the likelihood of malignancy (the computer output) and also (B) the difference of their initial ratings from the computer outputs, where the correlation coefficients were 0.93 and 0.90, respectively. Our detailed analysis showed that radiologists changed the majority of their ratings in agreement with the computer results, and the majority of these changes contributed to the improvement in their performance. They were able to maintain some of their correct ratings despite incorrect computer results. For some cases, they increased their confidence in their judgments above the computer output. Thus, the improvement in radiologists' performance above the computer performance was produced by the synergistic effect of the radiologists' decision making and the computer outputs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This work presents a system for realistic computer simulations of quantification systems from three-dimensional (3D) medical images.
The system is based on accurate computer models of anatomical parts derived from 3D medical images. The models can be realistically manipulated in the virtual domain reflecting actual scan-rescan situations. The manipulated object can be used to reconstruct 3D images. Furthermore, the model can be used to simulate observer segmentations of the reconstructed anatomy. Because segmentations are fundamental for comprehensive quantification of anatomical structures quantifications performances can be derived from the simulated segmentations. The proposed simulation system has been used to predict joint space width variations between two different imaging protocols without the need of analyzing several volunteers'
scans. The system has been used to predict intra observer performance helping in the selection of the best imaging protocol. Performance results and comparison with actual scan rescan performance evaluations are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Thirty images with added simulated pathological lesions at two different dose levels (100% and 10% dose) were evaluated with the free-response forced error experiment by nine experienced radiologists. The simulated pathological lesions present in the images were classified according to four different parameters: the position within the lumbar spine, possibility to perform a symmetrical (left-right) comparison, the lesion contrast, and the complexity of the surrounding background where the lesion was situated. The detectability of each lesion was calculated as the fraction of radiologists who successfully detected the lesion before a false positive error was made. The influence of each of the four parameters on lesion detectability was investigated. The results of the study show that the influence of lesion contrast is the most important factor for detectability. Since the dose level had a limited effect on detectability, large dose savings can be made without reducing the detectability of pathological lesions in lumbar spine radiography.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A procedure is developed that enables comprehensive and automatic image quality evaluation of computed tomography (CT) systems. This procedure includes custom-designed software and an image quality phantom composed of subsections with regional test objects. The phantom is designed so that the maximum amount of information concerning image quality and system performance can be obtained in a single scan. The software automatically analyzes phantom images and generates measurements of image quality that are both quantitative and objective. The image quality parameters that will be attained from a single scan of the phantom include: spatial resolution, contrast, contrast signal-to-noise ratio, linearity, uniformity, slice thickness, temporal resolution, and dose. This evaluation procedure provides a simple, automated method of quality control. The phantom and procedure can also be used as a research tool for studying modifications of CT system components.
In this study, we present results from a mathematical model of the phantom. We discuss the design and validation of the phantom and accompanying software.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents results of physical as well as psychophysical evaluations of an LCD with respect to its spatial noise. Spatial noise is quantified using a high-resolution CCD camera and a method is developed to compensate for it. This compensation method is based on a spatial noise map, derived from the CCD camera images, and on the application of an error diffusion algorithm. This method of noise compensation reduces the spatial noise by about a factor of 2. Psychophysical evaluation is performed in order to explore the dependence of human contrast sensitivity on display spatial noise. This evaluation uses the two-alternative forced choice (2-AFC) method. Aperiodic Gaussian-shaped objects, which simulate lung nodules, serve as stimuli. The detectability index, d', calculated indicates that spatial noise compensation leads to a lower contrast threshold.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Evaluation of images generated by new MR pulse sequences or reconstruction methods is traditionally done using subjective measures of image quality in a clinical study or by using quantitative measures such as local SNR or CNR to support the subjective findings. In order to accelerate evaluation of new candidate MR techniques, objective measures related to human perception and performance are desirable. Therefore, the goal of this study was to determine if the effects of parallel-imaging artifacts on subjective image quality could be predicted using the JNDmetrix vision model as a first step in developing objective measures to guide MR development. Single-shot fast spin echo images (HASTE) were acquired with increasing acceleration factors (0, 2, 3, and 4) and reconstructed with two algorithms, mSENSE and GRAPPA. Subjective quality ratings (0-10 scale) for these images were compared to spatial-frequency channel responses of the JNDmetrix model and to PSNR. Our results confirmed the anticipated degradation in quality for GRAPPA and mSENSE images with increasing acceleration factor. The mSENSE method yielded significantly lower quality ratings than GRAPPA for the higher acceleration factors (3 and 4). Full matrix images with no partial parallel acquisition (noppa) showed blurring due to longer shot time and T2 decay and were rated most comparable to the GRAPPA acceleration factor 4 images. There was a strong linear relationship between just-noticeable difference (JND) changes and observer ratings, while PSNR showed no correlation with observer ratings. The JNDmetrix results better reflected image degradation due to both blurring and noise. These results give confidence that the JNDmetrix approach may become a useful tool for the design and evaluation of MR pulse sequences and reconstruction methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Each year almost all film readers in the UK Breast Screening Programme voluntarily read a set of difficult mammographic cases as a means of self-assessing their film reading skills. We set out to investigate what case characteristics, if any, actually constituted a 'difficult' or 'easy' case in the opinion of radiological experts. We also examined how UK Breast Screening personnel performed on those cases which the experts deemed were difficult, in order to build up a profile of the types of cases that provide film readers with the most problems. We examined two main elements of case diagnosis, case classification and case features and investigated if there were any group differences in terms of case difficulty and the percentage of incorrectly reported cases. Data from over 15 radiological experts and approximately 400 film readers were compared on 180 cases. Significant differences were found between the expert and screening populations (p < .05) in terms of these case characteristics. These data contribute to the understanding of just what constitutes a difficult case as considered by experts and other film-readers, with a view to elucidating the type of cases most appropriate for advanced mammographic training.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A discrepancy exists between two studies that investigated psychophysical detection of simulated lesions (e.g. gaussians or designer nodules) embedded in filtered noise images (Johnson et al, 2002; Burgess et al, 2003). Johnson et al, 2002 identified a significant difference in the slope of the contrast detail plots (CD plots) as the presentation methodology in a 2AFC task was changed from the unlike background (unpaired) to identical backgrounds (paired). In comparable experiments, Burgess et al, 2003 challenged the results by finding no difference between the slopes (both positive) of the CD plots when using paired backgrounds or unpaired backgrounds. We found that a significant difference between the two studies, namely the presence of a circular fixation cue was responsible for the discrepancy. The detection noise due to positional uncertainty was sufficient to reduce subject's threshold for small target diameters. This effect was amplified in the paired background, switching the CD plot from a negative slope (without fixation) to a positive slope (with fixation). The effect was less dramatic with the unpaired backgrounds, however intra-observer variability seemed to be reduced with fixation cues. These results significantly reduce the discrepancies in C-D characteristics between the two studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a software framework and analysis tool to support the collection and analysis of eye movement and perceptual feedback data for a variety of diagnostic imaging modalities. The framework allows the rapid creation of experiment software that can display a collection of medical images of a particular modality, capture eye trace data, and record marks added to an image by the observer, together with their final decision. There are also a number of visualisation techniques for the display of eye trace information. The analysis tool supports the comparison of individual eye traces for a particular observer or traces from multiple observers for a particular image. Saccade and fixation data can be visualised, with user control of fixation identification functions and properties. Observer markings are displayed, and predefined regions of interest are supported. The software also supports some interactive and multi-image modalities. The analysis tool includes a novel visualisation of scan paths across multi-image modalities. Using an exploded 3D view of a stack of MRI scan sections, an observer's scan path can be shown traversing between images, in addition to inspecting them.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Receiver Operating Characteristic (ROC) analysis is a widely used method for analyzing the performance of two-class classifiers. Advantages of ROC analysis include the fact that it explicitly considers the tradeoffs in sensitivity and specificity, includes visualization methods, and has clearly interpretable summary metrics. Currently, there does not exist a widely accepted performance method similar to ROC analysis for an N-class classifier (N>2). The purpose of this study was to empirically compare methods that have been proposed to evaluate the performance of N-class classifiers (N>2). These methods are, in one way or another, extensions of ROC analysis. This report focuses on three-class classification performance metrics, but most of the methods can be extended easily for more than three classes. The methods studied were pairwise ROC analysis, Hand and Till M Function (HTM), one-versus-all ROC analysis, a modified HTM, and Mossman's "Three-Way ROC" method. A three-class classification task from breast cancer computer-aided diagnosis (CADx) is taken as an example to illustrate the advantages and disadvantages of the alternative performance metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The increasing number of CT images to be interpreted in mass screening requires radiologists to interpret a huge number of CT images, and the capacity for screening has therefore been limited by the capacity to process images. To remedy this situation we considered paramedical staff, especially radiological technologists, as "potential screeners," and investigated their capacity to detect abnormalities in CT images of lung cancer screening with and without the assistance of a computer-aided diagnosis (CAD) system. We then compared their performances with those of physicians. A set of 100 slices of thoracic CT images from 100 cases ( 73 abnormal and 27 normal), one slice per case, was interpreted by 43 paramedical college students. A second interpretation by the students was performed after they had been instructed on how to interpret CT images, and a third interpretation was assisted by a virtual CAD system. We calculated the areas under the ROC curve (Az values) for both students and physicians. For the first set of interpretations, the Az values of 40% out of students placed the Az values within the range of Az values of the physicians, which varied from 0.870 to 0.964. For the second set of interpretations after the students had been instructed on CT image interpretation, the students' rate was 86%, and for the third set of virtual CAD-assisted interpretations it was 95%. The performance of paramedical college students in detecting abnormalities from thoracic CT images proved to be sufficient to qualify them as "potential screeners."
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.