Measurement of total kidney volume (TKV) plays an important role in the early therapeutic stage of autosomal dominant polycystic kidney disease (ADPKD). As a crucial biomarker, an accurate TKV can sensitively reflect the disease progression and be used as an indicator to evaluate the curative effect of the drug. However, manual contouring of kidneys in magnetic resonance (MR) images is time-consuming (40 minutes), which greatly hinders the wide adoption of TKV in clinic. In this paper, we propose a multi-resolution 3D convolutional neural network to automatically segment kidneys of ADPKD patients from MR images. We adopt two resolutions and use a customized V-Net model for both resolutions. The V-Net model is able to integrate both high-level context information with detailed local information for accurate organ segmentation. The V-Net model in the coarse resolution can robustly localize the kidneys, while the VNet model in the fine resolution can accurately refine the kidney boundaries. Validated on 305 subjects with different loss functions and network architectures, our method can achieve over 95% Dice similarity coefficient with the groundtruth labeled by a senior physician. Moreover, the proposed method can dramatically reduce the measurement of kidney volume from 40 minutes to about 1 second, which can greatly accelerate the disease staging of ADPKD patients for large clinical trials, promote the development of related drugs, and reduce the burden of physicians.
Objective and efficient diagnosis of Alzheimer’s disease (AD) has been a major topic with extensive researches in recent years, and some promising results have been shown for imaging markers using magnetic resonance imaging (MRI) data. Beside conventional machine learning methods, deep learning based methods have been developed in several studies, where layer-by-layer neural network settings were purposed to extract features for disease classification from the patches or whole images. However, as the disease develops from subcortical nuclei to cortical regions, specific brain regions with morphological changes might contribute to the diagnosis of disease progress. Therefore, we propose a novel spatial and depth weighted neural network structure to extract effective features, and further improve the performance of AD diagnosis. Specifically, we first use group comparison to detect the most distinctive AD-related landmarks, and then sample landmark-based image patches as our training data. In the model structure, with a 15-layer DenseNet as backbone, we introduce a attention bypass to estimate the spatial weights in the image space to guide the network to focus on specific regions. A squeeze-and-excitation (SE) mechanism is also adopted to further weight the feature map channels. We used 2335 subjects from public datasets (i.e., ADNI-1, ADNI-2 and ADNI-GO) for experiment and results show that our framework achieves 90.02% accuracy, 81.25% sensitivity, and 96.33% specificity in diagnosis AD patients from normal controls.
Accurate segmentation of organs at risk (OARs) is a key step in image guided radiation therapy. In recent years, deep learning based methods have been widely used in medical image segmentation. Among them, U-Net and V-Net are the most popular ones. In this paper, we evaluate a customized V-Net on 16 OARs throughout the body using a large CT dataset. Specifically, two customizations are used to reduce the GPU memory cost of V-Net: 1) multi-resolution V-Nets, where the coarse-resolution V-Net aims to localize the OAR in the entire image space, while the fine-resolution V-Net focuses on refining detailed boundaries of OAR; 2) a modified V-Net architecture, which is specifically designed for segmenting large organs, e.g., liver. Validated on 3483 CT scans of various imaging and disease conditions, we show that, compared with traditional methods, the customized V-Net wins in speed (0.7 second vs 20 seconds per organ), accuracy (average Dice score 96.6% vs 84.3%), and robustness (98.6% successful rate vs 83.3% successful rate). Moreover, the customized V-Net is very robust against various image artifacts, diseases and slice thicknesses, and has much better performance even on the organs with large shape variations (e.g., the bladder) than traditional methods.
Accurate lumbar spine measurement in CT images provides an essential way for quantitative spinal diseases analysis such as spondylolisthesis and scoliosis. In today’s clinical workflow, the measurements are manually performed by radiologists and surgeons, which is time consuming and irreproducible. Therefore, automatic and accurate lumbar spine measurement algorithm becomes highly desirable. In this study, we propose a method to automatically calculate five different lumbar spine measurements in CT images. There are three main stages of the proposed method: First, a learning based spine labeling method, which integrates both the image appearance and spine geometry information, is used to detect lumbar and sacrum vertebrae in CT images. Then, a multiatlases based image segmentation method is used to segment each lumbar vertebra and the sacrum based on the detection result. Finally, measurements are derived from the segmentation result of each vertebra. Our method has been evaluated on 138 spinal CT scans to automatically calculate five widely used clinical spine measurements. Experimental results show that our method can achieve more than 90% success rates across all the measurements. Our method also significantly improves the measurement efficiency compared to manual measurements. Besides benefiting the routine clinical diagnosis of spinal diseases, our method also enables the large scale data analytics for scientific and clinical researches.
Automatically detecting anatomy orientation is an important task in medical image analysis. Specifically, the ability to automatically detect coarse orientation of structures is useful to minimize the effort of fine/accurate orientation detection algorithms, to initialize non-rigid deformable registration algorithms or to align models to target structures in model-based segmentation algorithms. In this work, we present a deep convolution neural network (DCNN)-based method for fast and robust detection of the coarse structure orientation, i.e., the hemi-sphere where the principal axis of a structure lies. That is, our algorithm predicts whether the principal orientation of a structure is in the northern hemisphere or southern hemisphere, which we will refer to as UP and DOWN, respectively, in the remainder of this manuscript. The only assumption of our method is that the entire structure is located within the scan’s field-of-view (FOV). To efficiently solve the problem in 3D space, we formulated it as a multi-planar 2D deep learning problem. In the training stage, a large number coronal-sagittal slice pairs are constructed as 2-channel images to train a DCNN to classify whether a scan is UP or DOWN. During testing, we randomly sample a small number of coronal-sagittal 2-channel images and pass them through our trained network. Finally, coarse structure orientation is determined using majority voting. We tested our method on 114 Elbow MR Scans. Experimental results suggest that only five 2-channel images are sufficient to achieve a high success rate of 97.39%. Our method is also extremely fast and takes approximately 50 milliseconds per 3D MR scan. Our method is insensitive to the location of the structure in the FOV.
Automatic and precise segmentation of hand bones is important for many medical imaging applications. Although several previous studies address bone segmentation, automatically segmenting articulated hand bones remains a challenging task. The highly articulated nature of hand bones limits the effectiveness of atlas-based segmentation methods. The use of low-level information derived from the image-of-interest alone is insufficient for detecting bones and distinguishing boundaries of different bones that are in close proximity to each other. In this study, we propose a method that combines an articulated statistical shape model and a local exemplar-based appearance model for automatically segmenting hand bones in CT. Our approach is to perform a hierarchical articulated shape deformation that is driven by a set of local exemplar-based appearance models. Specifically, for each point in the shape model, the local appearance model is described by a set of profiles of low-level image features along the normal of the shape. During segmentation, each point in the shape model is deformed to a new point whose image features are closest to the appearance model. The shape model is also constrained by an articulation model described by a set of pre-determined landmarks on the finger joints. In this way, the deformation is robust to sporadic false bony edges and is able to fit fingers with large articulations. We validated our method on 23 CT scans and we have a segmentation success rate of ~89.70 %. This result indicates that our method is viable for automatic segmentation of articulated hand bones in conventional CT.
In X-ray examinations, it is essential that radiographers carefully use collimation to the appropriate anatomy of interest to minimize the overall integral dose to the patient. The shadow regions are not diagnostically meaningful and could impair the overall image quality. Thus, it is desirable to detect the collimation and exclude the shadow regions to optimize image display. However, due to the large variability of collimated images, collimation detection remains a challenging task. In this paper, we consider a region of interest (ROI) in an image, such as the collimation, can be described by two distinct views, a cluster of pixels within the ROI and the corners of the ROI. Based on this observation, we propose a robust multi-view learning based strategy for collimation detection in digital radiography. Specifically, one view is from random forests learning based region detector, which provides pixel-wise image classification and each pixel is labeled as either in-collimation or out-of-collimation. The other view is from a discriminative, learning-based landmark detector, which detects the corners and localizes the collimation within the image. Nevertheless, given the huge variability of the collimated images, the detection from either view alone may not be perfect. Therefore, we adopt an adaptive view fusing step to obtain the final detection by combining region and corner detection. We evaluate our algorithm in a database with 665 X-ray images in a wide variety of types and dosages and obtain a high detection accuracy (95%), compared with using region detector alone (87%) and landmark detector alone (83%).
KEYWORDS: Detection and tracking algorithms, Medical imaging, Brain, Spine, Magnetic resonance imaging, 3D applications, Neuroimaging, Data analysis, 3D image processing, Imaging systems
One of primary challenges in the medical image data analysis is the ability to handle abnormal, irregular and/or
partial cases. In this paper, we present two different robust algorithms towards the goal of automatic planar
primitive detection in 3D volumes. The overall algorithm is a bottoms-up approach starting with the anatomic
point primitives (or landmarks) detection. The robustness in computing the planar primitives is built in through
both a novel consensus-based voting approach, and a random sampling-based weighted least squares regression
method. Both these approaches remove inconsistent landmarks and outliers detected in the landmark detection
step. Unlike earlier approaches focused towards a particular plane, the presented approach is generic and can be
easily adapted to computing more complex primitives such as ROIs or surfaces. To demonstrate the robustness
and accuracy of our approach, we present extensive results for automatic plane detection (Mig-Sagittal and
Optical Triangle planes) in brain MR-images. In comparison to ground truth, our approach has marginal errors
on about 90 patients. The algorithm also works really well under adverse conditions of arbitrary rotation and
cropping of the 3D volume. In order to exhibit generalization of the approach, we also present preliminary results
on intervertebrae-plane detection for 3D spine MR application.
We present an automatic method to quickly and accurately detect multiple anatomy region-of-interests (ROIs) from CT
topogram images. Our method first detects a redundant and potentially erroneous set of local features. Their spatial
configurations are captured by a set of local voting functions. Unlike all the existing methods where the idea was to try to
"hit" the correct/best constellations of local features, we have taken an opposite approach. We try to peel away the bad
features until a safe (i.e., conservatively small) number of features remain. It is deterministic in nature and guarantees
a success even for extremely noisy cases. The advantages of the method are its robustness and computational efficiency.
Our method also addresses the potential scenario in which outliers (i.e., false landmarks detections) forms plausible
configurations. As long as such outliers are a minority, the method can successfully remove these outliers. The final ROI
of the anatomy is computed from a best subset of the remaining local features. Experimental validation was carried out
for multiple organs detection from a large collection of CT topogram images. Fast and highly robust performance was
observed. In the testing data sets, the detection rate varies from 98.2% to 100% for different ROIs and the false detection
rate is from 0.0% to 0.5% for different ROIs. The method is fast and accurate enough to be seamlessly integrated into a
real-time work flow on the CT machine to improve efficiency, consistency, and repeatability.
Characterization and quantification of the severity of diffuse parenchymal lung diseases (DPLDs) using Computed
Tomography (CT) is an important issue in clinical research. Recently, several classification-based computer-aided
diagnosis (CAD) systems [1-3] for DPLD have been proposed. For some of those systems, a degradation of performance
[2] was reported on unseen data because of considerable inter-patient variances of parenchymal tissue patterns.
We believe that a CAD system of real clinical value should be robust to inter-patient variances and be able to classify
unseen cases online more effectively. In this work, we have developed a novel adaptive knowledge-driven CT image
search engine that combines offline learning aspects of classification-based CAD systems with online learning aspects of
content-based image retrieval (CBIR) systems. Our system can seamlessly and adaptively fuse offline accumulated
knowledge with online feedback, leading to an improved online performance in detecting DPLD in both accuracy and
speed aspects. Our contribution lies in: (1) newly developed 3D texture-based and morphology-based features; (2) a
multi-class offline feature selection method; and, (3) a novel image search engine framework for detecting DPLD. Very
promising results have been obtained on a small test set.
Emerging whole-body imaging technologies push computer aided detection/diagnosis (CAD) to scale up to a
whole-body level, which involves multiple organs or anatomical structure. To be exploited in this paper is the
fact that the various tasks in whole-body CAD are often highly dependent (e.g., the localization of the femur
heads strongly predicts the position of the iliac bifurcation of the aorta). One way to effectively employ task
dependency is to schedule the tasks such that outputs of some tasks are used to guide the others. In this sense,
optimal task scheduling is key to improve overall performance of a whole-body CAD system. In this paper,
we propose a method for task scheduling that is optimal in an information-theoretic sense. The central idea
is to schedule tasks in such an order that each operation achieves maximum expected information gain over
all the tasks. The formulation embeds two intuitive principles: (1) a task with higher confidence tends to be
scheduled earlier; (2) a task with higher predictive power for other tasks tends to be scheduled earlier. More
specifically, task dependency is modeled by conditional probability; the outcome of each task is assumed to be
probabilistic as well; and the objective function is based on the reduction of the summed conditional entropy
over all tasks. The validation is carried out on a challenging CAD problem, multi-organ localization in whole-body
CT. Compared to unscheduled and ad hoc scheduled organ detection/localization, our scheduled execution
achieves higher accuracy with much less computation time.
Reliable landmark detection in medical images provides the essential groundwork for successful automation of
various open problems such as localization, segmentation, and registration of anatomical structures. In this paper,
we present a learning-based system to jointly detect (is it there?) and localize (where?) multiple anatomical
landmarks in medical images. The contributions of this work exist in two aspects. First, this method takes the
advantage from the learning scenario that is able to automatically extract the most distinctive features for multi-landmark
detection. Therefore, it is easily adaptable to detect arbitrary landmarks in various kinds of imaging
modalities, e.g., CT, MRI and PET. Second, the use of multi-class/cascaded classifier architecture in different
phases of the detection stage combined with robust features that are highly efficient in terms of computation
time enables a seemingly real time performance, with very high localization accuracy.
This method is validated on CT scans of different body sections, e.g., whole body scans, chest scans and
abdominal scans. Aside from improved robustness (due to the exploitation of spatial correlations), it gains a
run time efficiency in landmark detection. It also shows good scalability performance under increasing number
of landmarks.
KEYWORDS: Anisotropy, Monte Carlo methods, Error analysis, Image fusion, Data fusion, Statistical analysis, Magnetic resonance imaging, Imaging systems, Medical imaging, Image resolution
Recent advance of imaging technology has brought new challenges and opportunities for automatic and quantitative analysis of medical images. With broader accessibility of more imaging modalities for more patients, fusion of modalities/scans from one time point and longitudinal analysis of changes across time points have become the two most critical differentiators to support more informed, more reliable and more reproducible diagnosis and therapy decisions. Unfortunately, scan fusion and longitudinal analysis are both inherently plagued with increased levels of statistical errors. A lack of comprehensive analysis by imaging scientists and a lack of full awareness by physicians pose potential risks in clinical practice.
In this paper, we discuss several key error factors affecting imaging quantification, studying their interactions, and introducing a simulation strategy to establish general error bounds for change quantification across time. We quantitatively show that image resolution, voxel anisotropy, lesion size, eccentricity, and orientation are all contributing factors to quantification error; and there is an intricate relationship between voxel anisotropy and lesion shape in affecting quantification error. Specifically, when two or more scans are to be fused at feature level, optimal linear fusion analysis reveals that scans with voxel anisotropy aligned with lesion elongation should receive a higher weight than other scans. As a result of such optimal linear fusion, we will achieve a lower variance than naïve averaging. Simulated experiments are used to validate theoretical predictions. Future work based on the proposed simulation methods may lead to general guidelines and error lower bounds for quantitative image analysis and change detection.
In this paper, we propose a new method for automated delineation of tumor boundaries in whole-body PET/CT by
jointly using information from both PET and diagnostic CT images. Our method takes advantage of initial robust hot
spot detection and segmentation performed in PET to provide a conservative tumor structure delineation. Using this
estimate as initialization, a model for tumor appearance and shape in corresponding CT structures is learned and the
model provides the basis for classifying each voxel to either lesion or background class. This CT classification is then
probabilistically integrated with PET classification using the joint likelihood ratio test technique to derive the final
delineation. More accurate and reproducible tumor delineation is achieved as a result of such multi-modal tumor
delineation, without additional user intervention. The method is particular useful to improve the PET delineation result
when there are clear contrast edges in CT between tumor and healthy tissue, and to enable CT segmentation guided by
PET when such contrast difference is absent in CT.
In this paper, we present a histogram-based approach to the problem of contour tracking on echcardiographic sequences. We improve the classic single-histogram approach by using a dula-histogram technique. Then, we compare different dissimilarity measures between histograms in the specific context of echocardiographic sequences. Finally, we demonstrate the advantage of the Earth Mover's Distance particularly when intensity shifts occur.
Various relevance feedback techniques have been applied in content-based image retrieval. However, many are either heuristics-based, or computationally too expensive to be implemented in real-time, or limited to deal with only positive examples. We propose a fast and optimal linear relevance feedback scheme that takes both positive and negative examples from the user. This scheme can be regarded as a generalization of discriminant analysis on one hang, and on the other hand, it is also a generalization of an existing optimal scheme that takes only positive examples. We first define biased classification problem for the case where the data samples are labeled as positive or negative as to whether belonging to the target class (the biased class) or not; then biased discriminant analysis (BDA) is proposed as an optimal linear solution for dimensionality reduction. We also propose biased whitening transformation on the data when Euclidean distance is applied afterwards. Toy problems are designed to show the theoretical advantages of the proposed scheme over traditional discriminant analysis. It is implemented in real-time image retrieval for large databases and experimental results are presented to show the improvement achieved by the new scheme.
Image classification into meaningful classes is essentially a supervised pattern recognition problem. These classes include indoor, outdoor, landscape, urban, faces, etc. The recognition problem necessitates a large set of labeled examples for training the classifier. Any stratagem, which reduces the burden of labeling, is therefore very important to the deployment of such classifiers in practical applications. In this paper we show that the labeled training set can be augmented by an unlabeled set of examples in order to boost the performance of the classifier. In general, the set of unlabeled examples is not guaranteed to improve the classifier performance. We show that if the actual examples to be labeled are automatically selected through an unsupervised clustering step, the performance is more likely to improve with the unlabeled set. In this paper, we first present a modified EM algorithm, which combined labeled and unlabeled sets for training. We then apply this algorithm to image classification. Using mutually exclusive classes we show that the clustering step is crucial to the improvement in classifier performance.
The performance of a content-based image retrieval (CBIR) system is inherently constrained by the features adopted to represent the images in the database. Use of low-level features can not give satisfactory retrieval results in many cases; especially when the high-level concepts in the user's mind is not easily expressible in terms of the low-level features. Therefore whenever possible, textual annotations shall be added or extracted and/or processed to improve the retrieval performance. In this paper a hybrid image retrieval system is presented to provide the user with the flexibility of using both the high-level semantic concept/keywords as well as low-level feature content in the retrieval process. The emphasis is put on a statistical algorithm for semantic grouping in the concept space through relevance feedback in the image space. Under this framework, the system can also incrementally learn the user's search habit/preference in terms of semantic relations among concepts; and uses this information to improve the performance of subsequent retrieval tasks. This algorithm can eliminate the need for a stand-alone thesaurus, which may be too large in size and contain too much redundant information to be of practical use. Simulated experiments are designed to test the effectiveness of the algorithm. An intelligent dialogue system, to which this algorithm can be a part of the knowledge acquisition module, is also described as a front end for the CBIR system.
KEYWORDS: Cameras, Calibration, Imaging systems, 3D image processing, Data processing, Ranging, Error analysis, Data acquisition, 3D metrology, Sensing systems
In this paper several adaptive order statistic filters (OSF) are developed and compared for channel characterization and noise suppression in images and 3D CT data. Emphasis has been put on the situation when a noise-free reference image is not available but instead we can have a sequence of two noisy versions of the same image. One of the noisy images is used as the reference in the OSF. It is shown theoretically that if noises are not correlated, the expected values of the derived filter coefficients will be equal to those coefficients derived using a noise-free reference. Experiments using the noisy reference images yield comparable result to those methods using a noise-free reference image nd also better results than those of median, Gaussian, averaging and Wiener filters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.