PurposeDeep convolutional neural network (CNN)-based methods are increasingly used for reducing image noise in computed tomography (CT). Current attempts at CNN denoising are based on 2D or 3D CNN models with a single- or multiple-slice input. Our work aims to investigate if the multiple-slice input improves the denoising performance compared with the single-slice input and if a 3D network architecture is better than a 2D version at utilizing the multislice input.ApproachTwo categories of network architectures can be used for the multislice input. First, multislice images can be stacked channel-wise as the multichannel input to a 2D CNN model. Second, multislice images can be employed as the 3D volumetric input to a 3D CNN model, in which the 3D convolution layers are adopted. We make performance comparisons among 2D CNN models with one, three, and seven input slices and two versions of 3D CNN models with seven input slices and one or three output slices. Evaluation was performed on liver CT images using three quantitative metrics with full-dose images as reference. Visual assessment was made by an experienced radiologist.ResultsWhen the input channels of the 2D CNN model increases from one to three to seven, a trend of improved performance was observed. Comparing the three models with the seven-slice input, the 3D CNN model with a one-slice output outperforms the other models in terms of noise texture and homogeneity in liver parenchyma as well as subjective visualization of vessels.ConclusionsWe conclude the that multislice input is an effective strategy for improving performance for 2D deep CNN denoising models. The pure 3D CNN model tends to have a better performance than the other models in terms of continuity across axial slices, but the difference was not significant compared with the 2D CNN model with the same number of slices as the input.
Deep convolutional neural network (CNN) based methods have become popular choices for reducing image noise in CT. Some of these methods showed promising results, especially in terms of preserving natural CT noise texture. Early attempts of CNN denoising were based on 2D CNN models with either single-slice or 3-slice input. The 3-slice input was mainly to utilize the existing network architecture that were proposed for natural images with 3 input channels. Multi-slice input has the potential to incorporate spatial information from adjacent slices. However, it remains unknown if this strategy indeed improves the denoising performance compared to a 2D model with a single-slice input and what is the best network architecture to utilize the multi-slice input. Two categories of network architectures can be used for multi-slice input. First, multi-slice low-dose images can be stacked channelwise as multi-channel input to a 2D CNN model. Second, multi-slice images can be employed as the 3D volumetric input to a 3D CNN model, in which the 3D convolution layers are adopted. In this study, we compare the performance of multiple CNN models with 1, 3, and 7 input slices. For the 7-slice input, we also include a comparison between 2D and 3D CNN models. When the input channels of the 2D CNN model increases from 1 to 3 to 7, a trend of improved performance was observed. Comparing the two models with 7-slice input, the 3D model slightly outperforms the 2D model in terms of noise texture and homogeneity in liver parenchyma as well as better subjective visualization of vessels such as intrahepatic portal vein and jejunal artery.
This study introduces a framework to approximate the bias inflicted by CNN noise reduction of CT exams. First, CNN noise reduction was used to approximate the noise-free image and noise-only image of a CT scan. The noise and signal were then recombined with spatial decoupling to simulate an ensemble of 100 images. CNN noise reduction was applied to the simulated ensemble and pixel-wise bias calculated. This bias approximation technique was validated within natural images and phantoms. The technique was then tested on ten whole-body-low-dose CT (WBLD-CT) patient exams. Bias correction led to improved contrast of lung and bone structures.
We estimate the minimum SNR necessary for object detection in the projection domain. We assume there is a set of objects O and we study an ideal observer that sequentially compares each member of O to the null hypothesis. This reduces to one-dimensional signal detection between two Gaussians. We find that for a search task of a circular 6 mm lesion in a region of interest 60 mm by 60 mm by 10 slices, and for a required sensitivity of 80% and specificity of 80%, the minimum required projection SNR is 5.1, a finding reminiscent of the Rose criterion.
Purpose: We developed a deep learning method to reduce noise and beam-hardening artifact in virtual monoenergetic image (VMI) at low x-ray energy levels.
Approach: An encoder–decoder type convolutional neural network was implemented with customized inception modules and in-house-designed training loss (denoted as Incept-net), to directly estimate VMI from multi-energy CT images. Images of an abdomen-sized water phantom with varying insert materials were acquired from a research photon-counting-detector CT. The Incept-net was trained with image patches (64 × 64 pixels) extracted from the phantom data, as well as synthesized, random-shaped numerical insert materials. The whole CT images (512 × 512 pixels) with the remaining real insert materials that were unseen in network training were used for testing. Seven contrast-enhanced abdominal CT exams were used for preliminary evaluation of Incept-net generalizability over anatomical background. Mean absolute percentage error (MAPE) was used to evaluate CT number accuracy.
Results: Compared to commercial VMI software, Incept-net largely suppressed beam-hardening artifact and reduced noise (53%) in phantom study. Incept-net presented comparable CT number accuracy at higher-density (P-value [0.0625, 0.999]) and improved it at lower-density inserts (P-value = 0.0313) with overall MAPE: Incept-net [2.9%, 4.6%]; commercial-VMI [6.7%, 10.9%]. In patient images, Incept-net suppressed beam-hardening artifact and reduced noise (up to 50%, P-value = 0.0156).
Conclusion: In this preliminary study, Incept-net presented the potential to improve low-energy VMI quality.
In this study, we describe a systematic approach to optimize deep-learning-based image processing algorithms using random search. The optimization technique is demonstrated on a phantom-based noise reduction training framework; however, the techniques described can be applied generally for other deep learning image processing applications. The parameter space explored included number of convolutional layers, number of filters, kernel size, loss function, and network architecture (either U-Net or ResNet). A total of 100 network models were examined (50 random search, 50 ablation experiments). Following the random search, ablation experiments resulted in a very minor performance improvement indicating near optimal settings were found during the random search. The top performing network architecture was a U-Net with 4 pooling layers, 64 filters, 3x3 kernel size, ELU activation, and a weighted feature reconstruction loss (0.2×VGG + 0.8×MSE). Relative to the low-dose input image, the CNN reduced noise by 90%, reduced RMSE by 34%, and increased SSIM by 76% on six patient exams reserved for testing. The visualization of hepatic and bone lesions was greatly improved following noise reduction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.