In recent works, more and more attention mechanisms have been used for medical image segmentation, however, attention mechanisms are not very good at distinguishing categories in multi-category medical image segmentation tasks. In this paper, we propose a category feature reconstruction module (CFRM) for multi-category pathological image segmentation of pancreatic adenosquamous carcinoma. Compared with the attention mechanism to enhance the features of the region of interest, the proposed CFRM pays more attention to the reconstruction of category features. The CFRM enables the network not only to highlight the features of the region of interest, but also to increase the discrimination of different categories of features. Based on UNet, the proposed CFRM is added at the top of the encoding path. Compared with other state-of-art methods, both the Dice coefficients and the Iou coefficients of the proposed method have reached the best level on our pancreatic adenosquamous carcinoma segmentation dataset.
Recent studies have achieved a great success in medical image segmentation, but do not perform well in the application of pathological image segmentation. In traditional segmentation networks, some important features may be lost during the encoding process. In this paper, an Enhanced Pooling-Convolution (EPC) module is proposed to add weights to the space and channels of features in the encoding process. EPC evaluates the differences and complementarities of features between max pooling, average pooling, and convolution in the pooling process. Channel based attention is further used to weight different channels. VGG16 is used as the backbone in the U-shaped network, and the number of channels for upsampling is reduced during decoding process. It shows that the pooling and convolution block with three consecutive convolution layers can be replaced with the EPC module. Experimental results shows that the average DICE coefficient of our method is 2.55% higher than that of U-Net.
Meibomian glands dysfunction (MGD) is the main cause of dry eyes. The degree of meibomian gland atrophy plays an important role in the clinical diagnosis of MGD. The automatic quantification of meibomian gland area (MGA) and meibomian gland atrophy area (MGAA) is challenging due to the blurred boundary and various shapes. A U-shaped information fusion network (IF-Net) is proposed for the segmentation of MGA and MGAA in this paper. The contributions of this paper are as follows: (1) An information fusion (IF) module is designed to fuse the context information from the spatial dimension and the channel dimension respectively, which effectively reduces the loss of information caused by continuous downsampling. (2) A parallel path connection (PPC) is proposed and inserted into skip connections. On one hand, it can suppress the noise of different levels of information. On the other hand, it can make up the lack of information via the original simple skip connection of U-Net. Our proposed IF-Net has been evaluated on 505 infrared MG images from 300 subjects and achieves the average Dice similarity coefficient (DSC) of 84.81% and the average intersection over union (IoU) of 74.44% on MGAA segmentation, which indicates the primary effectiveness of the proposed method.
We propose to apply model-agnostic meta-learning (MAML) and MAML++ for pathology classification from optical coherence tomography (OCT) images. These meta-learning methods train a set of initialization parameters using training tasks, by which the model achieves fast convergence in new tasks with only a small amount of data. Our model is pretrained on an OCT dataset with seven types of retinal pathologies, and then refined and tested on another dataset with three types of pathologies. The classification accuracies of MAML and MAML++ reached 90.60% and 95.60% respectively, which are higher than the traditional deep learning method with pretraining.
Pancreatic Ductal Adenocarcinoma (PDAC) is one of the most common types of pancreatic cancer and one of the malignant cancers, with an overall five-year survival rate of 5%. CT is the most important imaging examination method for pancreatic diseases with high resolutions. Due to the subtle texture changes of PDAC, single-phase pancreatic imaging is not sufficient to assist doctors in diagnosis. Therefore, dual- phase pancreatPancreatic Ductal Adenocarcinoma (PDAC) is one of the most common types of pancreatic cancer and one of the malignant cancers, with an overall five-year survival rate of 5%. CT is the most important imaging examination method for pancreatic diseases with high resolutions. Due to the subtle texture changes of PDAC, single-phase pancreatic imaging is not sufficient to assist doctors in diagnosis. Therefore, dual- phase pancreatic imaging is recommended for better diagnosis of pancreatic disease. However, since manual labeling requires a lot of time and efforts for experienced physicians, and dual-phase images are often not aligned and largely different in texture, it is difficult to combine cross-phase images. Therefore, in this study, we aim to enhance PDAC automatic segmentation by integrating multi-phase images (i.e. arterial and venous phase) through transfer learning. Therefore, we first transform the image in source domain into the image in target domain through CycleGAN. Secondly, we propose an uncertainty loss to auxiliary training of pseudo target domain images by using pseudo images of different qualities generated during CycleGAN training. Finally, a feature fusion block is designed to compensate for the loss of details caused by downsampling. Experimental results show that the proposed method can obtain more accurate segmentation results than the existing methods.c imaging is recommended for better diagnosis of pancreatic disease. However, since manual labeling requires a lot of time and efforts for experienced physicians, and dual-phase images are often not aligned and largely different in texture, it is difficult to combine cross-phase images. Therefore, in this study, we aim to enhance PDAC automatic segmentation by integrating multi-phase images (i.e. arterial and venous phase) through transfer learning.
Deep convolutional neural networks (CNN) have achieved great success in segmentation of retinal optical coherence tomography (OCT) images. However, images acquired by different devices or imaging protocols have relatively large differences in noise level, contrast and resolution. As a result, the performance of CNN tends to drop dramatically when tested on data with domain shifts. Unsupervised domain adaptation solves this problem by transferring knowledge from a domain with labels (source domain) to a domain without labels (target domain). Therefore, this paper proposes a two-stage domain adaptation algorithm for segmentation of retinal OCT images. First, after image-level domain shift reduction, the segmenter is trained with a supervised loss on the source domain, together with an adversarial loss given by the discriminator to minimize the domain gap. Then, the target domain data with satisfactory pseudo labels, measured by entropy, is used to fine-tune the segmenter, which further improves the generalization ability of model. Comprehensive experimental results of cross-domain choroid and retinoschisis segmentation demonstrate the effectiveness of this method. With domain adaptation, the Intersection over Union (IoU) is improved by 8.34% and 3.54% for the two tasks respectively.
Age-related macular degeneration (AMD) is a common ophthalmic disease, mainly occurring in the elderly. After the occurrence of pigment epithelial detachment (PED), neuroepithelial detachment and subretinal fluid (SRF) are further caused, and patients need follow-up treatment. Quantitative analysis of these two symptoms is very important for clinical diagnosis. Therefore, we propose a new joint segmentation network to accurately segment PED and SRF in this paper. Our main contributions are: (1) a new multi-scale information selection module is proposed. (2) based on the U-shape network, a novel decoder branch is proposed to obtain boundary information, which is critical to segmentation. The experimental results show that our method achieves 72.97% for the average dice (DSC), 79.92% for the average recall, and 67.11% for the average intersection over union (IOU).
Branch retinal artery occlusion (BRAO) is an ophthalmic emergency. Acute BRAO is a clinical manifestation of BRAO. Due to its various shapes, locations and the blurred boundary, the automatic segmentation of acute BRAO is very challenging. To tackle these problems, we propose a novel method based on deep learning for automatic acute BRAO segmentation in optical coherence tomography (OCT) image. In this method, a novel Bayes posterior attention network, named as BPANet, is proposed for precise segmentation of the lesion. Our major contributions include: (1) A novel Bayes posterior probability based spatial attention module is used to enhance the information of lesion region. (2) An effective max-pooling and average-pooling channel attention module is embedded into BPANet to improve the effectiveness of the feature extraction. The proposed method is evaluated on 472 OCT B-scan images with a 4-fold cross validation strategy. The mean and standard deviation of Dice similarity coefficient, true positive rate, accuracy and intersection over union are 85.48±1.75%, 88.84±1.19%, 98.63±0.48% and 76.88±2.92%, respectively. The primary results show the effectiveness of the proposed method.
Optical coherence tomography (OCT) is widely used in the diagnosis of retinal diseases. Reading OCT images and summarizing its insights is a routine, yet nonetheless time-consuming task. Automatic report generation can alleviate this issue. There are two major challenges in this task: (1) An OCT image may contain several fundus abnormalities and it is difficult to detect them all simultaneously. (2) The diagnostic reports are complex, which need to describe multiple lesions. In this paper, we propose a deep learning-based model, named as VSTA model (Visual and Semantic Topic Attention model), which is able to generate report from the input OCT image. Our major contributions include: (1) Semantic attention and visual attention are jointly embedded to the model to generate diagnosis report with complex content. (2) Semantic tags based on image similarity is employed to initialize the semantic attention weights, which increases the prediction accuracy of the model. With the proposed VSTA model, the metric of BLEU-4, CIDEr and ROUGE-L reach 31.16, 264.22 and 52.58, which are better than some existing advanced methods.
Optical coherence tomography (OCT), a non-invasive high-resolution imaging technology of retinal tissues, has been widely used in the diagnosis of retinal diseases. However, the shortage of ophthalmologists and the overloaded work have caused great difficulties in screening for retinal diseases. Therefore, developing an accurate automatic diagnosis system for screening retinal diseases in OCT images is essential for the prevention and treatment of retinal diseases. To this end, we propose a novel multi-view-based automatic aided diagnosis method for simultaneously screening multiple diseases in retinal OCT images. First, we collected 11,211 cases of 11 common retinal diseases from the ophthalmology clinic, and each case included two OCTs acquired from different views. Then, to automatically and accurately screen diseases in retinal OCT images, a novel multi-view attention network is proposed for screening retinal diseases based on the collected data. Finally, we conduct experiments based on the collected clinical data to evaluate the performance of the proposed method. The AUC of the proposed method achieves 0.9023, which indicates the effectiveness of the proposed method.
At present, high myopia has become a hot spot for eye diseases worldwide because of its increasing prevalence. Linear lesion is an important clinical signal in the pathological changes of high myopia. ICGA is considered to be the “Ground Truth” for the diagnosis of linear lesions, but it is invasive and may cause adverse reactions such as allergy, dizziness, and even shock in some patients. Therefore, it is urgent to find a non-invasive imaging modality to replace ICGA for the diagnosis of linear lesions. Multi-color scanning laser (MCSL) imaging is a non-invasive imaging technique that can reveal linear lesion more richly than other non-invasive imaging technique such as color fundus imaging and red-free fundus imaging and some other invasive one such as fundus fluorescein angiography (FFA). To our best knowledge, there are no studies focusing on the linear lesion segmentation based on MCSL images. In this paper, we propose a new U-shape based segmentation network with multi-scale and global context fusion (SGCF) block named as SGCNet to segment the linear lesion in MCSL images. The features with multi-scales and global context information extracted by SGCF block are fused by learnable parameters to obtain richer high-level features. Four-fold cross validation was adopted to evaluate the performance of the proposed method on 86 MCSL images from 57 high myopia patients. The IoU coefficient, Dice coefficient, Sensitivity coefficient and Specialty are 0.494±0.109, 0.654±0.104, 0.676±0.131 and 0.998±0.002, respectively. Experiment results indicate the effectiveness of the proposed network.
In this paper, we propose a new module called cascaded multi-scale feature interaction (CMSI) for choroidal atrophy segmentation in fundus images. The proposed CMSI module makes full use of multi-scale features, including using cascaded pooling and convolution to complete feature interactions at different scales and using strip pooling to capture long-distance features. Based on the U-shape network, we use the ResNet as the backbone to extract hierarchical feature representations. The proposed CMSI module is added at the top of the encoder path. In summary, our main contributions are summarized in two aspects as follows: (1) The CMSI module is proposed for multi-scale feature ensembling by cascading multi-scale pooling and strip pooling. (2) The Dice coefficients of our proposed network on choroidal atrophy segmentation increased by 4.15% compared to U-Net.
Diabetic retinopathy (DR) is the most common chronic complication of diabetes and the first blinding eye disease in the working population. Hard exudates (HE) is an obvious symptom of diabetic retinopathy, which has high reflectivity to light and appears as hyperreflective foci (HRF) in optical coherence tomography (OCT) images. Based on the research and improvement of U-Net, this paper proposes a selfadaptive network (SANet) for HRF segmentation. There are two main improvements in the proposed SANet: (1) In order to simplify the learning process and enhance the gradient propagation, the ordinary convolution block in the encoder structure is replaced by a dual residual module (DRM). (2) The novel self-adaptive module (SAM) is embedded in the deep layer of the model, which enables the network to integrate local features and global dependencies adaptively, and makes it adapt to the irregular shape of HRF. The dataset consists of 112 2D OCT B-scan images, which were verified by four-fold cross validation. The mean and standard deviation of Dice similarity coefficient, Jaccard index, Sensitivity and Precision are 73.69±0.72%, 59.17±1.00%, 74.57±1.16% and 75.54±1.35%, respectively. The experimental results show that the proposed method can segment HRF successfully and the performance is better than the original U-Net.
Retinal capillary non-perfusion (CNP) is one of diabetic retinal vascular diseases. As the capillaries are occluded, blood stops flowing to certain regions of the retina, resulting in the formation of non-perfused regions. Accurate determination of the area and change of CNP is of great significance in clinical judgment of the extent of vascular obstruction and selection of treatment methods. This paper proposes a novel generative adversarial framework, and realize the segmentation of non-perfusion regions in fundus fluorescein angiography images. The generator G of GANs is trained to produce “real” images; while an adversarially trained discriminator D is trained to do as well as possible at detecting “fakes” images from the generator. In this paper, a U-shape network is used as the discriminator. Our method is validated using on 138 clinical fundus fluorescein angiography images. Experimental results show that our method achieves more accurate segmentation results than that of state-of-the-art approaches.
Optical coherence tomography (OCT) is an imaging modality that is extensively used for ophthalmic diagnosis and treatment. OCT can help reveal disease-related alterations below the surface of the retina, such as retinal fluid which can cause vision impairment. In this paper, we propose a novel context attention-and-fusion network (named as CAF-Net) for multiclass retinal fluid segmentation, including intraretinal fluid (IRF), subretinal fluid (SRF) and pigment epithelial detachment (PED). To deal with the seriously uneven sizes and irregular distributions of different types of fluid, our CAF-Net proposes the context shrinkage encode (CSE) module and context pyramid guide (CPG) module to extract and fuse global context information. The CSE module embedded in the encoder path can ignore redundant information and focus on useful information by a shrinkage function. Besides, the CPG module is inserted between the encoder and decoder, which can dynamically fuse multi-scale information in high-level features. The proposed CAF-Net was evaluated on a public dataset from RETOUCH Challenge in MICCAI2017, which consists of 70 OCT volumes with three types of retinal fluid from three different types of devices. The average of Dice similarity coefficient (DSC) and Intersection over Union (IoU) are 74.64% and 62.08%, respectively.
Glaucoma is a leading cause of irreversible blindness. Accurate optic disc (OD) and optic cup (OC) segmentation in fundus images is beneficial to glaucoma screening and diagnosis. Recently, convolutional neural networks have demonstrated promising progress in OD and OC joint segmentation in fundus images. However, the segmentation of OC is a challenge due to the low contrast and blurred boundary. In this paper, we propose an improved U-shape based network to jointly segment OD and OC. There are three main contributions: (1) The efficient channel attention (ECA) blocks are embedded into our proposed network to avoid dimensionality reduction and capture cross-channel interaction in an efficient way. (2) A multiplexed dilation convolution (MDC) module is proposed to extract more target features with various sizes and preserve more spatial information. (3) Three global context extraction (GCE) modules are used in our network. By introducing multiple GCE modules between encoder and decoder, the global semantic information flow from high-level stages can be gradually guided to different stages. The method proposed in this paper was tested on 240 fundus images. Compared with U-Net, Attention U-Net, Seg-Net and FCNs, the OD and OC’s mean Dice similarity coefficient of the proposed method can reach 96.20% and 90.00% respectively, which are better than the above networks.
Pathologic myopia (PM) is a major cause of legal blindness in the world. Linear lesions are closely related to PM, which include two types of lesions in the posterior fundus of pathologic eyes in optical coherence tomography (OCT) images: retinal pigment epithelium-Bruch's membrane-choriocapillaris complex (RBCC) disruption and myopic stretch line (MSL). In this paper, a fully automated method based on U-shape network is proposed to segment RBCC disruption and MSL in retinal OCT images. Compared with the original U-Net, there are two main improvements in the proposed network: (1) We creatively propose a new downsampling module named as feature aggregation pooling module (FAPM), which aggregates context information and local information. (2) Deep supervision module (DSM) is adopted to help the network converge faster and improve the segmentation performance. The proposed method was evaluated via 3-fold crossvalidation strategy on a dataset composed of 667 2D OCT B-scan images. The mean Dice similarity coefficient, Sensitivity and Jaccard of RBCC disruption and MSL are 0.626, 0.665, 0.491 and 0.739, 0.814, 0.626, respectively. The primary experimental results show the effectiveness of our proposed method.
Retinopathy of prematurity (ROP) is the main cause of blindness in children worldwide. The severity of ROP can be reflected by staging, zoning and plus disease. Specially, some studies have shown that zone recognition is more important than staging. However, due to the subjective factors, ophthalmologists are often inconsistent in their recognition of zones according to fundus images. Therefore, automated zones recognition of ROP is particularly important. In this paper, we propose a new ROP zones recognition network, in which pre-trained DenseNet121 is taken as backbone and a proposed attention block named Spatial and Channel Attention Block (SACAB) and deep supervision strategy are introduced. Our main contributions are: (1) Demonstrating the 2D convolutional neural network model pre-trained on natural images can be fine-tuned for automated zones recognition of ROP. (2) Based on pre-trained DenseNet121, we propose two improved schemes which effectively integrate attention mechanism and deep supervision learning for ROP zoning. The proposed method was evaluated on 662 retinal fundus images (82 zone I, 299 zone II, 281 zone III) from 148 examinations with 5-fold cross validation strategy. The results show that the performance of the proposed ROP zone recognition network achieves 0.8852 for accuracy (ACC), 0.8850 for weighted F1 score (W_F1) and 0.8699 for kappa. The preliminary experimental results show the effectiveness of the proposed method.
Retinal detachment (RD) refers to the separation of the retinal neuroepithelium layer (RNE) and retinal pigment epithelium (RPE), and retinoschisis (RS) is characterized by the RNE splitting into multiple layers. Retinal detachment and retinoschisis are the main complications leading to vision loss in high myopia. Optical coherence tomography (OCT) is the main imaging method for observing retinal detachment and retinoschisis. This paper proposes a U-shaped convolutional neural network with a cross-fusion global feature module (CFCNN) to achieve automatic segmentation of retinal detachment and retinoschisis. Main contributions include: (1) A new cross-fusion global feature module (CFGF) is proposed. (2) The residual block is integrated into the encoder of the U-Net network to enhance the extraction of semantic information. The method was tested on a dataset consisting of 540 OCT B-scans. With the proposed CFCNN method, the mean Dice similarity coefficient of retinal detachment and retinoschisis segmentation reached 94.33% and 90.29% and were better than some existing advanced segmentation networks.
The assistance of deep learning techniques for clinic doctors in disease analysis, diagnosis and treatment is becoming popular and popular. In this paper, we propose a U-shape architecture based Group Attention network (named as GANet) for symptom segmentation in fundus images with diabetic retinopathy, in which Channel Group Attention(CGA) module and Spatial Group Attention Upsampling (SGAU) module are designed. The CGA module can adaptively allocate resources based on the importance of the feature channels, which can enhance the flexibility of the network to handle different types of information. The original U-Net directly merges the high-level features and low-level features in decoder stage for semantic segmentation, and achieves good results. To increase the nonlinearity of the U-shape network and pay more attention to the lesion area, we propose a Spatial Group Attention Upsampling (SGAU) module. In summary, our main contributions include two aspects: (1) Based on the U-shape network, the CGA module and SGAU module are designed and applied, which can adaptively allocate the weight of channels and pay more attention to the lesion area, respectively. (2) Compared with the original U-Net, the Dice coefficients of the proposed network improves by nearly 2.96% for hard exudates segmentation and 2.89% for hemorrhage segmentation, respectively.
Retinopathy of prematurity (ROP) is an ocular disease which occurs in premature babies and is considered as one of the largest preventable causes of childhood blindness. However, insufficient ophthalmologists are qualified for ROP screening, especially in developing countries. Therefore, automated screening of ROP is particularly important. In this paper, we propose a new ROP screening network, in which pre-trained ResNet18 is taken as backbone and a proposed attention block named Complementary Residual Attention Block (CRAB) and Squeeze-and-Excitation (SE) block as channel attention module are introduced. Our main contributions are: (1) Demonstrating the 2D convolutional neural network model pre-trained on natural images can be fine-tuned for ROP screening. (2) Based on the pre-trained ResNet18, we propose an improved scheme combining which that effectively integrates attention mechanism for ROP screening. The proposed classification network was evaluated on 9794 fundus images from 650 subjects, in which 8351 are randomly selected as training set according to subjects and others are selected as testing set. The results showed that the performance of the proposed ROP screening network achieved 99.17% for accuracy, 98.65% for precision, 98.31% for recall, 98.48% for F1 score and 99.84% for AUC. The preliminary experimental results show the effectiveness of the proposed method.
Accurate lung tumor delineation plays an important role in radiotherapy treatment planning. Since the lung tumor has poor boundary in positron emission tomography (PET) images, it is a challenging task to accurately segment lung tumor. In addition, the heart, liver, bones and other tissues generally have the similar gray value as the lung tumor, therefore the segmentation results usually have high false positive. In this paper, we propose a novel and efficient fully convolutional network with a trainable compressed sensing module and deep supervision mechanism with sparse constraints to comprehensively address these challenges; and we call it fully convolutional network with sparse feature-maps composition (SFC-FCN). Our SFC-FCN is able to conduct end-to-end learning and inference, compress redundant features within channels and extract key uncorrelated features. In addition, we use deep a supervision mechanism with sparse constraints to guide the features extraction by a compressed sensing module. The mechanism is developed by driving an objective function that directly guides the training of both lower and upper layers in the network. We have achieved more accurate segmentation results than that of state-of-the-art approaches with a much faster speed and much fewer parameters.
Automatic lung segmentation with severe pathology plays a significant role in the clinical application, which can save physicians’ efforts to annotate lung anatomy. Since the lung has fuzzy boundary in low-dose computed tomography (CT) images, and the tracheas and other tissues generally have the similar gray value as the lung, it is a challenging task to accurately segment lung. How to extract key features and remove background features is a core problem for lung segmentation. This paper introduces a novel approach for automatic segmentation of lungs in low-dose CT images. First, we propose a contrastive attention module, which generates a pair of foreground and background attention maps to guide feature learning of lung and background separately. Second, a triplet loss is used on three feature vectors from different regions to pull the features from the full image and the lung region close whereas pushing the features from background away. Our method was validated on a clinical data set of 78 CT scans using the four-fold cross validation strategy. Experimental results showed that our method achieved more accurate segmentation results than that of state-of-the-art approaches.
Choroidal neovascularization(CNV) refers to abnormal choroidal vessels that grow through the Bruch’s membrane to the bottom of retinal pigment epithelium (RPE) or retinal neurepithelium (RNE) layer, which is the pathological characterization of age-related macular degeneration (AMD) and pathological myopia (PM). Nowadays, optical coherence tomography (OCT) is an important imaging modality for observing CNV. This paper creatively proposes a convolutional neural network with differential amplification blocks (DACNN) to segment CNV in OCT images. There are two main contributions. (1) A differential amplification block (DAB) is proposed to extract the contrast information of foreground and background. (2) The DAB is integrated into the U-shaped convolutional neural network for CNV segmentation. The method proposed in this paper was verified on a dataset composed of 886 OCT B-scans. Compared with manual segmentation, the mean Dice similarity coefficient can reach 86.40%, outperforming some existing deep networks for segmentation.
Accurate segmentation of pigment epithelial detachment (PED) in retinal optical coherence tomography (OCT) images can help doctors comprehensively analyze and diagnose chorioretinal diseases, such as age-related macular degeneration (AMD), central serous chorioretinopathy and polypoidal choroidal vasculopathy. Due to the serious uneven sizes of PED, some traditional algorithms or common deep networks do not perform well in PED segmentation. In this paper, we propose a novel attention multi-scale network (named as AM-Net) based on a U-shape network to segment PED in OCT images. Compared with the original U-Net, there are two main improvements in the proposed method: (1) Designing channel multiscale module (CMM) to replace the skip-connection layer of the U-Net, which uses channel attention mechanism to obtain multi-scale information. (2) Designing spatial multi-scale module (SMM) based on dilated convolution, which is inserted in the decoder path to make the network pay more attention on the multi-scale spatial information. We evaluated the proposed AM-Net on 240 clinically obtained OCT B-scans with 4-fold cross validation. The mean and standard deviation of Intersection over Union (IoU), Dice Similarity Coefficient (DSC), Sensitivity (Sen) and Specificity (Spe) are 72.12± 9.60%, 79.17±8.25%, 93.05±1.72% and 79.93±5.77%, respectively.
The prevalence of myopia is rapidly increasing worldwide. Along with the deepening of myopia, there will be various pathological changes of retina, such as choroidal atrophy, choroidal neovascularization, etc. In this paper, a U-Net based deep network is proposed to automatically segment choroidal atrophy in fundus images. We use U-Net as the main structure, which can learn rich hierarchical feature representations. In the decoder path, Squeeze-and-Excitation (SE) block is employed before each deconvolution to adaptively recalibrate channel feature response. We introduce deep-supervision mechanism and merge all the early prediction maps to obtain final prediction map. To ensure the smoothness of segmentation results, we propose a new loss function, which is termed EDT-auxiliary-loss (Euclidean distance transformation auxiliary loss). EDT-auxiliary-loss consists of Dice loss for ground truth and mean square error (MSE) loss for distance map. Another strategy for performance improvement is utilizing the information of optic disc (OD), which is usually adjacent to atrophy. The proposed method was evaluated on ISBI 2019 Pathologic Myopia Challenge dataset, which consists of 400 fundus images from161 normal eyes, 26 high myopia eyes and 213 pathologic myopia eyes. The proposed network was validated with four-fold cross validation. The experiment results show that the proposed method can successfully segment choroidal atrophy and achieve better performance than traditional U-Net.
Diabetic retinopathy (DR), a highly specific vascular complication caused by diabetes, has been found a major cause of blindness in the world. Early screening of DR is crucial for prevention of vision loss. Hard exudates (HEs) is one of the main manifestations of DR, which is characterized by hyper-reflective foci (HF) in retinal optical coherence tomography(OCT) images. In this paper, a fully automated method based on U-shape network is proposed to segment HF in retinal OCT images. Compared with the original U-Net, there are two main improvements in the proposed network:(1) The ordinary 3×3 convolution is replaced by multi-scale convolution based on dilated convolution, which can achieve adaptive receptive fields of the images. (2) In order to ignore irrelevant information and focus on key information in the channels, the channel attention module is embedded in the model. A dataset consisting of 112 2D OCT B-scan images was used to evaluate the proposed U-shape network for HF segmentation with 4-fold cross validation. The mean and standard deviation of Dice similarity coefficient, recall and precision are 73.26±2.03%, 75.71±1.98% and 74.28± 2.67%, respectively. The experimental results show the effectiveness of the proposed method.
KEYWORDS: Optical coherence tomography, Image segmentation, Global system for mobile communications, Retina, Eye, Image fusion, Visualization, Convolution, Ophthalmology, Network architectures
The choroid is an important structure of the eye and choroid thickness distribution estimated from optical coherence tomography (OCT) images plays a vital role in analysis of many retinal diseases. This paper proposes a novel group-wise attention fusion network (referred to as GAF-Net) to segment the choroid layer, which can effectively work for both normal and pathological myopia retina. Currently, most networks perform unified processing of all feature maps in the same layer, which leads to not satisfactory choroid segmentation results. In order to improve this , GAF-Net proposes a group-wise channel module (GCM) and a group-wise spatial module (GSM) to fuse group-wise information. The GCM uses channel information to guide the fusion of group-wise context information, while the GSM uses spatial information to guide the fusion of group-wise context information. Furthermore, we adopt a joint loss to solve the problem of data imbalance and the uneven choroid target area. Experimental evaluations on a dataset composed of 1650 clinically obtained B-scans show that the proposed GAF-Net can achieve a Dice similarity coefficient of 95.21±0.73%.
Dermoscopy is a non-invasive dermatology imaging and widely used in dermatology clinic. In order to screen and detect melanoma automatically, skin lesion segmentation in dermoscopy images is of great significance. In this paper, we propose an adaptive scale network (ASNet) for skin lesion segmentation in dermoscopy images. A ResNet34 with pretrained weights is applied as the encoder to extract more representative features. A novel adaptive scale module is designed and inserted into the top of the encoder path to dynamically fuse multi-scale information, which can self-learn based on spatial attention mechanism. Our proposed method is 5-fold cross-validated on a public dataset from Challenge Lesion Boundary Segmentation in ISIC-2018, which includes 2594 images from different types of skin lesion with different resolutions. The Jaccard coefficient, Dice coefficient and Accuracy are 82.15±0.328%, 88.880.390% and 96.00±0.228%, respectively. Experimental results show the effectiveness of the proposed ASNet.
In order to make further and more accurate automatic analysis and processing of optical coherence tomography (OCT) images, such as layer segmentation, disease region segmentation, registration, etc, it is necessary to screen OCT images first. In this paper, we propose an efficient multi-class 3D retinal OCT image classification network named as VinceptionC3D. VinceptionC3D is a 3D convolutional neural network which is improved from basic C3D by adding improved 3D inception modules. Our main contributions are: (1) Demonstrate that a fine-tuned C3D which is pretrained on nature action video datasets can be applied for the classification of 3D retinal OCT images; (2) Improve the network by employing 3D inception module which can capture multi-scale features. The proposed method is trained and tested on 873 3D OCT images with 6 classes. The average accuracy of the C3D with random initialization weights, the C3D with pre-trained weights, and the proposed VinceptionC3D with pre-trained weights are 89.35%, 92.09% and 94.04%, respectively. The result shows that the proposed VinceptionC3D is effective for the 6-class 3D retinal OCT image classification.
Change of the thickness and volume of the choroid, which can be observed and quantified from optical coherence tomography (OCT) images, is a feature of many retinal diseases, such as aged-related macular degeneration and myopic maculopathy. In this paper, we make purposeful improvements on the U-net for segmenting the choroid of either normal or pathological myopia retina, obtaining the Bruch’s membrane (BM) and the choroidal-scleral interface (CSI). There are two main improvements to the U-net framework: (1) Adding a refinement residual block (RRB) to the back of each encoder. This strengthens the recognition ability of each stage; (2) The channel attention block (CAB) is integrated with the U-net. This enables high-level semantic information to guide the underlying details and handle the intra-class inconsistency problem. We validated our improved network on a dataset which consists of 952 OCT Bscans obtained from 95 eyes from both normal subjects and patients suffering from pathological myopia. Comparing with manual segmentation, the mean choroid thickness difference is 8μm, and the mean Dice similarity coefficient is 85.0%.
Accurate lung segmentation is of great significance in clinical application. However, it is still a challenging task due to its complex structures, pathological changes, individual differences and low image quality. In the paper, a novel shape dictionary-based approach, named active shape dictionary, is introduced to automatically delineate pathological lungs from clinical 3D CT images. The active shape dictionary improves sparse shape composition in eigenvector space to effectively reduce local shape reconstruction error. The proposed framework makes the shape model to be iteratively deformed to target boundary with discriminative appearance dictionary learning and gradient vector flow to drive the landmarks. The proposed algorithm is tested on 40 3D low-dose CT images with lung tumors. Compared to state-of-the-art methods, the proposed approach can robustly and accurately detect pathological lung surface.
Data imbalance is a classic problem in image classification, especially for medical images where normal data is much more than data with diseases. To make up for the absence of disease images, methods which can generate retinal OCT images with diseases from normal retinal images are investigated. Conditional GANs (cGAN) have shown significant success in natural images generation, but the applications for medical images are limited. In this work, we propose an end-to-end framework for OCT image generation based on cGAN. The new structural similarity index (SSIM) loss is introduced so that the model can take the structure-related details into consideration. In experiments, three kinds of retinal disease images are generated. The generated images assume the natural structure of the retina and thus are visually appealing. The method is further validated by testing the classification performance trained by the generated images.
The recent introduction of next generation spectral optical coherence tomography (OCT) has become increasingly important in the detection and investigation of retinal related diseases. However, unstable eye position of patient makes tracking disease progression over short period difficult. This paper proposed a method to remove the eye position difference for longitudinal retinal OCT data. In the proposed method, pre-processing is first applied to get the projection image. Then, a vessel enhancement filter is applied to detect vessel shadows. Third, SURF algorithm is used to extract the feature points and RANSAC algorithm is used to remove outliers. Finally, transform parameter is estimated and the longitudinal OCT data are registered. Simulation results show that our proposed method is accurate.
The processing and analysis of retinal fundus images is widely studied because many ocular fundus diseases such as diabetic retinopathy, hypertensive retinopathy, etc., can be diagnosed and treated based on the corresponding analysis results. The optic disc (OD), as the main anatomical structure of ocular fundus, its shape, border, size and pathological depression are very important auxiliary parameters for the diagnosis of fundus diseases. So the precise localization and segmentation of OD is important. Considering the excellent performance of deep learning in object detection and location, an automatic OD localization and segmentation algorithm based on Faster R-CNN and shape constrained level set is presented in this paper. First, Faster R-CNN+ZF model is used to locate the OD via a bounding box (B-box). Second, the main blood vessels in the B-box are removed by Hessian matrix if necessary. Finally, a shape constrained level set algorithm is used to segment the boundary of the OD. The localization algorithm was trained on 4000 images selected from Kaggle and tested on the MESSIDOR database. For the OD localization, the mean average precision (mAP) of 99.9% was achieved, with average time of 0.21s per image. The segmentation algorithm was tested on 120 images randomly selected from MESSIDOR database, achieving an average matching score of 85.4%.
Cystoid macular edema (CME) and macular hole (MH) are the leading causes for visual loss in retinal diseases. The volume of the CMEs can be an accurate predictor for visual prognosis. This paper presents an automatic method to segment the CMEs from the abnormal retina with coexistence of MH in three-dimensional-optical coherence tomography images. The proposed framework consists of preprocessing and CMEs segmentation. The preprocessing part includes denoising, intraretinal layers segmentation and flattening, and MH and vessel silhouettes exclusion. In the CMEs segmentation, a three-step strategy is applied. First, an AdaBoost classifier trained with 57 features is employed to generate the initialization results. Second, an automated shape-constrained graph cut algorithm is applied to obtain the refined results. Finally, cyst area information is used to remove false positives (FPs). The method was evaluated on 19 eyes with coexistence of CMEs and MH from 18 subjects. The true positive volume fraction, FP volume fraction, dice similarity coefficient, and accuracy rate for CMEs segmentation were 81.0%±7.8%, 0.80%±0.63%, 80.9%±5.7%, and 99.7%±0.1%, respectively.
Optical coherence tomography (OCT) has been widely applied in the examination and diagnosis of corneal diseases, but the information directly achieved from the OCT images by manual inspection is limited. We propose an automatic processing method to assist ophthalmologists in locating the boundaries in corneal OCT images and analyzing the recovery of corneal wounds after treatment from longitudinal OCT images. It includes the following steps: preprocessing, epithelium and endothelium boundary segmentation and correction, wound detection, corneal boundary fitting and wound analysis. The method was tested on a data set with longitudinal corneal OCT images from 20 subjects. Each subject has five images acquired after corneal operation over a period of time. The segmentation and classification accuracy of the proposed algorithm is high and can be used for analyzing wound recovery after corneal surgery.
In most cases, the pituitary tumor compresses optic chiasma and causes optic nerves atrophy, which will reflect in retina. In this paper, an Adaboost classification based method is first proposed to screen pituitary tumor from retinal spectral- domain optical coherence tomography (SD-OCT) image. The method includes four parts: pre-processing, feature extraction and selection, training and testing. First, in the pre-processing step, the retinal OCT image is segmented into 10 layers and the first 5 layers are extracted as our volume of interest (VOI). Second, 19 textural and spatial features are extracted from the VOI. Principal component analysis (PCA) is utilized to select the primary features. Third, in the training step, an Adaboost based classifier is trained using the above features. Finally, in the testing phase, the trained model is utilized to screen pituitary tumor. The proposed method was evaluated on 40 retinal OCT images from 30 patients and 30 OCT images from 15 normal subjects. The accuracy rate for the diseased retina was (85.00±16.58)% and the rate for normal retina was (76.68±21.34)%. Totally average accuracy of the Adaboost classifier was (81.43± 9.15)%. The preliminary results demonstrated the feasibility of the proposed method.
In this paper, we propose a 3D registration method for retinal optical coherence tomography (OCT) volumes. The proposed method consists of five main steps: First, a projection image of the 3D OCT scan is created. Second, the vessel enhancement filter is applied on the projection image to detect vessel shadow. Third, landmark points are extracted based on both vessel positions and layer information. Fourth, the coherent point drift method is used to align retinal OCT volumes. Finally, a nonrigid B-spline-based registration method is applied to find the optimal transform to match the data. We applied this registration method on 15 3D OCT scans of patients with Choroidal Neovascularization (CNV). The Dice coefficients (DSC) between layers are greatly improved after applying the nonrigid registration.
In this paper, a novel approach combining the active appearance model (AAM) and graph search is proposed to segment
retinal layers for optic nerve head(ONH) centered optical coherence tomography(OCT) images. The method includes
two parts: preprocessing and layer segmentation. During the preprocessing phase, images is first filtered for denoising,
then the B-scans are flattened. During layer segmentation, the AAM is first used to obtain the coarse segmentation
results. Then a multi-resolution GS–AAM algorithm is applied to further refine the results, in which AAM is efficiently
integrated into the graph search segmentation process. The proposed method was tested on a dataset which
contained113-D SD-OCT images, and compared to the manual tracings of two observers on all the volumetric scans. The
overall mean border positioning error for layer segmentation was found to be 7.09 ± 6.18μm for normal subjects. It was
comparable to the results of traditional graph search method (8.03±10.47μm) and mean inter-observer variability
(6.35±6.93μm).The preliminary results demonstrated the feasibility and efficiency of the proposed method.
Choroid thickness and volume estimated from optical coherence tomography (OCT) images have emerged as important metrics in disease management. This paper presents an automated three-dimensional (3-D) method for segmenting the choroid from 1-μm wide-view swept source OCT image volumes, including the Bruch’s membrane (BM) and the choroidal–scleral interface (CSI) segmentation. Two auxiliary boundaries are first detected by modified Canny operators and then the optical nerve head is detected and removed. The BM and the initial CSI segmentation are achieved by 3-D multiresolution graph search with gradient-based cost. The CSI is further refined by adding a regional cost, calculated from the wavelet-based gradual intensity distance. The segmentation accuracy is quantitatively evaluated on 32 normal eyes by comparing with manual segmentation and by reproducibility test. The mean choroid thickness difference from the manual segmentation is 19.16±4.32 μm, the mean Dice similarity coefficient is 93.17±1.30%, and the correlation coefficients between fovea-centered volumes obtained on repeated scans are larger than 0.97.
In this paper, an automatic method is proposed to recognize the liver on clinical 3D CT images. The proposed method effectively use statistical shape model of the liver. Our approach consist of three main parts: (1) model training, in which shape variability is detected using principal component analysis from the manual annotation; (2) model localization, in which a fast Euclidean distance transformation based method is able to localize the liver in CT images; (3) liver recognition, the initial mesh is locally and iteratively adapted to the liver boundary, which is constrained with the trained shape model. We validate our algorithm on a dataset which consists of 20 3D CT images obtained from different patients. The average ARVD was 8.99%, the average ASSD was 2.69mm, the average RMSD was 4.92mm, the average MSD was 28.841mm, and the average MSD was 13.31%.
Choroid neovascularization (CNV) is a kind of pathology from the choroid and CNV-related disease is one important cause of vision loss. It is desirable to predict the CNV growth rate so that appropriate treatment can be planned. In this paper, we seek to find a method to predict the growth of CNV based on 3D longitudinal Optical Coherence Tomography (OCT) images. A reaction-diffusion model is proposed for prediction. The method consists of four phases: pre-processing, meshing, CNV growth modeling and prediction. We not only apply the reaction-diffusion model to the disease region, but also take the surrounding tissues into consideration including outer retinal layer, inner retinal layer and choroid layer. The diffusion in these tissues is considered as isotropic. The finite-element-method (FEM) is used to solve the partial differential equations (PDE) in the diffusion model. The curve of CNV growth with treatment are fitted and then we can predict the CNV status in a future time point. The preliminary results demonstrated that our proposed method is accurate and the validity and feasibility of our model is obvious.
In this paper, we proposed a method to automatically segment and count the rhesus choroid-retinal vascular endothelial cells (RF/6A) in fluorescence microscopic images which is based on shape classification, bottleneck detection and accelerated Dijkstra algorithm. The proposed method includes four main steps. First, a thresholding filter and morphological operations are applied to reduce the noise. Second, a shape classifier is used to decide whether a connected component is needed to be segmented. In this step, the AdaBoost classifier is applied with a set of shape features. Third, the bottleneck positions are found based on the contours of the connected components. Finally, the cells segmentation and counting are completed based on the accelerated Dijkstra algorithm with the gradient information between the bottleneck positions. The results show the feasibility and efficiency of the proposed method.
In this paper, a fully automatic method is proposed to segment the lung tumor in clinical 3D PET-CT images. The proposed method effectively combines PET and CT information to make full use of the high contrast of PET images and superior spatial resolution of CT images. Our approach consists of three main parts: (1) initial segmentation, in which spines are removed in CT images and initial connected regions achieved by thresholding based segmentation in PET images; (2) coarse segmentation, in which monotonic downhill function is applied to rule out structures which have similar standardized uptake values (SUV) to the lung tumor but do not satisfy a monotonic property in PET images; (3) fine segmentation, random forests method is applied to accurately segment the lung tumor by extracting effective features from PET and CT images simultaneously. We validated our algorithm on a dataset which consists of 24 3D PET-CT images from different patients with non-small cell lung cancer (NSCLC). The average TPVF, FPVF and accuracy rate (ACC) were 83.65%, 0.05% and 99.93%, respectively. The correlation analysis shows our segmented lung tumor volumes has strong correlation ( average 0.985) with the ground truth 1 and ground truth 2 labeled by a clinical expert.
Branch retinal artery occlusion (BRAO) is an ocular emergency which could lead to blindness. Quantitative analysis of BRAO region in the retina is very needed to assessment of the severity of retinal ischemia. In this paper, a fully automatic framework was proposed to classify and segment BRAO based on 3D spectral-domain optical coherence tomography (SD-OCT) images. To the best of our knowledge, this is the first automatic 3D BRAO segmentation framework. First, a support vector machine (SVM) based classifier is designed to differentiate BRAO into acute phase and chronic phase, and the two types are segmented separately. To segment BRAO in chronic phase, a threshold-based method is proposed based on the thickness of inner retina. While for segmenting BRAO in acute phase, a two-step segmentation is performed, which includes the bayesian posterior probability based initialization and the graph-search-graph-cut based segmentation. The proposed method was tested on SD-OCT images of 23 patients (12 of acute and 11 of chronic phase) using leave-one-out strategy. The overall classification accuracy of SVM classifier was 87.0%, and the TPVF and FPVF for acute phase were 91.1%, 5.5%; for chronic phase were 90.5%, 8.7%, respectively.
Positron Emission Tomography (PET) and Computed Tomography (CT) have been widely used in clinical practice for radiation therapy. Most existing methods only used one image modality, either PET or CT, which suffers from the low spatial resolution in PET or low contrast in CT. In this paper, a novel 3D graph cut method is proposed, which integrated Gaussian Mixture Models (GMMs) into the graph cut method. We also employed the random walk method as an initialization step to provide object seeds for the improvement of the graph cut based segmentation on PET and CT images. The constructed graph consists of two sub-graphs and a special link between the sub-graphs which penalize the difference segmentation between the two modalities. Finally, the segmentation problem is solved by the max-flow/min-cut method. The proposed method was tested on 20 patients’ PET-CT images, and the experimental results demonstrated the accuracy and efficiency of the proposed algorithm.
In this paper, we proposed a method based on the Freeman chain code to segment and count rhesus choroid-retinal vascular endothelial cells (RF/6A) automatically for fluorescence microscopy images. The proposed method consists of four main steps. First, a threshold filter and morphological transform were applied to reduce the noise. Second, the boundary information was used to generate the Freeman chain codes. Third, the concave points were found based on the relationship between the difference of the chain code and the curvature. Finally, cells segmentation and counting were completed based on the characteristics of the number of the concave points, the area and shape of the cells. The proposed method was tested on 100 fluorescence microscopic cell images, and the average true positive rate (TPR) is 98.13% and the average false positive rate (FPR) is 4.47%, respectively. The preliminary results showed the feasibility and efficiency of the proposed method.
In this paper, we sought to find a method to detect the Inner Segment /Outer Segment (IS/OS)disruption region automatically. A novel support vector machine (SVM) based method was proposed for IS/OS disruption detection. The method includes two parts: training and testing. During the training phase, 7 features from the region around the fovea are calculated. Support vector machine (SVM) is utilized as the classification method. In the testing phase, the training model derived is utilized to classify the disruption and non-disruption region of the IS/OS, and calculate the accuracy separately. The proposed method was tested on 9 patients' SD-OCT images using leave-one-out strategy. The preliminary results demonstrated the feasibility and efficiency of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.