Paper
10 October 2023 Ensemble of deep convolutional neural networks using multi-spectral vision transformer for diabetic macular edema classification from multicolor image
Min Cui, Xinke Gao, Jianyi Song, Ping Zhang
Author Affiliations +
Proceedings Volume 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023); 127995S (2023) https://doi.org/10.1117/12.3005817
Event: 3rd International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 2023, Kuala Lumpur, Malaysia
Abstract
Diabetic Macular Edema (DME) is a prevalent condition that threatens vision. Early and accurate detection of DME, coupled with appropriate treatment, can reduce the risk of blindness for patients. Multicolor image (MCI) facilitates the diagnosis of DME by providing multiple spectral images of retinal structures. Recent advances in deep learning methods for multimodal analysis have demonstrated superior performance over manual analysis. However, existing techniques for DME classification still suffer from low accuracy due to the insufficient utilization of MCI's properties. To address this issue, we propose a Multi-Spectral Vision Transformer (MSVT) model as a potential solution. Our model first resizes images from different spectra and then splits them into image patches. These patches are then projected into linear embeddings and added with positional embeddings. The Transformer encoder is then applied for feature extraction. The features of each spectrum are processed by a Multilayer Perceptron (MLP) head and then fused by a feature fusion module. Finally, the features are compressed and fed into a classifier for classification. We evaluate the empirical performance of the proposed algorithm on our in-house datasets. The classifier achieves a prediction accuracy of 0.935, sensitivity of 0.912, specificity of 0.924, and an AUC of 0.938 for predicting the DME status of MCIs.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Min Cui, Xinke Gao, Jianyi Song, and Ping Zhang "Ensemble of deep convolutional neural networks using multi-spectral vision transformer for diabetic macular edema classification from multicolor image", Proc. SPIE 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 127995S (10 October 2023); https://doi.org/10.1117/12.3005817
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image classification

Transformers

Deep learning

Image processing

Feature extraction

Medical imaging

Back to Top