SPIE Journal Paper | 17 February 2022
KEYWORDS: Data modeling, Performance modeling, Detection and tracking algorithms, Medical imaging, Image classification, Mining, Pathology, Visual process modeling, Quantitative analysis, Kidney
Purpose: Recent studies have demonstrated the diagnostic and prognostic values of global glomerulosclerosis (GGS) in IgA nephropathy, aging, and end-stage renal disease. However, the fine-grained quantitative analysis of multiple GGS subtypes (e.g., obsolescent, solidified, and disappearing glomerulosclerosis) is typically a resource extensive manual process. Very few automatic methods, if any, have been developed to bridge this gap for such analytics. We present a holistic pipeline to quantify GGS (with both detection and classification) from a whole slide image in a fully automatic manner. In addition, we conduct the fine-grained classification for the subtypes of GGS. Our study releases the open-source quantitative analytical tool for fine-grained GGS characterization while tackling the technical challenges in unbalanced classification and integrating detection and classification.
Approach: We present a deep learning-based framework to perform fine-grained detection and classification of GGS, with a hierarchical two-stage design. Moreover, we incorporate the state-of-the-art transfer learning techniques to achieve a more generalizable deep learning model for tackling the imbalanced distribution of our dataset. This way, we build a highly efficient WSI-to-results GGS characterization pipeline. Meanwhile, we investigated the largest fine-grained GGS cohort as of yet with 11,462 glomeruli and 10,619 nonglomeruli, which include 7841 globally sclerotic glomeruli of three distinct categories. With these data, we apply deep learning techniques to achieve (1) fine-grained GGS characterization, (2) GGS versus non-GGS classification, and (3) improved glomeruli detection results.
Results: For fine-grained GGS characterization, when pretrained on the larger dataset, our model can achieve a 0.778-macro-F1 score, compared to a 0.746-macro-F1 score when using the regular ImageNet-pretrained weights. On the external dataset, our best model achieves an area under the curve (AUC) score of 0.994 when tasked with differentiating GGS from normal glomeruli. Using our dataset, we are able to build algorithms that allow for fine-grained classification of glomeruli lesions and are robust to distribution shifts.
Conclusion: Our study showed that the proposed methods consistently improve the detection and fine-grained classification performance through both cross validation and external validation. Our code and pretrained models have been released for public use at https://github.com/luyuzhe111/glomeruli.