In computational pathology, training and inference of conventional deep convolutional neural networks (CNN) are usually limited to patches of small sizes (e.g., 256 × 256) sampled from whole slide images. In practice, however, diagnostic and prognostic information could lie within the context of tumor microenvironment across multiple regions, far beyond the scope of individual patches. For instance, the spatial relationship of tumor-infiltrating lymphocytes (TIL) across regions of interest might be prognostic for non-small cell lung cancer (NSCLC). This poses a multi-instance learning (MIL) problem, and a single-patch-driven CNN typically fails to learn spatial information and context between multiple patches, especially their spatial relationship. In this work, we present a cell graph-based MIL framework to predict the risk of death for early-stage NSCLC by aggregating feature representation of TIL-enclosing patches according to their spatial relationship. Inspired by PATCHY-SAN, a graph-embedding framework for CNNs, we use graph kernel-based approaches to embed a bag of patches into a sequence with their spatial information encoded into the sequence order. A transformer model was then trained to aggregate patch-level features based on spatial information. We demonstrate the capability of this framework to predict the likelihood of the patient with NSCLC in two cohorts (n=240) to survive for more than 5 years. The training cohort (n=195) comprised hematoxylin and eosin (H&E)-stained whole slide images (WSI), while the testing cohort (n=45) comprised H&E-stained tumor microarrays (TMA). We show that, with the spatial context of multiple patches encoded as an ordered patch sequence, the performance in the testing cohort of our approach achieves an area under the receiver operating characteristic curve (AUC) of 0.836 (p=0.009; HR=5.62), as opposed to a baseline conventional CNN with an AUC of 0.542 (p=0.105; HR=1.66). The results suggest that the Transformer is a generic spatial information aware MIL framework that can learn the spatial relationship of multiple TIL-enclosing patches from the graph representation of immune cells.
The tumor microenvironment (TME) is comprised of multiple cell types, with their spatial organization having been previously studied to identify associations with disease progression and response to therapy. These works, however, have focused on spatial interactions of a single cell type, ignoring spatial interplay between the remaining cells. Here, we introduce a framework to quantify complex spatial interactions on H&E-stained image between multiple cell families simultaneously within the TME, called spatial connectivity of tumor and associated cells (SpaCell). First, nuclei are segmented and classified into different families (e.g., cancerous cells and lymphocytes) using a combination of image processing and machine learning techniques. Local clusters of proximal nuclei are then built for each family. Next, quantitative metrics are extracted from these clusters to capture inter- and intra-family relationships, namely: density of clusters, area intersected between clusters, diversity of clusters surrounding a cluster, architecture of clusters, among others. When evaluated for predicting risk of recurrence in HPV-associated oropharyngeal squamous cell carcinoma (n=233, 107 vs 126 patients for training vs testing) and non-small cell lung cancer (n=186, 70 vs 116 patients for training vs for testing), SpaCell was able to differentiate between patients at high and low risk of recurrence (p=0.03 and p=0.02, respectively). SpaCell was compared against a deep learning and a state-of-the-art approach that uses single-family cell cluster graphs (CGG). CCG extracted metrics were not prognostic of disease-free survival (DFS) for oropharyngeal (p=0.98) nor lung (p=0.15) cancer, and deep learning was prognostic of DFS for lung (p=0.03) but not for oropharyngeal cancer (p=0.26). SpaCell was not only prognostic for both cancer types but also provides more explainability in terms of tumor biology.
Stage III Colorectal cancer (CRC) is treated with surgery followed by chemotherapy. Yet >20% of clinically low-risk patients develop recurrence. It is critical to identify high-risk stage III patients who can benefit from closer monitoring and escalation of therapy. Previous studies showed promising results in predicting risk from H&E slides in CRC using deep learning based algorithms. One biomarker which was heavily studied and showed good results in predicting risk is Tumor Infiltrating Lymphocytes (TILs). In CRC, TIL density has been shown to be significantly and independently associated with overall patient survival1. Furthermore, additional studies have demonstrated that analyzing spatial organization of TILs could be more informative than density alone2. Hence, our study aimed to stratify stage III CRC patients into distinct risk groups based on features derived from TILs and to determine if this classification could have independent prognostic significance. The training set (D1) included 50 patients and validation set (D2) consisted of 70 patients from an independent site. A survival model was trained to predict the risk of recurrence in stage III CRC patients. First, a deep learning (DL) model was used to segment TILs on WSIs. Next, 1036 features related to spatial architecture (SpaTIL) and density of TILs (DenTIL) were extracted. A Cox proportional hazards regression model in conjunction with the least absolute shrinkage and selection operator (Lasso) was used to find top 5 features and the feature coefficients associated with progression free survival (PFS) and provide risk scores to each patient. The risk scores for the training dataset were computed using the selected features with their respective coefficients. A cut-off value was determined according to these risk scores, above which patients were labelled high-risk and below was low risk. In the validation set, the median PFS for the high-risk group was 15.1mos and in the low-risk group was 27mos. The model was able to accurately predict higher incidence of progression in patients in the high-risk group (HR = 3.76, 95% CI 1.3-10.9, p-value=0.0053, c-index=0.687) in the validation set. Future work will entail additional multi-site, multi-institutional validation of our biomarker to further understand its strengths and applications.
Machine learning techniques have shown great promise in digital pathology. However, a major bottleneck is the difficulty of annotating necessary amount of tissue to deal with several variability factors, namely chemical fixation, sample slicing, or staining. Usually, models are trained using sets of annotated small image patches, but then, the number of required patches may increase exponentially and yet they must represent such variability. This paper presents a method for automatic sample selection to train a classifier for ovarian cancer by integrating a novel soft clustering strategy. The method starts by classifying a large set of patches with a previously trained classifier and divide patches from the cancer class as highly and moderately confident. An unsupervised selection of moderately confident patches by a Probabilistic Latent Semantic Analysis (PLSA), picks samples from relevant and meaningful groups with maximum within-group variance. A new model is re-trained using the highly confident patches together with patches obtained from the associated PLSA. This strategy outperforms a model trained with a larger set of annotated patches while the training times and the number of samples are much more smaller. The strategy was evaluated in a set of patches from 18 patients with Serous Ovarian Cancer, obtaining a reduction of 54.62% in the training time and 73.66% in the number of samples, while recall rate improved from 0.69 to 0.73.
Purpose: We used computerized image analysis and machine learning approaches to characterize spatial arrangement features of the immune response from digitized autopsied H&E tissue images of the lung in coronavirus disease 2019 (COVID-19) patients. Additionally, we applied our approach to tease out potential morphometric differences from autopsies of patients who succumbed to COVID-19 versus H1N1.
Approach: H&E lung whole slide images from autopsy specimens of nine COVID-19 and two H1N1 patients were computationally interrogated. 606 image patches (∼55 per patient) of 1024 × 882 pixels were extracted from the 11 autopsied patient studies. A watershed-based segmentation approach in conjunction with a machine learning classifier was employed to identify two types of
nuclei families: lymphocytes and non-lymphocytes (i.e., other nucleated cells such as pneumocytes, macrophages, and neutrophils). Based off the proximity of the individual nuclei, clusters for each nuclei family were constructed. For each of the resulting clusters, a series of quantitative measurements relating to architecture and density of nuclei clusters were calculated. A receiver operating characteristics-based feature selection method, violin plots, and the t-distributed stochastic neighbor embedding algorithm were employed to study differences in immune patterns.
Results: In COVID-19, the immune response consistently showed multiple small-size lymphocyte clusters, suggesting that lymphocyte response is rather modest, possibly due to lymphocytopenia. In H1N1, we found larger lymphocyte clusters that were proximal to large clusters of non-lymphocytes, a possible reflection of increased prevalence of macrophages and other immune cells.
Conclusion: Our study shows the potential of computational pathology to uncover immune response features that may not be obvious by routine histopathology visual inspection.
Lung adenocarcinoma (LUAD), the most common type of lung cancer, has an average 5-year survival rate of 15%. In LUAD, interaction between tumor and immune cells has been shown to be highly associated with the likelihood of disease progression and metastases. We have previously demonstrated the association between spatial architecture and arrangement of tumor-infiltrating lymphocytes (TILs) with likelihood of recurrence in early stage NSCLC. Recently, gene set enrichment analysis-derived immune scores have been found to be prognostic of outcome. However, this requires transcriptomics techniques as a precursor, which involves mechanical disruption of cells and tissues. In this work (N = 170), we extracted graph-based histomorphometric features on segmented nuclei from digitized H and E biopsy images and then performed principal component analysis (PCA) to select the most representative tiles from each patient. We then identified TILs and quantitative histomorphometric attributes of different nuclei groups (all-nuclei, TILs, non-TILs) prognostic of overall patient survival (OS) and further investigated their associations with immune scores and biological pathways implicated immune response using gene-set enrichment analysis (GSEA). We found TIL-compactness (a set of TIL density features) derived risk scores were prognostic of OS (Hazard Ratio (HR) = 3.26, p = 0.012, C-index = 0.634). The median immune score (IS) in the cohort was used as a threshold to divide the cases into low and high IS expression groups. The TIL compactness measures prognostic of OS were also statistically significantly correlated with the IS and biological pathways related to immune response (Immune System Process, Immune Response, Adaptive Immune Response, and Humoral Immune Response Mediated by Circulating Immunoglobulin).
The presence of tumor-infiltrating lymphocytes (TILs) is correlated with outcome and prognosis in epithelial ovarian cancer (EOC). In this study, automated image analysis was used to analyze the association between overall survival (OS) and TIL spatial arrangement and density in a multi-site cohort of 102 EOC patients who received adjuvant chemotherapy following debulking surgery. Features of the spatial arrangement of TILs (SpaTIL) were used to quantify the spatial co-localization of TILs and tumor cells on digitized pathology slides of the malignant neoplasm of excised specimens. A multivariable Cox regression model of SpaTIL features was fit on the n1 = 51 patient training set and was evaluated in the n2 = 51 patient validation set. The SpaTIL signature was significantly associated with OS, both in the training set (hazard ratio (HR) = 2.81, 95% confidence interval (CI) = 1.33 − 5.92, and p = 0.003) and the validation set (HR = 2.06, 95% CI = 1.04 − 4.07, and p = 0.008). In addition, fusing our spaTIL risk score and the clinical staging further improved the results of the predictive model (HR = 4.045, 95% CI = 4.11−5.41, and p = 0.0002 in the validation set) and outperformed clinical staging alone. This finding illustrates that a spaTIL risk score is not only able to predict OS independent of clinical data, but also offers prognostic value complementary to current clinical standard-of-care. Patients with longer survival times had significantly higher heterogeneity of non-TIL cluster area, while shorter time survivors had mostly same-sized, evenly-distributed non-TIL clusters and smaller average TIL cluster area. These findings suggest that dispersion of TILs throughout the tumor is associated with better treatment response to post-treatment adjuvant chemotherapy, and therefore longer survival time.
A number of papers have established that a high density of tumor-infiltrating lymphocytes (TILs) is highly correlated with a better prognosis for many different cancer types. More recently, some studies have shown that the spatial interplay between different subtypes of TILs (e.g. CD3, CD4, CD8) is more prognostic of disease outcome compared to just metrics related to TIL density. A challenge with TIL subtyping is that it relies on quantitative immunofluoresence or immunohistochemistry, complex and tissue-destructive technologies. In this paper we present a new approach called PhenoTIL to identify TIL sub-populations and quantify the interplay between these sub-populations and show the association of these interplay features with recurrence in early stage lung cancer. The approach comprises a Dirichlet Process Gaussian Mixture Model that clusters lymphocytes on H&E images. The approach was evaluated on a cohort of N=178 early stage non-small cell lung cancer patients, N=100 being used for model training and N=78 being used for independent validation. A Linear Discriminant Analysis classifier was trained in conjunction with 186 PhenoTIL features to predict the likelihood of recurrence in the test set. The PhenoTIL features yielded an AUC=0.84 compared to an approach involving just TIL density alone (AUC=0.58). In addition, a Kaplan-Meier analysis showed that the PhenoTIL features were able to statistically significantly distinguish early from late recurrence (p = 4 ∗ 10 −5 ).
Dermatopathology education meaningfully relies on consultation of books, which are expensive, quickly outdated and have limited possibilities. In recent years, virtual microscopy, a method that enables examination of digitized microscopy samples, has earn interest because of its possibilities in terms of interaction, availability, usability, low costs and adaptation to multiple clinic scenarios. This work introduces a customized low-cost system for consultation of dermatopathology samples. First, physical slides are digitized using an optical microscope coupled to a digital camera controlled by a custom-motorized scanner. Then, digitized images are automatically stitched to assembly the Whole Slide Image (WSI). A web application, developed using open source tools, gives access to such WSI and allows users to interact with the content by panning and zooming. The application also allows to hand-free annotate specific regions. A set of 100 dermatopathology slides, provided by the Pathology Department of the Universidad Nacional de Colombia, representing basic lesions and inflammatory skin diseases (based on Ackerman patterns) were digitized. Each WSI contains diagnosis and annotations of relevant regions. The platform is currently being used by trainees who highlight the benefits of this kind of tools that complement their training and help to improve their diagnostic skills.
Automatic detection and quantification of glands in gastric cancer may contribute to objectively measure the lesion severity, to develop strategies for early diagnosis, and most importantly to improve the patient categorization. This article presents an entire framework for automatic detection of glands in gastric cancer images. This approach starts by selecting gland candidates from a binarized version of the hematoxylin channel. Next, the gland’s shape and nuclei are characterized using local features which feed a Monte Carlo Cross validation method classifier trained previously with manually labeled images. Validation was carried out using a dataset with 1330 annotated structures (2372 glands) from seven fields of view extracted from gastric cancer whole slide images. Results showed an accuracy of 93% using a simple linear classifier. The presented strategy is quite simple, flexible and easily adapted to an actual pathology laboratory.
Automatic detection of lymphocytes could contribute to develop objective measures of the infiltration grade of tumors, which can be used by pathologists for improving the decision making and treatment planning processes. In this article, a simple framework to automatically detect lymphocytes on lung cancer images is presented. This approach starts by automatically segmenting nuclei using a watershed-based approach. Nuclei shape, texture, and color features are then used to classify each candidate nucleus as either lymphocyte or non-lymphocyte by a trained SVM classifier. Validation was carried out using a dataset containing 3420 annotated structures (lymphocytes and non-lymphocytes) from 13 1000 × 1000 fields of view extracted from lung cancer whole slide images. A Deep Learning model was trained as a baseline. Results show an F-score 30% higher with the presented framework than with the Deep Learning approach. The presented strategy is, in addition, more flexible, requires less computational power, and requires much lower training times.
During a diagnosis task, a Pathologist looks over a Whole Slide Image (WSI), aiming to find out relevant pathological patterns. Nonetheless, a virtual microscope captures these structures, but also other cellular patterns with different or none diagnostic meaning. Annotation of these images depends on manual delineation, which in practice becomes a hard task. This article contributes a new method for detecting relevant regions in WSI using the routine navigations in a virtual microscope. This method constructs a sparse representation or dictionary of each navigation path and determines the hidden relevance by maximizing the incoherence between several paths. The resulting dictionaries are then projected onto each other and relevant information is set to the dictionary atoms whose similarity is higher than a custom threshold. Evaluation was performed with 6 pathological images segmented from a skin biopsy already diagnosed with basal cell carcinoma (BCC). Results show that our proposal outperforms the baseline by more than 20%.
KEYWORDS: Fetus, Signal detection, Wavelets, Electrocardiography, Electronic filtering, Independent component analysis, Discrete wavelet transforms, Detection and tracking algorithms, Data modeling, Signal to noise ratio
Non-invasive fetal electrocardiography (fECG) has attracted the medical community because of the importance of fetal monitoring. However, its implementation in clinical practice is challenging: the fetal signal has a low Signal- to-Noise-Ratio and several signal sources are present in the maternal abdominal electrocardiography (AECG). This paper presents a novel method to detect the fetal signal from a multi-channel maternal AECG. The method begins by applying filters and signal detrending the AECG signals. Afterwards, the maternal QRS complexes are identified and subtracted. The residual signals are used to detect the fetal QRS complex. Intervals of these signals are analyzed by using a wavelet decomposition. The resulting representation feds a previously trained Random Forest (RF) classifier that identifies signal intervals associated to fetal QRS complex. The method was evaluated on a public available dataset: the Physionet2013 challenge. A set of 50 maternal AECG records were used to train the RF classifier. The evaluation was carried out in signals intervals extracted from additional 25 maternal AECG. The proposed method yielded an 83:77% accuracy in the fetal QRS complex classification task.
Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes’ detected positions, can be correlated to the patient’s outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.
Tumor-infiltrating lymphocytes (TILs) have proved to play an important role in predicting prognosis, survival, and response to treatment in patients with a variety of solid tumors. Unfortunately, currently, there are not a standardized methodology to quantify the infiltration grade. The aim of this work is to evaluate variability among the reports of TILs given by a group of pathologists who examined a set of digitized Non-Small Cell Lung Cancer samples (n=60). 28 pathologists answered a different number of histopathological images. The agreement among pathologists was evaluated by computing the Kappa index coefficient and the standard deviation of their estimations. Furthermore, TILs reports were correlated with patient’s prognosis and survival using the Pearson’s correlation coefficient. General results show that the agreement among experts grading TILs in the dataset is low since Kappa values remain below 0.4 and the standard deviation values demonstrate that in none of the images there was a full consensus. Finally, the correlation coefficient for each pathologist also reveals a low association between the pathologists’ predictions and the prognosis/survival data. Results suggest the need of defining standardized, objective, and effective strategies to evaluate TILs, so they could be used as a biomarker in the daily routine.
Computational histomorphometric approaches typically use low-level image features for building machine learning classifiers. However, these approaches usually ignore high-level expert knowledge. A computational model (M_im) combines low-, mid-, and high-level image information to predict the likelihood of cancer in whole slide images. Handcrafted low- and mid-level features are computed from area, color, and spatial nuclei distributions. High-level information is implicitly captured from the recorded navigations of pathologists while exploring whole slide images during diagnostic tasks. This model was validated by predicting the presence of cancer in a set of unseen fields of view. The available database was composed of 24 cases of basal-cell carcinoma, from which 17 served to estimate the model parameters and the remaining 7 comprised the evaluation set. A total of 274 fields of view of size 1024×1024 pixels were extracted from the evaluation set. Then 176 patches from this set were used to train a support vector machine classifier to predict the presence of cancer on a patch-by-patch basis while the remaining 98 image patches were used for independent testing, ensuring that the training and test sets do not comprise patches from the same patient. A baseline model (M_ex) estimated the cancer likelihood for each of the image patches. M_ex uses the same visual features as M_im, but its weights are estimated from nuclei manually labeled as cancerous or noncancerous by a pathologist. M_im achieved an accuracy of 74.49% and an F-measure of 80.31%, while M_ex yielded corresponding accuracy and F-measures of 73.47% and 77.97%, respectively.
Evidence based medicine aims to provide a quantifiable framework to support cancer optimal treatment selection. Pathological examination is the main evidence used in medical management, yet the level of quantification is low and highly dependent on the examiner expertise. This paper presents and evaluates a method to extract graph based topological features from skin tissue images to identify cancerous regions associated to basal cell carcinoma. The graph features constitute a quantitative measure of the architectural tissue organization. Results show that graph topological features extracted from a nuclei based distance graph, particularly those related to local density, have a high predictive value in the automated detection of basal cell carcinoma. The method was evaluated using a leave-one-out validation scheme in a set of 9 skin Whole Slide Images obtaining a 0.76 F-score in distinguishing basal cell carcinoma regions in skin tissue whole slide images.
This article introduces a computer-aided solution for radiology education integrated with the clinic practice, inherited from modern technologies that facilitate the processing of large amounts of stored information. Radiology training may have several challenges such as image retrieval, extraction of knowledge, education towards solving problems and the interaction with huge repositories known as Picture Archiving and Communication Systems (PACS). This project proposes a user-based system that learns from user interaction, retrieving not just the requested information but recommending related cases and interesting images. The recommended images are retrieved using a Click-through rate (CTR) strategy for defining the most similar cases in the database. This is a fully web-based proposal, potentially useful at classroom or home, that allows students to develop the clinical skills needed in a more realistic scenario.
The use of low-level visual features to assign high level labels in datasets of histopathology images is a possible
solution to the problems derived from manual labeling by experts. However, in many cases, the visual cues are
not enough. In this article we propose the use of features derived exclusively from the spatial distribution of the
cell nuclei. These features are calculated using the weight of k-nn graphs constructed from the distances between
cells. Results show that there are k values with enhanced discriminatory power, especially when comparing
cancerous and non-cancerous tissue.
Reconstruction of the heartbeat is an useful tool to detect and diagnose some pathologies. However, this process represents a challenge because the heart is a moving organ inside a moving body, so that, either similar regions are hard to identify or some regions appear and disappear constantly. This article presents a reconstruction method of the right ventricle using SURF points in irregular regions. The SURF points, invariant to image scale and rotation, provide robust features of a right ventricle slice that can then be traced to the other slices. By using such points and then, using a nonrigid registration, it possible to perform a volumetrical reconstruction of these images.
Accessing information of interest in collections of histopathology images is a challenging task. To address such issue, previous works have designed searching strategies based on the use of keywords and low-level features. However, those methods have demonstrated to not be enough or practical for this purpose. Alternative low-level features such as cell area, distance among cells and cell density are directly associated to simple histological concepts and could serve as good descriptors for this purpose. In this paper, a statistical model is adapted to represent the distribution of the areas occupied by cells for its use in whole histopathology image characterization. This novel descriptor facilitates the design of metrics based on distribution parameters and also provides new elements for a better image understanding. The proposed model was validated using image processing and statistical techniques. Results showed low error rates, demonstrating the accuracy of the model.
Virtual microscopy (VM) facilitates visualization and deployment of histopathological virtual slides (VS), a useful tool for education, research and diagnosis. In recent years, it has become popular, yet its use is still limited basically because of the very large sizes of VS, typically of the order of gigabytes. Such volume of data requires efficacious and efficient strategies to access the VS content. In an educative or research scenario, several users may require to access and interact with VS at the same time, so, due to large data size, a very expensive and powerful infrastructure is usually required. This article introduces a novel JPEG2000-based service oriented architecture for streaming and visualizing very large images under scalable strategies, which in addition need not require very specialized infrastructure. Results suggest that the proposed architecture enables transmission and simultaneous visualization of large images, while it is efficient using resources and offering users proper response times.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.