The performance of Deep Learning (DL) segmentation algorithms is routinely determined using quantitative metrics like the Dice score and Hausdorff distance. However, these metrics show a low concordance with humans’ perception of segmentation quality. The successful collaboration of health care professionals with DL segmentation algorithms will require a detailed understanding of experts’ assessment of segmentation quality. Here, we present the results of a study on expert quality perception of brain tumor segmentations of brain MR images generated by a DL segmentation algorithm. Eight expert medical professionals were asked to grade the quality of segmentations on a scale from 1 (worst) to 4 (best). To this end, we collected four ratings for a dataset of 60 cases. We observed a low inter-rater agreement among all raters (Krippendorff’s alpha: 0.34), which potentially is a result of different internal cutoffs for the quality ratings. Several factors, including the volume of the segmentation and model uncertainty, were associated with high disagreement between raters. Furthermore, the correlations between the ratings and commonly used quantitative segmentation quality metrics ranged from no to moderate correlation. We conclude that, similar to the inter-rater variability observed for manual brain tumor segmentation, segmentation quality ratings are prone to variability due to the ambiguity of tumor boundaries and individual perceptual differences. Clearer guidelines for quality evaluation could help to mitigate these differences. Importantly, existing technical metrics do not capture clinical perception of segmentation quality. A better understanding of expert quality perception is expected to support the design of more human-centered DL algorithms for integration into the clinical workflow.
Several digital reference objects (DROs) for DCE-MRI have been created to test the accuracy of pharmacokinetic modeling software under a variety of different noise conditions. However, there are few DROs that mimic the anatomical distribution of voxels found in real data, and similarly few DROs that are based on both malignant and normal tissue. We propose a series of DROs for modeling Ktrans and Ve derived from a publically-available RIDER DCEMRI dataset of 19 patients with gliomas. For each patient’s DCE-MRI data, we generate Ktrans and Ve parameter maps using an algorithm validated on the QIBA Tofts model phantoms. These parameter maps are denoised, and then used to generate noiseless time-intensity curves for each of the original voxels. This is accomplished by reversing the Tofts model to generate concentration-times curves from Ktrans and Ve inputs, and subsequently converting those curves into intensity values by normalizing to each patient’s average pre-bolus image intensity. The result is a noiseless DRO in the shape of the original patient data with known ground-truth Ktrans and Ve values. We make this dataset publically available for download for all 19 patients of the original RIDER dataset.
In the last five years, advances in processing power and computational efficiency in graphical processing units have catalyzed dozens of deep neural network segmentation algorithms for a variety of target tissues and malignancies. However, few of these algorithms preconfigure any biological context of their chosen segmentation tissues, instead relying on the neural network’s optimizer to develop such associations de novo. We present a novel method for applying deep neural networks to the problem of glioma tissue segmentation that takes into account the structured nature of gliomas – edematous tissue surrounding mutually-exclusive regions of enhancing and non-enhancing tumor. We trained separate deep neural networks with a 3D U-Net architecture in a tree structure to create segmentations for edema, non-enhancing tumor, and enhancing tumor regions. Specifically, training was configured such that the whole tumor region including edema was predicted first, and its output segmentation was fed as input into separate models to predict enhancing and non-enhancing tumor. We trained our model on publicly available pre- and post-contrast T1 images, T2 images, and FLAIR images, and validated our trained model on patient data from an ongoing clinical trial.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.