ABSTRACT When training a deep learning model, the dataset used is of great importance to make sure that the model learns relevant features of the data and that it will be able to generalize to new data. However, it is typically difficult to produce a dataset without some bias toward any specific feature. Deep learning models used in histopathology have a tendency to overfit to the stain appearance of the training data { if the model is trained on data from one lab only, it will usually not be able to generalize to data from other labs. The standard technique to overcome this problem is to use color augmentation of the training data which, artificially, generates more variations for the network to learn. In this work we instead test the use of a so called domain-adversarial neural network, which is designed to prevent the model from being biased towards features that in reality are irrelevant such as the origin of an image. To test the technique, four datasets from different hospitals for Gleason grading of prostate cancer are used. We achieve state of the art results for these particular datasets, and furthermore for two of our three test datasets the approach outperforms the use of color augmentation.
Prostate cancer is the most diagnosed cancer in men. The diagnosis is confirmed by pathologists based on ocular inspection of prostate biopsies in order to classify them according to Gleason score. The main goal of this paper is to automate the classification using convolutional neural networks (CNNs). The introduction of CNNs has broadened the field of pattern recognition. It replaces the classical way of designing and extracting hand-made features used for classification with the substantially different strategy of letting the computer itself decide which features are of importance.
For automated prostate cancer classification into the classes: Benign, Gleason grade 3, 4 and 5 we propose a CNN with small convolutional filters that has been trained from scratch using stochastic gradient descent with momentum. The input consists of microscopic images of haematoxylin and eosin stained tissue, the output is a coarse segmentation into regions of the four different classes. The dataset used consists of 213 images, each considered to be of one class only. Using four-fold cross-validation we obtained an error rate of 7.3%, which is significantly better than previous state of the art using the same dataset. Although the dataset was rather small, good results were obtained. From this we conclude that CNN is a promising method for this problem. Future work includes obtaining a larger dataset, which potentially could diminish the error margin.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.