Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images (pdf)

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/jmicro/article-pdf/68/3/216/28762457/dfz002.pdf

Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images

Microscopy, 2019, 216–233 doi: 10.1093/jmicro/dfz002 Advance Access Publication Date: 5 February 2019 Article Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images Noorul Wahab1, Asifullah Khan1,2,*, and Yeon Soo Lee3 1 Pattern Recognition Lab, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore 45650, Islamabad, 2Deep Learning Lab, Centre for Mathematical Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore 45650, Islamabad, and 3Department of Biomedical Engineering, College of Medical Science, Catholic University of Daegu, Gyoungsangbuk-do 38430, Republic of Korea * To whom correspondence should be addressed. E-mail: Received 16 September 2018; Editorial Decision 9 January 2019; Accepted 11 January 2019 Abstract Segmentation and detection of mitotic nuclei is a challenging task. To address this problem, a Transfer Learning based fast and accurate system is proposed. To give the classiﬁer a balanced dataset, this work exploits the concept of Transfer Learning by ﬁrst using a pre-trained convolutional neural network (CNN) for segmentation, and then another Hybrid-CNN (with Weights Transfer and custom layers) for classiﬁcation of mitoses. First, mitotic nuclei are automatically annotated, based on the ground truth centroids. The segmentation module then segments mitotic nuclei and also produces some false positives. Finally, the detection module is trained on the patches from the segmentation module and performs the ﬁnal detection. Fine-tuning based Transfer Learning reduced training time, provided good initial weights, and improved the detection rate with F-measure of 0.713 and 76% area under the precision-recall curve for the challenging task of mitosis detection. Key words: breast cancer, mitosis count, convolutional neural networks, transfer learning, nuclei segmentation Introduction Glass slides of breast cancer tissue samples are observed by pathologists under light microscope and based on tissue characteristics a grade is assigned in order to help in advising treatment for the disease. To automate this process the glass slides are scanned with high resolution scanners so that digital image processing and machine learning can be applied. Proper diagnosis of breast cancer is important for its timely treatment. Recently the Whole Slide Imaging (WSI) has spurred research in automating the process of medical diagnosis [1,2]. In case of breast cancer, biopsy is performed if recommended by the doctor, based on the observed changes, or mammogram, or ultrasound results. To help automate the diagnosis process, the biopsy slides are scanned through special scanners, after proper staining [3]. © The Author(s) 2019. Published by Oxford University Press on behalf of The Japanese Society of Microscopy. All rights reserved. For permissions, please e-mail: 216 Microscopy, 2019, Vol. 68, No. 3 Fig. 1. Issues in accurate segmentation of nuclei. 1-overlapping, 2cluttering, 3-obscure boundaries. Fig. 2. Phases of mitosis. To classify images, usually some features, such as texture based, shape based or statistical features are extracted to train a classiﬁer. Such features are designed speciﬁcally for a task and are referred to as handcrafted features. But due to their requirement of accurate segmentation of the nuclei and variability of the slides’ staining process across different laboratories, these features cannot scale well. Secondly, utilizing experts’ knowledge for hand designing features is also difﬁcult, because of the subjective nature of identifying the mitotic nuclei. Recently, research has shown that the automatic features [9,10], extracted by convolutional neural networks (CNNs), can outperform (e.g. ResNet on ImageNet classiﬁcation [11]) speciﬁcally designed features (i.e. handcrafted features), especially on big datasets. On the other hand, the number of mitoses per 10 High Power Fields (HPF), an area that is visible under microscope are very few, therefore training a classiﬁer from scratch is not effective. Though the area that 10 HPFs cover varies slightly from microscope to microscope but roughly makes 2 mm2 area. Secondly, the non-mitotic nuclei outnumber the mitoses and cause class-imbalance problems. Previously the class-imbalance has been addressed by different techniques, such as random sampling from the non-mitotic nuclei [12], employment of ensembles [13], or combination of adaptive learning and class balancing [14]. To make use of a deep CNN on a small dataset, recently researchers have shown that Transfer Learning (TL) can be useful in many cases [15,16]. TL refers to the idea of adapting a CNN, which is previously trained on usually a big dataset, to a new problem. In case that the images from the source and target domain are somewhat similar, ﬁne-tuning of just the last few layers can produce results comparable to a trained-from-scratch CNN. This is because the lower layers learn basic shapes like edges which are common in most cases, whereas the last layers are tuned towards the target domain. In this article, we use the term trained-from-scratch to refer to training a model from scratch on the target domain data, rather than a ﬁnetuned model that is pre-trained on data from the source domain and is adapted and ﬁne-tuned for the target domain. But in the case that the domains are totally different, TL can be especially beneﬁcial since TL can give good weights initialization and ﬁne-tuning all the layers can produce comparable results, with faster convergence [17,18]. Mitotic count, which refers to the density of cells undergoing division, is regarded an important factor by pathologists, for breast cancer grading. As counting the number of mitoses, by observing them through a microscopy is a tedious and subjective task, therefore, recently research has focused on automating this process. Several international competitions (MITOS12 [4] from ICPR 2012, AMIDA13 [5] from MICCAI 2013, MITOS14 [6] from ICPR 2014, TUPAC16 [7] from MICCAI 2016) have been organized for encouraging new and improved algorithms. For each of these competitions, a dataset was prepared in which the centroid information (coordinates of row and column) of the mitotic nuclei was provided for the training set. Based on these centroids, different works have extracted patches to train a classiﬁer. Some methods used speciﬁc features for classiﬁcation and required accurate segmentation of the mitotic nuclei to extract such features [4,8]. But pixel-bypixel annotation of mitotic nuclei is difﬁcult and time consuming task, therefore for these datasets the pathologists only provided the centroids of the nuclei. Furthermore, the cells are very difﬁcult to be accurately segmented because of overlapping, cluttering and sometimes obscure boundaries (Fig. 1). Moreover, the mitotic ﬁgures can have different shapes at different phases (Fig. 2 (...truncated)