Multi-scale Gaussian representation and outline-learning based cell image segmentation

BMC Bioinformatics, Aug 2013

Background High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation. Methods We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scale-space representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information. Results and conclusions We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks.

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2105-14-S10-S6.pdf

Multi-scale Gaussian representation and outline-learning based cell image segmentation

Muhammad Farhan 0 Pekka Ruusuvuori 0 Mario Emmenlauer 1 Pauli Rm 1 Christoph Dehio 1 Olli Yli-Harja 0 0 Department of Signal Processing, Tampere University of Technology , 33720 Tampere , Finland 1 Biozentrum, Universitat Basel , 4056 Basel , Switzerland Background: High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation. Methods: We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scalespace representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information. Results and conclusions: We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks. - From 10th International Workshop on Computational Systems Biology Tampere, Finland. 10-12 June 2013 Introduction High-throughput screening used in drug design involves identification of genes which modulate a particular biomolecular pathway. RNA interference (RNAi), by decreasing the expression of particular genes in a cell culture, helps in identifying and analyzing the target gene functions in the cells by observing the cell behavior after gene knockdown [1-3]. Image analysis is at the center stage of such studies where cell cultures are imaged with automated fluorescent microscopy to study the cell behavior in knockdown as well as in normal condition. Genome-wide high-content siRNA screening involves studying the dynamics of gene expression in cellular functions for the whole genome and therefore yields hundreds of thousands of images making their manual analysis impractical [3]. Quantitative image analysis is needed for the identification, classification and quantification of the phenotypes which is also not possible through manual analysis [3,4]. Consequently, fast enough automated image analysis methods are needed to fulfill the potential of high-throughput system. Segmentation of cells is typically at the core of the image analysis pipelines dealing with high-content genome-wide screening experiments [4,5]. This is generally the step which performs cell detection and further analysis, such as cell tracking and lineage reconstruction and cell classification, is based on the results of cell detection. However, in such experiments, segmentation is challenging due to presence of large number of phenotypes. Different cell phenotypes have different characteristics and appearances and, for some complex and heterogeneous cell cultures, it is difficult to build analysis capable of detecting all the phenotypes, potentially leading to the loss of some phenotypes. Accurate cell segmentation and detection is therefore essential for quantification of phenotypes. One of the main challenges in cell segmentation is the cells touching and clustering together, forming a clump. Not only the cytoplasms form clumps but clustering of nuclei is also quite common. The latter problem has been tackled in our recent article [6]. The problem with cytoplasm region in general, and specifically with their clumps, is that they do not often have visible boundaries. Due to this reason, and also due to their irregular shapes, the methods typically in use for clump splitting often fail [7]. The other challenge often faced in cytoplasm segmentation is uneven and varying actin signal. Imaging aberrations cause actin signal to be saturated at some locations and to be too low on other locations for being regarded as part of the cell. This causes methods based on global image segmentation methods to fail. Another similar challenge that lies in cytoplasm segmentation is that the inside of the cells is inhomogeneous, consequently the intensity variations are large. Sometimes, part of the cell cytoplasm resembles the background and the methods solely based on image intensity are often found struggling in such situations [4]. However, if along with image intensity, other features locale to those regions are examined, the difference between background and cytoplasm could be highlighted. In addition to all this, uneven illumination and out of focus regions of the image also cause problems in getting accurate segmentation results. Methods for cell cytoplasm segmentation available in literature can be mainly divided into two approaches: classic segmentation methods and deformable model-based methods. The former includes watershed transform, region growing, and mathematical morphology methods etc., see for example [8,9], whereas the latter comprises active contour [10], level set [11,12] and graph cut based methods [5]. Authors in [7] developed a method in which watershed algorithm with double thresholds is followed by splitting and merging of cellular regions based on quality metric obtained by correctly classified cells. Classification of cells is performed using a set of features with a priori information about the cells. In [13], enhancement of high intensity variations in the actin channel is performed by variance filtering. The enhanced image is then smoothed and thresholded using Otsu thresholding method. Subsequently, seeded watershed transform is applied which is restricted to the binary image of the cytoplasm. In another method [5], region growing algorithm and modified Otsu thresholding are used to extract the cytoplasm. Long and thin protrusions on spiky cells are extracted by scale-adaptive steerable filter. Finally, constraint factor graph cutbased active contour method and morphological algorithms are combined to separate tightly clustered cells. In a met (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/1471-2105-14-S10-S6.pdf
Article home page: http://www.biomedcentral.com/1471-2105/14/S10/S6

Muhammad Farhan, Pekka Ruusuvuori, Mario Emmenlauer, Pauli Rämö, Christoph Dehio, Olli Yli-Harja. Multi-scale Gaussian representation and outline-learning based cell image segmentation, BMC Bioinformatics, 2013, pp. S6, 14, DOI: 10.1186/1471-2105-14-S10-S6