Multi-scale Gaussian representation and outline-learning based cell image segmentation
Muhammad Farhan
0
Pekka Ruusuvuori
0
Mario Emmenlauer
1
Pauli Rm
1
Christoph Dehio
1
Olli Yli-Harja
0
0
Department of Signal Processing, Tampere University of Technology
,
33720 Tampere
,
Finland
1
Biozentrum, Universitat Basel
,
4056 Basel
,
Switzerland
Background: High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation. Methods: We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scalespace representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information. Results and conclusions: We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks.
-
From 10th International Workshop on Computational Systems Biology
Tampere, Finland. 10-12 June 2013
Introduction
High-throughput screening used in drug design involves
identification of genes which modulate a particular
biomolecular pathway. RNA interference (RNAi), by decreasing
the expression of particular genes in a cell culture, helps in
identifying and analyzing the target gene functions in the
cells by observing the cell behavior after gene knockdown
[1-3]. Image analysis is at the center stage of such studies
where cell cultures are imaged with automated fluorescent
microscopy to study the cell behavior in knockdown as
well as in normal condition. Genome-wide high-content
siRNA screening involves studying the dynamics of gene
expression in cellular functions for the whole genome and
therefore yields hundreds of thousands of images making
their manual analysis impractical [3]. Quantitative image
analysis is needed for the identification, classification and
quantification of the phenotypes which is also not possible
through manual analysis [3,4]. Consequently, fast enough
automated image analysis methods are needed to fulfill
the potential of high-throughput system.
Segmentation of cells is typically at the core of the
image analysis pipelines dealing with high-content
genome-wide screening experiments [4,5]. This is generally
the step which performs cell detection and further
analysis, such as cell tracking and lineage reconstruction and
cell classification, is based on the results of cell detection.
However, in such experiments, segmentation is
challenging due to presence of large number of phenotypes.
Different cell phenotypes have different characteristics
and appearances and, for some complex and
heterogeneous cell cultures, it is difficult to build analysis capable
of detecting all the phenotypes, potentially leading to the
loss of some phenotypes. Accurate cell segmentation and
detection is therefore essential for quantification of
phenotypes.
One of the main challenges in cell segmentation is the
cells touching and clustering together, forming a clump.
Not only the cytoplasms form clumps but clustering of
nuclei is also quite common. The latter problem has
been tackled in our recent article [6]. The problem with
cytoplasm region in general, and specifically with their
clumps, is that they do not often have visible boundaries.
Due to this reason, and also due to their irregular shapes,
the methods typically in use for clump splitting often fail
[7]. The other challenge often faced in cytoplasm
segmentation is uneven and varying actin signal. Imaging
aberrations cause actin signal to be saturated at some
locations and to be too low on other locations for being
regarded as part of the cell. This causes methods based
on global image segmentation methods to fail. Another
similar challenge that lies in cytoplasm segmentation is
that the inside of the cells is inhomogeneous,
consequently the intensity variations are large. Sometimes, part
of the cell cytoplasm resembles the background and the
methods solely based on image intensity are often found
struggling in such situations [4]. However, if along with
image intensity, other features locale to those regions are
examined, the difference between background and
cytoplasm could be highlighted. In addition to all this, uneven
illumination and out of focus regions of the image also
cause problems in getting accurate segmentation results.
Methods for cell cytoplasm segmentation available in
literature can be mainly divided into two approaches: classic
segmentation methods and deformable model-based
methods. The former includes watershed transform,
region growing, and mathematical morphology methods
etc., see for example [8,9], whereas the latter comprises
active contour [10], level set [11,12] and graph cut based
methods [5]. Authors in [7] developed a method in which
watershed algorithm with double thresholds is followed by
splitting and merging of cellular regions based on quality
metric obtained by correctly classified cells. Classification
of cells is performed using a set of features with a priori
information about the cells. In [13], enhancement of high
intensity variations in the actin channel is performed by
variance filtering. The enhanced image is then smoothed
and thresholded using Otsu thresholding method.
Subsequently, seeded watershed transform is applied which is
restricted to the binary image of the cytoplasm. In another
method [5], region growing algorithm and modified Otsu
thresholding are used to extract the cytoplasm. Long and
thin protrusions on spiky cells are extracted by
scale-adaptive steerable filter. Finally, constraint factor graph
cutbased active contour method and morphological
algorithms are combined to separate tightly clustered cells.
In a met (...truncated)