FogBank: a single cell segmentation across multiple cell lines and image modalities
Joe Chalfoun
0
Michael Majurski
0
Alden Dima
0
Christina Stuelten
1
Adele Peskin
0
Mary Brady
0
0
Information Technology Laboratory, National Institute of Standards and Technology
,
Gaithersburg, MD
,
USA
1
Laboratory of Cellular and Molecular Biology, National Cancer Institute, National Institutes of Health
,
Bethesda, MD
,
USA
Background: Many cell lines currently used in medical research, such as cancer cells or stem cells, grow in confluent sheets or colonies. The biology of individual cells provide valuable information, thus the separation of touching cells in these microscopy images is critical for counting, identification and measurement of individual cells. Over-segmentation of single cells continues to be a major problem for methods based on morphological watershed due to the high level of noise in microscopy cell images. There is a need for a new segmentation method that is robust over a wide variety of biological images and can accurately separate individual cells even in challenging datasets such as confluent sheets or colonies. Results: We present a new automated segmentation method called FogBank that accurately separates cells when confluent and touching each other. This technique is successfully applied to phase contrast, bright field, fluorescence microscopy and binary images. The method is based on morphological watershed principles with two new features to improve accuracy and minimize over-segmentation. First, FogBank uses histogram binning to quantize pixel intensities which minimizes the image noise that causes over-segmentation. Second, FogBank uses a geodesic distance mask derived from raw images to detect the shapes of individual cells, in contrast to the more linear cell edges that other watershed-like algorithms produce. We evaluated the segmentation accuracy against manually segmented datasets using two metrics. FogBank achieved segmentation accuracy on the order of 0.75 (1 being a perfect match). We compared our method with other available segmentation techniques in term of achieved performance over the reference data sets. FogBank outperformed all related algorithms. The accuracy has also been visually verified on data sets with 14 cell lines across 3 imaging modalities leading to 876 segmentation evaluation images. Conclusions: FogBank produces single cell segmentation from confluent cell sheets with high accuracy. It can be applied to microscopy images of multiple cell lines and a variety of imaging modalities. The code for the segmentation method is available as open-source and includes a Graphical User Interface for user friendly execution.
-
Background
Many cell lines that are currently being studied for
medical purposes, such as cancer cell lines, grow in
confluent sheets. These cell sheets typically exhibit cell line
specific biological properties such as the morphology of
the sheet, protein expression, proliferation rate, and
invasive/metastatic potential. However, cell sheets are
comprised of cells of different phenotypes. For example,
individual cells in a sheet can have diverse migration
patterns, cell shapes, can express different proteins, or
differentiate differently. Identifying phenotypes of
individual cells is highly desirable, as it will contribute to
our understanding of biological phenomena of tumor
metastasis, stem cell differentiation, or cell plasticity.
Time-lapse microscopy now enables the observation of
cell cultures over extended time periods and at high
spatiotemporal resolution. Furthermore, it is now possible
not only to label cells with fluorescent markers, but also
to express fluorescently labeled protein, enabling
spatiotemporal analysis of protein distribution in a cell sheet
at a cellular level. To assess properties of individual cells
within the observed sheet, however, it is necessary to
accurately track these cells in a fully automated fashion.
Thus, one of the requirements of an automated image
analysis method is high accuracy single cell
segmentation for individual time steps and its applicability to a
wide range of cell types. Additionally, it is preferred that
the developed method can analyze a multitude of image
types, for example, phase contrast, differential
interference contrast, and fluorescence images, as they are
typically obtained in biomedical science.
Segmentation methods based on morphological
watersheds are used for object separation and appear
throughout the image processing and analysis literature and
patents, since the method was first applied to image
segmentation [1]. Most watershed methods work by dividing
the image surface into regions based on pixel intensity
gradient contours. However, the high level of noise in
biological images leads to over-segmentation - a major
problem when morphological watersheds are used [2-5]. This
noise creates small minima across the regions of interest
in an image, and gives rise to numerous small segmented
regions that do not have biological significance. Therefore,
a new segmentation method that accurately separates
confluent cells into single cells for a wide range of applications
is needed.
In general, watershed regions are formed either by a
flooding process, expanding out from gradient minima,
or by a watershed transform which computes a direct
solution. Either of these methods can include the entire
image, or begin from user-defined seed points. For
flooding techniques, typically the regions are flooded
according to intensity levels, through an immersion
simulation [6] creating a topographic surface. Automatic
minima detection can occur, for example, from low
frequency components in the morphological gradient of an
image [7]. Distance transforms can also be used for
watershed segmentation, flooded from localized distance
maxima [8]. Traditional watershed flooding by gradient
level has been improved by adding local neighborhood
comparisons and geodesic distance checking as the
flooding occurs [9]. Gradient vector flow (GVF) [10], a diffusion
of the classical gradient, has been used to give more
weight to important feature edges. The viscous watershed
technique [11] simulates flooding on a filtered relief of the
image. More user-dependent methods extract regions
through selected localized watershed flooding [12].
A variety of different watershed transforms are
available, dating back from Meyer's watershed transform,
which uses topographic distance to solve a shortest path
function [11]. The Image Foresting Transform (IFT) [13]
transforms an image into a weighted graph, in which
each pixel is represented by a node in the graph. Cost
functions are calculated for all possible paths within the
graph to find the optimal region separation. The Tie-Zone
Watershed (TZWS) transform [14] is derived from the
IFT transform, and defines tie-zones, where regions
overlap and the forests could produce multiple solutions, and
defines unique optimal partitions between regions.
Defining an energy minimization function to partition regions
[15] more effi (...truncated)