Mapping Topographic Structure in White Matter Pathways with Level Set Trees
Citation: Kent BP, Rinaldo A, Yeh F-C, Verstynen T (
Mapping Topographic Structure in White Matter Pathways with Level Set Trees
Brian P. Kent 0
Alessandro Rinaldo 0
Fang-Cheng Yeh 0
Timothy Verstynen 0
Karl Herholz, University of Manchester, United Kingdom
0 1 Department of Statistics, Carnegie Mellon University , Pittsburgh , Pennsylvania, United States of America, 2 Department of Biomedical Engineering, Carnegie Mellon University , Pittsburgh , Pennsylvania, United States of America, 3 Department of Psychology and Center for the Neural Basis of Computation, Carnegie Mellon University , Pittsburgh, Pennsylvania , United States of America
Fiber tractography on diffusion imaging data offers rich potential for describing white matter pathways in the human brain, but characterizing the spatial organization in these large and complex data sets remains a challenge. We show that level set trees-which provide a concise representation of the hierarchical mode structure of probability density functions-offer a statistically-principled framework for visualizing and analyzing topography in fiber streamlines. Using diffusion spectrum imaging data collected on neurologically healthy controls (N = 30), we mapped white matter pathways from the cortex into the striatum using a deterministic tractography algorithm that estimates fiber bundles as dimensionless streamlines. Level set trees were used for interactive exploration of patterns in the endpoint distributions of the mapped fiber pathways and an efficient segmentation of the pathways that had empirical accuracy comparable to standard nonparametric clustering techniques. We show that level set trees can also be generalized to model pseudo-density functions in order to analyze a broader array of data types, including entire fiber streamlines. Finally, resampling methods show the reliability of the level set tree as a descriptive measure of topographic structure, illustrating its potential as a statistical descriptor in brain imaging analysis. These results highlight the broad applicability of level set trees for visualizing and analyzing high-dimensional data like fiber tractography output.
-
Funding: This research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-10-2-0022. The
views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or
implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government
purposes notwithstanding any copyright notation herein. This research was also supported by NSF CAREER grant DMS 114967. The funders had no role in study
design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
Fiber tractography on diffusion weighted imaging (DWI) data
can provide a high-resolution map of the anatomical connections
between two brain areas [1]. The deterministic variant of fiber
tractography generates a set of simulated fiber streamlines that
provide rich information about the topographic structure of white
matter pathways [24]. This method has been used recently to
characterize the sheet-like layout of large, myelinated pathways
[5], map the organization of fiber bundles within the same
pathway [68], identify novel neuroanatomical patterns [912]
and quantify the global structural connectivity between large sets
of brain regions [3,13], providing a so-called structural
connectome of the human brain (see Van Essen et al. (2012) [14]).
The topography and connectivity of the structural connections
identified with fiber tractography have also been shown to relate
directly to corresponding functional connectivity [15] and
taskevoked functional dynamics [6,16], highlighting the relationship
between structure and function in neural systems. Despite these
advances, the lack of descriptive metrics for the spatial
topography of white matter pathways remains a standing
problem with structural connectivity analysis (see Jbabdi et al.
(2013) [17]).
Clustering is a popular method for summarizing the spatial
organization of white matter pathways [18,19], but clustering is
often a difficult and ill-defined task. Many of the proposed
approaches, such as fuzzy c-means [20,21], spectral clustering
[22,23], diffusion maps [24], local linear embedding [25],
geometric clustering [26,27] and white matter atlas matching
[28,29], assume there is a single well-defined partition of the data
into K separate groups, where K is presumed known a priori.
However, when the data are noisy or have a high degree of
complexity or spatial heterogeneity, as is often the case in
neuroimaging, it is more appropriate to assume the data have
multi-scale clustering features that can be captured by a hierarchy
of nested partitions of different sizes. These partitions and their
hierarchy provide a wealth of information about the data beyond
typical clustering results, unburdening the practitioner from the
need to guess the right number of clusters, providing a global
summary of the entire data set and offering the ability to select
sub-clusters at different levels of spatial resolution depending on
the scientific problem at hand.
There are many well-established hierarchical clustering
methods, some of which have been applied to the problem of fiber track
segmentation [3033]. However, these methods often suffer from a
lack of statistical justification. Single linkage clustering, for
example, is known to be inconsistent in dimensions greater than
one [34] and suffers from the problem of chaining [18]. In
addition, the dendrograms that result from agglomerative
hierarchical clustering do not indicate the optimal number of clusters;
the practitioner must specify the desired number of clusters or a
threshold at which to cut the dendrogram. Furthermore, the
dendrograms that result from these methods are rarely used as
statistical descriptors in their own right.
Several recent fiber clustering analyses propose more
sophisticated methods that do not require a priori knowledge of the
number of clusters. Wasserman and Deriche (2008) [35] and
Zvitia et al. (2008) [36] use the mean-shift clustering algorithm,
which finds clusters that correspond to the modes of an assumed
probability density function. Brun et al. (2004) use spectral
clustering but avoid choosing a cluster number by doing recursive
binary data partitions [37]. Wang et al. (2011) use a hierarchical
Bayesian mixture model over supervoxels to estimate white matter
segmentation, with the number of clusters chosen automatically by
a Dirichlet process [38]. Different clustering scales are achieved by
defining supervoxels of various sizes. Many of these methods are
capable of clustering at multiple data resolutions, but this is
generally not t (...truncated)