Classification of Crystallographic Data Using Canonical Correlation Analysis
Hindawi Publishing Corporation
EURASIP Journal on Advances in Signal Processing
Volume 2007, Article ID 19260, 8 pages
doi:10.1155/2007/19260
Research Article
Classification of Crystallographic Data Using
Canonical Correlation Analysis
M. Ladisa,1 A. Lamura,2 and T. Laudadio2
1 Istituto
2 Istituto
di Cristallografia (IC), CNR, Via Amendola 122/O, 70126 Bari, Italy
Applicazioni Calcolo (IAC), CNR, Via Amendola 122/D, 70126 Bari, Italy
Received 28 September 2006; Revised 10 January 2007; Accepted 4 March 2007
Recommended by Sabine Van Huffel
A reliable and automatic method is applied to crystallographic data for tissue typing. The technique is based on canonical correlation analysis, a statistical method which makes use of the spectral-spatial information characterizing X-ray diffraction data
measured from bone samples with implanted tissues. The performance has been compared with a standard crystallographic technique in terms of accuracy and automation. The proposed approach is able to provide reliable tissue classification with a direct
tissue visualization without requiring any user interaction.
Copyright © 2007 M. Ladisa et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1.
INTRODUCTION
One of the main goals of tissue engineering is the reconstruction of highly damaged bony segments. To this aim, it
is possible to exploit the patient’s own cells, which are isolated, expanded in vitro, loaded onto a bioceramic scaffold,
and, finally, reimplanted into the lesion site. Generally, bone
marrow stromal cells (BMSC) are adopted, as described in
[1]. In this respect it would be important to characterize the
structure of the engineered bone and to evaluate whether the
BMSC extracellular matrix deposition on a bioceramic scaffold repeats the morphogenesis of the natural bone development. In addition, it is also interesting to look into the interaction between the newly deposited bone and the scaffold in
order to recuperate damaged tissues. This is due to the fact
that the spatial organization of the new bone and the bonebiomaterial integration is regulated by the chemistry and the
geometry of the scaffold used to place BMSC in the lesion site
[1–3].
In this context the standard crystallographic approach to
detect the different tissues is based on a quantitative analysis performed by the Rietveld technique [4, 5]. This method
allows to determine the relative amounts of different tissue
components but it is rather sophisticated and computationally demanding. The aim of this paper is to propose a new
technique based on a statistical method called canonical correlation analysis (CCA) [6]. This method is the multivariate variant of the ordinary correlation analysis (OCA) and
has already been successfully applied to several applications
in biomedical signal processing [7, 8]. Here, CCA is applied
to X-ray diffraction data in order to construct a nosologic
image [9] of the bone sample in which all the detected tissues are visualized. The goal is achieved by combining the
spectral-spatial information provided by the X-ray diffraction patterns and a signal subspace that models the spectrum
of a characteristic tissue type. Such images can be easily interpreted by crystallographers. The paper is organized as follows. In Section 2, we present the mathematical aspects of
the CCA method. Then the application of CCA to crystallographic data is reported in Section 3. In Section 4, the numerical results are described and discussed and, finally, we
draw our conclusions.
2.
CCA
CCA is a statistical technique developed by Hotelling in 1936
in order to assess the relationship between two sets of variables [6]. It is a multichannel generalization of OCA, which
quantifies the relationship between two random variables x
and y by means of the so-called correlation coefficient
Cov[x, y]
,
ρ=
V [x]V [y]
(1)
where Cov and V stand for covariance and variance, respectively. The correlation coefficient is a scalar with value
2
EURASIP Journal on Advances in Signal Processing
between −1 and 1 that measures the degree of linear dependence between x and y. For zero-mean variables, (1) is replaced by
E[xy]
ρ = ,
E x2 E y 2
(2)
where E stands for expected value. Canonical correlation
analysis can be applied to multichannel signal processing as
follows: consider two zero-mean multivariate random vectors x = [x1 (t), . . . , xm (t)]T and y = [y1 (t), . . . , yn (t)]T , with
t = 1, . . . , N, where the superscript T denotes the transpose.
The following linear combinations of the components in x
and y are defined, which, respectively, represent two new
scalar random variables X and Y :
X = wx1 x1 + · · · + wxm xm = wxT x,
Y = w y1 y1 + · · · + w yn yn = wTy y.
(3)
CCA computes the linear combination coefficients wx =
[wx1 , . . . , wxm ]T and w y = [w y1 , . . . , w yn ]T , called regression
weights, so that the correlation between the new variables X
and Y is maximum. The solution wx = w y = 0 is not allowed
and the new variables X and Y are called canonical variates.
Several implementations of CCA are available in the literature. However, as shown in [7], the most reliable and fastest
implementation is based on the interpretation of CCA in
terms of principal angles between linear subspaces [6, 10].
For further details the reader is referred to [7] and references
therein. Here, an outline of the aforementioned implementation is provided for the sake of clarity.
2.1. Algorithm CCA (CCA by computing
principal angles)
Given the zero-mean multivariate random vectors x =
[x1 (t), . . . , xm (t)] and y = [y1 (t), . . . , yn (t)], with t =
1, . . . , N.
and Y
, defined as follows:
Step 1. Consider the matrices X
⎡
⎢
=⎢
X
⎣
⎡
⎤
x1 (1) · · · xm (1)
..
.. ⎥
⎥
.
. ⎦,
x1 (N) · · · xm (N)
⎢
=⎢
Y
⎣
⎤
y1 (1) · · · yn (1)
..
.. ⎥
⎥
.
. ⎦.
y1 (N) · · · yn (N)
(4)
and Y
:
Step 2. Compute the QR decompositions [11] of X
= QX RX ,
X
= QY RY ,
Y
(5)
where QX and QY are orthogonal matrices and RX and RY
are upper triangular matrices.
Step 3. Compute the SVD [11] of QTX QY :
QTX QY = USVT ,
(6)
where S is a diagonal matrix and U and V are orthogonal
matrices. The cosines of the principal angles are given by the
diagonal elements of S.
Figure 1: X-ray diffraction patterns of the investigated bone sample.
Step 4. Set the canonical correlation coefficients equal to the
diagonal elements of the matrix S and compute the corresponding regression weights as wX = RX−1 U and wY = RY− 1 V.
The computation of the principal angles yields the most
robust implementation of CCA, since it is able to provide re and Y
are singular.
liable results even when the matrices X
3.
CCA APPLIED TO CRYSTALLOGRAPHIC DATA
During the data acquisition procedure, a number of microscopic X-ray diffraction images (XRDI) displaying the spatial
v (...truncated)