Overview of constrained PARAFAC models (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1186%2F1687-6180-2014-142.pdf

Overview of constrained PARAFAC models

Favier and de Almeida EURASIP Journal on Advances in Signal Processing 2014, 2014:142 http://asp.eurasipjournals.com/content/2014/1/142 R EVIEW Open Access Overview of constrained PARAFAC models Gérard Favier1*† and André LF de Almeida2† Abstract In this paper, we present an overview of constrained parallel factor (PARAFAC) models where the constraints model linear dependencies among columns of the factor matrices of the tensor decomposition or, alternatively, the pattern of interactions between different modes of the tensor which are captured by the equivalent core tensor. Some tensor prerequisites with a particular emphasis on mode combination using Kronecker products of canonical vectors that makes easier matricization operations, are first introduced. This Kronecker product-based approach is also formulated in terms of an index notation, which provides an original and concise formalism for both matricizing tensors and writing tensor models. Then, after a brief reminder of PARAFAC and Tucker models, two families of constrained tensor models, the co-called PARALIND/CONFAC and PARATUCK models, are described in a unified framework, for Nth-order tensors. New tensor models, called nested Tucker models and block PARALIND/CONFAC models, are also introduced. A link between PARATUCK models and constrained PARAFAC models is then established. Finally, new uniqueness properties of PARATUCK models are deduced from sufficient conditions for essential uniqueness of their associated constrained PARAFAC models. Keywords: Constrained PARAFAC; PARALIND/CONFAC; PARATUCK; Tensor models; Tucker models 1 Review 1.1 Introduction Tensor calculus was introduced in differential geometry, at the end of the nineteenth century, and then tensor analysis was developed in the context of Einstein’s theory of general relativity, with the introduction of index notation, the so-called Einstein summation convention, at the beginning of the twentieth century, which allows to simplify and shorten physics equations involving tensors. Index notation is also useful for simplifying multivariate statistical calculations, particularly those involving cumulant tensors [1]. Generally speaking, tensors are used in physics and differential geometry for characterizing the properties of a physical system, representing fundamental laws of physics, and defining geometrical objects whose components are functions. When these functions are defined over a continuum of points of a mathematical space, the tensor forms what is called a tensor field, a generalization of vector field used to solve problems involving curved surfaces or spaces, as it is the case of *Correspondence: † Equal contributors 1 I3S Laboratory, University of Nice-Sophia Antipolis, CNRS, 2000 route des Lucioles, Les Algorithmes-B, 06903 Sophia Antipolis, France Full list of author information is available at the end of the article curved space-time in general relativity. From a mathematical point of view, two other approaches are possible for defining tensors, in terms of tensor products of vector spaces, or multilinear maps. Symmetric tensors can also be linked with homogeneous polynomials [2]. After the first tensor developments by mathematicians and physicists, the need of analyzing collections of data matrices that can be seen as three-way data arrays gave rise to three-way models for data analysis, with the pioneering works of Tucker in psychometrics [3], and Harshman in phonetics [4], who proposed what is now referred to as the Tucker and parallel factor (PARAFAC) decompositions, respectively. The PARAFAC decomposition was independently proposed by Carroll and Chang [5] under the name canonical decomposition (CANDECOMP) and then called CANDECOMP/PARAFAC (CP) in [6]. For a history of the development of multi-way models in the context of data analysis, see [7]. Since the 1990s, multi-way analysis has known a growing success in chemistry and especially in chemometrics (see Bro’s thesis [8] and the book by Smilde et al. [9] for a description of various chemical applications of three-way models, with a pedagogical presentation of these models and of various algorithms for estimating their parameters). At the same © 2014 Favier and de Almeida; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. Favier and de Almeida EURASIP Journal on Advances in Signal Processing 2014, 2014:142 http://asp.eurasipjournals.com/content/2014/1/142 period, tensor tools were developed for signal processing applications, more particularly for solving the so-called blind source separation (BSS) problem using cumulant tensors (see [10-12] and De Lathauwer’s thesis [13] where the concept of high-order singular value decomposition (HOSVD) is introduced, a tensor tool generalizing the standard matrix SVD to arrays of order higher than two). A recent overview of BSS approaches and applications can be found in the handbook co-edited by Comon and Jutten [14]. Nowadays, (high-order) tensors, also called multi-way arrays in the data analysis community, play an important role in many fields of application for representing and analyzing multidimensional data, as in psychometrics, chemometrics, food industry, environmental sciences, signal/image processing, computer vision, neuroscience, information sciences, data mining, pattern recognition, among many others. Then, they are simply considered as multidimensional arrays of numbers, constituting a generalization of vectors and matrices that are first- and second-order tensors, respectively, to orders higher than two. Tensor decompositions, also called tensor models, are very useful for analyzing multidimensional data under the form of signals, images, speech, music sequences, or texts and also for designing new systems as it is the case of wireless communication systems since the publication of the seminal paper by Sidiropoulos et al. [15]. Besides the references already cited, overviews of tensor tools, models, algorithms, and applications can be found in [16-19]. Tensor models incorporating constraints (sparsity; nonnegativity; smoothness; symmetry; column orthonormality of factor matrices; Hankel, Toeplitz, and Vandermonde structured matrix factors; allocation constraints...) have been the object of intensive works, during the last years. Such constraints can be inherent to the problem under study or the result of a system design. An overview of constraints on components of tensor models most often encountered in multi-way data analysis can be found in [7]. Incorporation of constraints in tensor models may facilitate physical interpretability of matrix factors. Moreover, imposing constraints may allow to relax uniqueness conditions and to develop specialized parameter estimation algorith (...truncated)