Wasserstein Riemannian Geometry on Statistical Manifold

International Electronic Journal of Geometry, Oct 2020

In this paper, we study some geometric properties of statistical manifold equipped with the Riemannian Otto metric which is related to the L 2 -Wasserstein distance of optimal mass transport. We construct some α -connections on such manifold and we prove that the proposed connections are torsion-free and coincide with the Levi-Civita connection when α = 0 . In addition, the exponentialy families and the mixture families are shown to be respectively (1) -flat and (−1) -flat. ..............................................

Article PDF cannot be displayed. You can download it here:

https://dergipark.org.tr/en/download/article-file/970344

Wasserstein Riemannian Geometry on Statistical Manifold

I NTERNATIONAL E LECTRONIC J OURNAL OF G EOMETRY V OLUME 13 N O . 2 PAGE 144–151 (2020) DOI: HTTPS :// DOI . ORG /10.36890/ IEJG .689702 Wasserstein Riemannian Geometry on Statistical Manifold Carlos Ogouyandjou* and Nestor Wadagni (Communicated by Murat Tosun) A BSTRACT In this paper, we study some geometric properties of statistical manifold equipped with the Riemannian Otto metric which is related to the L2 -Wasserstein distance of optimal mass transport. We construct some α-connections on such manifold and we prove that the proposed connections are torsion-free and coincide with the Levi-Civita connection when α = 0. In addition, the exponentialy families and the mixture families are shown to be respectively (1)-flat and (−1)-flat. Keywords: Statistical manifold; Riemannian metric; Otto metric; α-connections; Wasserstein Riemannian space; flatness. AMS Subject Classification (2020): 15B48 ; 53C23; 53C25 ; 60D05. 1. Introduction Information geometry started as the investigation of the differential geometric stucture of some set of probability distributions which constitutes a statistical manifold. Since the seminal work of Rao [11] where Fisher information geometry is viewed as a Riemannian metric on a space of probability distributions, it became obvious that as differentiable manifold, a space of probability distributions can be equipped with a multitude of Riemannian metrics that are not necessarily the Fisher metric. Considering the Riemannian structure obtained by the Fisher information on a statistical manifold, Amari [2] defines a one-parameter family of affine connections called α-connections. Hence α-connections have become key tools in information geometry and have been widely investigated by several authors such as Gbaguidi et al. [7] who constructed a family of α-connections on a Hilbert bundle of generalized statistical manifold. In this paper we are interested in statistical manifold equipped with the Wasserstein metric which is related to optimal transport. Kantorovich and Rubinstein [8] stated that the Wasserstein metric can be taken as a reasonable distance on spaces of random variables or of probability distributions. However, explicit calculations based on that metric seems to be somewhat difficult to perform. Lott [9] showed that the Riemannian Otto metric related to Wasserstein metric makes the calculations on Wasserstein space easier. We make use of the Otto metric to investigate the Wasserstein Riemannian geometry on statistical manifold. Let M be a set of probability densities endowed with the Otto Riemannian metric. We construct on M a family ∇(α) of torsion-free α-connections that is exactly the Levi-Civita connection on M when α = 0. We also find out that the exponential families and the mixture families are respectively (1)-flat and (−1)-flat. The rest of the paper is organized as follows: we recall some preliminaries on α-connections in section 2, and we present useful results on Otto metric and Wasserstein metric in section 3. Finally, the main results are given in section 4. 2. Preliminary remarks on α-connections For some integer d ≥ 1, let X be a non-empty subset of Rd and M be a family of probability distributions on X . Each element of M, can be identified with θ = (θ1 , · · · , θn ) ∈ Θ a subset of Rn and the mapping θ 7→ pθ is Received : 15-February-2020, Accepted : 16-August-2020 * Corresponding author C. Ogouyandjou and N. Wadagni injective. M is a C ∞ differentiable manifold. Example 2.1. X = R, n = 2, θ = (µ, σ), Θ = {(µ, σ) : µ ∈ R, σ ∈ R∗+ }   1 (x − µ)2 p(x, θ) = √ exp − 2σ 2 σ 2π Put `(.; θ) = log p(., θ). ∂`(.;θ) ∂θ i for i = 1, · · · , n are the scores functions. ˜ θ (M), the vector space spanned by ∂`(x;θ) The tangent space >θ (M) can be identifed with > ∂θ i , and endowed with the inner product hX̃, Ỹ iθ = Eθ [X̃ Ỹ ]. The mapping X X ∂`(x; θ) ∂ ai i 7→ ai ∂θ ∂θi i i ˜ θ (M), (see[12]). defines an isometry between >θ M and > Definition 2.1. The Fisher information metric The Fisher information matrix of M at θ is the n × n matrix G(θ) = (g̃ij (θ)) defined by : Z g̃ij (θ) := Eθ [∂i `(X, θ)∂j `(X, θ)] = ∂i `(x, θ)∂j `(x, θ)p(x; θ)dx X where ∂i := ∂θ∂ i and `(x, θ) = log p(x; θ). In particular, when n = 1, we call this the Fisher information. The inner product of the natural basis of the coordinate system (θ1 , · · · , θn ) h∂i , ∂j i = g̃ij uniquely determines a Riemannian metric g̃ = h·, ·i such that for all θ ∈ Θ, and for all X, Y ∈ >θ M; g̃θ (X, Y ) = hX, Y iθ = Eθ [(X`)(Y `)]. g̃ is called Fisher metric or alternatively, the information metric. Definition 2.2. An affine connection ∇ on a differentiable manifold M is a mapping ∇ : X (M) × X (M) → X (M) which is denoted by (X, Y ) → ∇X Y and which satisfies the following properties: • ∇f X+gY Z = f ∇X Z + g∇Y Z • ∇X (Y + Z) = ∇X Y + ∇X Z • ∇X (f Y ) = f ∇X Y + X(f )Y in which X, Y, Z ∈ X (M) and f, g ∈ C ∞ (M). Theorem 2.1. [6] Given a Riemannian manifold (M, g), there exists a unique affine connection ∇ on M satisfing the conditions: • ∇ is symmetric. • ∇ is compatible with the Riemannian metric g. This affine connection is the Levi-Civita connection on the manifold (M, g). ◦k In a coordinate system (U, θ), the function Γij defined on U by ∇∂i ∂j = symbol of the the Levi-Civita connection and we have   ◦k ∂gij 1 ∂gjm ∂gmi Γij = + − m g mk . 2 ∂θi ∂θj ∂θ k k Γij ∂k are called the Christoffel P (2.1) (α) Amari[2] considers the function Γij,k which maps each point θ to the following value:      1−α (α) Γij,k := Eθ ∂i ∂j `(X, θ) + ∂i `(X, θ)∂j `(X, θ) (∂k `(X, θ)) 2 θ where α is some arbitrary real number. The α-connection ∇(α) ,which is an affine connection, is defined by (α) (α) h∇∂i ∂j , ∂k i = Γij,k , (α) where g = h·, ·i is the Fisher metric and ∇∂i ∂j is the α covariant derivative of ∂j in the direction of ∂i . Next, we recall some important results on the Otto metric which is a Riemannian metric on the Wasserstein space. 145 www.iejgeo.com Wasserstein Riemannian Geometry on Statistical Manifold 3. Otto metric 3.1. Wasserstein metric Let (X , µ) and (Y, ν) be two probability spaces. A coupling of (µ, ν) is a random vector (X, Y ) such that the law of X is µ and the law of Y is ν . By abuse of language, the law of (X, Y ) is also called a coupling of (µ, ν). We denote by Π(µ, ν) the set of coupling of (µ, ν). Definition 3.1. Let X be a subset of Rn , n ∈ N∗ and let p ∈ [1; ∞[. For any two probability measures µ, ν on X , the Wasserstein distance of order p between µ and ν is defined by:  Wp (µ, ν) = 1/p kx − yk dπ(x, y) . Z p inf π∈Π(µ,ν) (3.1) X Definition 3.2. Let P (X ) be the set of probability measures on X . The Wasserstein space of order p, p ∈ [1, ∞[ is defined as   Z p Pp (X ) = µ ∈ P (X ); kxk dµ(x) < +∞ . (3.2) X Wp defines a (finite) distance on Pp (X ). For more details on Wasserstein space see [13]. 3.2. Otto metric We consider an n-dimensional regular statistical manifold M = {p(·; θ); θ = (θ1 , · · · (...truncated)


This is a preview of a remote PDF: https://dergipark.org.tr/en/download/article-file/970344
Article home page: https://dergipark.org.tr/en/pub/iejg/issue/56935/689702

Carlos OGOUYANDJOU, Nestor WADAGNI. Wasserstein Riemannian Geometry on Statistical Manifold, International Electronic Journal of Geometry, 2020, pp. 144-151, Volume 13, Issue 2, DOI: 10.36890/iejg.689702