Optimal Monte Carlo integration on closed manifolds
Statistics and Computing (2019) 29:1203–1214
https://doi.org/10.1007/s11222-019-09894-w
Optimal Monte Carlo integration on closed manifolds
Martin Ehler1 · Manuel Gräf2 · Chris. J. Oates3
Published online: 30 October 2019
© The Author(s) 2019
Abstract
The worst case integration error in reproducing kernel Hilbert spaces of standard Monte Carlo methods with n random points
decays as n −1/2 . However, the re-weighting of random points, as exemplified in the Bayesian Monte Carlo method, can
sometimes be used to improve the convergence order. This paper contributes general theoretical results for Sobolev spaces on
closed Riemannian manifolds, where we verify that such re-weighting yields optimal approximation rates up to a logarithmic
factor. We also provide numerical experiments matching the theoretical results for some Sobolev spaces on the sphere S2 and
on the Grassmannian manifold G2,4 . Our theoretical findings also cover function spaces on more general sets such as the unit
ball, the cube, and the simplex.
Keywords Bayesian cubature · Covering radius · Reproducing kernel
1 Introduction
Many problems in statistics and the applied sciences require
the numerical integration of one or more functions belonging
to a particular class. Given M ⊂ R D , endowed with some
probability measure μ, and an integrable function f : M →
R,
standard Monte Carlo methods approximate the integral
M f (x)dμ(x) by the finite sum
1
f (x j ),
n
n
(1)
j=1
where {x j }nj=1 ⊂ M are independent samples from μ. On the
one hand, Monte Carlo integration is widely used in many
Handling Editor T. J. Sullivan.
B Martin Ehler
Manuel Gräf
Chris. J. Oates
1
Department of Mathematics, University of Vienna, Vienna,
Austria
2
Acoustics Research Institute, Austrian Academy of Sciences,
Vienna, Austria
3
School of Mathematics, Statistics and Physics, Newcastle
University, Newcastle-Upon-Tyne, UK
numerical and statistical applications (Robert and Casella
2013). It is well-known, however, that the expected worst
case integration error for n random points using (1) in reproducing kernel Hilbert spaces does not decay faster than n −1/2 ,
cf. Brauchart et al. (2014), Breger et al. (2018), Hinrichs
(2010), Novak and Wozniakowski (2010), Plaskota et al.
(2009) and Gräf (2013), proof of Corollary 2.8. To improve
the approximation, it has been proposed to re-weight the random points (Briol et al. 2018; Oettershagen 2017; Rasmussen
and Ghahramani 2003; Sommariva and Vianello 2006; Ullrich 2017), which is of particular importance when μ can
only be sampled (Oates et al. 2017) and evaluating f is rather
expensive.
That re-weighting of deterministic points can lead to optimal convergence order has been known since the pioneering
work of Bakhvalov (1959). For Sobolev spaces on the sphere
and more generally on compact Riemannian manifolds, there
are numerically feasible strategies to select deterministic
points and weights matching optimal worst case error rates,
cf. Brandolini et al. (2014), Brauchart et al. (2014), Breger
et al. (2017), see also Hellekalek et al. (2016), Hinrichs et al.
(2016) and Niederreiter (2003).
The use of random points avoids the need to manually
specify a point set and can potentially lead to simpler algorithms if the geometry of the manifold M is complicated.
For random points, it was derived in Briol et al. (2018) that
the optimal rate for [0, 1]d , the sphere, and quite general
domains in Rd can be matched up to a logarithmic factor
123
1204
Statistics and Computing (2019) 29:1203–1214
if the weights are optimized with respect to the underlying
reproducing kernel. Decay rates of the worst case integration
error for Sobolev spaces of dominating mixed smoothness
on the torus and the unit cube were studied in Oettershagen
(2017). Gaussian kernel quadrature is studied in Karvonen
and Särkkä (2019). Numerical experiments on the Grassmannian manifold were provided in Ehler and Gräf (2017). We
refer to Trefethen (2017a, b), for further related results.
The present paper is dedicated to verify that, for Sobolev
spaces on closed Riemannian manifolds, random points with
optimized weights yield optimal decay rates of the worst case
error up to a logarithmic factor. We should point out that we
additionally allow for the restriction to nonnegative weights,
a desirable property not considered in Briol et al. (2018). Our
findings also transfer to functions defined on more general
sets such as the d-dimensional unit ball and the simplex.
The paper is structured as follows: First, we bound the
worst case error by the covering radius of the underlying
points. Second, we use estimates on the covering radius
of random points from Reznikov and Saff (2015), see also
Brauchart et al. (2018) for the sphere, to establish the optimal
approximation rate up to a logarithmic factor. Some consequences for the Bayesian Monte Carlo method are then
presented. Numerical experiments for the sphere and the
Grassmannian manifold are provided that support our theoretical findings. We also discuss the extension to the unit
ball, the cube, and the simplex.
wce({(x j , w j )}nj=1 , H K )2
=
n
wi w j K (xi , x j ) − 2
i, j=1
+
Let M ⊂ R D be a smooth, connected, closed Riemannian
manifold of dimension d, endowed with the normalized Riemannian measure μ throughout the manuscript. Prototypical
examples for M are the sphere and the Grassmannian
wj
j=1
M M
M
K (x j , y)dμ(y)
K (x, y)dμ(x)dμ(y).
(3)
If x1 , . . . , xn ∈ M are random points, independently distributed according to μ, then it holds
1
E wce({(x j , n1 )}nj=1 , H K )2 n − 2 ,
(4)
cf. Brauchart et al. (2014), Breger et al. (2018), Novak
and Wozniakowski (2010) and Gräf (2013), Proof of Corollary 2.8. Hence, even if H K consists of arbitrarily smooth
functions, the left hand side of (4) decays only like n −1/2 .
The present paper is dedicated to the question if and, as
the case may be, how much one can actually improve the
error rate in (4) when replacing the equal weights n1 with
weights {w j }nj=1 that are customized to the random points
{x j }nj=1 . From a practical perspective, the methods studied
in this paper require that the integrals appearing in (3) can be
analytically evaluated.
Remark 1 The restriction to integrals with respect to the normalized Riemannian measure μ is without too much loss of
generality, for if ν is any σ -finite measure that is absolutely
continuous with respect to μ, then
2 Preliminaries
n
M
f (x)dν(x) =
M
g(x)dμ(x),
dν
dν
where g = f dμ
and dμ
is the Radon–Nikodym derivative of
dν
ν with respect to μ. Often, H is an algebra, so that f , dμ
∈H
also yield g ∈ H and our considerations for μ do apply.
Sd = {x ∈ Rd+1 : x = 1},
Gk,m = {x ∈ Rm×m : x = x, x 2 = x, rank(x) = k},
respectively, where d = k(m − k) with D = m 2 in case of
the Grassmannian.
Let H be any normed space of continuous functions f :
M → R. For points {x j }nj=1 ⊂ M and weights {w j }nj=1 ⊂
R, the worst case (...truncated)