Precision matrix expansion – efficient use of numerical simulations in estimating errors on cosmological parameters
MNRAS 473, 4150–4163 (2018)
doi:10.1093/mnras/stx2566
Advance Access publication 2017 October 5
Precision matrix expansion – efficient use of numerical simulations in
estimating errors on cosmological parameters
Oliver Friedrich1,2‹ and Tim Eifler3,4
1 Universitäts-Sternwarte,
Fakultät für Physik, Ludwig-Maximilians Universität München, Scheinerstr 1, D-81679 München, Germany
Planck Institute for Extraterrestrial Physics, Giessenbachstrasse, D-85748 Garching, Germany
3 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA
4 Department of Physics, California Institute of Technology, Pasadena, CA 91125, USA
2 Max
ABSTRACT
Computing the inverse covariance matrix (or precision matrix) of large data vectors is crucial
in weak lensing (and multiprobe) analyses of the large-scale structure of the Universe. Analytically computed covariances are noise-free and hence straightforward to invert; however, the
model approximations might be insufficient for the statistical precision of future cosmological
data. Estimating covariances from numerical simulations improves on these approximations,
but the sample covariance estimator is inherently noisy, which introduces uncertainties in
the error bars on cosmological parameters and also additional scatter in their best-fitting values. For future surveys, reducing both effects to an acceptable level requires an unfeasibly
large number of simulations. In this paper we describe a way to expand the precision matrix
around a covariance model and show how to estimate the leading order terms of this expansion from simulations. This is especially powerful if the covariance matrix is the sum of two
contributions, C = A + B, where A is well understood analytically and can be turned off in
simulations (e.g. shape noise for cosmic shear) to yield a direct estimate of B. We test our
method in mock experiments resembling tomographic weak lensing data vectors from the
Dark Energy Survey (DES) and the Large Synoptic Survey Telescope (LSST). For DES we
find that 400 N-body simulations are sufficient to achieve negligible statistical uncertainties
on parameter constraints. For LSST this is achieved with 2400 simulations. The standard
covariance estimator would require >105 simulations to reach a similar precision. We extend
our analysis to a DES multiprobe case finding a similar performance.
Key words: methods: statistical – cosmological parameters – large-scale structure of Universe.
1 I N T RO D U C T I O N
Wide area surveys such as the currently running Dark Energy Survey
(DES, Flaugher 2005) or the upcoming Large Synoptic Survey
Telescope (LSST, Ivezic et al. 2009) will collect vast amounts of
data about the large-scale structure on the Universe. In cosmological
analyses this data can e.g. be compressed into measurements of twopoint correlation functions of galaxy clustering or cosmic shear. In
a redshift-tomographic analysis this will easily accumulate to data
vectors with several hundreds of data points. Testing cosmological
models from a measurement of such a large data vector requires
precise knowledge of the covariance matrix of the noise in this data
vector and especially of the inverse covariance, which is also called
the precision matrix. To obtain good estimates of these matrices,
E-mail:
survey collaborations have in the past e.g. taken the following route
(Heymans et al. 2013; Kilbinger et al. 2013; Becker et al. 2016): they
ran a set of high-precision numerical simulations, that in the limit
of infinite realizations would in principle allow for a calculation
of the true underlying covariance matrix of their observable. Given
that numerical simulations are expensive, they however estimated
the covariance and precision matrix from only a limited amount of
realizations, leaving them with possibly significant uncertainties of
their exact error budget.
There has been extensive research on the impact of errors
associated with covariance estimation on the constraints derived on cosmological parameters. Hartlap, Simon & Schneider
(2007) discussed the fact that the inverse of an unbiased covariance estimator is not an unbiased estimator for the inverse
covariance matrix (the precision matrix). They also described a
way to correct for this when assuming that the covariance estimate follows a Wishart distribution (see also Kaufman 1967 and
C 2017 The Authors
Published by Oxford University Press on behalf of the Royal Astronomical Society
Accepted 2017 September 29. Received 2017 September 6; in original form 2017 March 21
Precision matrix expansion
2 PA R A M E T E R C O N S T R A I N T S F RO M N O I S Y
C OVA R I A N C E E S T I M AT E S
We begin by outlining the main task of this paper. Let ξ̂ be a vector
of Nd data points measured from observational data and let ξ [π ] be a
model for this data vector that depends on a vector of Np parameters
π . If C is the covariance matrix of ξ̂ then a standard way to constrain
the parameters π is to assign a posterior distribution p(π|ξ̂ ) to them
as
1
p(π)
(1)
p(π|ξ̂ ) ∼ exp − χ 2 π | ξ̂ , C
2
with
T
χ 2 π | ξ̂ , C = ξ̂ − ξ [π] C−1 ξ̂ − ξ [π]
(2)
and p(π) being a prior density incorporating a priori knowledge
or assumptions on π. These expressions in fact ignore that C also
can be dependent on π . We will do this throughout this paper and
refer the reader to Eifler, Schneider & Hartlap (2009) who investigated the impact of cosmology-dependent covariance matrices on
cosmic shear likelihood analyses. Another assumption that goes
into equation (1) is that the measured data vector ξ̂ is drawn from
a multivariate Gaussian distribution. In wide area surveys this is
justified in the limit where one can consider the survey to consist of
many independent sub-regions, such that the measurements in those
regions add up to a Gaussian data vector by means of the central
limit theorem.
If the covariance matrix C is not exactly known, it can e.g. be
estimated from N-body simulations. If ξ̂ i , i = 1...Ns , are a number
of independent measurements of ξ in simulations then an unbiased
estimate of C is given by
Ĉ :=
Ns
T
1
ξ̂ i − ξ̄
ξ̂ i − ξ̄ ,
ν i=1
(3)
where ν = Ns − 1 and ξ̄ is the sample mean of the ξ̂ i . We will
assume Ĉ to have a Wishart distribution with ν degrees of freedom
which follows from our assumption that ξ̂ and the ξ̂ i are Gaussian
distributed (cf. Taylor et al. 2013). Also, we will assume that Ĉ is
an unbiased estimator for the covariance matrix of actual data, i.e.
if Ĉ is indeed an estimate from N-body simulations, then we will
assume these simulations to well resemble the error constributions
present in actual data.
To compute the likelihood in equation (1) we need to know the
precision matrix, i.e. is the inverse covariance matrix = C−1 .
According to Kaufman (1967, see also Hartlap et al. 2007; Taylor
et al. 2013) an unbiased estimator for can be constructed from Ĉ
as
ˆ =
ν − N d − 1 −1
Ĉ
ν
(4)
and (...truncated)