Quantifying lost information due to covariance matrix estimation in parameter inference
MNRAS 464, 4658–4665 (2017)
doi:10.1093/mnras/stw2697
Advance Access publication 2016 October 19
Quantifying lost information due to covariance matrix estimation
in parameter inference
Elena Sellentin‹ and Alan F. Heavens
Imperial Centre for Inference and Cosmology (ICIC), Department of Physics, Imperial College, Blackett Laboratory,
Prince Consort Road, London SW7 2AZ, UK
Accepted 2016 October 17. Received 2016 October 17; in original form 2016 September 6
Parameter inference with an estimated covariance matrix systematically loses information due
to the remaining uncertainty of the covariance matrix. Here, we quantify this loss of precision
and develop a framework to hypothetically restore it, which allows to judge how far away a
given analysis is from the ideal case of a known covariance matrix. We point out that it is
insufficient to estimate this loss by debiasing the Fisher matrix as previously done, due to a
fundamental inequality that describes how biases arise in non-linear functions. We therefore
develop direct estimators for parameter credibility contours and the figure of merit, finding
that significantly fewer simulations than previously thought are sufficient to reach satisfactory
precisions. We apply our results to DES Science Verification weak lensing data, detecting a
10 per cent loss of information that increases their credibility contours. No significant loss of
information is found for KiDS. For a Euclid-like survey, with about 10 nuisance parameters we
find that 2900 simulations are sufficient to limit the systematically lost information to 1 per cent,
with an additional uncertainty of about 2 per cent. Without any nuisance parameters, 1900
simulations are sufficient to only lose 1 per cent of information. We further derive estimators
for all quantities needed for forecasting with estimated covariance matrices. Our formalism
allows to determine the sweetspot between running sophisticated simulations to reduce the
number of nuisance parameters, and running as many fast simulations as possible.
Key words: methods: data analysis – methods: statistical – cosmology: observations.
1 I N T RO D U C T I O N
For surveys of the cosmic large-scale structure for example in
Abbott et al. (2015), Heymans et al. (2013), Laureijs et al. (2011) and
Jain et al. (2015), cosmology currently finds itself in the situation
of having difficulties reliably describing measurement uncertainties
via a covariance matrix. Although analytical approximations for
the covariance matrix exist and have been exploited e.g. in KiDS
(Hildebrandt et al. 2016), covariance matrices are usually estimated
from numerical simulations hoping that these model the specifics of
a survey and the non-linear structure growth better than analytical
approximations (Sato et al. 2011; Blot et al. 2015; Harnois-Déraps
& van Waerbeke 2015).
Uncertainty in the covariance matrix inevitably leads to loss of
information and a modified likelihood function (Sellentin & Heavens 2016). Here, we investigate further the loss of information, to
quantify the expected gains or losses obtainable by increasing or
decreasing the number of simulated data sets.
We begin with an unbiased sample estimator for the covariance
synthetic data
matrix. If Ns independent simulations each yield a
s
vector X i , with the sample average being X̄ = N1s N
i=1 X i , then
an unbiased estimator of the true but unknown covariance matrix
is
N
S=
s
1
(X i − X̄)(X i − X̄)T .
Ns − 1 i=1
(1)
This estimator is a matrix-variate random variable and in Sellentin
& Heavens (2016), we propagated the uncertainty on the true covariance into parameter inference, by transferring the randomness of
S on to by virtue of a prior and Bayes’ theorem. This lead to the
likelihood of a p-dimensional data set X o , conditioned on the mean
μ and S from Ns simulations:
P (X o |μ, S, Ns ) =
c̄p |S|−1/2
Ns /2 .
1 + NsQ−1
(2)
This is an ellipsoidally contoured t-distribution, using
E-mail:
Q = (X o − μ)T S−1 (X o − μ),
(3)
C 2016 The Authors
Published by Oxford University Press on behalf of the Royal Astronomical Society
ABSTRACT
Lost information
Figure 2. Scaling the confidence contours with equation (40) reduces their
size to the optimal case of a known covariance matrix. The remaining scatter
is then described by equation (38) and depicted in Fig. 3 for a Euclid-like
survey.
and the normalization
N2s
.
c̄p =
[π(Ns − 1)]p/2 Ns2−p
(4)
Previous approximate results had been proposed in a series of papers (Hartlap, Simon & Schneider 2007; Dodelson & Schneider
2013; Taylor, Joachimi & Kitching 2013; Percival et al. 2014; Taylor & Joachimi 2014) which employ only the first two moments
instead of the entire distribution of S, by construction maintaining
a Gaussian likelihood. In comparison to a Gaussian, a t-distribution
has broader wings, and a more peaked core. Its width is primarily
set by the number of simulations, as these determine how certain
our estimate of the true covariance matrix is. The lower the number
of simulations, the wider is the t-distribution, which reflects the
systematic loss of information due to estimating the covariance matrix. The obvious question in this framework is therefore how much
information is lost for a certain finite Ns , in comparison to a known
covariance matrix and this is the question we address in this paper.
We illustrate this loss with a toy model in Fig. 1 for a survey
with similar characteristics to ESA’s Euclid weak lensing survey
(Laureijs et al. 2011). Our agnostic model for Euclid assumes 10 redshift bins, leading to approximately p = 1500 data points. For Fig. 1,
we use Ns = 1550 simulations and Np = 50 parameters that include
cosmological parameters and a representative number of nuisance
parameters. The relative uncertainties due to estimated covariance
matrices do not depend on any further physical specifications. The
solid contour is the joint 1σ credible region of two arbitrary parameters θ 1 and θ 2 , derived from an assumed true covariance matrix.
In contrast, the open contours are derived from different draws of
estimated covariance matrices, and are systematically inflated, due
to losing information when estimating the covariance matrix.
We will assess this loss by developing the Fisher matrix approximation to the t-distribution in order to completely include the
uncertainty of . However, we will not stop at the Fisher matrix
level, as previous works have done, because an important inequality enforces that the lost information of joint credibility regions and
figures of merit (FoM) cannot be calculated from the information
lost in the Fisher matrix: Jensen’s inequality describes the fundamental problem that non-linear functions and averages of estimators
do not commute. For any matrix-variate function f of the estimated
covariance matrix S, Jensen’s inequality reads
f (S) ≤ f (S)
(5)
for convex functions f, and the inequality inverts for concave functi (...truncated)