Quantifying lost information due to covariance matrix estimation in parameter inference (pdf)

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/mnras/article-pdf/464/4/4658/8313962/stw2697.pdf

Quantifying lost information due to covariance matrix estimation in parameter inference

MNRAS 464, 4658–4665 (2017) doi:10.1093/mnras/stw2697 Advance Access publication 2016 October 19 Quantifying lost information due to covariance matrix estimation in parameter inference Elena Sellentin‹ and Alan F. Heavens Imperial Centre for Inference and Cosmology (ICIC), Department of Physics, Imperial College, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, UK Accepted 2016 October 17. Received 2016 October 17; in original form 2016 September 6 Parameter inference with an estimated covariance matrix systematically loses information due to the remaining uncertainty of the covariance matrix. Here, we quantify this loss of precision and develop a framework to hypothetically restore it, which allows to judge how far away a given analysis is from the ideal case of a known covariance matrix. We point out that it is insufficient to estimate this loss by debiasing the Fisher matrix as previously done, due to a fundamental inequality that describes how biases arise in non-linear functions. We therefore develop direct estimators for parameter credibility contours and the figure of merit, finding that significantly fewer simulations than previously thought are sufficient to reach satisfactory precisions. We apply our results to DES Science Verification weak lensing data, detecting a 10 per cent loss of information that increases their credibility contours. No significant loss of information is found for KiDS. For a Euclid-like survey, with about 10 nuisance parameters we find that 2900 simulations are sufficient to limit the systematically lost information to 1 per cent, with an additional uncertainty of about 2 per cent. Without any nuisance parameters, 1900 simulations are sufficient to only lose 1 per cent of information. We further derive estimators for all quantities needed for forecasting with estimated covariance matrices. Our formalism allows to determine the sweetspot between running sophisticated simulations to reduce the number of nuisance parameters, and running as many fast simulations as possible. Key words: methods: data analysis – methods: statistical – cosmology: observations. 1 I N T RO D U C T I O N For surveys of the cosmic large-scale structure for example in Abbott et al. (2015), Heymans et al. (2013), Laureijs et al. (2011) and Jain et al. (2015), cosmology currently finds itself in the situation of having difficulties reliably describing measurement uncertainties via a covariance matrix. Although analytical approximations for the covariance matrix exist and have been exploited e.g. in KiDS (Hildebrandt et al. 2016), covariance matrices are usually estimated from numerical simulations hoping that these model the specifics of a survey and the non-linear structure growth better than analytical approximations (Sato et al. 2011; Blot et al. 2015; Harnois-Déraps & van Waerbeke 2015). Uncertainty in the covariance matrix inevitably leads to loss of information and a modified likelihood function (Sellentin & Heavens 2016). Here, we investigate further the loss of information, to quantify the expected gains or losses obtainable by increasing or decreasing the number of simulated data sets. We begin with an unbiased sample estimator for the covariance synthetic data matrix. If Ns independent simulations each yield a s vector X i , with the sample average being X̄ = N1s N i=1 X i , then an unbiased estimator of the true but unknown covariance matrix is N S= s 1 (X i − X̄)(X i − X̄)T . Ns − 1 i=1 (1) This estimator is a matrix-variate random variable and in Sellentin & Heavens (2016), we propagated the uncertainty on the true covariance into parameter inference, by transferring the randomness of S on to by virtue of a prior and Bayes’ theorem. This lead to the likelihood of a p-dimensional data set X o , conditioned on the mean μ and S from Ns simulations: P (X o |μ, S, Ns ) = c̄p |S|−1/2 Ns /2 . 1 + NsQ−1 (2) This is an ellipsoidally contoured t-distribution, using E-mail: Q = (X o − μ)T S−1 (X o − μ), (3) C 2016 The Authors Published by Oxford University Press on behalf of the Royal Astronomical Society ABSTRACT Lost information Figure 2. Scaling the confidence contours with equation (40) reduces their size to the optimal case of a known covariance matrix. The remaining scatter is then described by equation (38) and depicted in Fig. 3 for a Euclid-like survey. and the normalization N2s . c̄p = [π(Ns − 1)]p/2 Ns2−p (4) Previous approximate results had been proposed in a series of papers (Hartlap, Simon & Schneider 2007; Dodelson & Schneider 2013; Taylor, Joachimi & Kitching 2013; Percival et al. 2014; Taylor & Joachimi 2014) which employ only the first two moments instead of the entire distribution of S, by construction maintaining a Gaussian likelihood. In comparison to a Gaussian, a t-distribution has broader wings, and a more peaked core. Its width is primarily set by the number of simulations, as these determine how certain our estimate of the true covariance matrix is. The lower the number of simulations, the wider is the t-distribution, which reflects the systematic loss of information due to estimating the covariance matrix. The obvious question in this framework is therefore how much information is lost for a certain finite Ns , in comparison to a known covariance matrix and this is the question we address in this paper. We illustrate this loss with a toy model in Fig. 1 for a survey with similar characteristics to ESA’s Euclid weak lensing survey (Laureijs et al. 2011). Our agnostic model for Euclid assumes 10 redshift bins, leading to approximately p = 1500 data points. For Fig. 1, we use Ns = 1550 simulations and Np = 50 parameters that include cosmological parameters and a representative number of nuisance parameters. The relative uncertainties due to estimated covariance matrices do not depend on any further physical specifications. The solid contour is the joint 1σ credible region of two arbitrary parameters θ 1 and θ 2 , derived from an assumed true covariance matrix. In contrast, the open contours are derived from different draws of estimated covariance matrices, and are systematically inflated, due to losing information when estimating the covariance matrix. We will assess this loss by developing the Fisher matrix approximation to the t-distribution in order to completely include the uncertainty of . However, we will not stop at the Fisher matrix level, as previous works have done, because an important inequality enforces that the lost information of joint credibility regions and figures of merit (FoM) cannot be calculated from the information lost in the Fisher matrix: Jensen’s inequality describes the fundamental problem that non-linear functions and averages of estimators do not commute. For any matrix-variate function f of the estimated covariance matrix S, Jensen’s inequality reads f (S) ≤ f (S) (5) for convex functions f, and the inequality inverts for concave functi (...truncated)