Precision matrix expansion – efficient use of numerical simulations in estimating errors on cosmological parameters (pdf)

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/mnras/article-pdf/473/3/4150/21860938/stx2566.pdf

Precision matrix expansion – efficient use of numerical simulations in estimating errors on cosmological parameters

MNRAS 473, 4150–4163 (2018) doi:10.1093/mnras/stx2566 Advance Access publication 2017 October 5 Precision matrix expansion – efficient use of numerical simulations in estimating errors on cosmological parameters Oliver Friedrich1,2‹ and Tim Eifler3,4 1 Universitäts-Sternwarte, Fakultät für Physik, Ludwig-Maximilians Universität München, Scheinerstr 1, D-81679 München, Germany Planck Institute for Extraterrestrial Physics, Giessenbachstrasse, D-85748 Garching, Germany 3 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA 4 Department of Physics, California Institute of Technology, Pasadena, CA 91125, USA 2 Max ABSTRACT Computing the inverse covariance matrix (or precision matrix) of large data vectors is crucial in weak lensing (and multiprobe) analyses of the large-scale structure of the Universe. Analytically computed covariances are noise-free and hence straightforward to invert; however, the model approximations might be insufficient for the statistical precision of future cosmological data. Estimating covariances from numerical simulations improves on these approximations, but the sample covariance estimator is inherently noisy, which introduces uncertainties in the error bars on cosmological parameters and also additional scatter in their best-fitting values. For future surveys, reducing both effects to an acceptable level requires an unfeasibly large number of simulations. In this paper we describe a way to expand the precision matrix around a covariance model and show how to estimate the leading order terms of this expansion from simulations. This is especially powerful if the covariance matrix is the sum of two contributions, C = A + B, where A is well understood analytically and can be turned off in simulations (e.g. shape noise for cosmic shear) to yield a direct estimate of B. We test our method in mock experiments resembling tomographic weak lensing data vectors from the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope (LSST). For DES we find that 400 N-body simulations are sufficient to achieve negligible statistical uncertainties on parameter constraints. For LSST this is achieved with 2400 simulations. The standard covariance estimator would require >105 simulations to reach a similar precision. We extend our analysis to a DES multiprobe case finding a similar performance. Key words: methods: statistical – cosmological parameters – large-scale structure of Universe. 1 I N T RO D U C T I O N Wide area surveys such as the currently running Dark Energy Survey (DES, Flaugher 2005) or the upcoming Large Synoptic Survey Telescope (LSST, Ivezic et al. 2009) will collect vast amounts of data about the large-scale structure on the Universe. In cosmological analyses this data can e.g. be compressed into measurements of twopoint correlation functions of galaxy clustering or cosmic shear. In a redshift-tomographic analysis this will easily accumulate to data vectors with several hundreds of data points. Testing cosmological models from a measurement of such a large data vector requires precise knowledge of the covariance matrix of the noise in this data vector and especially of the inverse covariance, which is also called the precision matrix. To obtain good estimates of these matrices, E-mail: survey collaborations have in the past e.g. taken the following route (Heymans et al. 2013; Kilbinger et al. 2013; Becker et al. 2016): they ran a set of high-precision numerical simulations, that in the limit of infinite realizations would in principle allow for a calculation of the true underlying covariance matrix of their observable. Given that numerical simulations are expensive, they however estimated the covariance and precision matrix from only a limited amount of realizations, leaving them with possibly significant uncertainties of their exact error budget. There has been extensive research on the impact of errors associated with covariance estimation on the constraints derived on cosmological parameters. Hartlap, Simon & Schneider (2007) discussed the fact that the inverse of an unbiased covariance estimator is not an unbiased estimator for the inverse covariance matrix (the precision matrix). They also described a way to correct for this when assuming that the covariance estimate follows a Wishart distribution (see also Kaufman 1967 and C 2017 The Authors Published by Oxford University Press on behalf of the Royal Astronomical Society Accepted 2017 September 29. Received 2017 September 6; in original form 2017 March 21 Precision matrix expansion 2 PA R A M E T E R C O N S T R A I N T S F RO M N O I S Y C OVA R I A N C E E S T I M AT E S We begin by outlining the main task of this paper. Let ξ̂ be a vector of Nd data points measured from observational data and let ξ [π ] be a model for this data vector that depends on a vector of Np parameters π . If C is the covariance matrix of ξ̂ then a standard way to constrain the parameters π is to assign a posterior distribution p(π|ξ̂ ) to them as 1 p(π) (1) p(π|ξ̂ ) ∼ exp − χ 2 π | ξ̂ , C 2 with T χ 2 π | ξ̂ , C = ξ̂ − ξ [π] C−1 ξ̂ − ξ [π] (2) and p(π) being a prior density incorporating a priori knowledge or assumptions on π. These expressions in fact ignore that C also can be dependent on π . We will do this throughout this paper and refer the reader to Eifler, Schneider & Hartlap (2009) who investigated the impact of cosmology-dependent covariance matrices on cosmic shear likelihood analyses. Another assumption that goes into equation (1) is that the measured data vector ξ̂ is drawn from a multivariate Gaussian distribution. In wide area surveys this is justified in the limit where one can consider the survey to consist of many independent sub-regions, such that the measurements in those regions add up to a Gaussian data vector by means of the central limit theorem. If the covariance matrix C is not exactly known, it can e.g. be estimated from N-body simulations. If ξ̂ i , i = 1...Ns , are a number of independent measurements of ξ in simulations then an unbiased estimate of C is given by Ĉ := Ns T 1 ξ̂ i − ξ̄ ξ̂ i − ξ̄ , ν i=1 (3) where ν = Ns − 1 and ξ̄ is the sample mean of the ξ̂ i . We will assume Ĉ to have a Wishart distribution with ν degrees of freedom which follows from our assumption that ξ̂ and the ξ̂ i are Gaussian distributed (cf. Taylor et al. 2013). Also, we will assume that Ĉ is an unbiased estimator for the covariance matrix of actual data, i.e. if Ĉ is indeed an estimate from N-body simulations, then we will assume these simulations to well resemble the error constributions present in actual data. To compute the likelihood in equation (1) we need to know the precision matrix, i.e. is the inverse covariance matrix = C−1 . According to Kaufman (1967, see also Hartlap et al. 2007; Taylor et al. 2013) an unbiased estimator for can be constructed from Ĉ as ˆ = ν − N d − 1 −1 Ĉ ν (4) and (...truncated)