Mixed effects: a unifying framework for statistical modelling in fisheries biology (pdf)

Article PDF cannot be displayed. You can download it here:

https://icesjms.oxfordjournals.org/content/72/5/1245.full.pdf

Mixed effects: a unifying framework for statistical modelling in fisheries biology

ICES Journal of Marine Science ICES Journal of Marine Science (2015), 72(5), 1245– 1256. doi:10.1093/icesjms/fsu213 Review Article Mixed effects: a unifying framework for statistical modelling in ﬁsheries biology James T. Thorson 1* and Cóilı́n Minto 2 Fisheries Resource Analysis and Monitoring Division, Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Boulevard East, Seattle, WA 98112, USA 2 Marine and Freshwater Research Centre, Galway-Mayo Institute of Technology, Dublin Road, Galway, Ireland *Corresponding author: tel: 1 206 302 1772; fax: 1 206 860 6792; e-mail: Thorson, J. T., and Minto, C. Mixed effects: a unifying framework for statistical modelling in ﬁsheries biology. – ICES Journal of Marine Science, 72: 1245 – 1256. Received 11 June 2014; revised 30 October 2014; accepted 31 October 2014; advance access publication 4 December 2014. Fisheries biology encompasses a tremendous diversity of research questions, methods, and models. Many sub-ﬁelds use observational or experimental data to make inference about biological characteristics that are not directly observed (called “latent states”), such as heritability of phenotypic traits, habitat suitability, and population densities to name a few. Latent states will generally cause model residuals to be correlated, violating the assumption of statistical independence made in many statistical modelling approaches. In this exposition, we argue that mixed-effect modelling (i) is an important and generic solution to non-independence caused by latent states; (ii) provides a unifying framework for disparate statistical methods such as time-series, spatial, and individual-based models; and (iii) is increasingly practical to implement and customize for problemspeciﬁc models. We proceed by summarizing the distinctions between ﬁxed and random effects, reviewing a generic approach for parameter estimation, and distinguishing general categories of non-linear mixed-effect models. We then provide four worked examples, including state-space, spatial, individual-level variability, and quantitative genetics applications (with working code for each), while providing comparison with conventional ﬁxed-effect implementations. We conclude by summarizing directions for future research in this important framework for modelling and statistical analysis in ﬁsheries biology. Keywords: Gaussian random ﬁeld, hierarchical, individual-level variability, integration, latent variable, measurement error, mixed-effects model, random effects, spatial variation, state space. Introduction Counting fish is like counting trees - except they are invisible and they keep moving John Shepherd, quoted in Hilborn (2002) Role of models in ﬁsheries science Marine populations are dispersed across varied habitats and are influenced by ecosystem and climatic factors that change over time (Hilborn and Walters, 1992). Population and community dynamics arise from somatic growth, reproduction, maturation, natural and fishing mortality, movement, between-species interactions, and spatial variation in habitat quality (Hilborn and Walters, 1992; Quinn and Deriso, 1999). These processes are generally not possible to measure directly at the spatial scale of the population or community. Predicting the net effect of all these processes simultaneously therefore requires statistical models that are used to reconcile population and community theory with available data. These statistical models must, explicitly or implicitly, make inference from data to biological characteristics that are not directly observed (i.e. statistical models are often used to count invisible moving trees). The existence of states that are not directly observed (termed “latent states” in the following) causes complications for many statistical models in fisheries biology (see Millar and Anderson, 2004). Spatial variability may cause unexplained residuals in data from multiple sampling units to be statistically correlated (“spatial nonindependence”), and interannual changes in population-dynamics Published by Oxford University Press on behalf of International Council for the Exploration of the Sea 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US. 1 1246 (i) Models using random effects are important for inference when analysing fisheries data that exhibit non-independence. (ii) Random effects provide a unifying statistical framework for models that might otherwise seem unrelated, for example, time-series models for populations, spatial models, genetics models, and models for variation among individuals; (iii) Models that include random effects are increasingly easy to build and customize for specific fisheries problems using publicly available modelling tools and software. Discussions of random effects are available elsewhere (Searle et al., 1992; Gelman and Hill, 2007; Pinheiro and Bates, 2009) so we instead seek to provide an accessible introduction for fisheries biologists using fishery and aquatic examples. The remainder of this paper is devoted to quickly reviewing inference and parameter estimation for models with random effects, followed by four case-study applications that include example code for flexible model development and parameter estimation. Random effects In this review, we use random effects to broadly refer to parameters that are assumed to arise from a shared stochastic process, where the distribution of likely values can be estimated. Fixed effects refers to other model parameters (including the parameters governing the distribution of random effects) where there is not sufficient data to estimate these parameters as arising from a shared distribution of likely values. Finally, a mixed-effects model is any model that has a mix of random and fixed effects (see Gelman, 2005 for more discussion). For example, an ecologist might analyse data from a growth experiment involving many individual fish subject to similar environmental conditions (e.g. Shelton et al., 2013). Because individual growth arises from a shared stochastic process (i.e. the conditions of the experiment), parameters representing growth rate for each fish can be treated as if they are randomly drawn from a shared distribution of likely values. By treating growth parameters as if they are drawn from a shared distribution, the ecologist can then estimate characteristics of the distribution of likely growth rates (i.e. its mean and standard deviation). The parameter representing average growth rate is in this case fixed, while individual growth rate parameters are random (i.e. the growth rates of individual fish is replicated in the experiment, so the magnitude of variation in growth rate among individuals can be estimated). Random effects are particularly useful thanks to the phenomenon called shrinkage. Shrinkage occurs when random effects are shrunk towards the average value f (...truncated)