On the use of tagging data in statistical multispecies multi-area models of marine populations
On the use of tagging data in statistical multispecies multi-area
models of marine populations. - ICES Journal of Marine Science
On the use of tagging data in statistical multispecies multi-area models of marine populations
Sigurdur Hannesson 0
Audbjorg Jakobsdottir 0
James Begley 0
Lorna Taylor 0
Gunnar Stefansson 0
0 S. Hannesson: Mathematical Institute, University of Oxford , 24-29 St Giles', Oxford OX1 3LB , UK. A. Jakobsdottir, J. Begley, L. Taylor
The use of multispecies models of marine stocks along with recognition of the importance of temporal differences in spatial overlap has resulted in migration rates playing an increasingly important role in models of fish stocks. Moreover, traditional estimates of growth based on samples from fishing gear are confounded with the selection pattern, which is exacerbated when multispecies issues are considered. For these and other reasons, there is a need to include explicit tagging data as a component of fisheries models. A statistical multispecies multi-area framework has been extended to predict tag returns and subsequently to incorporate tagging data in likelihood components to be used when estimating migration rates. The information content of such data is not clear a priori, but simulations indicate the point estimates to be quite reasonable. A bootstrap method is proposed, based on bootstrapping entire tagging experiments (rather than individual fish). The resulting bootstrapped uncertainty estimates are generally applicable and are found to be close to the true values in the simulated examples.
bootstrapping; migration estimation; multispecies models
Recent multispecies models incorporate multiple areas to account
for species overlap, an important concern when considering
(Stefansson and Palsson, 1997a, 1998; Begley
and Howell, 2004; Taylor et al., 2007)
. This implies an immediate
need to account for migration in such models, and these
migrations are typically poorly known. The rates therefore need
to be estimated and the errors associated with these estimates
need to be determined.
This is all the more important in the light of recent
developments in the management of fisheries, where the potential for
marine protected areas (MPAs) as management tools is becoming
better acknowledged, although still under debate
1998; Lubchenco et al., 2003)
. Such management procedures
have been criticized for lack of rigour, because they may not
have been subject to the same level of testing as traditional
management systems (Hilborn et al., 2004), although some such
testing has been initiated through simulations
In some cases, it is possible to use such simulations of different
migration scenarios to evaluate the effects of MPAs. However, a
more common requirement will be to obtain reasonable point
estimates of migration rates for a given environment. In particular,
this type of information is likely to allow some leeway in how,
for example, MPAs are set up. The evaluation of corresponding
harvest control rules will require knowledge of the corresponding
error rates. These error rates are quite important, because fairly
small changes in assumed migration rates can seriously affect
(Stefansson and Rosenberg, 2006)
. The same
concerns will apply when considering multispecies models
where, for example, species overlap is an important parameter
and is a function of migration rates.
Here, we describe several simulation experiments designed to
investigate the effects of using tagging data to estimate migration
rates. The experiments are intended to shed some light on the
estimability of migration rates in the context of statistical
multispecies multi-area models and to provide a means to measure
the uncertainty associated with the estimates.
The concept of a substock (or, equivalently, a stock
component) is used to denote the fish in a given area and maturity
stage. Most of this work uses data simulated quarterly on two
areas, with fish migrating between the areas while subject to
harvest mortality, natural mortality, and other processes.
Bootstrap methods are designed and used to ascertain the
estimation errors associated with maximum likelihood estimation of
migration rates in the non-linear population dynamics models.
Simulated populations are used to verify the estimation of error
Finally, we present a case study for immature and mature cod
(Gadus morhua) around Iceland, using a real dataset.
General multispecies population dynamics
The following describes in fairly generic terms the population
dynamics which can be incorporated in current spatially explicit
multispecies models of marine animal populations. All of these
processes are implemented in a computer program, Gadget (see
http://www.hafro.is/gadget), designed for generic multispecies
(Begley, 2004; Begley and Howell, 2004)
. If considered
important in the parent population (or substock in Gadget
terminology), these processes also need to be taken into account when
implementing tagged subpopulations.
Denote by Nalmsrt the number (N) of animals of species s, age a,
length l, and maturity stage m, alive in region r, and time-step t.
This group of individuals is a subset of the entire collection of
animals of a given species. Such a collection will be referred
to as a substock. Mathematically, this is just a collection of
numbers, but they are usually unknown and will need to be
estimated using statistical techniques.
Normally, there are only two maturity stages within the model,
immature and mature animals (m ? 0,1), though for closed life
cycles, there is a need to include the egg and larva stages as a
separate population. Always, only a fixed area, time-step, and maturity
stage are considered, and in this case, such indices are omitted and
the notation is simplified to Nals.
Several processes can affect such a group of animals and cause a
change in numbers in the group. A framework such as Gadget
(Taylor et al., 2007)
includes the following:
(i) Migration undertaken by a population on a given time-step
is described by matrices, P ? ?pr1r2 ?rr21??11;;......RR, containing the
proportion pr1r2 of the population that moves from area
r2 to area r1, and R is the number of areas. Hence, if u ?
(u1, . . . , uR)0 denotes abundance numbers by area for an
age ? length ? maturity ? species group in a population, and
P is such a matrix, then the area distribution after the
migration has taken place is Pu. When the migration
matrices vary by age, time, maturity stage, etc., the notation
Palmst may be needed.
(ii) Consumption of a prey by a predator and fishing by a fleet is
defined through a set of equations which describe the
desired food supply of a predator and the suitability of
each prey class. This is only used in the context of fishing
in the present paper.
(iii) Natural mortality other than predation/fishing mortality is
usually taken as a fixed number for each species, but it may
of course vary with time, age, or size.
(iv) Growth in length within multispecies population dynamics
models is most naturally implemented through growth in
weight, because the weight increase is more naturally
linked to consumption. Alternatively, growth may be
according to a fixed growth schedule in either length or
weight. In any case, as the model is based on the frequency
in length groups, growth in length needs to correspond to
transfers of fish between length groups in such a manner
that the predicted mean growth is maintained for the
length group. This is implemented through the use of a
length-update probability distribution, where the updating
distribution is a beta-binomial which has the mean length
increment as one parameter, and the other is a free
parameter (b) and can be estimated or fixed.
(v) A (mature) population may spawn to generate offspring and
lose biomass. This possibly results in spawning mortality.
(vi) Maturation involves shifting fish from a population of
immature fish to another population of mature fish. This
is done using proportions (as in the migration process),
which are designed to mimic the resulting proportion of
mature fish in each age ? length cell.
(vii) Ageing involves increasing the age by 1 year, which occurs at
the last time-step of a year (except for the oldest age group,
which is usually a plus group).
The order in which the processes are implemented is important.
The above sequence is the order in which these are implemented
When fish have been tagged, it is useful to think of the tagged fish
as a subpopulation of the population (or substock) from which
they were taken. A tagged subpopulation can be modelled as a
subgroup, Na0lmsrte of the numbers in a population, Nalmsrt, where
e denotes a given tagging experiment. Each tagging experiment
forms a separate subgroup.
Ideally one would like to tag fish and record, at the time of
tagging, every possible piece of information, so they can be
entered as a completely known population into the model.
Naturally, when fish are tagged, they are also length-measured
and the position is known, but the age and maturity stage are
generally not identified. Therefore, when forming the model
population from the tagging data, the tagged fish are allocated
to age groups in accordance with the age composition of the
substock at the time-step (t ) of tagging: Na0lmsrte ? N.0lmsrte
Nalmsrt/N.lmsrt, where ?.? denotes aggregation over the
In addition to the processes described above, tag loss may need
to be taken into account. This is done through a parameter, g,
referred to as tag loss. At each time-step 1 2 e2g, a proportion
of the tagged subpopulation loses its tags at the same time as
natural mortality acts on the subpopulation.
Ignoring for the moment this additional process, the new
tagged subpopulations have exactly the same dynamics as the
original population. Assuming that the number of tagged fish is low
compared with the population totals, the tagged fish do not
affect the dynamics of the population as a whole. Therefore, the
tagged dynamics can be viewed conditionally on the dynamics of
the whole population, but from an implementational point of
view, these components will normally be simulated
simultaneously. It is clear that computational effort will increase as
the number of tagging experiments increases.
For technical reasons, it is useful to think not only of the
number of tagged fish, but also of the proportion of tags, p?e ?
Ne0/N, in a given group (with a, l, m, s, r and t fixed).
The tagged subpopulations need to be subjected to exactly
the same processes as described for the original populations.
For practical reasons, the implementation of this is quite
important. Consider first the mortality processes that affect
the numbers in a given cell through reduction alone. They do
not change the proportion of tagged fish, and the values of p?e
remain fixed throughout these processes. It follows that these
processes can be implemented for the parent population alone,
and the proportions p?e can be used after the process to
recompute the tagged numbers. This is in contrast to implementing
migration, because in this case the proportions can change
drastically and are not easy to track until after the migration process
has been completed, so here the process needs to be applied to
each tagged subpopulation separately. It is therefore clear that
judicious use alternately of numbers and proportions can
simplify implementation considerably. Given the computational
resources required for a simulation environment such as
Gadget, this can be quite important with regard to
As the same population dynamics process applies to the tagged
subpopulation, the predicted number of tag returns is just the
predicted tagged catch, C? almsrfte, which is predicted like the
catch of any other substock
(Begley, 2004; Begley and Howell,
. The corresponding data or observed count, Calmsrfte,
should have this expected value if the model is correct. The new
subscript, f, refers to fleet, because there may be many fleets in
The models used in this paper and implemented in Gadget have
deterministic forward-projection models as a basic internal
simulation mechanism. Any such forward projection is based
on assumptions, several of which may be formulated in terms
of parameters which are potentially unknown but can be
A forward projection of the system results in a set of predicted
values for all model components, including the catch in numbers
of each length and age group at each time-step. This also provides
predicted mean lengths-at-age along with predictions of arbitrary
aggregations of catches in weight or numbers. Any of these can be
used to obtain likelihood components if corresponding data are
available. The estimation itself proceeds by minimizing the sum
of all negative log-likelihood components, possibly after weighting
each appropriately. In addition to proper likelihood components,
it is common to add penalties when bounds are exceeded and a
special penalty component which enters as the squared difference
between predicted and observed landings, when the model does
not provide enough catch in an area.
The mean length-at-age can be entered as sums of squares, and
data on length distributions can be entered as Gaussian
log-likelihoods (sums of squares) or using either multinomial
or multivariate Gaussian distributions. The Gadget likelihood
components are described in
Begley and Howell (2004)
accommodate tag-recapture data, e.g. as needed in this work, a
likelihood component has been added to Gadget as described
The recapture data are initially taken as Poisson counts, as
. As the Poisson parameter is now
Calmsrfte, the corresponding likelihood contribution becomes
Pee C^ almsrfte
C^ aClamlmssrrffttee :
This is the ideal situation, applicable when information about
length, maturity, age, and location of recaptures is available for
each individual fish recaptured. In cases when one or more of
these pieces of information are unavailable, aggregation will be
needed in one form or another. Normally, the area, period and
tagging experiment are available, but one or more of the others
may be missing. It is not clear how to combine data with
varying levels of information, but it is certainly possible to
use only the information which tends to be available always, e.g.
Pee C^ ...:rfte C...:rfte ;
when information is only available on region, fleet, time-step, and
recapture (r, f, t, e), so aggregates over age, length, maturity stage,
and substock (a, l, m, s) are used.
The emphasis in this approach is on predicting the observed
counts. It follows that this is mainly designed to resolve issues of
numbers, i.e. to provide some added resolution to mortality
rates and population numbers.
Simulation, data, and estimation
The level of information available in tagging data is not clear, nor
is the required number of tagging experiments or number of fish in
each tagging experiment, to obtain a given reduction in variance of
migration parameters. It should also be acknowledged that tagging
experiments are rarely carried out on fish selected at random,
whereas use within a model would generally be to calculate
average migration rates. Similarly, it is not at all clear how the
nonlinear nature of the population models affects the parameter
estimates, but in general, some bias is to be expected. A simulation
approach has been developed to verify the assumption and
estimation accuracy in the parameters of multispecies, multi-area
Biological parameters in the simulated stock
To focus on the effects of migration and associated uncertainty, the
population models in simulations are kept as simple as possible.
The model is run for 20 years with four (quarterly) time-steps
each year. A single stock is simulated, but the stock is distributed
across two areas, with an age range of 1 ? 10 years and a length
range of 4.5 ? 90.5 cm, at 1-cm intervals, giving 86 length groups.
The simulated stock grows according to a von Bertalanffy
growth function [Gadget?s ?lengthvbsimple? growth function
], with parameter values L1 ? 95, k ? 0.08. The
weight-at-length relationship is a simple power relationship, w ?
clb, with c ? 1 1025 and b ? 3. The maximum length group
growth was 6, and in the beta-binomial, b was set at 12.
Recruitment enters the stock once a year, and for simplicity, the
simulated recruitment enters as 1 million recruits in each area,
each year. The stock will be assumed to migrate every time-step.
To simulate variable spawning, returning and feeding behaviour,
there will be three different migration matrices. These will
assume the same migration pattern for the second and third
timestep each year. These matrices are:
with i ? 1,4, p1 ? 0.4, and p4 ; 0; and
The fishing fleet catches a fixed proportion of the stock each
year, implemented as 6.25% of the stock each quarter,
corresponding to a constant fishing mortality (as a proportion) of F ? 0.25,
but alternative scenarios are considered. Natural mortality is set
to 0.2 for all ages.
In some real situations, the catch information is disaggregated
by season and area, and this will provide some bounds on the
migration parameters. In the present models, the catch from the
stock was aggregated across areas, so those likelihood components
did not affect the estimation of migration rates. This is done to
avoid confusion over which part of the model provides
information on migration.
Simulated stocks and components
A simulation model has been developed in the R statistical package
(see http://www.r-project.org), simulating the population
dynamics described in the previous section. This simulation
model is an extension of
Sigurdardottir et al. (2005)
, with the
addition of generating recaptures from tagging experiments. The
simulated data are generated separately from the estimation
procedure, which is to be verified.
The simulation predicts the expected number of tag returns
from the catch of the tagged stock, C? almsrfte. Stochastic simulated
data are generated from a Poisson distribution with this parameter,
and the generated number is rounded to the nearest integer. These
simulated data are then fed into the estimation procedure to give
estimated migration rates (or other parameters if needed).
Repeating these simulations provides multiple sets of estimated
migration rates, which can be used to estimate the variability in
estimated migration rates (in addition to any bias). As these
values are obtained by simulating different ?realities?, they can
be obtained with any desired accuracy simply by repeating the
simulations. It should be noted that these then become ?true?
standard errors of estimated migration rates, i.e. these give sp?i.
In real situations, one attempts to estimate the errors using a
numerical procedure (such as Hessian matrices or bootstrapping),
which gives s? ?i. The simulation approach can be used to verify
whether the estimates are realistic at least within the bounds of
the simulated world.
A model with two areas and two stock components was considered
using biological assumptions detailed elsewhere
Stefansson, 2004; Taylor et al., 2007)
. The model includes
immature and mature cod stock components and only the mature fish
migrate between the areas. There are 12 time-steps in each year.
Many parameters were fixed to the values estimated from a
simpler single area model, such as growth and maturity
parameters. Migration rates and two sets of initial population
parameters and recruitment parameters were estimated using tagging
data, survey indices, and age distribution data. This is because
migration is linked to the estimation of initial population and
recruitment, which strongly affect the biomass and in turn
influences the number of recaptures.
In the model, immature cod can be 1 ? 10 years old and occupy
1-cm intervals in the range 4 ? 130 cm. These grow according to a
von Bertalanffy growth function
growth function; Begley, 2004)
with a maximum growth of ten
length groups per time-step. Recruitment occurs in the first
timestep each year, and this substock matures into the mature stock.
Mature cod are 3 ? 12 years old and 20 ? 140 cm long (1-cm
intervals). They grow like immature cod. This substock migrates
with the migration pattern similar to that in the simulated
model, described below.
There are two fleets, commercial and the March groundfish
survey. Further, there are a considerable number of likelihood
components, but the most important, here, are the recapture
likelihood components. The others are:
(i) bound likelihood component: a penalty is generated when
the bounds given for the parameters to be estimated are
(ii) migration penalty likelihood component: generates a penalty
if there is a negative entry in a migration matrix;
(iii) catch distribution likelihood components: sums of squares of
the difference of the proportion of the data and the model
sample of fish caught by time, area, and age, to estimate
the initial population;
(iv) understocking likelihood component: sums of squares of
overconsumption in the model are included to ensure that
no more fish are removed from the stock than a certain
(Begley and Howell, 2004; Taylor
and Stefansson, 2004)
(v) survey indices likelihood components
(Taylor et al., 2007)
These likelihood components are described in detail in
A basic split of Icelandic and surrounding waters is given in
Figure 1. In this Figure, subareas 101 ? 108 basically give waters
,500 m deep and are usually considered the main cod areas.
Area 102 is a feeding area, subject to considerable migration,
whereas the main spawning grounds are in 101 and 108. For
testing with real data, the models use two areas: a ?southern
area?, Area 1 (subareas 101, 102, 107, and 108) and a ?northern
area?, Area 2 (subareas 103, 104, and 105). These are all based
on the definitions set out in Taylor (2003b). This is not completely
in accordance with convention, because common assessments
include 102 with the northern part.
When analyses have a fine time-scale and with most of the
spawning in subareas 101 and 108, together with the main
feeding ground in 102, the migration rates are expected to be
fairly low. They are therefore scaled to percentages for
Tag-recapture data for the years 1991 ? 1997 were extracted from a
(Kupca and Sandbeck, 2003; Taylor, 2003a, b;
. Tagged fish were aggregated by month, year, and
area, giving 13 tagging experiments. All tagged fish .40 cm
were classified as mature, because these tagging experiments
targeted spawning aggregations. The tagging experiments are
summarized in Table 1.
Tag loss was fixed at 0.02 for all 13 tagging experiments, and
other parameters were estimated. At this rate, more tagged fish
lose tags than die in each time-step, because natural mortality is
0.2 per year but tag loss is 0.02 per month. Note that tagging
mortality and non-reporting are not specifically taken into account in
Gadget, so the tag-loss parameter also represents these processes.
The weights on the recapture data in the negative log-likelihood
function were all set equal. Although alternatives exist
et al., 2007)
, the emphasis here is on a generic estimation
Figure 1. Separation of waters around Iceland into areas. Area 1, south: subareas 101, 102, 107, and 108. Area 2, north: subareas 103, 104, and
105. Subarea 102 is an important feeding area for cod, whereas 101 and 108 are the main spawning grounds. See text for detail.
method and generic estimation of uncertainty. It is important that
this generic uncertainty estimation works also when the weights
are uncertain or incorrect, so no attempt is made to adjust these.
To differentiate between spawning, returning, and feeding
behaviour, three migration matrices are implemented. These matrices
are as in Equations (1) and (2), with p1 and p2 ; p3 to be
estimated, but p4 ; 0.
From January to April (inclusive), migration matrix P1 is used,
from May to August (inclusive) and December, migration matrix
P2 is used, and from September to November (inclusive), there is
no migration (P4 is used). It is assumed that no fish will move to
areas other than the two areas involved.
The above population dynamics models and estimation
procedures have been implemented within Gadget. Estimation is
always implemented with an initial global search, the simulated
(Corana et al., 1987)
, followed by a local
search, the Hooke and Jeeves algorithm
(Hooke and Jeeves,
, both implemented within Gadget
(Begley, 2004; Begley
and Howell, 2004)
Bootstrap estimation of uncertainty
is a very general method that can be
used to obtain estimates of uncertainty. Given the complexity,
specialized bootstrap methods have been developed to shed
some light on the issue of estimability and accuracy of parameter
estimates in typical multi-area fisheries models. Some care needs
to be taken when implementing the bootstrap for tagging data,
or any fisheries data for that matter. In particular, it is very clear
that the fish within a single tagged sample are not randomly
sampled from the entire population of wild fish, so the traditional
(Efron and Tibshirani, 1994)
needs to be modified. The
correlated nature of these data is commonly termed the
(Pennington and V?lstad, 1994)
. It is taken
into account here by taking the sampling units for bootstrapping
purposes to be individual tagging experiments rather than
individual tagged fish.
For a scenario with a collection of N tagging experiments,
bootstrapping is implemented by sampling N experiments (with
replacement) from this collection, giving a bootstrap sample. Given this
bootstrapped sample, the parameters are estimated giving a single
bootstrap parameter estimate. This procedure is then repeated to
obtain repeated bootstrap estimates of parameters, subsequently
providing estimated standard errors.
It is not known a priori whether the bootstrap procedure will
provide reasonable estimates of uncertainty. As with any proposed
method, the bootstrap needs to be evaluated and this will be done
within the context of a simulated population, where the
distribution of migration estimates around the expected value is known.
Initial tests based on data without any error and deterministic
recaptures were used to verify the consistency between the R
simulator and the Gadget estimator. When as many as 50 000 fish are
tagged, the estimated values of the parameters are quite close to
the ?true? values (Table 2; maximum difference ,1.5%). There
will always be some error in the recapture likelihood, because
the real recaptures, which are integers, are compared with the
predicted values from Gadget, which are real (positive) numbers.
Never is there an obvious problem or bias, although there is
some deviation when estimating all three values as opposed to
restricting the rates to be equal (Table 2). When only 500 fish
are tagged, the point estimates deviate more from the true values
than with 50 000 tagged fish (Table 2). The difference in the
estimates of p2 and p3 when all three are estimated freely is now in the
opposite direction from before. Again, there is no obvious
indication of bias.
The optimization might potentially be sensitive to the initial
values of the migration parameters, but repeated optimization
with alternative initial values gave the same estimates as before
The likelihood score for the migration rates in the domain ]0,
1[ ]0, 1[ was computed using Gadget. A plot of the likelihood
surface for the migration rates p1 and p2 shows that, as a function
of these two parameters, the negative log-likelihood surface is
quite well behaved, with a single minimum (Figure 2).
Estimation from simulated stochastic data
In all the following simulations, recaptures are drawn
stochastically. When each tagging experiment is considered an independent
experiment and the number tagged are treated as a single
subpopulation throughout each simulation, the computational
requirements increase drastically, because the number of
experiments increases. It is therefore of considerable interest to test a
different approach, where all tagged individuals are considered a
part of the same tagged population, and during a simulation
period, tagged individuals are simply added to the existing
The following subsections consider two tagging scenarios. In
the first scenario with 40 tagging experiments, tagging occurs in
all time-steps in both areas for 5 years. The second scenario
considers eight tagging experiments, where the year of tagging is
randomly chosen from 6 potential years, and there is tagging on every
time-step in both areas for that year.
Scenario 1: all steps ? all areas ? all years
A tagging experiment is simulated in each of two areas every
quarter in each of 5 years, yielding 40 tagging experiments.
Each tagging as a separate experiment: The number of tagged
fish in each experiment is normally distributed (and rounded to
the nearest integer), with a mean of 280 fish and a standard
deviation (s.d.) of 30. For such a generated dataset, the migration rates
are estimated. The procedure is repeated 40 times to verify the
variability in the estimated migration rates. The true values of
the parameters were fixed at 0.4, and the resulting estimates
came out with an average of 0.398 (s.d. ? 0.018) for p?1 and
0.406 (s.d. ? 0.012) for p?2, and a correlation between the two is
0.698. Figure 3a ? c shows the distribution of the estimate of the
migration rates p1 and p2, along with a scatterplot of the estimates
of the two parameters.
An aggregated experiment: Consider again the 40 experiments
conducted above. This time, these are considered as a single
aggregated experiment, where fish tagged subsequently are added to the
existing tagged stock during the simulation. The advantage of this
approach is that the simulation time is reduced by an order of
magnitude as the number of populations stored in Gadget is
reduced considerably. Again, the true values of the parameters
were fixed at 0.4. The resulting estimates came out with an
average of 0.399 (s.d. ? 0.017) for p?1 and 0.398 (s.d. ? 0.012)
for p?2, and a correlation between p1 and p2 of 0.719, compared
with 0.698 earlier, with the tagging experiments disaggregated.
These results are depicted in Figure 3d ? f in the form of histograms
of the distribution of estimated migration rates, along with a
scatterplot of the two.
Bootstrap estimation of uncertainty: The same scenario as before
is considered here. Bootstrapping was done 100 times to obtain
100 estimates of migration rates, providing estimated standard
errors. The ?base value? in this case is the migration rate estimated
from the initial simulated data from which the bootstrap samples
are simulated. For p?1, the resulting estimates came out with an
average of 0.391 (s.d. ? 0.012, base value ? 0.392) and with
0.383 (s.d. ? 0.011, base value ? 0.383) for p?2, and a correlation
of 0.518 (Figure 3g ? i). Note that the bootstrap mean is similar
to the true value, indicating that there is not a great bias. Also
notable is the fact that the standard errors are 0.011, slightly
lower than the simulated values of 0.012 ? 0.018, but certainly of
the same order.
Scenario 2: all steps ? all areas ? year chosen randomly
In this scenario, there are eight tagging experiments, two at each
time-step from the year, one from each area. The year at tagging
is randomly chosen from 6 years for each of the eight tagging
experiments. Tagging data were simulated with Poisson error
(100 datasets), using R as before. Gadget was then run with
optimization, to estimate the two migration rates.
When each tagging experiment is treated as a separate
experiment, fixing the true migration rates at 0.4 results in an average
of 0.402 (s.d. ? 0.035) for p?1 and 0.410 (s.d. ? 0.029) for p?2.
Figure 4a ? c shows the distribution of estimated migration rates
along with their bivariate distribution migration rates, where the
correlation is now 0.671. When the tagging experiments are
aggregated, the results obtained are again similar. As before, the true
values are set to 0.4, and the average point estimates of p1 and
p2 are 0.397 (s.d. ? 0.033) and 0.400 (s.d. ? 0.024), respectively,
with a correlation of 0.554.
Note that the standard deviation of these migration rates is
lower in the multiple experiment than it was in the single
experiment. The resulting plots are similar to those obtained earlier
For the bootstrap, the same scenario with eight tagging
experiments is considered again. Bootstrapping these 100 times with a
base value of 0.400 for p1 results in a mean value of 0.397
(s.d. ? 0.016). For p2, the base value was 0.399 and the bootstrap
mean was 0.396 (s.d. ? 0.0257). In this setting, the correlation was
0.626 or slightly lower than with the disaggregated tagging
experiments. The bootstrap results are seen in Figure 5a ? c.
Changing fleet and fishing mortality
In all the simulations above, the catch was modelled as a simple
proportion of the population (Gadget?s ?linearfleet? predation
function) with fishing mortality (F, but as a proportion of the
stock size) set constant across areas, time-steps, and years.
When fishing mortality varies among time-steps (i.e.
proportional fishing mortalities of F ? 0.25, 0.375, 0.125, 0.25 for
time-steps 1 ? 4), the estimates of migration parameters are
similar to those when fishing mortality is identical for all
The model described above was implemented in Gadget, and
parameters were estimated. Bootstrapping was subsequently used to
obtain confidence statements for the migration rates. Although
the model predicted fewer recaptured fish than were actually
recaptured, the proportion of recaptured fish by area was
similar. The point estimates of the migration rates were 1.559
Entire tagging experiments were bootstrapped, and parameters
were estimated for each bootstrap sample. Within each
bootstrapped sample, the experiments were chosen randomly, with
the restriction that there had to be at least one tagging experiment
from each area (resulting in 97 rather than 100 bootstraps). The
results are given in Table 3 and Figure 6.
The point estimate of absolute bias in the percentage migrating
is from 0.08 to 1.5 when the correct percentage is 1 ? 1.5%. The
95% confidence intervals based on the bootstrap are (0.246,
3.100%) for migration rate p1 and (0.396, 1.750%) for migration
rate p2. A boxplot of the migration rates from the bootstrap runs
is provided in Figure 7.
The bootstrap means are close to the base value (point estimate
from the initial data) but the standard deviation is relatively high.
In Figure 6, both the histogram (Figure 6a) of the estimated
migration rate p?1 and the scatterplot (Figure 6c) show that the
points lie in three groups. In the first, p?1 , 1.1, in the second
1.1 p?1 2.6, and in the third p?1 . 2.6. Of the observed tagging
data, one particular experiment was very large (2818 fish tagged)
with a high return rate (38.1%), including almost one-third of all
recaptures. Optimization runs using a bootstrapped dataset
including this experiment never optimize to solutions with p?1 ,
1.1. Runs optimizing with 1.1 p?1 2.6 include this experiment
1 ? 3 times in the bootstrap data, and with p?1 . 2.6 the experiment
is present 2 ? 5 times in the bootstrap data.
Adding tagging data to the existing (multispecies) multi-area
models allows the estimation of migration rates, at least in
principle. Results here indicate that these migration rates are indeed
estimable, at least for few areas and when reliable estimates of
some other model parameters have been obtained from other
sources. More-general models will typically be fitted to several
datasets, and in these cases, one should expect the tagging data to
resolve the migration rates if other parameters become estimable
from the remaining datasets.
The estimators of migration seem to be biased when using real
datasets, and this bias appears to be of the order of 10% of the true
value. It is very likely that one could fit better models to the dataset
in question, e.g. using a reweighting scheme to estimate the
weights given to each likelihood component, etc., but this is
somewhat outside the scope of this work. On the other hand, by
their very nature, tagging experiments cannot be a randomized
sample from the underlying population being modelled. It is
therefore quite natural for there to be some biases in the estimates,
particularly with areas as large as those used here.
In general, a bias may occur for many reasons. Any model
including a tagging process will include assumptions or models
for tag loss, survival in the population, survival after tagging, the
migration process, and growth. A simple example of a failure of
such models is the lack of very high spatial resolution, which is
the only way to explain some of the microscale fish behaviour or
grouping, and this can easily lead to higher observed tag returns
than can be explained with large-area models. Finally, although
growth is not linked to the tagging data in this particular model,
in the general case, fitting a growth submodel will affect the
These initial simulations indicate that repeated tagging
experiments may, at least in some cases, be treated as a single
experiment, where new tags are added to tagged populations already
in the model. This greatly reduces the computations required.
It is not known, however, how variable survival rates or other
complications might affect these results. Preliminary results
indicate that weights attached to likelihood components may be
important. Indications are that a tagging experiment with a large
number of tagged fish and recaptures changes the estimate of
the migration rates. The weights should depend on the number
of comparisons or the number of fish recaptured, but the exact
form is not obvious.
In addition to estimating migration rates, an important issue is
obtaining the errors of these estimates. This is non-trivial, because
the models are highly non-linear, and several earlier simulations
indicate that simple approaches to variance estimation in fisheries
models provide highly misleading results, even in single-stock,
(Gavaris et al., 2000; Restrepo et al., 2000;
Patterson et al., 2001)
Bootstrap-based methods based on resampling entire tagging
experiments appear to capture the uncertainty in estimating
migration parameters. Although this result may be conditional
on the models used, the results obtained appear promising.
Gadget is simple when it comes to reducing the number of
tagged fish. Only tag loss in time is considered here, but there is
also tagging mortality, which should reduce the initial number
of tagged fish. Other issues include non-reporting, which would
be modelled by reducing the predicted recaptures by a fraction.
These two issues cannot be distinguished by the data considered
here and should therefore appear as one parameter to be estimated
In addition to this layout, it must be noted that there is
information in the recapture data on the growth of each recaptured fish.
This information is only used implicitly in the above, and not at all
when the likelihood components are aggregated over length and
age groups. Gadget is a Markovian model, neglecting the length
of fish before the previous time-step. It is therefore not trivial
how information on the history of the individual fish should be
brought into a new likelihood component, but this is certainly
not insoluble and needs to be explored. The easiest way would
be to put each length group as a specific tagging experiment, but
that would increase the computational time dramatically. It is,
however, possible to compare the mean length of recaptured fish
from the model and observed recaptures, and this has in fact
been implemented within Gadget.
Only one likelihood function for tag-recapture data has been
considered here but there are other possibilities, such as a
. The exact choice of a
likelihood function is likely to affect the precision of results as is true
in ordinary multiple linear regression. The important issue is,
however, that the bootstrap approach can be used with any such
estimation method to obtain uncertainty estimates. Further, it is
fairly well established that standard statistical assumptions rarely
apply to fisheries data, be these length distributions, landings,
age compositions, stomach samples, or tagging experiments
(Pennington and V?lstad, 1994; Pennington, 1996; Stefansson
and Palsson, 1997b; Brynjarsdottir and Stefansson, 2004;
. One can therefore not expect to find ?the?
correct likelihood function. If one can use a bootstrap method
to evaluate the uncertainty even for incorrect likelihoods, then
this potentially gives a means to evaluate which likelihoods give
less biases and variance.
Two-area examples have been considered here. Adding more
areas to a model increases the computing time for single
simulations. In the ?real data? example, some 30 parameters need to
be estimated for each area, in addition to the migration rates
whose number would increase too. Although it might be
interesting to investigate the effects of more complex geometries, this
should probably be done in the light of particular applications.
An interesting result from this analysis is the low estimate of
migration of mature fish between the areas chosen here. This is
probably a consequence of classifying subarea 102 with the south.
If this result holds under further scrutiny, it may imply that fairly
homogeneous (sub)populations are obtained using this split. This
should therefore be considered for assessments, because this may
lead to more consistent estimates of catches in numbers-at-age,
survey indices, etc., than when the area split corresponds to
higher migration rates.
Much of the work described here was undertaken while the
authors were employed at the MRI (Marine Research Institute,
Reykjav??k) and uses data from the MRI databases. The R simulator
was originally written by A. J. Sigurdardottir and E. Olafsdottir,
whereas the Gadget code has been in development for more
than a decade by many programmers at the MRI and IMR
(Institute of Marine Research, Bergen). The work was supported
in part by EU grant QLK5-CT1999-01609, a grant from The
Icelandic Centre for Research (Rannis), and the FMAP project
within the Census of Marine Life.
Begley , J. 2004 . Gadget User Guide . Technical Report 120 , Marine Research Institute, Reykjav??k. 93 pp.
Begley , J. , and Howell , D. 2004 . An overview of Gadget, the Globally applicable Area - Disaggregated General Ecosystem Toolbox . ICES Document CM 2004/FF: 13 . 16 pp.
Brynjarsdottir , J. , and Stefansson , G. 2004 . Analysis of cod catch data from Icelandic groundfish surveys using generalized linear models . Fisheries Research , 70 : 195 - 208 .
Corana , A. , Marchesi , M. , Martini , C. , and Ridella , S. 1987 . Minimizing multimodal functions of continuous variables with the simulated annealing algorithm . ACM Transactions on Mathematical Software (TOMS) , 13 : 262 - 280 .
Efron , B. 1979 . Bootstrap methods: another look at the jackknife . Annals of Statistics , 7 : 1 - 26 .
Efron , B. , and Tibshirani , R. J. 1994 . An Introduction to the Bootstrap . Chapman and Hall/CRC, New York. 464 pp.
Elvarsson , B. 2004 . An application of the multivariate normal distribution in a statistical fisheries model . In Final Report: dst2: Development of Structurally Detailed Statistically Testable Models of Marine Populations , pp. 127 - 131 . Technical Report 118 , Marine Research Institute, Reykjav??k.
Gavaris , S. , Patterson , K. R. , Darby , C. D. , Lewy , P. , Mesnil , B. , Punt , A. E. , Cook , R. M. et al. 2000 . Comparison of uncertainty estimates in the short term using real data . ICES Document CM 2000 /V: 03 . 30 pp.
Hannesson , R. 1998 . Marine reserves: what would they accomplish? Marine Resource Economics , 13 : 159 - 170 .
Hilborn , R. 1990 . Determination of fish movement patterns from tag recoveries using maximum likelihood estimators . Canadian Journal of Fisheries and Aquatic Sciences , 47 : 635 - 643 .
Hilborn , R. , Stokes , K. , Maguire , J-J. , Smith , T. , Botsford , L. W. , Mangel , M. , Orensanz , J. et al. 2004 . When can marine reserves improve fisheries management? Ocean and Coastal Management , 47 : 197 - 205 .
Hooke , R. , and Jeeves , T. A. 1961 . Direct search. Solution of numerical and statistical problems . Journal of Association of Computational Machinery , 8 : 212 - 229 .
Kupca , V. 2004 . A standardized database for fisheries data . ICES Document CM 2004/FF: 15 . 34 pp.
Kupca , V. , and Sandbeck , P. 2003 . dst2 Datawarehouse structure and data import . In Progress Report: dst2: Development of Structurally Detailed Statistically Testable Models of Marine Populations , pp. 37 - 46 . Technical Report 98 , Marine Research Institute, Reykjav??k.
Lubchenco , J. , Palumbi , S. R. , Gaines , S. D. , and Andelman , S. 2003 . Plugging a hole in the ocean: the emerging science of marine reserves . Ecological Applications , 13 : 3 - 7 .
Patterson , K. , Cook , R. , Darby , C. , Gavaris , S. , Kell , L. , Lewy , P. , Mesnil , B. et al. 2001 . Estimating uncertainty in fish stock assessment and forecasting . Fish and Fisheries , 2 : 125 - 157 .
Pennington , M. 1996 . Estimating the mean and variance from highly skewed marine data . Fishery Bulletin US , 94 : 498 - 505 .
Pennington , M. , and V?lstad , J. H. 1994 . Assessing the effect of intrahaul correlation and variable density on estimates of population characteristics from marine surveys . Biometrics , 50 : 1 - 8 .
Restrepo , V. R. , Patterson , K. R. , Darby , C. D. , Gavaris , S. , Kell , L. T. , Lewy , P. , Mesnil , B. et al. 2000 . Do different methods provide accurate probability statements in the short term? ICES Document CM 2000/V: 08 . 18 pp.
Sigurdardottir , A. J. , Olafsdottir , E. I. , and Taylor , L. 2005 . An R program to simulate gadget population dynamics . In Final Report: dst2: Development of Structurally Detailed Statistically Testable Models of Marine Populations , pp. 109 - 117 . Technical Report 118 , Marine Research Institute, Reykjavik.
Stefansson , G. , and Palsson , O. K. 1997a . BORMICON A Boreal Migration and Consumption model . Technical Report 58 , Marine Research Institute, Reykjavik. 253 pp.
Stefansson , G. , and Palsson , O. K. 1997b. Statistical evaluation and modelling of the stomach contents of Icelandic cod (Gadus morhua) . Canadian Journal of Fisheries and Aquatic Sciences , 54 : 169 - 181 .
Stefansson , G. , and Palsson , O. K. 1998 . A framework for multispecies modelling of boreal systems . Reviews in Fish Biology and Fisheries , 8 : 101 - 104 .
Stefansson , G. , and Rosenberg , A. A. 2005 . Combining control measures for more effective management of fisheries under uncertainty: quotas, effort limitation and protected areas . Philosophical Transactions of the Royal Society of London, B: Biological Sciences , 360 : 133 - 146 .
Stefansson , G. , and Rosenberg , A. A. 2006 . Designing marine protected areas for migrating fish stocks . Journal of Fish Biology , 69 : 66 - 78 .
Taylor , L. 2003a . Datawarehouse for Icelandic waters . In Progress Report: dst2: Development of Structurally Detailed Statistically Testable Models of Marine Populations , pp. 64 - 75 . Technical Report 98 , Marine Research Institute, Reykjav??k.
Taylor , L. 2003b . Definition of areas in Icelandic waters . In Progress Report: dst2: Development of Structurally Detailed Statistically Testable Models of Marine Populations , pp. 222 - 230 . Technical Report 98 , Marine Research Institute, Reykjav??k.
Taylor , L., Begley , J. , Kupca , V. , and Stefansson , G. 2007 . A simple implementation of the statistical modelling framework Gadget for cod in Icelandic waters . African Journal of Marine Science , 29 : 223 - 245 .
Taylor , L., and Stefansson , G. 2004 . Gadget cod-capelin-shrimp interactions in Icelandic Document CM 2004/FF: 29 . 30 pp.