#### Copula–entropy theory for multivariate stochastic modeling in water engineering

Singh and Zhang Geosci. Lett.
Copula-entropy theory for multivariate stochastic modeling in water engineering
Vijay P. Singh 0 1
Lan Zhang 0
0 Department of Biological & Agricultural Engineering, Texas A&M University, College Station , TX 77843-2117 , USA
1 Zachry Department of Civil Engineering, Texas A&M University, College Station , TX 77843-2117 , USA
The copula-entropy theory combines the entropy theory and the copula theory. The entropy theory has been extensively applied to derive the most probable univariate distribution subject to specified constraints by applying the principle of maximum entropy. With the flexibility to model nonlinear dependence structure, parametric copulas (e.g., Archimedean, extreme value, meta-elliptical, etc.) have been applied to multivariate modeling in water engineering. This study evaluates the copula-entropy theory using a sample dataset with known population information and a flood dataset from the experimental watershed at the Walnut Gulch, Arizona. The study finds the following: (1) both univariate and joint distributions can be derived using the entropy theory. (2) The parametric copula fits the true copula better using empirical marginals than using fitted parametric/entropy-based marginals. This suggests that marginals and copula may be identified separately in which the copula is investigated with empirical marginals. (3) For a given set of constraints, the most entropic canonical copula (MECC) is unique and independent of the marginals. This allows the universal solution for the proposed analysis. (4) The MECC successfully models the joint distribution of bivariate random variables. (5) Using the “AND” case return period analysis as an example, the derived MECC captures the change of return period resulting from different marginals.
Copula theory; Entropy theory; Multivariate stochastic modeling; Probability density function; Most entropic canonical copula; Return period
Introduction
A multitude of processes in water engineering involve
more than one random variable. For example, floods
are characterized by peak, duration, volume, and
interarrival time, which are all random in nature. Droughts
are described by their severity, duration, inter-arrival
time, and areal extent, which are also random. Extreme
precipitation events are represented by their intensity,
amount, duration, and inter-arrival time, which are all
random. Inter-basin water transfer involves transfer
of excess water from one basin (say, donor) to a water
deficient basin (say, recipient). The transfer involves the
volume of water, availability of water in both donor and
recipient basins, duration of transfer, rate of transfer,
and time interval between water transfers which are all
random variables. Water quality entails pollutant load,
duration for which the load is higher than the
protection limits, and peak pollutant concentration, which
are all random variables. Likewise, erosion in a basin
may be characterized by sediment yield, number of
erosion events, duration of events, intensity of events, and
time interval between two consecutive events. These
are all random variables. Flooding in a coastal
watershed may be caused by the simultaneous occurrence
of high precipitation and high tides where both
precipitation and tide are random variables. Examples of
processes involving more than one random variable
abound in hydrologic, hydraulic, environmental, and
water resources engineering. There usually exists some
degree of dependence among the random variables or
at least among some of the variables. Often we are
concerned with multivariate stochastic modeling and risk
analysis of the systems and processes that involve the
derivation of probability distributions of the random
variables considering the dependence structure among
them. Nowadays, these stochastic processes can be
modeled with the copula–entropy theory that has proven
to be more flexible and accurate than the traditional
approaches. The objective of this paper therefore is to
reflect on some recent advances made in the application
of the copula–entropy theory and future challenges.
Methods
Copula–entropy theory
The copula–entropy theory (CET) is an amalgam of the
copula theory and the entropy theory. These two theories
are now discussed.
Entropy theory
The entropy theory comprises (1) formulation of
entropy, (2) principle of maximum entropy (POME),
and (3) theorem of concentration (TOC). Entropy can
be defined in the real domain or frequency domain. In
the real domain, the most famous form of entropy is
the Shannon entropy
(Shannon 1948)
, although Tsallis
entropy
(Tsallis 1988)
and Renyi entropy
(Renyi 1951)
have been receiving much attention in recent years.
Another popular formulation of entropy is the
crossentropy or relative entropy due to
Kullback and Leibler
(1951)
which is a generalization of the Shannon entropy.
For a continuous random variable X with a probability
density function (PDF), f(x), and cumulative probability
distribution function (CDF), F(x), the Shannon entropy,
H(X) or H[f(x)], can be defined as
∞
0
H (X ) = H [f (x)] = −
f (x) ln f (x) dx,
(1)
where X ∊ [0,∞] but can also vary from − ∞ to + ∞ or
from a finite lower limit to a finite upper limit. The
Shannon entropy can also be defined in an analogous manner
for a discrete variable.
The principle of maximum entropy (POME),
propounded by
Jaynes (1957)
, states that of all the
distributions that satisfy the given constraints, the distribution
yielding the maximum entropy is the least-biased
distribution and should hence be preferred. If there are no
constraints then POME says that the resulting
distribution would be a uniform distribution, which is consistent
with the Laplacian principle of insufficient reason.
The theorem of concentration states that POME yields
the best constrained probability distribution and is the
preferred method for inferring this distribution, and
this distribution best represents our state of knowledge
about the behavior of the system. This is a consequence
of Shannon’s inequality and the relation between entropy
and Chi square statistic.
Copula theory
The foundation of the copula theory is the Sklar theorem
(Sklar 1959)
. The theorem states that the joint
(multivariate) probability distribution of two or more random
variables is a function of the probability distributions of
individual variables (also referred to as marginal
distributions which are one-dimensional). In other words, the
multivariate distribution is coupled to its marginal
distributions. It is implied that these random variables are not
independent of each other. The copula theory does not
specify the way to derive the marginal distributions and
does not lead to a unique copula. There are different ways
to construct copulas and different ways to select the best
copula.
Methodology for application of copula–entropy theory
The copula–entropy theory can be applied in different
ways: (1) the marginal distributions are derived using the
entropy theory and the joint distribution using the copula
theory
(e.g., Hao and Singh 2012; Zhang and Singh 2012)
.
Since there can be more than one joint distribution
fitted to the multivariate random variables, the best
distribution is then selected from either visual goodness-of-fit
plot (e.g. Q–Q plot) or formal goodness-of-fit test
statistics (Genest et al. 2009). (2) With the marginal
distributions derived using the entropy theory, the best copula
is selected as the copula function yielding the maximum
entropy. (3) Both marginal and joint distributions are
derived using the entropy theory
(e.g., Chu 2011; Chen
et al. 2013; Aghakouchak 2014)
. The methodology for
application of the copula–entropy theory will depend
on the way it is applied. Each of the three ways is now
outlined. First, the methodology for application of the
entropy theory is outlined, since entropy is needed in all
three ways.
Methodology for application of entropy theory
Fundamental to applying the entropy theory is the
specification of constraints the derived probability
distribution must satisfy. There can be any number of constraints
which can be defined in different ways but the easiest way
is to define them in terms of moments. Let g(x) be any
function of random variable X. Then, the ith constraint,
Ci, can be expressed as
gi(x)f (x) dx = E[gi(x)], i = 1, 2, . . . , m,
(2)
f (x) ln f (x) dx − ( 0 − 1)
∞
i
0
gi(x)f (x) dx − Ci,
where i, i = 0, 1, . . . , m, are the unknown Lagrange
multipliers. Applying the Lagrange–Euler calculus of
variation, Eq. (4) leads to the maximum entropy distribution:
f (x) = exp −
igi(x) .
−
Now the unknown Lagrange multipliers are
determined from the known constraints. The multipliers can
be determined in two ways: regular entropy method and
parameter space expansion method
(Singh 1998; Singh
and Rajagopal 1986)
. Substituting Eq. (5) in Eq. (3), we
get
exp( 0) = Z =
0 = ln Z = ln
∞
0
∞
0
exp −
or
exp −
m
i=1
m
i=1
igi(x) dx
igi(x) dx
where Z is called the partition function, and
,
(6)
(4)
(5)
(7)
(8)
For obtaining parameters, the derivatives in
Eq. (9) are equated to the derivatives obtained from
0 = 0( 1, 2, . . . , m). Similarly, it can be shown that
Hmax = −
f (x) ln f (x) dx
∞
0
∞
0
= −
− 0 −
× exp − 0 −
m
i=1
m
i=1
m
i=1
igi(x)
igi(x) dx
m
i=0
(13)
(14)
∂2 0
∂ i2 = E gi2(x)
−
E[gi(x)] 2
= Var[gi(x)], i = 1, 2, . . . , m
(10)
= E gi(x)gj(x)
− E[gi(x)]E gj(x)
= Cov gi(x)gj(x) , i, j = 1, 2, . . . , m, i = j. (11)
The maximum entropy, Hmax, of the derived
POMEbased PDF can be expressed as
ZCi =
gi(x) exp −
igi(x) dx, i = 1, 2, . . . , m.
Equation (6) shows that λ0 is a function of
1, 2, . . . , m, i.e., 0 = 0( 1, 2, . . . , m), and the
function is convex. Differentiating λ0 with respect to
1, 2, . . . , m individually, we get the relations between
Lagrange multipliers. Substituting Eq. (6) into Eq. (7), we
obtain
Equation (8) shows that Ci, i = 1, 2, . . . , m, are
functions of 1, 2, . . . , m.
With the Chi square distribution as the limiting
distribution, it is shown that 2NΔH is Chi square distributed.
Hence, the Chi square statistic may be applied to assess if
the fitted parametric distribution is close to the
POMEbased distribution (i.e., the reference distribution of
random variable).
Methodology for application of copula theory
Definition and main properties for copula
As stated by
Sklar (1959)
, copula couples the
multivariate distribution to its marginal distributions which are
uniformly distributed on [0,1]. In other words, copula is a
mapping function as [0, 1]d → [0, 1]. For d-dimensional
continuous random variables, there is a unique copula
function (C) to represent the joint distribution function (H) as
H (x1, x2, . . . , xd ) = C(u1, u2, . . . , ud );
ui = Fi(xi) ∼ uniform (0, 1), i = 1, . . . , d.
(15)
As shown in Eq. (15), ui is the CDF of random variable
Xi. Representing the joint distribution, the copula
function has the following properties:
1. 0 ≤ C(u1, . . . , ud ) ≤ 1;
2. if any ui = 0, then C(u1, . . . , ud ) = 0;
3. if all uj = 1, j = 1, . . . , d and j = i;
C(1, . . . , ui, . . . , 1) = ui;
4. C is bounded by the Fréchet–Hoeffding bounds as
W ≤ C ≤ M;
W = max 1 − d +
ui, 0 ,
d
i=1
M = min (u1, . . . , ud )
Frank copula may model the entire range of dependence
structure. Given its easy construction, the Archimedean
copulas have been extensively applied in bivariate
hydrological frequency analysis
(e.g., Sraj et al. 2015; Salvadori and
Michele 2015; Requena et al. 2016a, b)
.
Meta-elliptical copulas
(Fang et al. 2002)
, as the name
suggests, is derived from the elliptical joint distribution. The
popularly applied meta-elliptical copulas are
meta-Gaussian and meta-Student t copulas. Unlike the Archimedean
copulas, the meta-elliptical copulas can model the entire
range of dependence structure and can be easily applied to
high-dimensional multivariate modeling. Comparing the
two popularly applied meta-elliptical copulas, there exists
the symmetric tail dependence for meta-Student t copula,
while no tail dependence exists for meta-Gaussian copula
(e.g. Genest et al. 2007; Song and Singh 2010)
.
The extreme value copula is derived in accordance with
the extreme value theory which may be applied to model
the rare events. As stated by
Gudendorf and Segers
(2009)
and
Joe (2014)
, the following relation exists:
CF (u11/n, . . . , u1d/n) → C(u1, . . . , ud );
∃ n → ∞.
(18)
In Eq. (18), C denotes the extreme value copula, and CF
denotes that the copula fulfills the limiting relation.
In other words, the extreme value copula must be
maxstable. For the bivariate case, the extreme value copula
may be written as
log(v)
C(u, v) = uv exp A log(uv)
,
u, v ∈ [0, 1].
(19)
In Eq. (19), A denotes the Pickands dependence
function
(Pickands 1981; Falk and Reiss 2005)
that is convex as
A : [0, 1] → [1/2, 1] and max (t, 1 − t) ≤ A(t) ≤ 1 for t ∈ [0, 1].
The Gumbel–Hougaard copula (Archimedean copula
family) is the only Archimedean copula that belongs to
the extreme value family. Hence, the Gumbel–Hougaard
copula has been popularly applied in bivariate flood
frequency analysis, storm analysis, drought analysis, etc.
Vine copula is constructed, based on the probability
density decomposition. The vine copula is applied for
high-dimensional analysis (i.e., d ≥ 3). It is usually
categorized into Canonical (C)-Vine copula, D-Vine
copula, and Regular R-Vine copula
(Aas et al. 2007)
. Using
3-dimensional analysis as an example, we can write the
joint probability density function as
f (x1, x2, x3) =
fi(xi)c12(F1(x1), F2(x2))c23(F2(x2), F3(x3))c13|2
In Eq. (16), W represents the perfectly negative
dependence, while M represents the perfect
positive dependence. For independent random variables,
the corresponding copula function is simply given as
Π = u1u2 · · · ud = F1(x1)F2(x2) · · · Fd (xd ); and
5. C is d-increasing, that is, the C(u1, . . . , ud ) volume for
any given d-dimensional interval is non-negative.
Copula families and parameter estimation
The major copula families are Archimedean copulas,
metaelliptical copulas, extreme value copulas, vine copulas, and
entropic copulas. The Archimedean copula (2-dimensional)
is symmetric and easy to construct through the generating
function as
C(u, v) = φ−1(φ (u) + φ (v)),
where φ is the generating function which is non-increasing.
Based on the choice of Archimedean copulas, different
copulas within the family may cover different ranges of
dependence
(Nelsen 2006)
. For example, the Gumbel–Hougaard
copula may only model the positive dependence, while
In Eq. (20), c denotes the copula density function. As
seen in Eq. (20), the vine copula is very flexible, since the
bivariate copula is applied at all the levels. The vine
copula has also been applied in high-dimensional
hydrological frequency analysis (e.g., Pham et al. 2016; Arya and
Zhang 2017; Verneiuwe et al. 2015)
The parameters of the parametric copula functions
constructed above may be estimated with one of the
following three approaches:
(i) Full-Maximum Likelihood Estimation (Full-MLE): In
this method, the parameters of the marginal
distributions and copula functions are estimated
simultaneously.
(ii) Two-Stage Maximum Likelihood Estimation
(TwoStage MLE): In this method, one first estimates the
parameters of marginal distributions and then the
parameters of the copula function are estimated
using MLE with the marginals computed from the
previously fitted marginal distributions.
(iii) Semi-Parametric (or Pseudo) Maximum
Likelihood Estimation (Pseudo-MLE): In this method,
the parameters of the copula function are
estimated from the empirical marginals (i.e., empirical
CDF computed from the plotting position formula
or kernel density function).
Of the three estimation methods for parametric copula
functions, the Pseudo-MLE is considered least impacted by
the possible misidentification of marginal distributions. The
advantage of Pseudo-MLE is the separate parameter
estimation of marginal distributions and the copula function.
The most entropic canonical copula may be derived using
the entropy theory, similar to the application of entropy
theory to the univariate random variables. The Shannon
entropy of the copula function for two variables is written as
H (u, v) = −
Substituting Eq. (22) into Eq. (21), one may conclude, with
some simple algebra, that the negative copula entropy [i.e.,
Eq. (21)] denotes the mutual information of random
variables X and Y through the Kullback–Leibler cross-entropy as
c(u, v) ln[c(u, v)] dudv
fX (x)fY (y) dxdy
= −
= −
f (x, y)
fX (x)fY (y)
f (x, y) ln
ln f (x, y)
fX (x)fY (y)
f (x, y)
fX (x)fY (y)
dxdy
= − KLCE(fX :fY ) = − I(X; Y ).
According to the information theory, the mutual
information [i.e., I (X ;Y )] is a measure of the total correlation
between random variables, that is, the mutual
dependence between random variables X and Y. From the
copula theory [e.g., Eq. (22) for bivariate random variables],
the copula density [i.e., c(u, v)] also denotes the mutual
dependence between variables X and Y. Thus, the
information maintained in the copula function is the mutual
information (i.e., total correlation) between X and Y
which results in the copula entropy being negative.
In other words, a higher absolute value of the copula
entropy represents higher mutual dependence (or total
correlation) among the random variables.
Similar to the POME-based univariate distribution, the
common constraints are the constraints of total
probability of marginals (i.e., for uniform distributed
variable on [0,1]), and a measure of dependence (also called
association):
[0,1]2
[0,1]2
Applying
Eq. (25a) as
c(u, v) dudv = 1
total probability
(24)
ur c(u, v) dudv = E(ur ) =
1
,
r + 1
r = 1, 2, . . . (constraints of u = FX (x)). (25a)
f (x) =
f (x, y) dy,
we
can
evaluate
ur c(u, v) dudv =
=
0
1
0
1
ur du
0
1
ur f (u) du =
ur du = E(ur ) =
0
1
c(u, v) dv
1
r + 1
In Eq. (25b), f (u) = 1 since u ~ uniform (0,1). Similarly,
vr c(u, v) dudv = E(vr ) =
1
r + 1
r = 1, 2, . . . constraints of v = FY (y) .
aj(u, v)c(u, v) dudv = E[aj(u, v)] = Θj,
j = 1, 2, . . . (constraints of dependence measure).
(27)
In Eq. (27), Spearman’s rho is commonly applied
as the constraint to measure the dependence with
aj(u, v) = uv ⇒ E(uv) = ρs1+23. One can also apply other
dependence measures discussed in
Nelsen (2006)
and
Chu (2011)
.
Using the constraints [Eqs. (24)–(27)], the Lagrange
function for the most entropic canonical copula (MECC)
can be written as
−
−
−
i=1
n
i=1
k
j=1
i
γi
In Eq. (28), 0, . . . , n, γ1, . . . , γn, n+1, . . . , n+k
are the Lagrange multipliers. More specifically for
MECC, r = γr , r = 1, . . . , n. The Lagrange multipliers
n+1, . . . , n+k are pertaining to the constraints in
relation to the rank-based dependence measures.
Differentiating Eq. (28) with respect to c(u, v), we have
Z(Λ) = ln [0,1]2
exp −
iui −
γivi −
n
i=1
n
i=1
In Eq. (30), Λ = [ 1, . . . , n, γ1, . . . , γn, n+1, . . . , n+k ].
The most entropic canonical copula (MECC) may be
generalized to most entropic copula (MEC) with respect
to a given parametric copula
(Chu 2011)
. In the case of
MEC, Eqs. (29)–(30) can be re-written as
c(u, v) =
exp −
[0,1]2 exp −
in=1 iui −
n
i=1 iui −
n
i=1 γivi −
n
i=1 γivi −
k
j=1 n+j aj(u, v) − bc˜(u, v)
k
j=1 n+j aj(u, v) − bc˜(u, v) dudv
n+jaj(u, v) dudv
n+j aj(u, v) − bc˜(u, v) dudv
+
n
i=1
+
n
i=1
In Eq. (31), b is a generic constant, c˜(u, v) is the given
reference copula. It is seen that the MECC is obtained by
setting b = 0. In what follows, we will focus on the
application of MECC for bivariate cases through examples.
Copula–entropy for multivariate modeling
Following the discussion of Shannon entropy and copula
theory in the previous sections, we will outline the
copula–entropy theory for stochastic modeling in this
section. In general, we can apply the copula–entropy theory
in three ways:
(i) The marginal distributions are derived using the
entropy theory, while the joint distribution (i.e.,
copula function) is modeled through the
parametric copula function with its parameter estimated
through the K–K plot or statistically with the
formal goodness-of-fit test statistics
(Genest et al.
2009)
.
c(u, v) =
exp −
[0,1]2 exp −
in=1 iui −
n
i=1 iui −
n
i=1 γivi −
n
i=1 γivi −
k
j=1 n+jaj(u, v)
k
j=1 n+jaj(u, v) dudv
.
Based on the principle of maximum entropy,
maximizing Eq. (21) is equivalent to minimizing the objective
function
using the Full-MLE, Two-Stage MLE, or
PseudoMLE. In this approach, the goodness-of-fit of the
copula function may be assessed either graphically
Z(Λ) = ln [0,1]2
exp −
n
i=1
n
Lognormal distribution
The POME-based lognormal distribution may be written
as
f (x) = exp
− 0 − 1 ln x − 2(ln x)2
The relation of Lagrange multipliers to the parameters
of gamma distribution
(Singh 1998)
is given as
(ii) The difference of this second approach from (i)
above is that the parametric copula function is
selected such that it yields the maximum entropy
among all copula candidates.
(iii) The approach (iii) takes full advantage of the
entropy theory. Both marginal and joint
distributions are derived using the entropy theory. The
Lagrange multipliers are estimated by maximizing
entropy or minimizing the corresponding
objective function which is the dual problem of
maximizing entropy. The Lagrange multipliers of the
MECC (joint distribution) may be optimized from
the fitted POME-based marginal distributions
or from the empirical marginal distribution. The
approach (iii) is further adopted for the
applications.
Application to multivariate data of known population
Here, we will first show the application of copula–
entropy theory to the bivariate sample dataset with the
known true population. In this sample study, the sample
dataset (N = 1000) is generated from the known
Gumbel–Hougaard copula (θ = 4.5) with the true marginal
distributions:
1 x
X ∼ Gamma (10.5, 4.3):
10.5Γ (4.3) 10.5
1
Y ∼ Lognormal (4, 0.72): y(0.7)√2π exp −
3.3e−(x/10.5)
(ln y − 4)2
2(0.72)
Study of univariate variates
In
Singh (1998)
, it was shown that E[X ], and E[ln(X )]
should be applied as constraints to derive the
POMEbased gamma distribution; while E[ln (x)] and E (ln x)2
are the constraints to derive the POME-based
lognormal distribution. Following
Singh (1998)
, we have the
following:
Gamma distribution
The POME-based gamma distribution may be written as
f (x) = exp(− 0 − 1x − 2 ln x)
∂ 0
∂ 1 =
2 − 1
1
= − E(X ) ≈ − x¯
(32a)
(32b)
In Eq. (33d), y = ln (x) and s2y represents the sample
variance of y.
Using the bivariate data sampled from the true
population, Table 1 lists the Lagrange parameters that are
estimated for the univariate variables based on both sample
moments and population moments.
Besides applying the constraints directly related to the
parametric distribution that may be fitted to the observed
dataset, one may also directly apply the first three or four
monocentral moments [i.e., E(X ), E(X 2), E(X 3), E(X 4)],
given that the moments about the origin govern the
shape and mode of the univariate probability density
functions
(Zellner and Highfield 1988; Cobb et al. 1983)
.
The POME-based distribution so derived is given as
To avoid the possible integration problem, the
univariate variable is commonly scaled to [0,1] or [− 1,1]
(Hao
and Singh 2012; Zhang and Singh 2014)
. In this study, the
univariate variables are scaled to [0,1] to assess its
appropriateness. The scaled variable xs is given as
x − (1 − d)min(x)
xs = (1 + d)max(x) − (1 − d)min(x)
.
In Eq. (36), d is a small number such that the scaled
variable will not reach either the lower limit or the upper
(35)
(36)
Gamma Population
Lognormal Population
hist
Gamma population
Entropy poulation
Entropy sample
Entopy scaled
hist
LN population
Entropy poulation
Entropy sample
Entopy scaled
300
250
200
s
t
n
u
o
C
cy150
n
e
u
q
e
r
F100
50
0
0
600
500
400
300
200
100
150
a Test statistics computed for the Chi square test
b Critical value for α = 0.05 of the Chi square distribution with certain degrees of freedom
c Fitted parametric distribution
d POME scaled
df
limit, i.e., avoiding P(X ≤ max (x)) = 1. Here, we chose
d = 0.01. Equations (34)–(35) may then be re-organized
with the use of the scaled variable as
f (x) = f (xs)
dxs
dx
1
f (y) = 582.2347 exp 1.2604 + 11.6222ys
− 103.6613ys2 + 182.3986ys3 − 101.5606ys4 .
Furthermore, Fig. 1 plots the relative frequency and
the frequency computed from the POME-based
distributions. As shown in Fig. 1, the POME-based univariate
distributions derived (using the constraints pertaining
to certain population, and first four moments about the
origin) visually fit the observed data very well. Using
the true population from the reference distribution,
Table 3 lists the Chi square test for the fitted parametric
and POME-based distributions constructed. Results in
Table 3 clearly indicate the POME-derived distributions
may be applied to model the univariate variables. Thus, it
is safe to conclude that one may directly use the moments
about origin as the constraints to model the univariate
random variables.
Study of dependence
As previously discussed, one may apply three different
approaches to study the dependence using the copula–
entropy theory. Hereafter, each approach is evaluated.
Within the objective of the study, the Gumbel–Hougaard,
Clayton, Frank and meta-t copulas
(Nelsen 2006)
were
applied as parametric copulas. The MECC copula was
derived with the constraints of E(U ), E U 2 , E(V ), E V 2
and E(UV ). According to the discussion in “Univariate
analysis of peak discharge and flood volume” section for
Now applying the first four moments about origin to
the scaled variable X and the first three moments about
origin to the scaled variable Y, Table 2 lists the
parameters estimated by optimizing the objective function using
the first four moments about origin [i.e., Eq. (38)].
The POME-based distributions for the original
variables are expressed as
1
f (x) = 153.4792 exp (− 1.6011 + 29.2444xs
−101.8716xs2 + 125.7947xs3 − 57.5913xs4
(39a)
Italic values indicate the best fitted copula function under each condition
a Entropy computed from the parametric copula using Eq. (40). The copula with the largest absolute value is the best copula candidate
univariate analysis, we will simply apply the POME-based
distribution derived using the moments about the origin
with the use of scaled variables.
POME‑based marginals with parametric copulas
In this approach, the copula parameters were estimated
with the use of POME-based marginals and by
maximizing the log-likelihood function (it may be also called
Two-Stage MLE). Table 4 lists the estimated
parameters as well as the corresponding log-likelihood. In this
approach, the copula yielding the largest log-likelihood
was selected for further analysis. As seen in Table 4, the
Gumbel–Hougaard copula was the best candidate. It was
in agreement with the sample data actually generated
from the Gumbel–Hougaard copula discussed earlier in
the section.
POME‑based marginals with parametric copulas selected
based on the entropy
In this approach, the parameters of copulas again were
estimated by Two-Stage MLE. The difference is that the
copula–entropy was computed for the fitted parametric
copula. Here, the copula–entropy was estimated using
HC = − E[ln c(u, v; θ )]
1 n
= − n
ln c(u, v; θ );
i=1
n: sample size.
(40)
The computed entropy is also listed in Table 4. From
the computed entropy using Eq. (40), it is seen that the
Gumbel–Hougaard copula yielded the highest mutual
information (the absolute value of the copula entropy)
among all the copula candidates.
Parametric copulas estimated using Pseudo‑MLE
In this approach, the parameters of the copula were
directly estimated using the empirical distribution (e.g.,
empirical distribution using the Weibull plotting
position formula) which is listed in Table 4. It is seen that
with the Pseudo-MLE, the Gumbel–Hougaard copula
again yielded the largest MLE and the highest mutual
information.
Most entropic canonical copula with POME‑based marginals
(or empirical marginals)
In this approach, copulas were derived from the entropy
theory with the constraints of E(U ) = E(V ) = 21 ,
E(U 2) = E(V 2) = 31 ; ρspearman = 0.9287 (sample
Spearman’s rho) with the parameters listed in Table 5.
In Eqs. (24)–(31), it is seen that the dependence
structure (i.e., Spearman’s rho in this sample study) was the
controlling factor to optimize the objective function of
MECC. Thus, it did not matter how the marginals were
handled, the MECC would not change for given
Spearman’s rho which is shown in Table 5. To further evaluate
the impact of Spearman’s rho correlation coefficient, we
changed Spearman’s rho to population Spearman’s rho
as the constraint (i.e., ρ = 0.9298 from the true Gumbel–
Hougaard copula). As seen in Table 5, there was a
significant difference in the Lagrange multipliers estimated for
MECC.
To assess whether the MECC so derived fulfilled the
fundamental properties of copula function:
POME marginal
Empirical marginal
POME marginal
Empirical marginal
1
0.9
0.8
0.7
0.6
0.4
0.3
0.2
0.1
FX0.5
0
0
FY0.5
1
0.9
0.8
0.7
0.6
0.4
0.3
0.2
0.1
C(u,1) C(1,v)
Fig. 2 Comparison of empirical and POME-univariate marginals to computed C(u, 1) and C(1,v) computed from MECC
True GH
POME marginal GH
True GH
empirical marginal GH
0.2
0.4
0.6
0.8
1
0.2
0.4
0.6
0.8
True GH
MECC (true spearman rho)
True GH
MECC (sample spearmanrho)
1
0.9
Figure 2 compares the marginal variables computed using
Eq. (41) from the MECC with both empirical and
POMEbased univariate marginals. As seen in Fig. 2, C(u,1) and
C(1,v) were in good agreement with their empirical and
POME-based univariate marginals. This also implied
the appropriateness of POME-based univariate
marginals derived using the first four and three non-central
moments for random variables X and Y, respectively.
With the fundamental properties fulfilled with the use
of MECC derived, one may want to further evaluate the
goodness-of-fit of the derived MECC. Here, the Kendall
distribution plots were generated and compared to the
Kendall distribution of the underlying true Gumbel–
Hougaard copula:
t(θ − ln t)
Kθ (t) = θ . (42)
(41)
The Kendall distribution [K (t)] for MECC may be
approximated following the procedure discussed in
Genest et al. (2009) as follows:
1. Generate random variables [U1,U2] with sample N
from the MECC derived, where N is greater than the
sample size of the observed dataset.
2. Approximate [K (t)] using:
1 N
Vi = N
j=1
1 U1j ≤ U1i ∩ U2j ≤ U2i ;
i = 1, 2, . . . , N
1 N
K approx(t) = N i=1 (Vi ≤ t);
t ∈ [0, 1]
(43a)
(43b)
Comparisons shown in Fig. 3 conclude that (i) both
fitted Gumbel–Hougaard (GH) copula and MECC
derived may properly represent the true GH (θ = 4.5)
population through the comparison of the Kendall
distribution plots; (ii) visually, there is minimal
difference of GH fitted with empirical marginals (b) and the
MECC derived based on sample or population
Spearman’s rho (c and d); (iii) the reason for the minimal
difference is due to the rank-based empirical distribution
does not impose any external bias on parameter
estimation for the parametric copulas; and the MECC derived
here does not rely on the actual marginal values, but
the population moments about origin for the uniformly
distributed variables; and (iv) though the POME-based
marginals well represent the univariate random
variables, they do introduce external bias to the estimation
of parametric copulas (a).
Overall, from the bivariate analysis of sample data,
MECC may be directly applied to model the
dependence structure of the random variables. In the case of the
MECC application, the impact of the marginal
distributions is eliminated. In the next section, we will use the
real watershed data as a case study to further illustrate
the copula–entropy theory as well as risk analysis.
Case study with real watershed data
Collected from Flume 1 at Walnut Gulch Watershed in
Arizona, the annual maximum flood data [i.e., peak
discharge (Q) and flood volume (V)] from 1957 to 2012 were
considered for the case study. Based on the findings from
analysis of sample data, the case study proceeded as
follows: (i) the POME-based univariate distribution was
applied to model the univariate peak discharge and flood
volume; and (ii) the MECC was applied to model the
dependence between peak discharge and flood volume.
Univariate analysis of peak discharge and flood volume
As discussed in “Univariate analysis of peak discharge
and flood volume” section, the moments about origin
for the scaled variables were considered as constraints to
capture the shape and mode of the univariate flood
variables. Choosing d = 0.1 in Eq. (36), Table 6 lists the
sample statistics for the scaled variables. In Table 6, T and P
denote the test statistic and the corresponding P value to
evaluate whether kurtosis was significantly different from
3 using
γ ex
2 = γ2 − 3
n − 1
G2 = (n − 2)(n − 3) ((n + 1)γ2 + 6)
In Eqs. (44a)–(44c), γ2 and γe2x denote the sample
kurtosis and excessive kurtosis; n is the sample size; SEK is
the standard error of kurtosis; and T is the test
statistic with the underlying distribution of standard normal
distribution.
Results in Table 6 indicate that the first three moments
about origin were necessary to derive the POME-based
distribution for the scaled peak discharge and flood
volume variables. The Lagrange multipliers were optimized
and listed in Table 7. Figure 4 compares the
POMEbased probability density to the histogram, as well as the
POME-based CDF to the empirical CDF. Comparisons
confirmed the appropriateness of the POME-based
univariate distribution.
Bivariate flood frequency analysis with MECC
Let U and V represent the univariate marginals for
peak discharge and flood volume, the same
constraints to construct MECC for sample data [i.e.,
E(U ), E U 2 , E(V ), E V 2 , E(UV )] were applied to
model the dependence of peak discharge and flood
volume. The Lagrange multipliers were optimized by
minimizing the objective function of Eq. (31a) with b = 0.
With the observed data, sample Spearman’s rho
was computed as ρˆspearman = 0.9419, we
approximated E(UV) from sample Spearman’s rho as
E(UV ) = ρspear1m2an+3 ≈ 0.3285. With these constraints,
the copula density function for the MECC was obtained
to model bivariate flood frequency as
c(u, v) = exp(−1.8194 − 1.1352u − 44.9511u2
− 1.1352v − 44.9511v2 + 92.1726uv). (45)
Using Eq. (45), Fig. 5 compares (a) the C(u, 1), C(1, v)
to the corresponding empirical and POME-based
marginals, and (b) the approximated parametric Kendall
distribution for MECC to the empirical Kendall
distribution. Comparisons in Fig. 5 indicated that (a) the MECC
constructed successfully fulfilled the copula properties of
C(u, 1) = u, C(1, v) = v; and (b) there was a good
agreement between the empirical and parametric (i.e., MECC)
Kendall distributions, which indicated the appropriateness
of the MECC constructed. Applying the POME-based
univariate distribution, Fig. 6 plots the simulated random
variates versus the observed random variables. Figure 6 shows
the dependence structure was well preserved with the
application of MECC and POME-based marginals. To
further compare the MECC with the empirical copula, Fig. 7
compares the copula and the survival copula with different
contour levels. The plot on the left is for the copula
function, while the plot on the right is for the survival copula.
As shown in Fig. 7, there is good agreement between the
contours obtained from the empirical copula (and its
P
≪ 0.05
≪ 0.05
survival copula) and the MECC (and its survival copula)
for different probability levels. This finding further assured
the appropriateness of the MECC.
Risk analysis
With the assurance to apply the MECC to model the
dependence of annual flood sequences (i.e., peak
discharge and flood volume), one may proceed to study the
associated risk measure, which may be used as an
engineering design criterion. In hydrology and water
engineering, risk has commonly been assessed through the
return period. In what follows, the joint return period of
“AND” case was applied for risk analysis. The joint return
period of “AND” case is given as
μ
Tand = P(X ≥ x, Y ≥ y)
μ
= 1 − FX (x) − FY (y) + C FX (x), FY (y)
To assess the return period “AND” case, the peak
discharge and flood volume with univariate CDF of P = 0.8,
0.9, 0.96, and 0.98 were used. Given the limitation of the
sample size (n = 56), P = 0.99 was not chosen for the
study for comparison purposes. Table 8 lists the joint
Hist
Entropy
0.2
50
100
150 200 250
Peak discharge (cms)
300
350
400
4 6
Flood volume (m3)
8
10
x 105
empirical CDF: discharge
POME v.s. empirical: discharge
empirical CDF: flood volume
POME v.s. empirical: flood volume
0.2 0.4 0.6 0.8 1
Empirical CDF
Fig. 4 Comparison of POME-based probability density function to the histogram
0.4 0.6
Empirical CDF
0.8
1
POME marginal
Empirical marginal
POME marginal
Empirical marginal
FY0.5
1
0.9
3 )m 8
(
e
m
lu 6
o
v
d
o
o
lF 4
2
return period estimated from both empirical copula and
MECC. Results in Table 8 indicated the following:
(i)
There was a small difference between the joint
CDFs computed from empirical copula and the
MECC. The absolute relative difference was in
the range from 0.96% for C(0.8,0.8) to 2.17% for
C(0.8,0.9). Thus, in regard to the joint CDF, the
differences were insignificant.
(ii) Though the difference with joint CDF estimated
may not be significant, it resulted in larger
differences in regard to the “AND” case return period. It
is seen that with the increased marginal
probability, the discrepancy also increased between the Tand
estimated from empirical copula and the MECC.
(iii) There was an interesting finding which was in
agreement with Tand estimated from empirical
copula and MECC. Using volume = 6.44 × 105
m3 corresponding to P = 0.96 as an example, the
joint return period computed from smaller peak
discharge (e.g., Q = 73.2 cms corresponding to
P = 0.8) was less than that computed with larger
peak discharge (e.g., Q = 109.9 cms). This was true
in reality, since it was more likely for (Q ≥ 109.9
cms and V ≥ 6.44 × 105 m3) to occur
simultaneously compared to that for (Q ≥ 73.2 cms and
V ≥ 6.44 × 105 m ). This finding was also in the
3
agreement that higher discharge was most likely
associated higher flood volume. This scenario also
happened for large flood volume with relatively
low peak discharge and vice versa.
Discussion and conclusions
In this study, we investigate the copula–entropy theory in
bivariate analysis. Using the sample data with the known
univariate populations (i.e., gamma and lognormal) and
known dependence (Gumbel–Hougaard), it is concluded
that the POME-based distribution derived may model
the univariate distribution well. There is minimal
difference for POME-based distribution based on the moment
of the observed variable and that derived based on the
scaled variable (i.e., scaling the observed variable to [0,1]).
To avoid the improper integrals, the scaled variable is
suggested to derive the POME-based distribution.
Comparing to the true Gumbel–Hougaard copula, the MECC
derived using the constraints of E(U), E(U2), E(V), E(V2),
and E(UV) can properly model the dependence structure
of the sample data. The MECC constructed successfully
fulfills the fundamental properties of the copula, i.e.,
C(u,1) = u; C(1,v) = v. In addition, the derived MECC
can well present the true dependence structure
represented with the Gumbel–Hougaard copula.
Using the real watershed data (i.e., Flume 1 at Walnut
Gulch, Arizona), the case study shows the
appropriateness of POME-univariate distribution of scaled variable
to model the univariate distribution for the observed
variates. With the constraints E(U), E(U2), E(V), and E(V2)
converging to the population moments of the uniform
distributed variables as E(U i) = E(V i) = 1 (i + 1);
the MECC constructed only depends on the rank-based
dependence measure (in this case, Spearman’s rho). The
derived MECC properly models the dependence of annual
peak discharge and flood volume, which is independent of
the marginal distributions (non-parametric or
parametric). The evaluation of the flood risk (using “AND” case
return period) indicates that the MECC copula reasonably
represents the change of the return period of “AND” case.
Overall, the study concludes as follows:
(i) For the bivariate random variables investigated,
the MECC may be easily and efficiently applied to
model the dependence structure. Unlike other
copulas, the MECC is uniquely defined for a given set
of constraints. Its uniqueness allows one universal
solution for the proposed frequency analysis.
(ii) Similar to other copula families (e.g., Archimedean
copulas, meta-elliptical copulas, vine copulas, etc.),
the MECC may be applied for multivariate
analysis in hydrology and water engineering, including
multivariate rainfall analysis, multivariate drought
analysis, spatial analysis of drainage networks, and
spatial analysis of water quality as few examples.
(iii) The bivariate MECC copula may be easily extended
to higher dimensions. For example, for the
d-dimensional variables [X1, X2, . . . , Xd ] with the marginals
of Ui = Fi(Xi), i = 1, 2, . . . , d; the MECC may be
constructed using the set of constraints, i.e.,
marginal E(Uir ) = 1 (r + 1), i = 1, 2, . . . , d and
pair-wise E(UiUj); i, j ∈ [1, d], i = j estimated
from rank-based Spearman’s coefficient of
correlation. The same optimization procedure applied for
the bivariate case may be applied to construct the
MECC for dependence structure in higher
dimensions.
Authors’ contributions
VPS conceptualized the paper, helped with data interpretation and crafting
the manuscript. LZ did analysis, processed the data, constructed all the graphs
and wrote the first draft. Both authors read and approved the final manuscript.
Acknowledgements
No applicable.
Competing interests
The authors declare that they have no competing interests.
Availability of data and materials
The data used is in public domain and is available for anyone to use.
Consent for publication
We consent for publication.
Ethics approval and consent to participate
Not applicable.
Funding
No funding source is available.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Aas K , Czado C , Frigessi A , Bakken H ( 2007 ) Pair-copula constructions of multiple dependence . Insur Math Econ . https://doi.org/10.1016/j. insmatheco. 2007 . 02 .001
Aghakouchak A ( 2014 ) Entropy-copula in hydrology and climatology . J Hydrometeorol 15 : 2176 - 2189 . https://doi.org/10.1175/jhm-d - 13 -0207 . 1
Arya FK , Zhang L ( 2017 ) Copula-based Markov process for forecasting and analyzing risk of water quality time series . J Hydrol Eng 22 ( 6 ): 04017005 . https://doi.org/10.1061/(asce)he. 1943 - 5584 . 00001494
Chen L , Singh VP , Guo S ( 2013 ) Measure of correlation between river flows using entropy-copula theory . J Hydrol Eng 18 ( 12 ): 1591 - 1608 . https://doi. org/10.1061/(asce)he. 1943 - 5584 . 0000714
Chu B ( 2011 ) Recovering copulas from limited information and an application to asset allocation . J Bank Finance 35 : 1824 - 1842 . https://doi. org/10.1016/j.jbankfin. 2010 . 12 .011
Cobb L , Koppstein P , Chen NH ( 1983 ) Estimation and moment recursion relations for multimodal distributions of the exponential family . J Am Stat Assoc 78 : 124 - 130
Falk M , Reiss R-D ( 2005 ) On pickands coordinates in arbitrary dimensions . J Multivar Anal 92 : 426 - 453
Fang HB , Fang KT , Kotz S ( 2002 ) The meta-elliptical distributions with given marginals . J Multivar Anal 82 : 1 - 16
Genest C , Favre A-C , Béliveau J , Jacques C ( 2007 ) Metaelliptical copulas and their use in frequency analysis of multivariate hydrological data . Water Resour Res 43 : W09401 . https://doi.org/10.1029/2006wr005275
Genest C , Remillard B , Beaudoin D ( 2009 ) Goodness-of-fit tests for copulas: a review and a power study . Insur Math Econ 44 ( 2 ): 199 - 213 . https://doi. org/10.1016/j.insmatheco. 2007 . 10 .005
Gudendorf G , Segers J ( 2009 ) Extreme-value copulas . arXiv:0911.1015v2
Hao Z , Singh VP ( 2012 ) Entropy-copula method for single-site monthly streamflow simulation . Water Resour Res 48 : W06604 . https://doi. org/10.1029/wr011419
Jaynes ET ( 1957 ) Information theory and statistical mechanics . Phys Rev 106 : 620 - 630
Joe H ( 2014 ) Dependence modeling with copulas . CRC Press, Boca Raton
Kullback S , Leibler RA ( 1951 ) On information and sufficiency . Ann Math Stat 22 : 79 - 86
Nelsen RB ( 2006 ) An introduction to copulas, 2nd edn . Springer Science + Business Media, Inc., Berlin
Pham MT , Vernieuwe H , Baets BD , Willems P , Verhoest NEC ( 2016 ) Stochastic simulation of precipitation-consistent daily reference evapotranspiration using vine copulas . Stoch Environ Res Risk Assess 30 : 2197 - 2214 . https:// doi.org/10.1007/s00477-015-1181-7
Pickands J ( 1981 ) Multivariate extreme value distribution . Bull Int Stat Inst 49 : 859 - 878
Renyi A ( 1951 ) On measure of entropy and information . In: Proceedings, 4th Berkeley symposium, mathematics, statistics, and probability, Berkeley, California, pp 547 - 561
Requena AI , Chebana F , Mediero L ( 2016a ) A complete procedure for multivariate index-flood model application . J Hydrol 535 : 559 - 580 . https://doi. org/10.1016/j.jhydrol. 2016 . 02 .004
Requena AI , Flores I , Mediero L , Garrote L ( 2016b) Extension of observed flood series by combining a distributed hydro-meteorological model and a copula-based model . Stoch Environ Res Risk Assess 30 : 1363 - 1378 . https://doi.org/10.1007/ 200477 -015-1138-x
Salvadori G , Michele CD ( 2015 ) Multivariate real-time assessment of droughts via copula-based multi-site hazard trajectories and fans . J Hydrol 526 : 101 - 115 . https://doi.org/10.1016/j.jhydrol. 2014 . 11 .056
Shannon CE ( 1948 ) A mathematical theory of communication . Bell Syst Technol J 27 : 379 - 423
Singh VP ( 1998 ) Entropy-based parameter estimation in hydrology . Springer, Dordrecht
Singh VP , Rajagopal AK ( 1986 ) A new method of parameter estimation for hydrologic frequency analysis . Hydrol Sci Technol 2 ( 3 ): 33 - 40
Sklar M ( 1959 ) Fonctions de repartition an dimensions et leurs marges . Universite Paris, Paris, p 8
Song S , Singh VP ( 2010 ) Meta-elliptical copulas for drought frequency analysis of periodic hydrologic data . Stoch Environ Res Risk Assess 24 ( 3 ): 425 - 444 . https://doi.org/10.1007/s00477-009-0331-1
Sraj M , Bezak N , Brilly M ( 2015 ) Bivariate flood frequency analysis using the copula function: a case study of the Litija station on the Sava River . Hydrol Process 29 : 225 - 238 . https://doi.org/10.1002/hyp.10145
Tsallis C ( 1988 ) Possible generalizations of Boltzmann-Gibbs statistics . J Stat Phys 52 ( 1 /2): 479 - 487
Verneiuwe H , Vandenberghe S , Baets BD , Verhoest NEC ( 2015 ) A continuous rainfall model based on vine copulas . Hydrol Earth Syst Sci 19 : 2685 - 2699 . https://doi.org/10.5194/hess-19- 2685 -2015
Zellner A , Highfield RA ( 1988 ) Calculation of maximum entropy distribution and approximation of marginal posterior distributions . J Econom 37 : 95 - 209
Zhang L , Singh VP ( 2012 ) Bivariate rainfall and runoff analysis using entropy and copula theories . Entropy 14 : 1784 - 1812 . https://doi.org/10.3390/ e14091784
Zhang L , Singh VP ( 2014 ) Joint conditional probability distributions of runoff depth and peak discharge using entropy theory . J Hydrol Eng 19 ( 6 ): 1150 - 1159