\(U\) Statistics of Ornstein–Uhlenbeck Branching Particle System
Radosaw Adamczak
Piotr Mios
We consider a branching particle system consisting of particles moving according to the OrnsteinUhlenbeck process in Rd and undergoing a binary, supercritical branching with a constant rate > 0. This system is known to fulfill a law of large numbers (under exponential scaling). Recently the question of the corresponding central limit theorem (CLT) has been addressed. It turns out that the normalization and the form of the limit in the CLT fall into three qualitatively different regimes, depending on the relation between the branching intensity and the parameters of the OrnsteinUhlenbeck process. In the present paper, we extend those results to U statistics of the system, proving a law of large numbers and CLT. We consider a single particle located at time t = 0 at x Rd , moving according to the OrnsteinUhlenbeck process and branching after an exponential time independent of the spatial movement. The branching is binary and supercritical, with probability p > 1/2 the particle is replaced by two offspring, and with probability 1 p it vanishes.

The offspring particles follow the same dynamics (independently of each other).
We will refer to this system of particles as the OU branching process and denote it by
X = {Xt }t0.
We identify the system with the empirical process, i.e., X takes values in the space
of Borel measures on Rd and for each Borel set A, Xt ( A) is the (random) number
of particles at time t in A. We refer to [14] for the general construction of X as a
measurevalued stochastic process.
It is well known (see, e.g., [16]) that the system satisfies the law of large numbers,
i.e., for any bounded continuous function f , conditionally on the set of nonextinction
Xt 1 Xt , f
where Xt  is the number of particles at time t, {Xt (1), Xt (2), . . . , Xt (Xt )} are their
positions, Xt , f := iX=t1 f (Xt (i )) and is the invariant measure of the Ornstein
Uhlenbeck process.
In a recent article [1], we investigated secondorder behavior of this system and
proved central limit theorems (CLT) corresponding to (1). We found three qualitatively
different regimes, depending on the relation between the branching intensity and the
parameters of the OrnsteinUhlenbeck process.
In the present article, we extend these results on the LLN and CLT to the case of
U statistics of the system of arbitrary order n 1, i.e., to random variables of the form
Utn( f ) :=
f (Xt (i1), Xt (i2), . . . , Xt (in)),
(note that Ut1( f ) = Xt , f ). Our investigation parallels the classical and
welldeveloped theory of U statistics of independent random variables; however, we would
like to point out that in our context, additional interest in this type of functionals of
the process X stems from the fact that they capture average dependencies between
particles of the system. This will be seen from the form of the limit, which turns out to
be more complicated than in the i.i.d. case. This is one of the motivations for studying
U statistics in a more general context than for independent random variables. Another
one is that while being structurally the simplest generalization of additive functionals
(which are U statistics of degree one), they may be considered building blocks for
other, more complicated statistics as they appear naturally in their Taylor expansions
(see [7], where in the i.i.d. context such expansions are used in the analysis of some
statistical estimators). In another context, they also appear in the study of treebased
expansions and propagation of chaos in interacting particle systems (see [911,23]).
The organization of the paper is as follows. After introducing the basic notation
and preliminary facts in Sect. 2, we describe the main results of the paper in Sect.
3. Next (Sect. 4), we restate the results in the special case of n = 1 (as proven in
[1]) to serve as a starting point for the general case. Finally, in Sect. 5, we provide
proofs for arbitrary n, postponing some of the technical details (which may obscure
the main ideas of the proofs) to Sect. 6. We conclude with some remarks concerning
the socalled nondegenerate case (Sect. 7).
2 Preliminaries
2.1 Notation
For a branching system {Xt }t0, we denote by Xt  the number of particles at time
t and by Xt (i )the position of the i th (in a certain ordering) particle at time t . We
sometimes use Ex or Px to denote the fact that we calculate the expectation for the
system starting from a particle located at x . We use also E and P when this location is
not relevant.
By d , we denote the convergence in law. We use to denote the situation when
an inequality holds with a constant c > 0, which is irrelevant to calculations, e.g.,
f (x ) g(x ) means that there exists a constant c > 0 such that f (x ) cg(x ).
By x y = id=1 xi yi , we denote the standard scalar product of x , y Rd , by
the corresponding Euclidean norm. By n, we denote the nfold tensor product.
We use also f, := Rd f (x )(dx ). We will write X to describe the fact
that a random variable X is distributed according to the measure , similarly X Y
will mean that X and Y have the same law.
For a subset A of a linear space by span( A), we denote the set of finite linear
combinations of elements of A.
In the paper, we will use Feynman diagrams. A diagram on a set of vertices
{1, 2, . . . , n} is a graph on {1, 2, . . . , n} consisting of a set of edges E not having
common endpoints and a set of unpaired vertices A . We will use r ( ) to denote the
rank of the diagram, i.e., the number of edges. For properties and more information,
we refer to [21, Definition 1.35].
In the paper, we will use the space
P = P (Rd ) := { f : Rd R : f is continuous and
k such that  f (x )/ x k 0 as x +}.
We endow this space with the following norm
where n(x ) := exp
Cc = Cc(Rd ),
to denote the space of continuous compactly supported functions.
Given a function f P (Rd ), we will implicitly understand its derivatives (e.g.,
xfi ) in the space of tempered distributions (see, e.g., [25, p. 173]).
By f (a), we denote the function x f (ax ).
Let 1, 2 be two probability measures on R, and Lip(1) be the space of 1Lipschitz
functions R [1, 1]. We define
It is well known that m is a distance metrizing the weak convergence (see, e.g., [12,
Theorem 11.3.3]). One easily checks that when 1, 2 correspond to two random
variables X1, X2 on the same probability space then we have
X1 X2 1
X1 X2 2.
2.2 Basic Facts on the GaltonWatson Process
The number of particles {Xt }t0 is the celebrated GaltonWatson process. We present
basic properties of this process used in the paper. The main reference in this section is
[2]. In our case, the expected total number of particles grows exponentially at the rate
The process becomes extinct with probability (see [2, Theorem I.5.1])
pe =
We will denote the extinction and nonextinction events by E x t and E x t c, respectively.
The process Vt := ept Xt  is a positive martingale. Therefore, it converges (see also
[2, Theorem 1.6.1])
Vt V, a.s. as t +.
We have the following simple fact (we refer to [1] for the proof).
Proposition 1 We have {V = 0} = E x t and conditioned on nonextinction V
has the exponential distribution with parameter 2 pp1 . We have E(V) = 1 and
1 . Ee4pt Xt 4 is uniformly bounded, i.e., there exists C > 0 such
Var(V) = 2 p1
that for any t 0 we have Ee4pt Xt 4 C . Moreover, all moments are finite, i.e.,
for any n N and t 0 we have EXt n < +.
2.3 Basic Facts on the OrnsteinUhlenbeck Process
We recall that the OrnsteinUhlenbeck process with parameters , > 0 is a
timehomogeneous Markov process with the infinitesimal operator
The corresponding semigroup will be denoted by T. The density of the invariant
measure of the OrnsteinUhlenbeck process is given by
(x ) :=
2.4 Basic Facts Concerning U Statistics
We will now briefly recall basic notation and facts concerning U statistics. A U
statistic of degree n based on an X valued sample X1, . . . , X N and a function
f : X n R is a random variable of the form
The function f is usually referred to as the kernel of the U statistic. Without loss
of generality, it can be assumed that f is symmetric, i.e., invariant under permutation
of its arguments. We refer the reader to [7,22] for more information on U statistics of
sequences of independent random variables.
In our case, we will consider U statistics based on positions of particles from the
branching system as defined by (2). We will be interested in weak convergence of
properly normalized U statistics when t . Similarly as in the classical theory,
the asymptotic behavior of U statistics depends heavily on the socalled order of
degeneracy of the kernel f , which we will briefly recall in Sect. 5.2.
A function f is called completely degenerate or canonical (with respect to some
measure of reference , which in our case will be the stationary measure of the
OrnsteinUhlenbeck process) if
f (x1, . . . , xn )(dxk ) = 0,
for all x1, . . . , xk1, xk+1, . . . , xn X . The complete degeneracy may be considered
a centeredness condition, in the classical theory of U statistics canonical kernels are
counterparts of centered random variables from the theory of sums of independent
random variables. Their importance stems from the fact that each U statistic can be
decomposed into a sum of canonical U statistics of different degrees, a fact known
as the Hoeffding decomposition (see Sect. 5.2). Thus, in the main part of the article,
we prove results only for canonical U statistics. Their counterparts for general U
statistics, which can be easily obtained via the Hoeffding decomposition, are stated in
Sect. 7.
3 Main Results
This section is devoted to the presentation of our results. The proofs are deferred to
Sect. 5.
We start with the following law of large numbers (throughout the article when
dealing with U statistics of order n we will identify Rd Rd with Rnd ).
aTshseuomree mtha2t Lfet:{RXnt}dt0 be the OU branching system starting from x Rd . Let us
R is a bounded continuous function. Then, on the set of
nonextinction E x t c there is the convergence
Having formulated the law of large numbers, let us now pass to the corresponding
CLTs. We recall that is the drift parameter in (9) and p is the growth rate (7).
As already mentioned in the introduction, the form of the limit theorems depends on
the relation between p and , more specifically, we distinguish three cases: p <
2, p = 2 and p > 2. We refer the reader to [1] (Introduction and Section 3)
for a detailed discussion of this phenomenon as well as its heuristic explanation and
interpretation. Here, we only stress that the situation for p > 2 differs substantially
from the remaining two cases, as we obtain convergence in probability and the limit
is not Gaussian even for n = 1. Intuitively, this is caused by large branching intensity
which lets local correlations between particles prevail over the ergodic properties of
the OrnsteinUhlenbeck process.
3.1 Slow Branching Case: p < 2
Let Z be a Gaussian stochastic measure on Rd+1 with intensity
1(dt dx ) := 0(dt ) + 2pept dt (x )dx ,
defined according to [21, Definition 7.17], where 0 is the Dirac measure concentrated
at 0. We denote the stochastic integral with respect to Z by I and the corresponding
multiple stochastic integral by In [21, Section 7.2]. We assume that Z is defined on
some probability space (, F , P).
For f P (Rnd ) we define (we recall that T is the semigroup of the Ornstein
Uhlenbeck process)
si R+, xi Rd .
n
H ( f )(s1, x1, s2, x2, . . . , sn, xn) := i=1Tsi f (x1, x2, . . . , xn),
2(dsdx ) := 2peps (x )dsdx ,
2(dz j,k )
where ui = z j,k if ( j, k) E and (i = j or i = k) and ui = zi if i A .
Less formally, for each pair ( j, k), we integrate over diagonal of coordinates j and k
with respect to 2. The function obtained in this way is integrated using the multiple
stochastic integral IA . We define
L1( f ) :=
where the sum spans over all Feynman diagrams labeled by {1, 2, . . . , n}.
We are now ready to formulate our main result for processes with small branching
rate. Recall (8) and that W is V conditioned on E x t c.
Theorem 3 Let {Xt }t0 be the OU branching system starting from x Rd . Let us
assume that f P (Rnd ) is a canonical kernel and p < 2. Then conditionally on
the set of nonextinction E x t c there is the convergence
Xt 
3.2 Critical Branching Case: p = 2
Before we present the main results in the critical branching case, we need to introduce
some additional notation. 2
Consider the orthonormal Hermite basis {hi }i0 for the measure = N (0, 2 )
(i.e, h0 = 1, hi is a polynomial of degree i and hi h j d = i j ). Then for any
positive integer n, the set {hi1 hind }i1,...,ind 0 of multivariate Hermite
polynomials is an orthonormal basis in L2(Rnd , n ). For a function f L2(Rnd , n )
let fi1,...,ind be the sequence of coefficients of f with respect to this basis, i.e.,
Cov(G f , Gg) = 2p( f1,0,...,0g1,0,...,0 + f0,1,...,0g0,1,...,0 + + f0,...,0,1g0,...,0,1)
(16)
We will identify this process with a map I : L2(Rd , ) L2(, F , P), such that
I ( f ) = G f . One can easily check that I is a bounded linear operator. Moreover,
I = I P. In fact I is the stochastic integral of P f with respect to the random Gaussian
measure on Rd with intensity 2p (however, we will not use this fact in the sequel).
Since Hn = (H1)n, there exists a unique linear operator L 2 : Hn L2(, F , P)
1
such that for any functions f1, . . . , fn H ,
L 2( f1 fn) = I ( f1) I ( fn)
(we used here the fact that Gaussian variables have all moments finite).
Let now Pn : L2(Rnd , n) Hn be the orthogonal projection onto Hn. We have
Pn = Pn and using the fact that Hn are finite dimensional we obtain that the linear
operator L2 : L2(Rnd , n ) L2(, F , P) defined as
L2( f ) = L 2( Pn f )
L2( f1 fn) = G f1 G fn .
We are now ready to formulate the theorem, which describes the asymptotic
behavior of U statistics in the critical case.
Theorem 4 Let {Xt }t0 be the OU branching system starting from x Rd . Let us
assume that f P (Rnd ) is a canonical kernel and p = 2. Then conditionally on
the set of nonextinction E x t c there is the convergence
Xt 
,
Remark 5 One can express L2( f ) in terms of the Hermite expansion of the function
f , which might give more insight into the structure of the limiting law. Indeed, define
hi : Rd R, i = 1, . . . , d, by hi (x1, . . . , xd ) = h1(xi ) (thus, hi = h0(i1) h1
h(di)). Then for f L2(Rdn, n),
0
f hi1 . . . hin , n Gi1 Gin ,
i1,...,ind
where Gi = (2p)1/2Ghi . In particular (G1, . . . , Gd ) is a vector of independent
standard Gaussian variables (this follows easily from the covariance structure of the
process (G f ) and the fact that the functions hi form an orthonormal system).
3.3 Fast Branching Case: p > 2
i=1
The following two facts have been proved in [1, Propositions 3.9,3.10].
Proposition 6 H is a martingale with respect to the filtration of the OU branching
system starting from x Rd . Moreover for p > 2, we have supt E Ht 2 < +,
therefore there exists H := limt+ Ht (a.s. limit) and H L2. When the OU
branching system starts from 0, then martingales Vt and Ht are orthogonal.
It is worthwhile to note that the distribution of H depends on the starting
conditions.
Proposition 7 Let {Xt }t0 and {X t }t0 be two OU branching processes, the first one
starting from 0 and the second one from x . Let us denote the limit of the corresponding
martingales by H, H, respectively. Then
H H + x V,
where V is the limit given by (8) for the system X .
H is Rd valued, we denote its coordinates by H i . Let f P (Rnd ). We define
L 3( f ) :=
n f
i1,i2,...,in=1 x1,i1 x2,i2 , . . . , xn,in
where we adopted the convention that x j,l is the lth coordinate of the j th variable. By
L3( f ), we will denote L 3( f ) conditioned on E x t c.
Theorem 8 Let {Xt }t0 be the OU branching system starting from x Rd . Let us
assume that f P (Rnd ) is a canonical kernel and p > 2. Then conditionally on
the set of nonextinction E x t c there is the convergence
Xt 
Xt  ept V , en(p)t Utn( f )
ept Xt , en(p)t Utn( f )
V, L 3( f ) i n pr obabili t y.
3.4 Remarks on the CLT for U Statistics of i.i.d. Random Variables
For comparison purposes, we will now briefly recall known results on the CLT for U
statistics of independent random variables. U statistics were introduced in the 1940s
in the context of unbiased estimation by Halmos [19] and Hoeffding who obtained
the CLT for nondegenerate (degenerate of order 0, see Sect. 5.2) kernels [20]. The
full description of the CLT was obtained in [15,24] (see also the article [18] where
the CLT is proven for a related class of V statistics). Similarly as in our case, the
asymptotic behavior of U statistics based on a function f : X n R and an i.i.d.
X valued sequence X1, X2, . . . is governed by the order of degeneracy of the function
f (see Sect. 5.2) with respect to the law of X1 (call it P). The case of general f can
be reduced to the canonical one, for which one has the weak convergence
N n/2
where Jn is the nfold stochastic integral with respect to the socalled isonormal
process on X , i.e., the stochastic Gaussian measure with intensity P.
For the small branching rate case, the behavior of U statistics in our case resembles
the classical one as the limit is a sum of multiple stochastic integrals of different orders.
In the remaining two cases, the behavior differs substantially. This can be regarded as
a result of the lack of independence. Although asymptotically the particles positions
become less and less dependent, in short timescale, offspring of the same particle stay
close one to another.
Let us finally mention some results for U statistics in dependent situations, which
have been obtained in the last years. In [6], the authors analyzed the behavior of
U statistics of stationary absolutely regular sequences and obtained the CLT in the
nondegenerate case (with Gaussian limit). In [5], the authors considered and
mixing sequences and obtained a general CLT for canonical kernels. Interesting results for
longrange dependent sequences have been also obtained in [8]. A more recent
interesting work (already mentioned in the introduction) is [911,23], where the authors
consider U statistics of interacting particle systems.
4 The Case of n = 1
In the special case of n = 1, the results presented in the previous section were proven
in [1]. Although this case obviously follows immediately from the results for general
n, it is actually a starting point in the proof of the general result (similarly as in the case
of U statistics of i.i.d. random variables). Therefore, for the readers convenience, we
will now restate this case in a simpler language of [1], not involving multiple stochastic
integrals.
We start with the law of large numbers
Theorem 9 Let {Xt }t0 be the OU branching system starting from x Rd . Let
us assume that f P (Rd ). Then
tli+m ept Xt , f = f, V i n pr obabili t y,
or equivalently on the set of nonextinction, E x t c, we have
Moreover, if f is bounded then the almost sure convergence holds.
4.1 Small Branching Rate: p < 2
We denote f(x ) := f (x ) f, and
Let us also recall (8) and that W is V conditioned on E x t c. In this case, the behavior
of X is given by the following
Theorem 10 Let {Xt }t0 be the OU branching system starting from x Rd . Let us
assume that p < 2 and f P (Rd ). Then 2f < + and conditionally on the set
of nonextinction E x t c, there is the convergence
Xt 
where G1 N (0, 1/(2 p 1)), G2 N (0, 2f) and W, G1, G2 are independent
random variables.
4.2 Critical Branching Rate: p = 2
Recall the notation related to Hermite polynomials introduced in Sect. 3.2 and denote
2
xi
i=1
2 2 2
= 2p( f1,0,...,0 + f0,1,...,0 + + f0,...,0,1)
(where the first equality follows from the form of the Gaussian density and its relation
to Hermite polynomials, whereas the second one from the definition of P).
Note that the same symbol 2f has already been used to denote the asymptotic
variance in the small branching case. However, since these cases will always be treated
separately, this should not lead to ambiguity.
Theorem 11 Let {Xt }t0 be the OU branching system starting from x Rd . Let us
assume that p = 2 and f P (Rd ). Then 2f < + and conditionally on the set
of nonextinction E x t c there is the convergence
Xt 
where G1 N (0, 1/(2 p 1)), G2 N (0, 2f) and W, G1, G2 are independent
random variables.
4.3 Fast Branching Rate: p > 2
In the following theorem, we use the notation introduced in Sect. 3.3.
Theorem 12 Let {Xt }t0 be the OU branching system starting from x Rd . Let us
assume that p > 2 and f P (Rd ). Then conditionally on the set of nonextinction
E x t c, there is the convergence
Xt 
d (W, G, f, J ),
where G N (0, 1/(2 p 1)), (W, J ), G are independent and J is H conditioned
on E x t c. Moreover
We will now pass to the proofs of the results announced in Sect. 3. Their general
structure is similar to the case of U statistics of independent random variables, i.e.,
all the theorems will be proved first for linear combinations of tensor products and
then via suitable approximations extended to the function space P (Rnd ). Below we
provide a brief outline of the proofs, common for all the cases considered in the paper.
1. Using the onedimensional versions of the results, presented in Sect. 4, and the
CramrWold device, one proves convergence for functions f , which are linear
combinations of tensor products. This class is shown to be dense in P .
5 Proofs of Main Results
5.1 Outline of the Proofs
(V, f, H), i n pr obabili t y.
2. Using algebraic properties of the covariance, one obtains explicit formulas for the
limit, which are well defined for any function f P . Further, one shows that they
depend on f in a continuous way.
3. One obtains a uniform in t bound on the distance between the laws of Utn( f ) and
of Utn(g) in terms of the distance between f and g in P .
This is the most involved and technical step as it relies on the analysis of moments
of U statistics. It turns out that the formulas for moments can be expressed in
terms of auxiliary branching processes indexed by combinatorial structures, more
specifically by labeled trees of a special type (introduced in Sect. 6.3). Having this
representation, one can then obtain moment bounds via combinatorial arguments.
4. Combining the above three steps, one can easily conclude the proofs by standard
metrictheoretic arguments. By step 3, a general Ustatistic based on a function f
can be approximated (uniformly in t ) by a Ustatistics based on special functions
fn whose laws converge by step 1 as t to some limiting measure n. By step
2, when the approximation becomes finer and finer (n ), one has n for
some probability measure . Finally, it is easy to see that is the limiting measure
for the original U statistic.
The organization of the rest of the paper is as follows. First, we recall some basic
facts about U  and V statistics and Hoeffding projections, which we will need already
at step 1. Then we present the proof of the law of large numbers and CLTs. In the
latter proofs, we formulate and use the estimates related to step 3 without proving
them. Only later in Sect. 6 do we introduce the necessary notation and combinatorial
arguments which give those estimates.
We choose this way of presentation since it allows the readers to see the structure of
the proofs without being distracted by rather heavy notation and quite lengthy technical
arguments related to step 3.
From now on, we will often work conditionally on the set of nonextinction E x t c,
which will not be explicitly mentioned in the proofs (however, should be clear from
the context).
5.2 Basic Facts on U  and V Statistics
We will now briefly recall one of the standard tools of the theory of U statistics, which
we will use in the sequel, namely the Hoeffding decomposition.
Let us introduce for I {1, . . . , n} the Hoeffding projection of f : Rnd R
corresponding to I as the function I f : RI d R, given by the formula
iI
ORnned cfa(nx1e,a.s.il.y, xsene) thina=t 1for(dIxi ). 1, I f is a canonical kernel. Moreover f =
Note that if f is symmetric (i.e., invariant with respect to permutations of
arguments), I f depends only on the cardinality of f . In this case, we speak about the
kth Hoeffding projection (k = 0, . . . , n), given by
k f (x1, . . . , xk ) = (x1 ) (xk ) (nk), f .
A symmetric kernel in n variables is called degenerate of order k 1 (1 k n)
iff k = min{i > 0 : i f 0}. The order of degeneracy is responsible for the
normalization and the form of the limit in the CLT for U statistics, e.g., if the kernel is
nondegenerate, i.e., 1 f 0, then the corresponding U statistic of an i.i.d. sequence
behaves like a sum of independent random variables and converges to a Gaussian limit.
The same phenomenon will be present also in our situation (see Sect. 7).
In the particular case k = n, the definition of the Hoeffding projection reads as
One easily checks that
which gives us the aforementioned Hoeffding decomposition of U statistics
which in the case of symmetric kernels simplifies to
f (x1, . . . , xn ) =
I {1,...,n}
Utn( f ) =
(Xt  I )! UtI (I f ),
I {1,...,n} (Xt  n)!
Utn( f ) =
k=0
where we use the convention Ut0(a) = a for any constant a.
For technical reasons, we will also consider the notion of a V statistic which is
closely related to U statistics, and is defined as
Vtn( f ) :=
i1,i2,...,in=1
f (Xt (i1), Xt (i2), . . . , Xt (in)).
The corresponding Hoeffding decomposition is
Vtn( f ) =
I {1,...,n}
where again we set Vt0(a) = a for any constant a.
In the proofs of our results, we will use a standard observation that a U statistic can
be written as a sum of V statistics. More precisely, let J be the collection of partitions
of {1, . . . , n} i.e., of all sets J = { J1, . . . , Jk }, where Ji s are nonempty, pairwise
disjoint and i Ji = {1, . . . , n}. For J as above let f J be a function of  J  variables
x1, . . . , xJ , obtained by substituting xi for all the arguments of f corresponding to
the set Ji , e.g., for n = 3 and J = {{1, 2}, {3}}, f J (x1, x2) = f (x1, x1, x2). An easy
application of the inclusionexclusion formula yields that
Utn( f ) =
J J
where aJ are some integers depending only on the partition J . Moreover one can
easily check that if J = {{1} , . . . , {n}}, then aJ = 1, whereas if J consists of sets
with at most two elements then aJ = (1)k where k is the number of twoelement
sets in J . Let us also note that partitions consisting only of one and twoelement sets
can be in a natural way identified with Feynman diagrams (defined in Sect. 2.1).
5.3 Proof of the Law of Large Numbers
Proof of Theorem 2 Consider the random probability measure t = Xt 1 Xt (recall
that formally we identify Xt with the corresponding counting measure). By Theorem
9 with probability one (conditionally on E x t c), t converges weakly to . Thus, by
Theorem 3.2 in [3], n converges weakly to n.
t
Let f be bounded and continuous. We notice that f, tn =  Xt n Vtn( f ), which
gives the almost sure convergence Xt n Vtn( f ) f, . Now it is enough to note
that the number of offdiagonal terms in the sum (25) defining Vtn( f ) is of order
Xt n1 and use the fact that Xt  a.s. on E x t c.
We note that in the proofs below we will use this fact only in the
version for f C(Rnd ) which we have just proven. The proof of convergence
in probability for f P (Rnd ) follows directly from the CLT presented in
Sect. 7.
5.4 Approximation
Before we proceed to the proofs of CLTs, we will demonstrate the simple fact that any
function in P (Rnd ) can be approximated by tensor functions.
Lemma 13 Let A := in=1gi : gi bounded continuous! and f P (Rnd ) be a
canonical kernel. For every m > 0 there exists a sequence { fk } span( A) such
that each fk is canonical and
fk (m) f (m) P 0,
as k +.
Proof First, we prove that span( A) is dense in P (Rnd ). Let us notice that given
a function f P (Rnd ) it suffices to approximate it uniformly on some box
[M, M ]d , M > 0. The box is a compact set and an approximation exists due to
the StoneWeierstrass theorem.
Now, let f P (Rnd ). We may find a sequence {hk } span( A) such that hk (m)
f (m) in P . Let us recall the Hoeffding projection (24) and denote I = {1, 2, . . . , n}.
Now direct calculation (using exponential integrability of Gaussian variables) reveals
that the sequence fk := I hk fulfills the conditions of the lemma.
5.5 Small Branching rate: Proof of Theorem 3
Let us first formulate two crucial facts, whose rather technical proofs we defer to Sect.
6.4. The first one corresponds to Step 2 in the outline of the proof presented in Sect.
5.1. Recall the definition of L1 given in (14).
Proposition 14 For any canonical f P (Rnd ) we have EL1( f )2 < +. Moreover
L1 is a continuous function L1 : Can L2(, F , P), where
Can = " f P (Rnd ) : f is a canonical kernel# ,
and Can is endowed with the norm P .
The other fact we will use allows for a uniform in t approximation of general
canonical U statistics by those, whose kernels are sums of tensor products. This corresponds
to step 3 of the outline. Recall the distance m given by (5).
Proposition 15 Let {Xt }t0 be the OU branching system starting from x Rd
and p < 2. For any n 2 there exists a function ln : R+ R+, fulfilling
lims 0 ln(s) = 0 and such that for any f1, f2 Can and any t > 1 we have
m(1, 2) ln( f1(2n) f2(2n) P ),
where 1 Xt n/2Utn( f1), 2 Xt n/2Utn( f2) (the U statistics are considered
here conditionally on E x t c).
We can now proceed with the proof of Theorem 3.
Proof of Theorem 3 For simplicity, we concentrate on the third coordinate. The joint
convergence can be easily obtained by a straightforward modification of the arguments
below (using the joint convergence in Theorem 10 for n = 1). In the whole proof, we
work conditionally on the set of nonextinction E x t c.
Let us consider bounded continuous functions fil : Rd R, l = 1, . . . , m, i =
1, . . . , n, which are centered with respect to and set fl := in=1 fil and f := lm=1 fl .
In this case the U statistic (2) writes as
Utn( f ) =
l=1 iij1=,ii2k,,..f.,oirn =j=1,k
f1l (Xt (i1)) f2l (Xt (i2)) . . . fnl (Xt (in)).
Let be a Feynman diagram labeled by {1, 2, . . . , n}, with edges E and unpaired
vertices A . Let
Decomposition (27) writes here as
f (Xt (i1), Xt (i2), . . . , Xt (in)).
Utn( f ) =
where the sum spans over all Feynman diagrams labeled by {1, 2, . . . , n} (note that
when has no edges, then St ( ) = Vtn( f )), and the remainder R is the sum of
V statistics corresponding to partitions of {1, . . . , n} containing at least one set with
more than two elements. First, we will prove that Xt (n/2) Rt 0.
To this end, let us consider a partition J = { Ar }1rm1 {Br }1rm2 {Cr }1rm3
of the set {1, 2, . . . , n}, in which  Ar  3, Br  = 2 and Cr  = 1. Assume that m1 1
and recall the definition of f J used in (27). For any l = 1, . . . , m, we have
rm1
rm3
kAr
kCr
rm2
kBr
By the first part of Theorem 9, the first product on the righthand side converges
almost surely to 0 and the second one converges to a finite limit. Each factor of the
third product, by Theorem 10, converges (in law) to a Gaussian random variable.
We conclude that Xt (n/2)VtJ ( f Jl ) d 0. Thus, only the first summand of (28) is
relevant for the asymptotics of Xt (n/2)Utn( f ).
Consider now
i=1
i=1
Let us denote Z f l (t ) := Xt 1/2
j
Wold device, we get that
iX=t1 f jl (Xt (i )). By Theorem 10 and the
Cramr(Z f jl (t ))1 jn,1lm d (G f jl )1 jn,1lm ,
Let D := R+ Rd and note that
= Cov(I1(H ( f jl1 )), I1(H ( fkl2 ))).
i=1
We conclude that
(recall that I1 is the Gaussian stochastic integral with respect to the random Gaussian
measure with intensity 1). Thus, without loss of generality, we can assume that
G f jl = I1(H ( f jl )) for all l, j .
On the other hand by Theorem 9, one easily obtains
Xt 1
G frl
Thus, by decomposition (28) and the considerations above, we obtain
Xt (n/2)t Utn( f ) d
We will now show that L is equal to L1( f ) given by (14). By linearity of L1( f ) it is
enough to consider the case of m = 1. We will therefore drop the superscript and write
fi instead of fil . We recall (12) and denote P( fi , f j ) := D H ( fi f j )(z, z)2(dz),
where D = R+ Rd . By (29) and the definition of 2 given in (13)
L =
(EG f j G fk P( f j , fk ))
L =
Let us notice that the inner sum can be written as
where runs over all Feynman diagrams on the set of vertices A. Thus, by [21,
Theorem 3.4 and Theorem 7.26] this equals IA H (iA fi ) and in consequence
L =
P( f j , fk )IA H (iA fi ) .
n
It is easy to see that in the case of f = i=1 fi , the expression above is equivalent
to (14). Thus, for each f , which is a finite sum of tensors, we have L = L1( f ) and in
consequence Xt n/2Utn( f ) d L1( f ).
Let us now consider a general canonical function f P . We put h(x ) := f (2nx ).
By Lemma 13 we may find a sequence of canonical functions { fk }k span( A) such
that fk (2n) h in P . Now by Proposition 15, we may approximate Xt (n/2)Utn( f )
with Xt (n/2)Utn( fk ) uniformly in t > 1. This together with Proposition 14 and
standard metrictheoretic considerations concludes the proof.
5.6 Critical Branching Rate: Proof of Theorem 4 For the critical case, we will need the following counterpart of Proposition 15, which will be proved in Sect. 6.5.
Proposition 16 Let {Xt }t0 be the OU branching system starting from x Rd
and p = 2. For any n 2 there exists a function ln : R+ R+, fulfilling
lims0 ln(s) = 0 and such that for any canonical f1, f2 P (Rnd ) and any t > 1 we
have
where 1 (t Xt )n/2Utn( f1), 2 (t Xt )n/2Utn( f2) (the U statistics are
considered here conditionally on E x t c).
Proof of Theorem 4 As in the subcritical case, we will focus on the third coordinate.
The proof is slightly easier than the one of Theorem 3 as, because of larger
normalization, the notion of U statistics and V statistics coincide in the limit. Indeed, let
us consider bounded continuous functions f1l , f2l , . . . , fnl , l = 1, . . . , m, which are
centered with respect to and denote f := lm=1 in=1 fil . By (28) we have
Utn( f ) Vtn( f ) =
simply by the fact that the Feynman diagram without edges corresponds to Vtn( f ).
Analogously as in the proof of Theorem 3, we have (t Xt )n/2 Rt d 0. Let us now
fix some diagram with at least one edge. Without loss of generality, we assume that
E = {(1, 2), (2, 3), . . . , (2k 1, 2k)} for k 1. We have
l=1 ik
(Xt t )1/2 Xt , fil
Let us denote Z f l (t ) := (t Xt )1/2
j
CramrWold device, we get that
By Theorem 11 each of the factors in the second product converges in distribution,
whereas by the first part of Theorem 2 each factor in the first product converges almost
surely to 0, in consequence (t Xt )n/2 S( ) converges in probability to 0, which shows
that (t Xt )n/2Utn( f ) and (t Xt )n/2Vtn( f ) are asymptotically equivalent.
iX=t1 f jl (Xt (i )). By Theorem 11 and the
(Z f jl (t ))1in,lm d (G f jl )1in,lm ,
where (G f l )in,lm is centered Gaussian with the covariances given by (16). Thus,
i
(t Xt )n/2Vtn( f ) d
G f jl = L2( f ),
l=1 j=1
where L2 is defined by (17).
Now we pass to general canonical functions f P . By Lemma 13, we can
approximate f by canonical fk from span( A) in such a way that f fk P
f (2n) fk (2n) P 0. Thus, by Proposition 16, the law of (t Xt )1/2Utn( fk )
converges to the one of (t Xt )1/2Utn( f ) as k uniformly in t > 1.
Moreover by the fact that L2 is bounded on L2(Rnd , n ) and there exists C < such
that L2(Rdn,n) C P (which follows easily from exponential integrability
of Gaussian variables), we obtain L2( fk ) L2( f ) in the space L2(, F , P). The
proof may now be concluded by standard metrictheoretic arguments.
As in the previous two cases, we start with a fact, which allows to approximate general
U statistics, by those with simpler kernels. It is slightly different than the
corresponding statements in the small and critical branching case, which is related to a different
type of convergence and a deterministic normalization which we have for large
branching. The proof is deferred to Sect. 6.6.
Proposition 17 Let {Xt }t0 be the OU branching system with p > 2. There exist
constants C, c > 0 such that for any canonical f P (Rnd ) we have
Ex en(p)t Utn( f ) C exp c x ! f (2n) P .
Proof of Theorem 8 Again we concentrate on the third coordinate. The joint
convergence can be easily obtained by a modification of the arguments below (using the joint
convergence in Theorem 12 for n = 1).
First, note that U statistics and V statistics are asymptotically equivalent. The
argument is analogous to the one presented in the proof of Theorem 4, since under the
assumption p > 2 we have
as t and consequently we can disregard the sum over all multiindices
(i1, . . . , in ) in which the coordinates are not pairwise distinct.
1, .L.e.t, mus, wcohniscihdearrebcoeunntdeereddcwonitthinrueosupsecftutnoctioannsd fd1le,nfo2lt,e. .f. ,:=fnl : Rlm=d1 in=1Rf,il .l B=y
Theorem 12 for n = 1 we have
fil , H
l=1 i=1
in probability. Before our final step, we recall that the convergence in probability can
be metrized by d(X, Y ) := E XXY Y+ 1 X Y 1. Let us now consider a function
f P . By Lemma 13 we may find a sequence of canonical functions { fk } span( A)
such that fk (2n) f (2n) in P . Now by Proposition 17, we may approximate
en(p)t Utn( f ) with en(p)t Utn( fk ) uniformly in t in the sense of the metric d.
Moreover, one can easily show that limk+ d(L 3( fk ), L 3( f )) = 0. This concludes
the proof.
6 Proofs of Technical Lemmas
We will now provide the proofs of the technical facts formulated in Sect. 5. The proofs
are quite technical and require several preparatory steps. In what follows, we first
recall some additional properties of the OrnsteinUhlenbeck process, and then we
introduce certain auxiliary combinatorial structures which will play a prominent role
in the proofs.
The semigroup of the OrnsteinUhlenbeck process can be represented by
Tt f (x ) = (gt f )(xt ), xt := et x ,
gt (x ) =
x 2 , t2 := 2(1 e2t ).
Let us recall (10). We denote ou(t ) := 1 e2t and let G . Then (30) can be
written as
We also denote
Tt f (x ) =
f (xt y)gt (y)dy
f x et + ou(t )y (y)dy
It is well known that the OrnsteinUhlenbeck semigroup increases the smoothness
of a function. We will now introduce some simple auxiliary lemmas which quantify
this statement and give bounds on the P norms of derivatives of certain functions
obtained from f by applying the OrnsteinUhlenbeck semigroup on a subset of
coordinates. Such bounds will be useful, since they will allow us to pass in the analysis to
smooth functions.
Let f P (Rnd ) and I {1, 2, . . . , n} with I  = k. We define
fI (x1, x2, . . . , xn ) :=
f (z1, z2, . . . , zn)
iI
Lemma 18 Let f P (Rnd ) and l N. Then for any I {1, 2, . . . , n} the function
fI is smooth with respect to coordinates in I . For any multiindex = (i1, . . . , il )
{1, . . . , nd}l such that { i j /d : j = 1, . . . , l} I we have
C f
where C > 0 does not depend on f and depends only on the parameters of the system
(that is , , d, l, n). Moreover, when f is canonical, so is fI .
K :=

f (z1, z2, . . . , zn) x iI g1(xi e yi )dyi .
Therefore, by (4), the properties of the Gaussian density g1 and easy calculations we
arrive at
fI (x1, x2, . . . , xn)(x j )dx j = 0.
There are two cases, the first when j / I . Then we have
fI (x1, x2, . . . , xn)(x j )dx j
f (z1, z2, . . . , zn)
iI
The second case is when j I . Then
fI (x1, x2, . . . , xn)(x j )dx j
f (z1, z2, . . . , zn)
f (z1, z2, . . . , zn)
iI\{ j}
iI\{ j}
where the second equality holds by the fact that is the invariant measure of the
OrnsteinUhlenbeck process. Now the proof reduces to the first case.
and ab the integral over the segment [a, b] Rd ).
We will also need the following simple identity. We consider {xi }i=1,2,...,n ,
{xi }i=1,2,...,n. By induction one easily checks that the following lemma holds (we
slightly abuse the notation here, e.g., yi denotes the derivative in direction xi xi
Lemma 19 Let f be a smooth function, then
n
(1) i=1 i f (x1 + 1(x1 x1), x2 + 2(x2 x2), . . . , xn + n(xn xn))
( 1, 2,..., n){0,1}n
n
y1 y2 . . . yn
6.3 Bookkeeping of Trees
f (y1, y2, . . . , yn)dyndyn1 . . . dy1.
We will now introduce the bookkeeping of trees technique (for similar considerations
see, e.g., [13, Section 2] or [4]), which via some combinatorics and introduction of
auxiliary branching processes will allow us to pass from equations on the Laplace
transform in the case of n = 1 to estimates of moments of V statistics and consequently
U statistics, which will be crucial for proving Propositions 15, 16 and 17.
Our starting point is classical. We will use the equation on the Laplace transform
of the branching process to obtain, via integration, recursive formulas for moments of
V statistics generated by tensors.
Recall thus (25). Let f1, f2, . . . , fn Cc(Rd ) and fi 0. We would like to
calculate
i=1
Ex Vtn(in=1 fi ) = Ex
Note that this differentiation is valid by Proposition 1 and properties of the Laplace
transform (e.g., [17, Chapter XIII.2]). By the calculations from Section 4.1. in [1] we
know that
It is easy to check that
Tts 4 pw2(, s, ) w(, s, ) + (1 p)5 (x )ds.
The last formula is much easier to handle if written in terms of auxiliary branching
processes. Firstly, we introduce the following notation. For n N \ {0} we denote by
Tn the set of rooted trees described below. The root has a single offspring. All inner
vertices (we exclude the root and the leaves) have exactly two offspring. For Tn,
= Tt
= Tt
We evaluate it at = 0, (let us notice that 11 w(x , s, 0) = 1
1 w1 (x , s, 0) =
Assume that  > 0. We denote by P1() all pairs (1, 2) such that 1 2 =
and 1 2 = , and by P2() P1() pairs with an additional restriction that
1 = and 2 = . Using (36) we easily check that
Tts

w(, s, ) w(, s, ) (x )ds.
by l( ) we denote the set of its leaves. Each leaf l l( ) is assigned a label, denoted by
lab(l), which is a nonempty subset of {1, 2, . . . , n}. The labels fulfill two conditions:
: lab(l) = {1, 2, . . . , n} , l1,l2l( ) (l1 = l2 # lab(l1) lab(l2) = ) .
ll( )
In other words, the labels form a partition of {1, 2, . . . , n}. For a given Tn let
i ( ) denote the set of inner vertices (we exclude the root and the leaves), clearly
i ( ) = l( ) 1 (as usual   denotes the cardinality). Let us identify the vertices
of with {0, 1, 2, . . . ,   1} in such a way, that for any vertex i its parent, denoted
by p(i ), is smaller. Obviously, this implies that 0 is the root and that the inner vertices
have numbers in the set {1, 2, . . . ,   1}. We denote also
leaves with singleton and nonsingleton label sets, respectively.
Given Tn and t R+ and {ti }ii( ), we consider an OrnsteinUhlenbeck
branching walk on as follows. The initial particle is placed at time 0 at location x , it
evolves up to the time t t1 according to the OrnsteinUhlenbeck process and splits
into two offspring, the first one is associated with the left branch of vertex 1 in tree
and the second one with the right branch. Further each of them evolves independently
until time t ti , where i is the first vertex in the corresponding subtree, when it splits
and so on. At time t , the particles are stopped and their positions are denoted by
{Yi }il( ) (the number of particles at the end is equal to the number of leaves). The
construction makes sense provided that ti t and ti t p(i) for all i i ( ) (which
we implicitly assume). We define
OU an=1 fa , , t, {ti }ii( ) , x := Ex
a=1
where we set t0 = t . The reason to study the above objects becomes apparent by the
following statement.
Proof The claim is a consequence of the identity
where {1, 2, . . . , n} and T() is the set of trees, as T, with the exception that
the labels are in the set . This identity in turn will follow by induction with respect to
the cardinality of . For = {i } Eq. (38) reads as v(x , t ) = ept Tt fi (x ) (note that
P2() = ). The space T() contains only one tree, denoted by s , consisting of the
root and a single leaf labeled by {i }. We have i ( ) = and S(s , t, x ) = ept Tt fi (x )
so (43) follows.
Let now  = k > 1 and assume that (43) holds for all sets of cardinality at most
k 1. Apply again (38). Similarly as before, the first term corresponds to s . Let p be
the transition density of the OrnsteinUhlenbeck semigroup. By induction the second
term of (38) can be written as
ept1 Ttt1
By (41) the above expression equals
(1,2)P2() 1T(1) 2T(2)
t1
OU i2 fi , 2, t1, {ti }ii(2) , y dydt1.
Now we create a new tree by setting 1 and 2 to be descendants of the vertex
born from the root at time t t1 (thus this vertex is assigned the split time t1). We
keep labels and the remaining split times unchanged. Consider the branching random
walk on with the initial position of the first particle equal to x . Note that by the
branching property the evolution of this process on subtrees 1 and 2 is conditionally
independent given the evolution of the first particle up to time t t1. Thus, by the
Markov property of the OrnsteinUhlenbeck process, we can identify the branching
random walk on 1 and 2 with the branching random walk on . More precisely we
have
p(t t1, x , y)OU i1 fi , 1, t1, {ti }ii(1) , y
Using the Fubini theorem together with the equality i ( ) = i (1) + i (2) + 1 we
see that the summand corresponding to 1, 2 in (44) equals S(, t, x ). It is also easy to
check that the described correspondence is a bijection from the set of pairs (1, 2) [as
in (44)] to Tn \ {s } and therefore the expression (44) is equal to Tn\{s } S(, t, x ),
which ends the proof.
The calculations will be more tractable when we derive an explicit formula for
{Yi }il( ). Let us recall the notation introduced in (32) and consider a family of
independent random variables {Gi }i , such that Gi for i = 0 and G0 x . Recall
also that ou(t ) = 1 e2t . The following proposition follows easily from the
construction of the branching walk on and (32).
Proposition 21 Let {Yi }il( ) be positions of particles at time t of the Ornstein
Uhlenbeck process on tree with labels {ti }ti( ). We have
Zi :=
ou(t p(l) tl )Gl etl + ou(t p(i))Gi ,
lP(i)
We are now ready to prove an extended version of Proposition 20. This result will
be instrumental in proving bounds needed to implement step 3 of the outline presented
in Sect. 5.1.
Proposition 22 Let {Xt }t0 be the OU branching system starting from x Rd and
f P (Rnd ). Then
where in (41) we extend the definition of OU in (40) by putting
OU( f, , t, {ti }ii( ) , x ) := Ex f (Y j (1), Y j (2), . . . , Y j (n)).
Moreover all the quantities above are finite.
Proof (Sketch) Using Proposition 20, Proposition 1, (35) and Lebesgues monotone
convergence theorem one may prove that (45) is valid for f C, C > 0. Using
standard methods, we may drop the positivity assumption in (35) and (42). Therefore,
by the StoneWeierstrass theorem, linearity and Lebesgues dominated convergence
theorem, (45) is valid for any f Cc(Rnd ). Let now f P (Rnd ), f 0. We
notice that for any Tn the expression OU( f, , t, {ti }ii( ) , x ) is finite, which
follows easily from Proposition 21. Further, one can find a sequence { fk } such that
fk Cc(Rnd ), fk 0 and fk % f (pointwise). Appealing to Lebesgues monotone
convergence theorem yields that (45) still holds (and both sides are finite). To conclude,
once more we remove the positivity condition.
As a simple corollary we obtain
Corollary 23 Let {Xt }t0 be the OU branching system, then for any n 1 there
exists Cn such that
Proof We apply the above proposition with f = 1. Using the definition (41) and the
inequality i ( ) n 1 for Tn, it is easy to check that for any t Tn we have
S(, t, x ) C enpt , for a certain constant depending only on and p, .
Let us recall the notation of (39). The following proposition will be crucial in
proving moment estimates for V  and U statistics.
Proposition 24 For any n > 0 there exist C, c > 0, such that for any Tn, any
split times {ti }ii( ) and any canonical f P (Rnd ) we have
exp c x ! exp
Proof Let k n. Without loss of generality we may assume that I := {1, 2, . . . , k}
are single numbers (i.e., j (i ) s( )) and {k + 1, . . . , n} are multiple ones. Let
us also assume for a moment that for i {1, 2, . . . , k} we have t p( j (i)) 1.
Let Zi and Gi be as in Proposition 21. We have E f (Y j (1), Y j (2), . . . , Y j (n)) =
E f (Z j (1), Z j (2), . . . , Z j (n)). For i k we define
Z i :=
ou(t p(l) tl )Gl e(tl 1) + ou(t p(i) 1)Gi .
lP(i)
A = E
n
(1) i=1 i f(Z j (1) + 1(G j (1) Z j (1)), . . . , Z j (k)
( 1,..., k ){0,1}k
+ k (G j (k) Z j (k)), Z j (k+1), . . . , Z j (n))
= E
f(y1, . . . , yk , Z j (k+1), . . . , Z j (n))dyk . . . dy1.
From now on, we restrict to the case d = 1. The proof for general d proceeds along
the same lines but it is notationally more cumbersome. Using Lemma 18 and applying
the Schwarz inequality multiple times we have
 A C f P
C f
/
/
/
exp Z j (i)!/
/
/
/
Note that by the definition of Z i we have Z i Gi = Hi e(tp(i)1) +
ou(t p(i) 1) 1 Gi , where Hi is independent of Gi and Hi N (xi , i2) with
i /2 and xi x . Thus, Z i Gi is a Gaussian variable with the mean
bounded by C xi etp(i) and the standard deviation of order etp(i) . In particular
Z i Gi l Cl exp(Cl x t p(i)). Since
exp {yi } dyi
eG j(i) + eZ j(i) Z j (i) G j (i),
the proof can be concluded by yet another application of the Schwarz inequality and
standard facts on exponential integrability of Gaussian variables.
Finally, if some i s do not fulfill t p(i) 1, we repeat the above proof with s( )
replaced by the set s& of indices from s( ) for which additionally t p(i) 1. In this
way, we obtain (46) with is& t p(i) instead of is( ) t p(i). In our setting
t p(i)
t p(i) =
is&
hence (46) still holds (with a worse constant C ).
Proof of Proposition 14 The sum (14) is finite hence it is enough to prove our
claim for one L( f, ). Without loss of generality let us assume that E =
{(1, 2), (3, 4), (2k 1, 2k)} and A = {2k + 1, . . . , n} (we recall notation in Sect.
2.1). Using the same notation as in (13) we write
J (z2k+1, . . . , zn) :=
2(dz j,k )
H ( f )(u1, u2, . . . , un)
H ( f )(z1, z1, z2, z2, . . . , zk , zk , z2k+1 . . . , zn)
where D := R+Rd . We have L( f, ) = In2k ( J (x2k+1, . . . , xn)). By the properties
of the multiple stochastic integral [21, Theorem 7.26] we know that EL( f, )2
iI c
By Lemma 18, we have (in order to simplify the notation, we calculate for d = 1, the
general case is an easy modification)
f (yi )iI , (Yi (si ))iI c .
/
/
eyi dyi //
// iI c
/
(1) iI i f
Yi (si 1) + i (Yi Yi (si 1))
iI
iI c
For any x , y R we have max (exp(x ), exp(y)) exp(x ) + exp(y). Therefore, by
the mean value theorem, we get
+ EYi (si 1) Yi  exp(Yi ).
Using the Schwarz inequality and performing easy calculations, we get
E exp(Yi (si 1)) exp(Yi )
F (z) = H ( f )(z11, z11, z21, z21, . . . , zk1, zk1, z2k+1 . . . , zn)
H ( f )(z12, z12, z22, z22, . . . , zk2, zk2, z2k+1 . . . , zn)
()
Now, using (48) in combination with the Fubini theorem, the definition of the
measures i (given in Sect. 3.1) and our assumption p < 2, we get
I1,I2{1,...,k} I3{2k+1,...,n}
A(I1, I2, I3).
A(I1, I2, I3)
To conclude the proof, we use the fact that f L( f, ) is linear and P is a
norm.
Our next goal is to prove Proposition 15, which is the last remaining ingredient used
in the proof of Theorem 3. This is where we will use for the first time the bookkeeping
of trees technique introduced in Sect. 6.3. We will proceed in three steps. First we will
obtain L2 bounds on V statistics with deterministic normalization (Proposition 25),
then we will pass to L1 bound of U statistics with random normalization, restricted
to the subset of the probability space, where Xt  is large (Corollary 26). Finally, we
will obtain bounds on the distance between the distribution of two U statistics (with
random normalization) in terms of the distance in P of the generating kernels (proof
of Proposition 15).
Proposition 25 Let {Xt }t0 be the OU branching particle system with p < 2.
There exist C, c > 0 such that for any canonical kernel f P (Rnd ) we have
Proof We need to estimate
i1,i2,...i2n=1
f (Xt (i1), . . . , Xt (in)) f (Xt (in+1), . . . , Xt (i2n)).
f 2 exp(c x ) exp ((n 1) + P1( )/2 + P3( ))pt
P
f 2 exp(c x )e(n1)pt
P
etp(i)
f 2 exp(c x ) exp ((n 1) + P1( )/2 + P3( ))pt .
P
Note that  P1( ) + 2 P2( ) = s( ) and i3=1  Pi ( ) = i ( ) = l( ) 1.
Thus,  P1( ) + 2 P3( ) = 2l( ) 2 s( ) = l( ) + m( ) 2 2n 2
(recall that m( ) denotes the set of leaves with multiple labels). This ends the proof.
The next corollary is a technical step toward the proof of Proposition 15. Since
we would like to normalize the U statistic by the random quantity Xt n/2, we need
to restrict the range of integration in the moment bound to the set on which Xt  is
relatively large. It will not be an obstacle in the proof of Proposition 15, since the
probability that Xt  is small will be negligible (on the set of nonextinction), which
will allow us to pass from restricted L1 estimates to bounds on the distance between
distributions.
Corollary 26 Let {Xt }t0 be the OU branching system with p < 2. There exist
constants C, c > such that for any canonical f P (Rnd ) and r (0, 1) we have
Ex Xt n/2Utn( f )1 Xt rept ! C exp c x ! r n/2 f (2n) P .
Proof Let J be the collection of partitions of {1, . . . , n}, i.e., of all sets J =
{ J1, . . . , Jk }, where Ji s are nonempty, pairwise disjoint and i Ji = {1, . . . , n}.
Using (27) and the notation introduced there, we have
n 2I  k I  = I c.
Xt n/2Utn( f ) = Xt n/2
J J
where aJ are some integers depending only on the partition J . Since the cardinality
of J depends only on n, it is enough to show that for each J J and some constants
C, c > 0 we have
where I c := {1, . . . , k} \ I . Let us notice that n 2k l, so n k k l I ,
which gives
We consider a single term of (50). We have
Ex Xt I n/2VtI c(I c f J )1 Xt rept ! r n/2+I 
Ex ept (n2I )/2VtI c(I c f J ) r n/2+I 
for I = {1, . . . , k}, where in the third inequality we used Proposition 25. One can check
that for any n 2 there exists C > 0 such that for any I, J we have I f J P (RId )
C f J P (RJd ) and f J P (RJd ) f (2n) P (Rnd ). Therefore, it remains to bound
the ccontribution from I = {1, . . . , k} (in the case l = 0). But in this case I c = , so
VtI (I c f J ) = I c f J  =  k , f J  C f J P and exp( pt (n 2I )) 1,
which easily gives the desired estimate.
Let h(x ) := f1(2nx ) f2(2nx ), take r :=
Corollary 26 we get
On the other hand,
2 g
P E x t c Xt ept < r ! .
Since on E x t c, we have Xt  1 and Xt ept converges to an absolutely
continuous random variable,
which ends the proof.
As the proofs in this section follow closely the line of those in the subcritical case, we
present only outlines, emphasizing differences.
Proposition 27 Let {Xt }t0 be the OU branching particle system with p = 2.
There exist C, c > 0 such that for any canonical kernel f P (Rnd ) and t > 1 we
have
Proof We will use similar ideas as in the proof of Proposition 25 as well as the notation
introduced therein. Consider any T2n. By the definition of S(, t, x ), Proposition
24 and the assumption p = 2, we obtain
C2 f 2 exp(c x )t n exp ((n 1) +  P1( )/2 +  P3( )) pt
P
where we used the fact that  P2( ) n and the estimate  P1( ) + 2 P3( ) 2n 2
obtained in the proof of Proposition 25.
Now we can repeat the proof of Corollary 26 using Proposition 27 instead of
Proposition 25 and obtain the following corollary, whose role is analogous to the one
played by Corollary 26 in the slow branching case.
Corollary 28 Let {Xt }t0 be the OU branching system with p = 2. There exist
constants C, c such that for any canonical f P (Rnd ) and r (0, 1) we have for
t 1,
The proofs in this section diverge slightly from those in the critical and subcritical
cases, and hence, we present more details.
Proposition 29 Let {Xt }t0 be the OU branching particle system with p > 2.
There exist C, c > 0 such that for any canonical kernel f P (Rnd ) we have
Proof As in the previous cases, consider any T2n. We use the same notation as in
the proof of Proposition 25. We have
e2n(p)t S(, t, x ) C1e2n(p)t+pt
Thus, it is enough to prove that
2n( p )t + pt +  P1( )( p )t
where for simplicity we write Pi instead of Pi ( ) (in the rest of the proof we will use the
same convention with other characteristics of ). Using the equality s =  P1+2 P2,
we may rewrite (52) as
p + s( p )  P2 p +  P3 p 2n( p ),
2 + s 2 P2 + 2 P3 2n.
But  P3 = i   P2  P1 and so
which ends the proof.
Ex
Proof of Proposition 17 Using the notation from the proof of Corollary 26, we get
en(p)t+pI t VtI c(I c f J )
7 Remarks on the Nondegenerate Case
As in the case of U statistics of i.i.d. random variables, by combining the results for
completely degenerate U statistics with the Hoeffding decomposition, we can obtain
limit theorems for general U statistics, with normalization, which depends on the
order of degeneracy of the kernel. For instance, in the slow branching case Theorem 3,
the Hoeffding decomposition and the fact that k : P (Rnd ) P (Rkd ) is continuous,
give the following
Corollary 30 Let {Xt }t0 be the OU branching system starting from x Rd . Assume
that p < 2 and let f P (Rnd ) be symmetric and degenerate of order k 1.
Then conditionally on E x t c, Xt (nk/2)Utn( f f, n ) converges in distribution
to nk L1(k f ).
Similar results can be derived in the remaining two cases. Using the fact that on the
set of nonextinction Xt  grows exponentially in t , we obtain
Corollary 31 Let { Xt }t0 be the OU branching system starting from x Rd . Assume
that p = 2 and let f P (Rnd ) be symmetric and degenerate of order k 1. Then
conditionally on E x t c, t k/2 Xt (nk/2)Utn ( f f, n ) converges in distribution
to nk L 2(k f ).
Similarly, using (8) and the definition of W we obtain
Corollary 32 Let { Xt }t0 be the OU branching system starting from x Rd . Assume
that p > 2 and let f P (Rnd ) be symmetric and degenerate of order k 1. Then
conditionally on E x t c, exp(( pn k)t )Utn ( f f, n ) converges in probability
to nk W nk L 3(k f ).
Since in all the corollaries above the normalization is strictly smaller than  Xt n ,
they in particular imply that  Xt nUtn ( f f, n ) 0 in probability, which
proves the second part of Theorem 9 (as announced in Sect. 5.3).
Acknowledgments Research of R.A. was partially supported by the MNiSW grant N N201 397437 and
by the Foundation for Polish Science. Research of P.M. was partially supported by the MNiSW grant
N N201 397537. The authors would like to thank the referees for their constructive remarks.
Open Access This article is distributed under the terms of the Creative Commons Attribution License
which permits any use, distribution, and reproduction in any medium, provided the original author(s) and
the source are credited.