#### Merging meets matching in MC@NLO

Rikkert Frederix
2
Stefano Frixione
0
1
E-mail:
0
PH Department, TH Unit
, CERN, CH-1211 Geneva 23,
Switzerland
1
ITPP, EPFL, CH-1015 Lausanne,
Switzerland
2
Institut fur Theoretische Physik, Universitat Zurich
, Winterthurerstrasse 190, CH-8057 Zurich,
Switzerland
The next-to-leading order accuracy for MC@NLO results exclusive in J light jets is achieved if the computation is based on matrix elements that feature J and J + 1 QCD partons. The simultaneous prediction of observables which are exclusive in different light-jet multiplicities cannot simply be obtained by summing the above results over the relevant range in J ; rather, a suitable merging procedure must be defined. We address the problem of such a merging, and propose a solution that can be easily incorporated into existing MC@NLO implementations. We use the automated aMC@NLO framework to illustrate how the method works in practice, by considering the production at the 8 TeV LHC of a Standard Model Higgs in association with up to J = 2 jets, and of an e+e pair or a tt pair in association with up to J = 1 jet.
1 Introduction and definitions 2 3 Merging at the NLO
2.1 Outline of the procedure
2.2 Technicalities
2.2.1 Merging 0- and 1-parton samples
2.2.2 The general case
2.2.3 Sudakov reweighting Results
3.1 Standard Model Higgs production
Introduction and definitions
Let us consider the process
P1 + P2 S + M jets ,
where P1 and P2 can be either hadrons or leptons, and in order to simplify the discussion we
assume that the final state is defined at the parton level (that is, before the hadronization
phase in an event generator). S is a set of particles which does not contain any QCD
massless partons, and the M jets are light, i.e., obtained by clustering light quarks and
gluons. If the definition of a given observable O explicitly involves J jets, with 0 J M
(with or without a further dependence on the four-momenta of the particles in S), we refer
to such an observable as exclusive in J jets, and inclusive in the remaining M J jets;
when J = 0, the observable is typically called fully inclusive.
In Monte Carlos (MCs) based on the leading-order (LO) approximation, the description
of the process in eq. (1.1) stems from the underlying tree-level matrix elements relevant to
the process
a1 + a2 S + i partons ,
with a1 and a2 partons or leptons (according to the identities of Pk), and i a fixed number
with 0 i M ; the shower must generate at least M i partons, in order to obtain a final
state that can contribute to eq. (1.1). The most straightforward approach is that of using
matrix elements with i = 0 (which, for later use, we suppose to be of O(Sb)), and thus
of letting the shower generate all partons. The apparent simplicity of this procedure has
the drawback that cross sections are severely underestimated in the hard regions (i.e., for
energetic and well-separated jets). This problem, which obviously worsens when increasing
the c.m. energy, stems from the fact that showers give only a leading-logarithmic (LL)
approximation to the full matrix elements that would be the correct description in those
phase-space regions. As a result, one may say that with i = 0 one obtains a LL accuracy
for observables exclusive in J jets, with J 1; for fully-inclusive observables, the accuracy
is LO + LL. In order to partly remedy to the lack of hard emissions in Monte Carlos, the
shower scale (which is directly related to the largest hardness attainable) can be increased
in the latter to very large values. While this is formally correct, it entails the stretching of
Monte Carlo simulations outside their range of validity, and thus a loss of predictive power.
If one is interested in J -jet exclusive observables, for J given and fixed, it is best to
use matrix elements with i = J , which promotes such observables to LO + LL accuracy.
This has the disadvantage that the event samples generated in this way cannot be used
for observables exclusive in less than J jets, since in doing so one would be forced to
integrate matrix elements over soft and collinear regions, where they diverge. There are
technical implications as well. Firstly, the matrix-element divergences must be avoided by
introducing (unphysical) cutoffs, chosen so as not to bias physical predictions. Secondly,
one requires that the largest jet hardness generated in the shower phase be less than that at
the matrix-element level, which can be typically done with a suitable choice of the shower
scale. This requirement is not necessary in order to avoid double counting (which cannot
occur in this context: since each shower emission is associated with one power of S, after
a single emission one has already a contribution of relative O(Si+1)), but rather in order to
maintain the expected LO + LL accuracy of the J -jet exclusive observables. The two issues
discussed here are specific manifestations of the general problem of the matching between
matrix-element calculations and parton shower simulations.
A first improvement on the situation presented in the preceding paragraph aims at
the LO + LL-accurate description of fully inclusive and 1-jet exclusive observables, in the
context of a single simulation; the corresponding techniques are generally known as matrix
element corrections (MECs), although this terminology has been lately adopted to identify
other, and more sophisticated, approaches. MECs rely on the computations of i = 0
and i = 1 matrix elements, which are combined by either switching between the two
descriptions they underpin [1, 2], or by directly including the information on i = 1 in
the showers, which is still initiated starting from an i = 0 configuration [35]. These
consistent combinations of matrix elements characterized by different multiplicities are the
first examples of merging procedures.1
Extensions of MECs to arbitrarily-large multiplicities have attracted considerable
interest, in view of their relevance to Tevatron and LHC phenomenology, where multi-jet
processes are ubiquitous. There are now well-established procedures, such as CKKW [6, 7],
CKKW-L [8, 9] (and their later improvements, see refs. [1012]), and MLM [13]. These
merging approaches are based on the computations of all tree-level matrix elements (which
1We point out that in ref. [5] the approach of refs. [1, 2] is called matching, and that of refs. [35]
merging. This naming convention becomes a source of confusion when one considers the problem of NLO
corrections, and hence is not adopted here.
is why they can be referred to as attaining an LO merging) with 0 i N , where N is
as large as a computer can handle; for J N the J -jet exclusive observables are LO + LL
accurate. Since LO-merging techniques use tree-level results, cutoffs are necessary that
prevent infrared (IR) divergences from occurring. The avoidance of biases in physical results
is much more difficult to achieve than in the case of the two parton multiplicities relevant
to MECs, and the interplay between the merging and matching conditions is in fact non
trivial. The matching can be expressed in terms of the IR cutoffs, which can be collectively
(and slightly improperly) called matching scale.2 The dependence of observables on the
matching scale never completely vanishes, but in practice it is sufficiently small numerically
(consistently with the proof [6] that for CKKW in e+e collisions it is suppressed by two
logarithmic powers w.r.t. the nominal accuracy of the shower). We point out that such
a dependence is the unavoidable consequence of the fact that a cutoff-free cross section
at relative O(SN ) can only be achieved by a full NN LO computation (i.e., which includes
loop corrections). Clever matching conditions reduce the cutoff dependence, but cannot
eliminate it.
In parallel to the development of LO-merging procedures, the problem was considered
of matching next-to-leading order (NLO) QCD computations with parton showers. This
spurred quite a lot of theoretical activity [1426], but currently the only proposals that are
systematically applied to hadroproduction processes are MC@NLO [15] and POWHEG [18]
(possibly in their SHERPA implementations, refs. [26] and [25] respectively). The problem
is the analogue of the LO matching for J -jet exclusive observables, with J fixed,
discussed above. The novelty is that the underlying matrix elements are of O(Sb+J ) and of
O(Sb+J+1). The former are the same as those relevant to the LO matching, but play the
role of Born contributions in this context; the latter include the real corrections, their IR
subtraction terms, and the one-loop contributions. Although real matrix elements have
parton multiplicities one unit larger than the other matrix elements involved, which is also
what happens in MECs, the problem at hand is a matching, and not a merging, one. In
fact, its aim is not that of enlarging the number of jet multiplicities which can be
predicted to LO + LL accuracy, but rather that of promoting J -jet exclusive observables from
LO + LL to NLO + LL accuracy; the fact that (J + 1)-jet exclusive observables are indeed
predicted at LO + LL is beside the point, since this fact is a spinoff of the method, rather
than its primary motivation. Technically, what happens is that one set of IR divergences
(those arising from one parton becoming soft, or two partons becoming collinear) are
cancelled without the need of introducing a cutoff. Apart from the case of zero jet at the
Born level (J = 0), cutoffs are still required in order to prevent the cross sections from
diverging because of multi-parton IR configurations (e.g., two partons becoming soft, or
three partons becoming collinear); however, their impact on fully-showered events is much
reduced w.r.t. the analogous situation at the LO (loosely speaking, by a factor of S
see e.g. ref. [27] for a discussion on this point).
The scope of LO-merging procedures is larger than that of NLO matching, since in the
latter the relevance of matrix-element information is limited to a couple of jet
multiplici2It is in fact a matching and merging scale. We stick to the standard notation.
ties (of which only one is truly NLO, as mentioned above). Still, NLO-matched approaches
must be the method of choice whenever possible, because they are capable of giving
predictions affected by theoretical uncertainties (such as cutoff biases and scale dependences)
significantly smaller than at the LO. It is therefore clear that the situation can be improved
by combining the two strategies, in order to merge consistently event samples which are
individually matched with parton showers to NLO accuracy (this, we call for brevity an
NLO-merging approach). Merging at NLO requires one to tackle two issues. Firstly, there
is the purely theoretical problem of devising an acceptable solution, which has stimulated
much work lately [24, 2833].3 Secondly, in order for the above solution not to remain an
academic achievement, the computations must be feasible of all matrix elements involved in
the relevant NLO cross sections (i.e., up to relatively large multiplicities). This is indeed the
case thanks to the extremely high level of automation achieved in the past few years for the
one-loop [3444] and the subtracted real [4550] contributions, both of which have greatly
benefitted from the thorough understanding of tree-level-amplitude calculations [5161].
The goal of this paper is that of proposing a prescription for an NLO merging built
upon the matching achieved according to the MC@NLO formalism [15], and which requires
only very minimal modifications to the latter. This allows us to easily implement such a
merging scheme into the existing aMC@NLO framework (see e.g. refs. [27, 62] for recent
applications), and to test it by considering, at the 8 TeV LHC, the production of S = H
(Standard Model Higgs), S = e+e, and S = tt, for observables exclusive in up to one
or two jets at the NLO. The paper is organized as follows: in section 2 we describe our
approach; in section 3 we present sample predictions; finally, in section 4 we draw our
conclusions.
Merging at the NLO
Outline of the procedure
Let us consider the process in eq. (1.2), and interpret it as the O(Sb+i) Born contribution
to a (J = i)-jet exclusive cross section hence, all final-state partons are hard and well
separated. The total transverse momentum vanishes because of momentum conservation:
The O(Sb+i+1) real corrections feature (S + i + 1)-body final states, which define the
Hevent configurations in MC@NLO. For such final states, pTBorn is different from zero;4
this property will be exploited in what follows as an intuitive way to measure the extra
3We expect, in particular, that refs. [32, 33] may have features in common with the present paper, since
use is made there of the MC@NLO formalism, as it is also advocated here. A detailed comparison between
the two approaches is beyond the scope of this work, not least because they differ by higher orders already
at the un-merged level [15, 26].
4pTBorn is not an observable, but can be defined operatively in both fixed-order computations and MC
simulations. However, we shall not need to do that here.
radiation w.r.t. a given (Born) kinematic configuration.5 The (i + 1)th parton can be
arbitrarily soft or collinear to any other parton (which implies pTBorn small), but also
hard and well separated (where pTBorn large). All the other O(Sb+i+1) contributions to
the cross section have an (S + i)-body kinematics, identical to that of the Born; these are
the S-event configurations in MC@NLO.
After processing hard events with parton showers, one will obtain configurations quite
different from those of the S and H events; in particular, final-state multiplicities will have
greatly increased. However, these differences may be irrelevant to physics observables,
which may be almost identical, in shape and normalization, to those resulting from an
NLO parton-level computation.6 For this not to be the case, two conditions must be
fulfilled. Firstly, the observable must be IR-sensitive (i.e., large logarithms can appear
in the coefficients of its perturbative expansion). Secondly, one must be in an IR
phasespace region, where partons are soft and/or collinear (which causes those logarithms to
grow large); this corresponds to having pTBorn small. When this happens, the shape
of the observable is determined by the MC (large logarithms are resummed), while its
normalization is still dictated by the underlying NLO matrix elements (thanks to the
unitarity property of the shower). This implies, in particular, that the value of pTBorn of the
configuration emerging from the shower can be markedly different from that relevant to
the H event from where the shower started (which is trivially true for S events, since they
have pTBorn = 0). On average, one can say that in the IR regions S and H events provide
the normalization, while the kinematics is controlled by the MC.
Let us now consider the hard regions, where pTBorn large. S events do not contribute
there, since in order to do so the shower would have to provide all the extra radiation
leading to pTBorn (which is still possible, but at the price of choosing unjustifiably large
shower scales). On the other hand, H events do contribute; more specifically, the values of
pTBorn before and after the shower do not differ significantly. Thus, on average, in the hard
regions H events provide one with both the normalization and the kinematic configurations.
Finally, it should be stressed that the characteristics of the S and H events discussed here are
quite directly related to the fact that MC@NLO is designed to perturb in a minimal manner
both the MC and the matrix-element results (in particular, there are no contributions of
relative O(S2) which are not of MC origin).
The above observations underpin the proposal for the NLO-merging strategy that we
sketch here.
1. For any given Born multiplicity, except for the largest one considered, there must not be contributions to the hard regions. This implies, in particular, that in such regions real emissions must not occur, and the corresponding matrix elements must rather be viewed as defining the Born process for the next (i.e., one unit larger) multiplicity.
2. Suitable choices of veto scales in showers must be made for consistency with item 1.
5Other variables can be devised so as to distinguish Born from real-emission configurations; they are all
equivalent for the sake of the present discussion.
6In some cases, hadronization effects can blur this picture, and is therefore convenient to consider them
switched off in MCs for the time being.
3. Further conditions can be imposed which are similar to those used in LO-merging
procedures; for a given multiplicity, the combination of H- and S-event contributions
plays essentially the same role as a tree-level matrix element in LO mergings.
A few comments are in order here. The definition of a hard region is realised by
introducing a (generalised) matching scale, analogously to what is done in LO mergings. Item 1
is achieved by cuts at the matrix-element level, defined by means of the matching scale.
Since MC@NLO and NLO parton-level predictions are quite similar in the hard regions,
by getting rid of the latter, one also eliminates the former without the need of any extra
conditions at the MC level. However, the hard region defined here may still be populated
by the Monte Carlo (whose choice of shower scales is made a priori, and independently of
any merging procedure) when showering S events, hence item 2. Note that the relevant
scale choices there are easily worked out by taking into account what is done in item 1,
owing to the interplay between H and S events.
As far as item 3 is concerned, let us suppose that, for a given (S + i)-body Born
kinematics, MC@NLO were used to obtain only i-jet exclusive observables. This is effectively
as if the (i + 1)th parton were never resolved, and always integrated over. It thus suggests
to formally treat the combination of S and H events as equivalent to (S + i)-body tree-level
matrix elements in an LO merging which implies, among other things, the reweighting
of these events by suitable combinations of Sudakov factors (or equivalent suppressions,
as in the MLM procedure). By construction, such factors must then be obtained using
(S + i)-body configurations (possibly effective), also in the case of H events. One readily
observes that Sudakov reweightings are used in an LO merging (together with conditions
on showers) in order to prevent different multiplicities from double counting which, in the
present context, is supposed to be guaranteed by items 1 and 2. This is correct; the idea is
indeed that of employing reweighting factors which contribute to a relative O(S2) (hence,
beyond NLO), so as the perturbative expansions of the cross sections that define S and
H events are identical to the original ones up to O(Sb+i+1). This is a condition whose
application guarantees that the accuracy of the MC@NLO calculation is not spoiled, and
which results in the insertion into the MC@NLO short distance cross sections of an extra
relative O(S) term, that we shall call di(). It also implies that Sudakov reweightings are
totally general, and do not constrain the type of observables that one predicts starting from
a given multiplicity it should be clear that the assumption made at the beginning of this
paragraph has the sole role of simplifying the picture. Still, it will remain true (essentially
because of item 1) that i-jet exclusive observables will receive the dominant contributions
from MC@NLO samples associated with an underlying (S + i)-body Born kinematics. This
is directly related to the fact that, when Sudakov reweightings are applied, the term di()
is numerically small; in fact, as we shall explicitly show later, the procedures in items 1
and 2 are sufficient to obtain smooth results for most observables.
The contributions to the process in eq. (1.2) will be denoted by:
where Ti are the O(Sb+i) tree-level matrix elements, and Vi is the O(Sb+i+1) sum of the
finite part of the one-loop amplitude times that of the Born, plus the finite remainders of
the soft and collinear subtractions. The MC@NLO cross section for what we shall call the
i-parton sample can thus be written schematically as follows:
where K and KMC indicate symbolically the kernels relevant to the NLO and MC
subtractions respectively. In order to simplify the notation, in eqs. (2.4) and (2.5) we have
understood the interface-to-MCs IMC of ref. [15] (in turn equivalent to the generating
functionals F M(kC) of ref. [63]), since they will not play any role in what follows. In fact,
all manipulations of the short-distance cross sections that will be carried out below are
relevant to hard events, i.e. at the parton level and before the shower phase (with one
exception, discussed in section 2.2.3).
For our merging scheme we introduce a function:
Although one may want to choose a smooth function D for numerical reasons, it is more
transparent from the physics viewpoint to adopt a sharp version:
Q = 1 = 2 ,
which is a particular case of eq. (2.6). In eq. (2.7), the identification of Q as the matching
scale is obvious. We shall also denote by
the scale (with canonical dimension equal to one, i.e. mass) at which a given S+partons
configuration passes from being reconstructed as an j-jet one to being reconstructed as an
(j 1)-jet one, according to a kT jet-finding algorithm [64] (in other words, there are j
jets of hardness dj , and (j 1) jets of hardness dj + , with arbitrarily small). For
example:
a1 + a2 S + a3
=
d1 = pT (a3) pT (S) .
In general, for n final-state partons one will have
It will also turn out to be convenient to define
dn dn1 . . . d2 d1 .
dj = s ,
j 0 ,
with s the parton c.m. energy, i.e. the largest energy scale available event-by-event.
Equations (2.3)(2.5) imply that the i-parton MC@NLO sample gets contributions
from both the i- and the (i + 1)-parton tree-level matrix elements. This is what usually
happens in MC@NLO, when eq. (2.3) is used to compute the i-jet exclusive cross section
with NLO + LL accuracy. However, in the context where MC@NLO samples with
different multiplicities must be consistently merged, this implies that a given tree-level matrix
element Ti will contribute to both the i-parton sample (as Born contribution) and to the
(i 1)-parton sample (as real correction). This fact is peculiar of the merging at the NLO
(since at the LO one imposes that Ti contributes solely to the i-jet exclusive cross section),
and can lead to problems of double-counting nature even before considering the matching
to showers. In order to avoid these problems, we introduce the following rule:
R.1 For any given (S + i)-body kinematic configuration (with i 1) at the
matrixelement level, the sum of the contributions due to Ti to the i- and (i 1)-parton
samples must be equal to Ti (possibly times a factor smaller than one if this helps
prevent the reconstruction of a number of hard jets smaller than (i 1)).
We now proceed to incorporate rule R.1 into the MC@NLO short-distance cross sections.
In order to be definite, let us consider the merging of the i-parton samples with
0 i N ,
that is, the largest final-state multiplicity will be N + 1 partons, relevant to the real
corrections to the N -parton sample. We then formally define modified MC@NLO formulae
in the following way:
i N 1 ,
with di given in eq. (2.3). As it will be discussed in what follows, these expressions are
incorrect; however, their intuitive meaning is easy to grasp. Thus, we discuss here their
physics contents, and refine them later. The D and functions in eq. (2.13) imply that,
out of the i + 1 jets in the i-parton sample (i < N ), there are at least:
A. i 1 jets harder than 2 (owing to (di1 2));
B. one jet harder than 1 (owing to (1 D(di)));
C. one jet softer than 2 (owing to D(di+1)).
Note that one jet is degenerate (i.e., has zero four momentum) in the case of S events, and
that condition C. does not apply to H events when i = N . We stress again that no shower
is involved yet, and the jets are thus defined at the matrix-element level. Furthermore,
according to the definition of di, it would be more appropriate to talk about a hardness
scale at which one resolves i or (i 1) jets; in practice, a less precise language is acceptable,
since no confusion is possible here.
It is easier to start analysing the implications of the previous formulae by considering
first a sharp D function, eq. (2.7). In such a case, items AC imply that there are i jets
harder than Q, and one jet softer than Q. The idea is that the matrix element description
of the former i jets is adequate (and of NLO accuracy), while the latter one jet will be
heavily affected by MC showers. Consistently with this picture, one will not want the MC
to generate emissions harder than Q. When a generic and smooth D function is considered
instead (eq. (2.6)), the interpretation is basically the same, only slightly more involved. In
particular, the description of the (i 1) jets harder than 2 is still a fully matrix-element
one. In the intermediate-hardness region ( 1, 2), item B implies the presence of an extra
jet, with probability given by 1 D. This is more correctly interpreted as our confidence
in the correctness of a matrix-element description for such a jet, which is maximal (i.e.,
equal to one) for a hardness equal to 2, and minimal (i.e., equal to zero) for a hardness
equal to 1. This damping factor 1 D is arbitrary, and must be compensated. This
will happen thanks the contribution to the i-jet cross section due to the (i 1)-parton
sample, the extra jet being generated in the intermediate region ( 1, 2) by means of MC
radiation. This compensation is consistent with the idea that in the intermediate-hardness
region both the matrix element and the MC description are on equal footing. Finally, the
case of item C is analogous to that of item B, but specular. In particular, the
matrixelement description of the extra jet relevant here is turned off with probability D, so that
an MC description is generally dominant.
The latter point is not only motivated by physical arguments, but is actually necessary
in the context of a well-behaved NLO computation. In fact, while the last two factors on
the r.h.s. of eq. (2.13) limit the hardness of i jets from below, the factor D(di+1) limits that
of the (i + 1)th jet from above. It is clear that for such a jet (which generally corresponds
to the softest parton in the event) there must not be a lower bound on hardness, because
of infrared safety. On the other hand, the bound from above prevents one from having an
(i + 1)-jet configuration generated by the i-parton sample, since this would effectively be
an LO (rather than an NLO) prediction. The exception is of course that of the largest
parton multiplicity available to the calculation, simply because one cannot do better than
LO there; this is the reason for the special case considered in eq. (2.14).
We can now check the consistency of eqs. (2.13) and (2.14) with rule R.1. As it can
be seen from eqs. (2.4)and (2.5), Ti enters the S events of the i-parton sample, and the H
events of the (i 1)-parton sample. Explicitly, one obtains:
dS,i = . . . + Ti 1 D(di) (di1 2) ,
dH,i1 = . . . + Ti D(di) 1 D(di1) (di2 2) ,
where in eq. (2.15) we have exploited the fact that, for S events in the i-parton sample
(with i < N ), di+1 = 0 and hence D(di+1) = 1 (when i = N , this D factor simply does not
appear in eq. (2.14)). By using
and the properties of the D function, eq. (2.16) can be rewritten as follows:
+TiD(di) 1 D(di1) ( 2 di1) (di2 2) .
+TiD(di) 1 D(di1) ( 2 di1) (di2 2) .
This result indeed obeys rule R.1. In fact, the first term on the r.h.s. of eq. (2.19) is a
contribution to the hard region for (i 1) jets; the hardness of the remaining one jet can be
either small (thus playing the role of an NLO correction to an (i 1)-jet cross section), or
large (thus being a Born contribution to an i-jet cross section). In both cases, the use of a
matrix element description is fully justified, and Ti appears with its proper weight (which
implies that there is no double counting of matrix-element origin in the combination of
dH,i1 and dS,i). On the other hand, in the second term on the r.h.s. of eq. (2.19) Ti is
multiplied by a non-trivial weight factor. However, that term corresponds to having only
(i 2) jets in the hard region, while one of them is forced to be in the intermediate region
(owing to (1 D(di1))( 2 di1)), and another one to be either in the intermediate or
in the soft region (owing to D(di)). As was discussed before, in this situation one should
not expect the matrix elements to give the only correct description, and it therefore appears
desirable that Ti be multiplied by a number smaller than one.
Ultimately, effects such as that giving rise to the second term on the r.h.s. of eq. (2.19)
can be ascribed to the systematics of the merging scheme. This can be checked e.g. by
changing the values of 1 and 2, and the functional form of D in the intermediate region.
For example, one can see immediately that the sharp D form of eq. (2.7) simply sets the
term above identically equal to zero (as it must happen, since with eq. (2.7) there is no
intermediate region). It has to be stressed that, in the case of a smooth D, eq. (2.19) will
only give an upper bound to the merging systematics. In fact, we expect a compensating
effect, mainly due to the S events of the (i 1)-parton sample, giving rise through showers
to at least two extra jets, one in the intermediate region, and one in the soft region. Finally,
the particular source of systematics we are discussing here is essentially due to the fact
that two jets can simultaneously be present in the intermediate region. In order to avoid
this, one can think to variants of the prescription given in eq. (2.13) such as:
D(di+1) 1 D(di) ( 1 di+1) 1 D(di)
This option and its possible variants will not be considered in this paper.
What was done so far brings out the physics contents of eqs. (2.13) and (2.14). As
they stand, however, those equations are ambiguous, since the two components of di,
namely dS,i and dH,i, are associated with different kinematics configurations (2 S + i
and 2 S + i + 1 respectively), and one needs to specify which of these is used in the
computation of the dis. Let us denote by
these kinematic configurations (the notation reminds one that they are associated with the
S and H events of the i-parton sample). It is fairly obvious that S,i must be used in the
dS,i contribution to eq. (2.13), while H,i must be used in the dH,i bit. In fact, by doing
so the manipulations carried out in eqs. (2.15)(2.19) are still correct, since one can always
identify H,i1 with S,i. As an explicit example of the cross sections that one obtains with
these kinematic assignments, we write here the results relevant to the simplest case of the
merging of the two lowest parton multiplicities (N = 1):
Here, we have used eq. (2.11), d1(S,0) = 0, and the properties of the D function in eq. (2.6)
(obviously, 2 < d0).
While the above prescription removes the ambiguity in eqs. (2.13) and (2.14), and gives
those equations an operative meaning, it leads to cross sections affected by double counting
within the i-parton sample (i.e., even before merging different multiplicities), as one can
readily prove by using eqs. (2.22)(2.25), and by proceeding e.g. as is done in appendix B
of ref. [15]. However, the correct formulae can easily be obtained by means of a few simple
modifications. We shall illustrate them in the following, starting from the case N = 1 in
order to simplify the discussion, and moving next to the fully general case.
Merging 0- and 1-parton samples
The correct, non-double-counting versions of eqs. (2.22)(2.25) read as follows:
dS,0 = T0 + V0 T0K + T0KMCD d1(H,0) ,
where changes have occurred in the definitions of dS,0 and of dH,1. The factor D(d1(H,0))
in eq. (2.27) limits the hardness of the final-state parton as prescribed by the function D
(with a sharp D, eq. (2.7), the parton relative transverse momentum will obey pT < Q,
see eq. (2.9)). While this condition is imposed at the matrix-element level, one should
keep in mind that the MC subtraction term, T0KMC, appears in eq. (2.27) in order to
prevent double counting at the NLO. Hence, consistency demands that its modification
due to the D-dependent prefactor be accompanied by a prescription for the shower scale
that limits emissions within the same hardness range. Given the NLO accuracy of the
MC subtraction terms, this can be conveniently done by means of the LH-interface [65]
parameter SCALUP, which will be chosen event-by-event in a random manner (so as to
avoid biases) according to the inverse of the function D (for example, with a sharp D
function and SCALUP having the meaning of relative pT , such a scale will be always set
equal to Q). The modifications of the shower scale and of the MC subtraction term in
H events imply that the MC subtraction term must be modified in S events as well; this
is the reason for the factor D(d1(H,0)) in eq. (2.26). As far as the 1-parton sample is
concerned (eqs. (2.28) and (2.29)), the factors 1 D limit from below what is essentially
the relative pT of the Born-level parton in the case of a sharp D function, this is therefore
equivalent to imposing hard Born-level cuts. Thus, it should be intuitively clear, and could
be formally proven using again the techniques of appendix B of ref. [15], that the proper
1 D prefactor for the H-event MC subtraction term is that in eq. (2.29), and not that in
eq. (2.25).
The general case
What is done in section 2.2.1 is sufficient to sketch the procedure one has to follow in
order to convert the naive prescriptions of eqs. (2.13) and (2.14) into correct expressions
for MC@NLO short-distance cross sections. We obtain:
dS,0 = T0 + V0 T0K + T0KMCD d1(H,0) ,
dS,i = Ti + Vi TiK + TiKMCD di+1(H,i)
di1(S,i) 2 i D di+1(H,i) ,
We stress that eqs. (2.30) and (2.31) are redundant, since they are just eqs. (2.32) and (2.33)
respectively, with i = 0; we report them explicitly only for the sake of clarity. Furthermore,
eqs. (2.34) and (2.35) are identical to eqs. (2.32) and (2.33) respectively, with i = N , except
for the fact that the hardness of the (N + 1)th parton is not bounded from above. This is
correct, since there is no higher multiplicity whose Born-level kinematics could compensate
for the lack of hard emissions in the N -parton sample.
As was already mentioned in section 2.2.1, the equations above entail specific choices
of shower scales. We have chosen:
max di(H,i) di+1(H,i) , di(H,i) p(TD) ,
for S and H i-sample events respectively.7 We have introduced the quantity
p(TD) = D1(r) ,
with r a random number, to be generated event by event; one has p(TD) = Q with a sharp
D function. In the case of the N -parton sample, the p(TD)-dependent part in eqs. (2.36)
and (2.37) must be dropped. The p(TD) dependence in eq. (2.36) has already been discussed
in section 2.2.1. That on di(S,i) is intuitively clear: since such a quantity is directly
related to the hardness of the softest Born-level jet, one does not want the shower to
generate jets harder than that. This argument can be substantiated analytically in the
context of the toy model of ref. [15], where one can actually show that any
monotonicallygrowing function of di, subject to the conditions 1/2di f (di) di, will do. For un-merged
MC@NLO samples, the toy model does not give any prescription for the shower-scale choice
of H events; when merging, however, one obtains the constraint that the scale be either a
monotonically-decreasing function of the hardness of the real emission, or a constant, which
motivates eq. (2.37) (but does not determine it uniquely). Incidentally, such a choice is
just the generalisation of what was done in ref. [15], where it was motivated simply by the
argument that the shower scale has to be related to the hardness left in the system after
the first emission.
The formulae presented in section 2.2.2 achieve the strategy described in items 1 and 2 of
section 2.1. As far as item 3 there is concerned, the basic idea has already been discussed,
which is essentially that of following the CKKW prescription [6] with a reweighting of the
short-distance cross sections by a combination of Sudakov factors. We implement this by
defining modified MC@NLO cross sections as follows:
with dS,i and dH,i given in eqs. (2.32) and (2.33) respectively (or eqs. (2.34) and (2.35)
when i = N ). The compensating factor alluded to in section 2.1, necessary in order to
avoid double counting in the presence of the Sudakovs, reads as follows:
7The settings relevant to the Q2-ordered Pythia6 [66] may be different. We have postponed the study
of this issue to a forthcoming work.
where by (1) we have denoted the O(S) term in the perturbative expansion of i.
i
We point out that the quantities i are products of ordinary Sudakov factors, which
we define following the CKKW prescription of reconstructing the most probable shower
history, according to the jet-finding algorithm that determines the di. When doing that,
we use the (S + i)-body kinematic configurations when dealing with S events (eq. (2.39)),
and the (S + i + 1)-body kinematic configurations for H events (eq. (2.40)), as one would
naively expect. However, in the latter case the softest of the ds is discarded. This is in
keeping with what has been discussed at the end of section 2.1, that H events have to be
treated on the same footing as S ones as far as the multiplicity relevant to the definition
of Sudakovs is concerned. The scales entering the Sudakov factors in eqs. (2.39)(2.41) are
defined as follows:
Here, ME is a hard scale, which can be generically associated with matrix-elements
computations (e.g. an NLO parton-level result which corresponds to a given MC@NLO
simulation); explicit examples will be given below. Note that, in the case of a sharp D function,
the r.h.s. of eqs. (2.44) and (2.45) are equal to p(TD) = Q.
In the CKKW procedure, a reweighting by S factors is also performed. Although
this would technically be possible in the context of MC@NLO, the complications it entails
(owing to the more involved dependence on the renormalization scale of NLO cross sections
w.r.t. that of LO ones) do not seem justified, in view of the fact that in MC@NLO there is
already an O(S) cancellation between matrix elements and Monte Carlo effects. On the
other hand, it is probably best to choose a renormalization scale whose definition exploits
the CKKW-like considerations which lead to eqs. (2.39) and (2.40). We adopt therefore
what is used in the MINLO procedure [67], which is tailored for NLO computations, and
in view of its connections with CKKW:
We remind the reader that b is the power of S that appears in the Born contribution to
the 0-parton sample; furthermore, in eq. (2.48) both ME and the dj s are meant to be
computed with the kinematics proper of either S or H events. For consistency with ref. [67],
we also set the factorization scale equal to di. We remark that both renormalization and
factorization scales are set equal to ME for NLO mergings that do not include the Sudakov
reweighting discussed here, and for un-merged MC@NLO predictions.
Reference [67] also suggests a simplification in the implementations of eqs. (2.39)
and (2.40), which is useful because of a known issue with out-of-the-box CKKW (see
e.g. ref. [9, 68]). Namely, Sudakov reweighting leads to better results from the numerical
viewpoint if the Sudakov factors used in the short-distance cross sections are equal to those
that enter the Monte Carlo which is matched to the matrix elements. However,
analytical NLL Sudakov, such as those considered in ref. [6], are appealing precisely because,
being MC-independent, they give one the possibility of an error-free, easy, and universal
implementation. The differences induced by analytical or actual-MC Sudakovs will grow
with the distance between the largest and smallest scales entering them. We can therefore
envisage the following possibility: in eqs. (2.39) and (2.40), when i < N we use the identity:
i( i,min, i,max) = i p(TD), di di p(TD) + p(TD) di i di, i,max ,
having taken eqs. (2.44) and (2.45) into account. The Sudakov factors outside the square
brackets on the r.h.s. of eq. (2.49) are then computed with the NLL analytical forms (which,
given the scales used, is what is done in ref. [67]). On the other hand, the Sudakov factors
inside the square brackets are effectively computed using an MLM-type rejection procedure,
which is supposed to be a good approximation of the use of actual-MC Sudakovs (in that
it exploits information obtained from the Monte Carlo). When such a procedure is applied
at the LO, one matches partons with jets.8 In order to extend this idea to the NLO, we
must make use of jets, reconstructed at the level of the hard subprocess (which corresponds
to the matrix-element level), instead of partons; this preserves IR safety. We then require
these hard-subprocess jets to match parton-level jets after shower, in essentially the same
way as in the original MLM procedure [13]. The only difference is that, while MLM uses
a cone jet-finding algorithm, we adopt a kT one [64], as is also done in the MLM
LOimplementation in MadGraph [69], and consistently with the construction of the dis.
More specifically, if jets are defined by the kT algorithm with a given radius R0, we tag
a jet after shower as matched with a hard-subprocess-level one if the two are less than
1.5R0 apart in the plane. All the results we shall present in section 3 have been
obtained with R0 = 1 (large radii have to be preferred, but there does not appear to be a
strict constraint on what to choose. For example, we have verified that with R0 = 0.8 our
results are unchanged). Finally, after having obtained a set of shower-level matched jets,
we impose that in the i-parton sample there be exactly i jets with hardness larger than
p(TD). In practice, as a further measure to ensure that there be no biases after applying the
MLM rejection, we generate hard events by relaxing the conditions enforced by the 1 D
factors in eqs. (2.32)(2.35). For i < N , this implies a loss of efficiency, while for i = N
it requires that an MLM condition be imposed as well, with hardness dN . In particular,
one demands that, if there are N or N + 1 jets at the hard-subprocess level, then these
should match the N or N + 1 hardest ones after shower. We finally point out that the
MLM conditions above are identical in the cases of S and H events, again consistently with
the treatment of these two event classes in the CKKW-type procedure set up here.
8We point out that the verb to match used in context of the MLM procedure does not have the same
meaning as in the rest of this paper.
In this section, we present results obtained with the NLO-merging procedure described
in section 2 which are relevant to the production of a Standard Model Higgs (denoted
by H henceforth), of an e+e pair, and of a tt pair, at the 8 TeV LHC. We concentrate
in particular on H production, which is an ideal test case since it features a very large
amount of radiation, in this way helping expose any problems in the matching and merging
techniques. Furthermore, the matrix elements relevant to this process are relatively simple,
and thus fast to evaluate. We consider the cases of sharp and smooth D functions, and of
Sudakov reweighting; we study the merging with N = 1 and with N = 2. As far as e+e and
ttpair production are concerned, we limit ourselves to presenting a few key observables with
N = 1 and Sudakov reweighting, which is sufficient to show the generality and flexibility
of the procedure. The latter is also guaranteed by its implementation in the automated
aMC@NLO framework, which has been used to obtain all the results shown below.
The relevant information are the following:
a. The merged result.
b. The i-parton-sample results, 0 i N .
c. The un-merged (standalone) MC@NLO results for the corresponding multiplicities.
d. The merged result with different matching scales and/or merging conditions.
e. The merged result with different N .
For each observable, we present all the results above in one plot, with the following layout.
The main frame displays a, superimposed to b, thus allowing one to check the origin of
the features of the merged results, and to see the interplay among the various
i-partonsample contributions. An upper inset shows the ratios (c/a); in this way, it is easy to
assess how much the merged results differ from the standalone MC@NLO ones, in both
shape and normalization. It should be kept in mind that the latter, if obtained with an
underlying (S + J )-parton description, are physical only for observables that feature at
least J hard jets (be them obtained by explicit cuts, or by effective ones enforced by other
constraints, such as reconstructing the pT of recoiling objects). A lower inset shows the
ratios (d/a) and/or (e/a) the idea here is that of assessing the merging systematics,
and of studying the effect of increasing the largest multiplicity that enters the merging
procedure. Finally, in the cases of H and tt production we have also compared our results
with those of Alpgen [57] (including parton showers and MLM merging). The latter have
been renormalized by a process-dependent overall factor, at the sole purpose of rendering
them more visible in the figures, where they typically appear as ratios in the lower insets.
All the Monte Carlo simulations have been performed with HERWIG6 [70]; however,
the merging procedure has been implemented so as no or minor changes are foreseen in the
cases of Herwig++ [71] and Pythia8 [72] (while the case of Q2-ordered Pythia6 [66]
may require further consideration), which are currently available and under testing
respectively in aMC@NLO. Underlying events have not been generated. Jets are
reconstructed (by clustering all final-state stable hadrons) with the kT [64], anti-kT [73], and
Cambridge/Aachen [74, 75] algorithms as implemented in FastJet [76], for several
different jet radii our default choice, to be shown in the plots, is the kT algorithm with
R = 0.6. We have used the PDF set MSTWnlo200868cl [77].
Standard Model Higgs production
Our runs have been performed with mH = 125 GeV; we have set ME equal to the Higgs
transverse mass. We have also considered ME = HT /2, and found the same patterns
as with our default choice; hence, the corresponding results will not be presented here.
The one-loop matrix elements relevant to the 1- and 2-parton samples have been taken
from the MCFM code [78] as implemented in ref. [79]. We have studied the following
four merging scenarios: N = 1, sharp-D, non-Sudakov-reweighted; N = 1, smooth-D,
nonSudakov-reweighted; N = 1, sharp-D, Sudakov-reweighted; and N = 2, sharp-D,
Sudakovreweighted.
We start by discussing the former case, for the three choices Q = 30, 50, and 70 GeV.
Sample observables are shown in figure 1; the merged results in the main frame correspond
to Q = 30 GeV, in order to facilitate the direct comparison with Alpgen, where the
matching scale has also been set equal to 30 GeV, which is fairly typical for this process. The
two observables displayed on the top panels of figure 1 are representative of the behaviour of
all observables which are not directly related to jet transverse momenta: namely, the 0- and
1-parton samples merge smoothly. For the Higgs transverse momentum, this smoothness,
and the agreement with the standalone H + 1j prediction at large pT (H) (which results
naturally from the merging procedure), gives an effective constraint on the total rate, which
can only act as a normalization effect at small pT (H), where differences can be seen w.r.t.
the standalone H + 0j prediction (while the shapes are identical, as they should be by
construction). We note that normalization effects have to be expected in this case, where
the fact that the shower may cause leaks into larger exclusive multiplicities w.r.t. those
of the underlying parton cross sections is not compensated by any suppression (e.g. by
Sudakov reweighting). In spite of this, and of the fairly severe conditions posed by a sharp
D function, it is reassuring that these effects are smaller than 20%, as shown by the two
lower insets of the upper-left panel. From the comparison between those two insets, one can
also see that, by setting Q = 50 GeV, the merged result agrees almost perfectly (in shape
and normalization) with the standalone H + 0j and H + 1j predictions (where relevant,
i.e. at small and large pT s respectively). Although this may suggest a way towards the
definition of an optimal matching scale, it is important to keep in mind that the capability of
changing the merging conditions is essential in order to assess the systematics that affects
the procedure. The dotted-plus-symbols green histograms in figure 1, also presented in
terms of ratios over the merged MC@NLO results in the lower insets, are the Alpgen
predictions; for consistency with the N = 1 case considered here, the H + 0, H + 1, and
H + 2 parton samples in Alpgen have been generated, and combined according to the
MLM prescription. Although in reasonable agreement, we see that the pT (H) shape as
given by Alpgen is harder than that of MC@NLO; we point out, however, that to some
extent this difference might be due to the fact that we have run Alpgen with its default
scales, and with the LO version of the NLO PDFs we have used for MC@NLO.
In the bottom panels of figure 1 we present two observables directly related to jet
hardness, namely the pT of the leading jet, and d1. They display similar features, the
most striking of which is a kink at pT (j1) d1 Q. It is worth stressing that the same
kink appears in the Alpgen results. This emphasises the fact that such a kink is a more
general feature than being simply an artifact of the merging prescriptions adopted in these
plots (which are obviously very different from each other9), since it is ultimately caused by
9Just as obviously, it is not implied that kinks will appear regardless of the merging technique employed;
we shall show later how to get rid of them within the NLO approach proposed here. However, they are
a persistent characteristic see e.g. ref. [68] for examples relevant to CKKW (although for processes
different from Higgs hadroproduction), and ref. [11] for a possible amendment, shown there to work in
e+e collisions.
a significant mismatch between the Monte Carlo and matrix-element descriptions in the
region chosen for merging. As far as we know, the problem posed by this kink is very often
ignored: in fact, if one is interested in 1-jet exclusive observables, a lower bound pcTut on
the jet transverse momentum is imposed, and the issue of the kink is simply bypassed, in
the context of a merging procedure, by choosing Q < pcTut. With this, one is essentially
assuming that the proper description for such a jet is a matrix-element one. However, this
is not necessarily the case; for example, a 40-GeV-pT jet is presumably well described by
matrix elements if produced in association with a 80 GeV Higgs, but likely much better
modeled by parton showers when the mass of the Higgs is 600 GeV. Furthermore, the
mass scales involved in the problem are not the only relevant factors: for example, in a
ggdominated process such as Higgs production, Monte Carlo effects will extend farther in pT
in comparison with a process characterized by exactly the same mass scales, but qq-initiated
(e.g., Z production with mZ = mH ). The bottom line is that, even for J -jet exclusive
observables, the choice Q < pcTut can be misleading, if anything because it prevents one
from assessing theoretical uncertainties in a complete manner. Merging procedures do offer
a systematic way of addressing this problem, provided that pcTut is not regarded as a natural
upper bound for the matching scale; jet transverse momenta (or related quantities) must
be studied in a range that includes the matching scale. We conclude this discussion by
mentioning that the kink that appears in the bottom panels of figure 1 is not peculiar of
the kT algorithm with R = 0.6 we have found the same feature for all the jet algorithms
we have considered.
One interesting aspect of figure 1 is the general agreement between MC@NLO and
Alpgen for the basic features of the observables. This may be surprising at first, given
the fact that only in Alpgen a Sudakov suppression is implemented (effectively), which
incorporates information on the behaviour of the Monte Carlo in the merging scheme.
In fact, such information is also included in MC@NLO via the matching procedure (in
particular, in the MC subtractions), in a way that differs from that of Alpgen at relative
O(S2). This confirms the naive expectation that effects due to the mismatch between
matrix elements and Monte Carlos, which are mitigated at the LO by merging, are largely
dealt with by matching at the NLO. It also leads one to expect that, if similar merging
procedures can be implemented at both the LO and the NLO, the latter results will be
better behaved than the former. We shall show later explicit examples that this is indeed
the case.
In figure 2 we plot the same observables as in figure 1, computed with the same settings
as before, except for the fact that here the function D is smooth, with 1 = 30 GeV and
2 = 70 GeV. We have adopted the following functional form:
D( ) = f x( ) ,
with = c = 1. The histograms shown in the lower insets are computed by taking the
ratios of the predictions obtained with a sharp D (for the three values of Q previously
considered), over those obtained with the smooth D of eq. (3.1). Perhaps not surprisingly,
the new results are in a very good overall agreement with those relevant to a sharp D
function with Q = 50 GeV. The exception is the marked disagreement around the pT (j1)
d1 Q regions, where the results obtained with a smooth D function do not present any
kinks, and are fairly regular. We thus conclude that the merging procedure with a smooth
D function (and without Sudakov reweighting) leads to satisfactory results.10 Its drawback
is that the assessment of the theoretical systematics is rendered more involved because of
the presence of two scales, that define the position and width of a merging range. One
possibility is that of taking the envelope of the predictions obtained with a sharp D,
10This suggests that it could be useful to implement the CKKW or MLM merging procedures by randomly
generating the matching scale in pre-assigned ranges.
for values of Q that span the merging range (which is essentially what is done in the
lower insets of figure 2). On the one hand, this overestimates the systematics, since the
contributions due to scales close to the end-points of the merging range are less important
(in the effective average performed by the smooth D function) than those at its center. On
the other hand, this is not equivalent to assessing the effect of changing the position and
width of the merging range, which should probably also be done. In any case, these appear
to be pretty minor issues, given that the theoretical systematics associated with merging
cannot be given a precise statistical meaning, and some degree of arbitrariness is always
present.
We now study the effect of the Sudakov reweighting, following the procedure described
in section 2.2.3. We start by considering again the N = 1 case, which we generate with
a sharp D function, and the three values Q = 30, 50, and 70 GeV already employed.
In figure 3 we plot the same observables as in figure 1 and 2; a few more jet-related
observables are displayed in figures 4 and 5. In all these figures, the main frame presents
the Q = 50 GeV results, our central predictions henceforth. The histograms in the
lower insets are the ratios of the Sudakov-reweighted Q = 30 GeV and 70 GeV results
over the central ones (in other words, there are no merged predictions in these plots that
do not include the Sudakov reweighting). Also shown there are the ratios computed using
Alpgen in the numerator, over the central NLO-merged results.
The comparison of figure 3 with figures 1 and 2 shows that the Sudakov reweighting
on top of a sharp D function is as effective as the use of a smooth D function (without
Sudakov reweighting) in removing the kinks. There are quite small residual wiggles,11
which may be seen in some of the lower insets in the vicinity of the transition regions,
that are in any case within a modest theoretical systematics; this is smaller than 10%
everywhere, and in most regions quite negligible. As one can infer from the comparison
between the upper insets of figure 3 with those of figure 2 (keeping in mind that the
standalone H + 0j and H + 1j MC@NLO predictions are the same in these two figures)
the sharp-D, Sudakov-reweighted results are quite close to the smooth-D,
non-Sudakovreweighted ones, with differences of the order of 5% or smaller. The same holds true for
the sharp-D, non-Sudakov-reweighted results obtained with Q = 50 GeV, in keeping with
the idea that Sudakov effects in the context of the present merging procedure are beyond
NLO, hence small. Obviously, this smallness is true in a parametric sense, and numerically
the case Q = 50 GeV is particularly fortunate, just because such a value happens to be
quite suited for this process. When Q = 30 GeV or Q = 70 GeV the differences induced
by the Sudakov reweighting on top of a sharp D function are numerically a bit larger, but
there is no difference of principle among the various scale choices.
The transverse momentum of the second-hardest jet, and d2, presented in the bottom
panels of figure 4 show the same patterns as pT (j1) and d1 (possibly with an even smaller
theoretical systematics). This also applies to the comparison with Alpgen, since the latter
11These can be eliminated with a smooth D function (plus Sudakov reweighting). We did not test this
option here, since it appears to be just a phenomenological issue analogous to tuning. We reckon that a
relatively small range (1, 2) will be sufficient; a further handle can be provided by the parameters and
c that appear in eq. (3.1).
has kinks at pT (j2) d2 30 GeV, analogous to those affecting pT (j1) and d1 (if a bit
smaller). The pseudorapidities of the hardest and second-hardest jets are presented in the
upper panels of figure 4, for pT (jk) > pcTut 30 GeV. In both cases, the merged predictions
are more central than those obtained with the standalone H + 0j MC@NLO simulations,
owing to the contribution of the 1-parton sample. Still, since pseudorapidities receive the
most important contributions from the region pT (jk) pcTut, the 0-parton sample will
typically be dominant (except for very large pcTut, which is not the case here), with the
1parton sample providing a subleading correction. Hence, the onset of the 1-parton sample
regime determines directly the amount of migration towards central values w.r.t. the
standalone H + 0j results. In turn, this onset is controlled by the matching scale; this
explains why the systematics affecting (j1) and (j2) is larger than for other observables.
In the case of (j1), the Alpgen result is in fact quite close to the NLO-merged prediction
Figure 4. As in figure 3, for the pseudorapidity of the hardest jet (upper left), the pseudorapidity
(upper right) and pT (lower left) of the second-hardest jet, and d2 (lower right). In the case of
(jk), we have imposed a pT (jk) > 30 GeV cut.
obtained with the same matching scale (30 GeV). On the other hand, for (j2) Alpgen
is significantly more central than MC@NLO, even with the same matching scale. This
can be understood as follows. Let us consider the H + 2 parton matrix element which,
if surviving the merging cuts in Alpgen, will result into two hard jets. This is not
quite the case in MC@NLO (except in the large-pT region, which is not important here),
where its contribution to the H events of the 1-parton sample is partly compensated by the
MC subtractions. This is what underscores the intuitive picture of the 1-parton sample
being kinematically a 1-jet cross section, plus (small) corrections, and is consistent with
the fact that NLO computations must be inclusive to a certain degree. Hence, despite
receiving contributions from the same tree level matrix elements, Alpgen results will be
more matrix-element driven than MC@NLO ones, for observables sensitive to the largest
multiplicity.12 Indeed, we shall show later that the inclusion of the 2-parton sample in
MC@NLO results in a more central (j2) distribution.
As the final example for the N = 1, sharp-D function, Sudakov-reweighted merging,
we present in figure 5 the difference in rapidity between the Higgs and the hardest jet, by
imposing that the pT of the latter be larger than 10, 30, 50, and 70 GeV. This observable
has attracted some attention in the past, because of the presence in the standalone H +
0j MC@NLO results of a dip in the central region (analogous features can be found in
12We also note that the one-loop contribution to the N -parton sample has also an N -jet kinematics,
which will give a further MC-driven contribution to (N + 1)-jet observables, not present in Alpgen.
y(S) y(j1) or y(j1) for standalone S+0j MC@NLO runs see e.g. ref. [80] for a discussion
of the case S = tt, to which we shall return later). It should be pretty clear that the dip is
inherited by MC@NLO from the underlying Monte Carlo, which has in fact a deeper dip,
partly filled in MC@NLO by the H-event contribution. This is documented in ref. [81],
where MC@NLO was matched with Q2-ordered Pythia; since Pythia, at variance with
HERWIG6 or Herwig++, does not have a dip (at least in the low-pT region), MC@NLO
does not have a dip. Having clarified this, the natural question is the following: if, for a
given pT (j1) > pcTut, the underlying Monte Carlo has a dip in the rapidity difference, why
MC@NLO does not remove it completely? The answer is pretty simple: as mentioned
above, MC@NLO fills the dip through the H-event contribution, which is matrix-element
driven and of relative O(S). Such contribution will thus be overwhelmed by Monte Carlo
effects (from the first emission onwards), when radiation of O(pcTut) can be easily achieved
by parton showers.13 In other words, in spite of the fact that pcTut may seem to define a
hard scale, pT (j1) pcTut could still be in an MC-dominated region (which is a
processand MC-dependent statement). Hence, in such a case MC@NLO will not assume that a
matrix element description is correct, but will rather follow the pattern of the underlying
Monte Carlo. The implication is that, if the dip is phenomenologically untenable, a solution
has to be found at the Monte Carlo level, and not in the matching procedure.
An alternative point of view, which will however lead one to the same conclusions, is
the following. Since y(S) y(j1) is one-jet exclusive, standalone S + 0j MC@NLO will give
an LO-accurate description at best, so we are much better off by considering standalone
S + 1j MC@NLO instead. While this is true, it is indeed equivalent to assuming that
for pT (j1) pcTut a matrix-element description is correct. This is obviously the same
problem as that of pT (j1), discussed in the context of figure 1. Therefore, as in that case,
a merging procedure will certainly constitute an improvement over standalone results,
provided that the merging systematics is correctly assessed (i.e., Q must be chosen and
varied independently of pcTut). As can be seen from the four panels of figure 5, the
NLOmerged results are in much better agreement with what one expects from matrix elements
(equivalent to the standalone H + 1j histograms here) than with the standalone H + 0j
predictions. There are, however, significant differences among the results obtained with
different Qs, since the dip is typically present only when pcTut < Q (but not necessarily
so: e.g. when Q = 30 GeV, there is no dip even for pcTut = 10 GeV). For example,
when pcTut = 30 GeV, the NLO-merged result has a dip when Q = 50 and 70 GeV, while
it has no dip when Q = 30 GeV. It is interesting to observe that Alpgen does have
dips when pcTut Q = 30 GeV, while the NLO-merged prediction obtained with the
same matching scale does not, as mentioned above. This is not surprising, on the basis
of what was discussed before about the inheritance of this feature from the underlying
Monte Carlo. The conclusion is that, in the context of merging at both the LO and the
NLO accuracy, the dip can be tuned away by a suitable choice of merging parameters;
this is acceptable only if the merging systematics is exhaustively assessed. We expect
that, for small pcTut (where small can be precisely defined given the production process),
13Which also explains why, if pcTut is sufficiently large, MC@NLO does fill the dip completely.
such systematics will be large if the underlying Monte Carlo features the dip, and much
smaller otherwise hence, theoretical uncertainties will be strictly MC-dependent. This
dependence is bound to disappear, and the merging-parameter dependence reduced, when
pcTut becomes large.
We finally turn to discussing the case of the N = 2, sharp-D function,
Sudakovreweighted merging; that is, we increase the largest multiplicity by one unit w.r.t. what
was done before. The settings are the same as in the N = 1 case, and figures 6, 7, and 8
are the analogues of figures 3, 4, and 5 respectively (with the exception of one panel in
figure 8). The numerators of the ratios that appear in the upper insets are the same as
before for the H +0j and H +1j cases; that for H +2j is obviously specific to N = 2. In the
lower insets, together with the ratios that allow one to assess the merging systematics, we
have plotted (as histograms overlaid with open circles) the ratios of the N = 1 results over
the N = 2 ones, both for Q = 50 GeV. We have also recomputed the Alpgen predictions,
by adding the H + 3 parton sample, for consistency with N = 2. The corresponding results
will not be shown in the plots, since these are already quite busy, and there is no difference
at all in the patterns discussed above, except in a very few cases which we shall comment
upon when appropriate.
The common feature of all but one of the observables presented in figures 68 is that
they are extremely close, in both shape and normalization, to their N = 1 counterparts
of figures 35. This is highly non-trivial, since the individual i-parton contributions are
different in the two cases. The exception is the pseudorapidity of the second-hardest jet
(upper right panel of figure 7), which the inclusion of the 2-parton sample turns into a
more central distribution, as anticipated in the discussion relevant to figure 4, and brings
it very close to the Alpgen result obtained with the same Q.
Figure 8. As in figure 5, with N = 2. The y(H) y(j1) observable with pT (j1) > 10 GeV of the
upper left panel of figure 5 has been replaced here by a plot presenting the ratios of the N = 1
results over the N = 2 results for the azimuthal separation between the two hardest jets, with three
different pT cuts on the second-hardest jet.
The small impact of the increase of the largest multiplicity is also generally in
agreement with what is found in Alpgen, where the inclusion of the H + 3 parton contribution
changes the fully-inclusive rate by +0.3%. The effects on differential observables are also
comparably small, growing to only a few percent in the tail of pT distributions. The
exceptions are pT (j2) and pT (j3), which are significantly harder in Alpgen after the inclusion of
the H + 3 parton sample (a 20% and 30% effect respectively). However, this may be related
to the fact that such an inclusion also leads to larger kinks (of which we see no trace in
the NLO-merged results) at pT Q for these two observables; thus, the hardening may
be partly an artifact of the LO-merging procedure; in order to further this point, it will
be useful to study the merging systematics affecting pT (j2) and pT (j3) in Alpgen. It is
obvious that the relative stability of the NLO-merged results against the inclusion of higher
multiplicities is observable-dependent. (j2) gives a counterexample, but more spectacular
ones can be found by looking at correlations. As an example of this, we show in the upper
left panel of figure 8 the azimuthal distance (j1, j2) between the two hardest jets, for
three different cuts pT (j2) > pcTut on the subleading jet.14 The plots are presented in the
form of ratios of the N = 1 results over the N = 2 ones. When pcTut is well below the
matching scale (10 GeV vs 50 GeV, see the upper inset), (j1, j2) is MC-dominated, and
the inclusion of the 2-parton sample is irrelevant. However, matrix-element effects start to
be felt when pcTut is increased (middle inset), to become quite important at pcTut = 50 GeV
(lower inset). This clearly shows the impact of the largest matrix-element multiplicity on
(j1, j2), and demonstrates that for the predictions of observables exclusive in up to J
jets is best to have N J . We conclude this section by remarking that in general the
merging systematics is smaller (although for some observables marginally so) when N = 2
than in the N = 1 case, which is exactly what one expects in a converging procedure,
where the N = 2 results are better (i.e., more accurate) than the N = 1 ones.
In this section, we present the results for e+e production, limiting ourselves to the case
N = 1, sharp-D function, and Sudakov-reweighted merging. We have treated the electron
as massless, and have set mW = 80.419 GeV, W = 2.0467 GeV, and
where M (e) and pT (e) are the invariant mass and transverse momentum of the lepton
pair respectively; the former is also required to obey the following constraints:
The one-loop matrix elements are computed with MadLoop [37]. We have considered
three values for the matching scale, Q = 35 (our default), 20, and 50 GeV. As a shorthand
notation, we may denote by W the e+e pair in the following and in the labels of the plots.
Our predictions are presented in figures 911, where we have used the same layout and
patterns as e.g. in figure 3 (except for the Alpgen results, which we did not generate for
this process). As in the case of Higgs production, the merged results are fairly smooth,
and affected by merging systematics which are at most a 15% effect, but typically much
smaller than that. Similarly to what happens for Higgs, some of the pT or d distributions
display wiggles at the transition between regions dominated by different i-parton samples.
While this is just a manifestation of the merging systematics, in phenomenology-oriented
applications one may consider using a smooth D function, as was already suggested in
section 3.1. As is expected, the inclusion of the 1-parton sample induces a hardening in the
14For reasons of space, this plot has replaced that of the rapidity difference with pT (j1) > 10 GeV. The
latter is in fact not particularly illuminating since, exactly as its analogues shown in figure 8 for larger pT
cuts, the merged N = 2 result is identical to the N = 1 one shown in figure 5.
Figure 9. e+e production, with N = 1, sharp D function, and Sudakov reweighting. Pair pT
(upper left), hardest-jet pT (upper right), d1 (lower left), and d2 (lower right) are shown.
tails of pT distributions w.r.t. the standalone W +0j results (see the dashed blue histograms
in the upper insets), while affecting the fully-inclusive rates only in a marginal manner.
In figure 11 we show the rapidity difference between the lepton pair and the hardest
jet, for four different pT cuts on the latter. This is in fully analogy with the case of figure 5,
and it is immediate to see that the general discussion given there applies to e+e production
as well. In particular, the pattern of the presence or absence of the dip is exactly the same,
while the specific behaviour at a given pcTut is different because of the differences between
the two processes (i.e., mW vs mH and qq vs gg). We also point out that figure 11 can
be quite directly compared with figures 5 and 6 of ref. [81], where the same observable is
computed with standalone MC@NLO matched with Q2-ordered Pythia6; this underlines
again the MC-dependence of these distributions for small pcTut.
Figure 10. As in figure 9, for the electron pT (upper left) and pseudorapidity (upper right), the
pair rapidity (lower left), and the hardest-jet pseudorapidity (lower right). The latter observable is
obtained with a pT (j1) > 30 GeV cut.
tt production
We conclude this phenomenology section by presenting our predictions for tt production,
again limiting ourselves to the case N = 1, sharp D function, and Sudakov-reweighted
merging. The top quarks are produced on the mass shell; they are decayed leptonically
in the shower phase, in order to limit the contamination of the hadronic activity of the
events. Furthermore, the b quarks emerging from the top decays are not included in the
jets whose distributions we present below. We have set mt = 172.5 GeV, and
ME = max mT (t), mT (t) ,
where we have denoted by mT (X) the transverse mass of X. We point out that the top
quarks are not included in the clustering algorithm that determines the dis of section 2.2,
which enter in the merging procedure. This implies that gluon radiation off t and t is not
constrained, which is an acceptable (and approximate) solution only because such radiation
is quite unlikely to be hard in the case of the very massive actual top. The one-loop matrix
elements which contribute to the 0- and 1-parton samples are those originally computed
in refs. [82] and [83] respectively. We have considered three values for the matching scale,
Q = 100 (our default), 45, and 155 GeV. We have also obtained results with Alpgen,
using 45 GeV as matching scale in the MLM procedure, and generating the tt + 0, tt + 1
and tt+ 2 parton samples for consistency with the N = 1 case.
Figure 12. tt production, with N = 1, sharp D function, and Sudakov reweighting. Pair pT (upper
left), hardest-jet pT (upper right), d1 (lower left), and d2 (lower right) are shown.
For tt production, one could repeat most of what has been said in sections 3.1 and 3.2.
However, there are a few specific features that are worth stressing. Firstly, the merging
systematics is greater than previously observed. In part, this is due to the very large
range of matching scales adopted here, but it is also related to the dynamics of the present
process. Namely, up to quite large pT values (one can use the pair transverse momentum
to be definite) the standalone MC@NLO tt+ 0j result is larger in absolute normalization
than the tt+ 1j one; this is the combined effect of the fact that the shower easily produces
hard radiation (as a consequence of the top mass driving the setting of the shower scale
to relatively large values), and of the large K factor in the tt + 0j NLO computation.
This feature is easily seen e.g. in the upper inset of the upper-left panel of figure 12
the relative difference between the dashed blue and dotted red histograms is about
Figure 13. As in figure 12, for the top-quark pT (upper left) and rapidity (upper right), the pair
rapidity (lower left), and the hardest-jet pseudorapidity (lower right). The latter observable is
obtained with a pT (j1) > 30 GeV cut.
30% for pT (tt) 100 GeV. Secondly, there is a good agreement between Alpgen, and
the merged-NLO results obtained with the same matching scale Q = 45 GeV (see the
comparison between the dot-dashed blue and the solid-plus-crosses green histograms in
the lower insets). It will be interesting to see whether this agreement holds also for much
larger matching scales, such as those adopted here at the NLO. However, one starts to see
deviations between the two computations when approaching large pT s, which can be in
part due to the feature of the NLO cross sections outlined above, and in part to the different
scale and PDF choices made in Alpgen and MC@NLO. Thirdly, the agreement between
the NLO-merged and Alpgen results for (j1) (see the lower right panel in figure 13)
shows how the inclusion of the 1-parton sample addresses the issue raised for standalone
MC@NLO at the end of section 4 of ref. [80] for such a variable.15 However, from that
figure one can also see that the systematics affecting the shape is quite large. Finally, for
the y(tt)y(j1) difference, shown in figure 14, a dip or depletion is present for all the pT (j1)
cuts if Q = 100 GeV is used, while it disappears in all cases except in the pT (j1) > 10 GeV
one when Q = 45 GeV, for both the NLO-merged and Alpgen results.
15Ref. [80] actually discussed the case of rapidity (and with different jet-finding parameters), which is
obviously fully analogous to what is done here.
We have presented a procedure for the merging of MC@NLO simulations characterized
by underlying processes with different parton multiplicities. It entails only minor changes
to the standard MC@NLO short-distance cross sections, and therefore is not difficult to
incorporate into existing implementations. We have done so in the automated aMC@NLO
framework, which guarantees maximum flexibility and independence of the process.
We have provided no proof, and made no claims, on the formal accuracy of the
merging. Instead, we have thoroughly compared the merged results with those of standalone
MC@NLO, for the three sample cases of Standard Model Higgs, e+e, and tt
hadroproduction, and found agreement in shape and normalization where relevant. We have also
studied the theoretical uncertainties that affect the merging, and they have turned out to
be quite small. Although the agreement with LO-merged results computed with Alpgen
and the MLM approach is generally good, there are a few differences which will deserve
further investigations.
There are obviously several open questions, both theoretical and phenomenological,
and this work should be seen only as a first step towards further improvements. To name
just a few: the application to processes that feature b quarks or light jets at the Born level
of the lowest multiplicity; the use of alternative definitions of the dis; the use of different
scales in the short-distance cross sections and in the showers; the possibility of including
other features of the CKKW scheme, such as PDF reweighting; the potential role of
vetoedtruncated showers; the use of tree-level matrix elements only (as opposed to NLO cross
sections) for large multiplicities; the issue of logarithms of higher orders in showers. In the
near future, we plan to apply the method proposed here to the Herwig++ and Pythia8
Monte Carlos (which should not require any modifications), to extensively compare our
results with other LO and NLO merging techniques, and to validate our approach using
LHC data.
We wish to thank F. Maltoni, M. Mangano, P. Torrielli, B. Webber and M. Zaro for the
discussions we have had with them during the course of this work. This work has been
performed in the framework of the ERC grant 291377, LHCtheory: theoretical predictions
and analyses of LHC physics: advancing the precision frontier.
Open Access. This article is distributed under the terms of the Creative Commons
Attribution License which permits any use, distribution and reproduction in any medium,
provided the original author(s) and source are credited.