On jet substructure methods for signal jets

Journal of High Energy Physics, Aug 2015

We carry out simple analytical calculations and Monte Carlo studies to better understand the impact of QCD radiation on some well-known jet substructure methods for jets arising from the decay of boosted Higgs bosons. Understanding differences between taggers for these signal jets assumes particular significance in situations where they perform similarly on QCD background jets. As an explicit example of this we compare the Y-splitter method to the more recently proposed Y-pruning technique. We demonstrate how the insight we gain can be used to significantly improve the performance of Y-splitter by combining it with trimming and show that this combination outperforms the other taggers studied here, at high p T . We also make analytical estimates for optimal parameter values, for a range of methods and compare to results from Monte Carlo studies.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:


On jet substructure methods for signal jets

Received: March On jet substructure methods for signal jets 0 Oxford Road , Manchester M13 9PL, U.K 1 Mrinal Dasgupta 2 School of Physics & Astronomy, University of Manchester 3 Institute of Nuclear Physics, Polish Academy of Sciences 4 Consortium for Fundamental Physics, School of Physics & Astronomy, University of Manchester 5 CH-1211 Geneva 23 , Switzerland 6 ul. Radzikowskiego 152 , 31-342 Krak ́ow , Poland We carry out simple analytical calculations and Monte Carlo studies to better understand the impact of QCD radiation on some well-known jet substructure methods for jets arising from the decay of boosted Higgs bosons. Understanding differences between taggers for these signal jets assumes particular significance in situations where they perform similarly on QCD background jets. As an explicit example of this we compare the Ysplitter method to the more recently proposed Y-pruning technique. We demonstrate how the insight we gain can be used to significantly improve the performance of Y-splitter by combining it with trimming and show that this combination outperforms the other taggers studied here, at high pT . We also make analytical estimates for optimal parameter values, for a range of methods and compare to results from Monte Carlo studies. QCD Phenomenology; Jets 1 Introduction 3.1 3.3 4.1 2 Results for plain jet mass 2.1 Initial state radiation Final state radiation 2.3 Non-perturbative contributions 3 Trimming Lowest order result 3.2 Initial state radiation Final state radiation 3.4 Non-perturbative contributions 4 Pruning and mMDT Pruning Modified mass drop tagger 4.3 Non-perturbative effects and MC results 5 Y-pruning and Y-splitter 5.1 Y-pruning 5.2 Y-splitter 6 Y-splitter with trimming 7 Optimal parameter values 7.1 7.2 mMDT Pruning and Y-pruning 7.3 Trimming 8 Conclusions A Angular integration for FSR B Fixed-order results vs parton showers for FSR corrections – 1 – In recent years the detailed study and analysis of the internal structure of hadron jets has become an area of very active investigation. The principle reason for such high interest has been in the context of Higgs boson and new physics searches at the LHC and associated phenomenology. Due to the large (TeV scale) transverse momenta that can be accessed at the LHC, electroweak scale particles, such as the Higgs boson, can be directly produced with large boosts. Alternatively the decay of yet undiscovered heavy new particles to comparatively light standard model particles, such as top quarks or W /Z bosons, would also result in the production of boosted particles whose decay products would consequently be collimated. This in turn means that rather than producing multiple resolved jets, a significant fraction of the time the decay products are encompassed in a single fat jet. Understanding the substructure of such jets therefore becomes crucial in the context of discriminating between jets originating from QCD background and those originating from signal processes involving e.g. Higgs production and its hadronic decay. Though pioneering studies were carried out by Seymour several years ago [1], and the Y-splitter method for tagging W bosons was subsequently introduced in ref. [2] over a decade ago, the revival of interest in jet substructure is relatively recent and owes essentially to seminal work by Butterworth, Davison, Rubin and Salam [3]. These authors revealed the power of substructure analyses by studying the discovery potential for a light Higgs boson that exploiting the boosted regime and applying jet substructure methods, specifically a mass-drop and filtering analysis, was sufficient to turn what was previously regarded as an unpromising channel into one of the best channels for Higgs discovery at the LHC. Several other applications followed, dedicated to new physics searches as well as top and W /Z boson tagging, and numerous substructure techniques are now in existence and being commonly employed in experimental analyses both in the context of QCD measurements as well as for searches [4–14]. For the original articles introducing a selection of some of these methods we refer the reader to refs. [15–25] while comprehensive reviews of the field and further studies are available in refs. [26–29]. Most recently research has started to emerge [30–33] which aims at enhancing our understanding of jet substructure methods via the use of analytical calculations that, where possible, lend greater insight and provide powerful complementary information to that available purely from traditional Monte Carlo (MC) based investigations of jet substructure. In ref. [31] in particular, resummed results were provided for jet mass distributions for QCD background jets after the application of a variety of jet substructure methods (that we shall collectively refer to as ‘taggers’) and detailed comparisons to MC studies were carried out. Jet mass distributions were examined for the case of trimmed [21] and pruned jets [19, 20] as well as for jets obtained after the application of the mass-drop tagger [3]. One of the main aims of ref. [31] was to better understand how aspects of tagger definition and design may interplay with QCD dynamics to dictate the performance of taggers as reflected by their action on background jets. The improved analytical understanding that was achieved led to a better appreciation of the role of tagger parameters (including the ger [3]). The analytical studies also paved the way for improvement of theoretical properties of taggers. Examples of improvements that were suggested or made in ref. [31] included the design of taggers with a perturbative expansion more amenable to resummation as for the modified mass-drop (mMDT) as well as removing undesirable tagger features as for the case of pruning via the Y-pruning modification. Subsequent work has also demonstrated how an analytical understanding of the action of taggers on QCD background can be exploited to construct valuable new tools such as the soft drop technique introduced in ref. [33]. While thus far there has been heavy focus on taggers applied to QCD background, until now radiative effects for signal jets have not been investigated in the same level of analytical detail for many commonly used substructure methods. Detailed analytical calculations have however been performed to study the action of filtering for H → b¯b [34] and for N-subjettiness [35], while the role of QCD radiation in the context of template tagging was discussed in ref. [23]. We first observe that it is common to study high pT signal jets in some relatively resonance of interest, this mass cut being a first step in tagging signal jets. One then has a situation where there are various disparate scales involved in the problem such as the (potentially) TeV scale transverse momenta of the fat jets, the mass M of the resonance (which for our studies we can consider to be around the electroweak scale) and the width These scales are in addition, of course, to the various parameters corresponding to angular distances and energy cuts introduced by tagging and jet finding. It is well known that in such multi-scale problems radiative corrections have the potential to produce large logarithms involving ratios of disparate scales. In the example emissions, which were accompanied by collinear logarithmic enhancements in Rb¯b/Rfilt, the ratio of the b¯b opening angle to the filtering radius. On the other hand ref. [31] observed via MC studies of the signal that for the taggers studied there (mMDT, pruning, trimming,Ypruning) the tagger performance was primarily driven by the action of taggers on QCD background, with signals not appearing to display very sizeable radiative corrections for the default parameters chosen there. In order to better understand these apparently contrasting observations it is desirable to acquire a higher level of analytical insight into the action of taggers on signal jets. When comparing the performance of taggers one may also meet a situation where two taggers shall act essentially similarly on background jets and hence their action on signal becomes of critical significance. We shall in fact provide an explicit example of this situation later in this article. It is also of importance to understand and assess the impact of QCD radiative corrections and non-perturbative effects on tagger efficiency for signals, to ascertain what theoretical tools (fixed-order calculations, MC methods, resummed calculations or combinations thereof), should ideally be deployed to get the most reliable picture for the signal efficiency for a given tagger. fat jet with radius R though we will often consider R ∼ 1. With all the above aims in mind, here we embark on a more detailed study dedicated to signal jets. We shall focus our attention on the case of a jet arising from a boosted Higgs boson for a process such as Higgs production in association with a vector boson pp → W/Z, H, with H → b¯b, where we will work in a narrow width approximation. For our analytical approximations we shall also typically consider highly boosted configurations i.e. those where the Higgs has transverse momentum pT MH and shall further take a MpTH . We shall work in a small-angle approximation throughout, We stress here that we do not intend to provide precise high-order calculations for radiative corrections to any given process but seek mainly to understand and compare the behaviour of taggers via a combination of approximate analytics and MC cross-checks. For an example of exact fixed-order calculations involving jet substructure and signal processes we refer the reader to ref. [36]. We start in the next section by analysing the case of a plain jet mass cut focussing on a mass window around the signal mass and deeming a jet to be tagged as signal if the jet mass falls within this window. We consider the impact on signal efficiency of initial state radiation (ISR), final state radiative corrections (FSR) both analytically and in MC studies. We also study the impact of non-perturbative effects (NP) with MC. The results so obtained can then provide a point of reference and comparison to judge the improvements that are offered by use of substructure taggers, which impose requirements in addition to a simple cut on mass. Next we move to analysing jets with application of various taggers. In section 3 we study trimming at lowest order and with ISR and FSR corrections. We investigate the compare to MC results where appropriate. We also study non-perturbative corrections, though purely with MC results. In section 4 we analyse along similar lines pruning and the modified mass-drop tagger (mMDT). FSR is analysed further for these taggers in appendices A and B where we also compare parton shower results to those from full leadingorder calculations i.e. those that go beyond the soft/collinear approximation. In section 5 we study Y-pruning and the Y-splitter tagger [2]. We observe that while the action of Y-splitter on QCD background is similar to Y-pruning, the signal jet with Y-splitter is subject to severe loss of resolution due to ISR and underlying event (UE) effects. We show with MC studies that combining Y-splitter with trimming dramatically improves the signal behaviour while leaving the background largely unmodified. As a consequence we show that Y-splitter with trimming outperforms the other taggers we study here, especially at high pT . To our mind this example further illustrates how even a relatively basic analytical understanding of all aspects of taggers (for both signal and background) can be exploited to achieve important performance gains.1 Finally we carry out analytical studies of optimal values for tagger parameters, obtained by maximising signal significance, and compare to MC results. We conclude with a summary and mention prospects for future work. 1One other recent example of simple analytical arguments, based on power counting, being used to good effect for top tagging can be found in ref. [37]. Results for plain jet mass Here we shall consider the plain jet-mass distribution for fat signal jets without the appreviously stated. As also mentioned before, we shall consider the case of Higgs boson production in association with an electroweak vector boson pp → W/Z, H with Higgs decay to a b¯b quark pair and shall work in a narrow width approximation throughout. For the purposes of examining the jet substructure we shall not need to write down matrix elements for the production of the high pT Higgs boson and shall instead be concerned purely with the details of the Higgs decay and the resulting fat jet, as well as the impact of ISR, FSR and non-perturbative effects. Let us take a boosted Higgs boson produced with transverse momentum pT MH and purely for convenience set it to be at zero rapidity with respect to the beam direction, so that the corresponding energy is qp2T + M H2 . We Thus the invariant mass of the Higgs can be expressed as p2H = M H2 = 2pb · p¯b = 2z(1 − z)(p2T + M H2 ) (1 − cos θb¯b) , where we neglected the b quark masses. Furthermore we shall consider the highly boosted Then from eq. (2.1), taking a small-angle approximation, we obtain the standard result: thus translates into a constraint on z: taking just the H → b¯b decay without any radiative corrections. This can be considered as the fraction of decays that are reconstructed inside a fat jet of radius R. Here one has to consider the relevant Feynman amplitude for H → b¯b and the full decay phase-space with an integral over the final state parton momenta. However for the fraction of decays inside 1 − R2 ≈ 1 − R2 + O z(1 − z) > which is trivially in good agreement with corresponding MC event generator results with all ISR, FSR and non-perturbative effects turned off. The above result simply suggests, reconstruction inside a single jet increases. At this lowest-order level there is of course no Let us now account for the impact of initial state radiation on the jet mass distribution. We can anticipate that the impact of soft radiation may be significant here because of the pT . This requirement imposes a constraint on real emissions, arising from ISR, that enter the jet since these contribute directly to the deviation of the jet mass from MH . Hence one can expect that large logarithmic corrections arise as a consequence. In order to understand the structure of the logarithmic corrections that arise, we consider the process pp → ZH with the additional production of soft gluons radiated by the incoming hard partons (qq¯ pair). Let us start by taking a single ISR gluon which is soft i.e. has pT . In the soft limit we can work with the eikonal approximation in which production of the ISR factorises from the Born-level hard process pp → ZH. To compute the signal efficiency we shall require the jet invariant mass to be within a relatively narrow As we observed previously at lowest order (Born level) this inequality is always true where pH is the four-momentum of the Higgs (or equivalently the sum of the four-momenta soft emission and the Higgs direction we can write: in small quantities and write the following constraint on gluon energy: We wish to examine only the leading logarithmic structure that arises from soft ISR emissions, starting with a single emission i.e. to leading order in the strong coupling. Since we are considering an emission that enters the high pT fat jet we are concerned with largeangle radiation from the incoming hard partons. This in turn implies that there are no collinear enhancements associated to such radiation and the resulting leading logarithmic structure ought to be single-logarithmic, arising purely from the infrared singularities in the gluon emission probability.2 2In practice this expectation is challenged by the discovery of superleading logs [39]. Since these appear arguments we make here. The gluon emission probability is in turn given by the standard two particle antenna for soft emissions off the incoming quark/anti-quark: QCD jets (see for example calculations in ref. [38]) the ISR contribution can be written by integrating eq. (2.9) over the gluon emission phase space. The result can be expressed in − 1 The above equation contains an integral over the energy fraction of the partonic offspring involved in the Higgs decay (i.e. that over z), with a constraint that is identical to the zeroth order requirement that the hard quarks be contained in the fat jet, which is unmodified by the presence of soft ISR at leading logarithmic level, i.e. in the limit pT . The step function involving a restriction on the transverse momentum kt follows directly from eq. (2.8) and the subsequent arguments. Virtual corrections are bination of three particles within the fat jet, namely the b, ¯b and the ISR gluon. In our R2 i.e. in the limit of large boosts, we are considering a highly collimated quark pair, relative to the radius of the fat jet. One can thus ignore the effect of the finite b¯b opening angle as these effects contribute only terms that are relatively supconsider the fact that the soft ISR gluon is in the interior of the fat jet which amounts to 2 , since we had taken the Higgs rapidity as zero. the coupling as fixed at scale pT and ignore its running. Running coupling effects are of course important to include for leading logarithmic resummation and we shall do so for our final answers. We define εS,ISR = ε(0) + ε(S1,I)SR and carrying out the relevant integrations S with fixed coupling we get from eq. (2.10) the leading logarithmic result: In order to obtain the above result starting from eq. (2.10) one can first integrate over formally subleading logarithmic dependence on R which can become important at smaller values of R, e.g. R ∼ 0.4. Given the large single logarithms that emerge from the above approximate fixed-order calculation, it is natural to wish to attempt to resum at least the leading logarithms to all perturbative orders. This is far from a straightforward exercise. One of the main obstacles to performing a soft single log resummation, in the present context, is the presence of non-global logarithms, associated clustering logarithms [40–43] as well as superleading logarithms referred to previously. Such calculations pose a serious challenge to the current state of the art and are beyond the scope of our work. In the absence of a complete resummed calculation one can still obtain a working result obtained above and by including running coupling effects. The exponentiated result including the running of the QCD coupling is given by where we defined the single-log evolution variable t = the exponentiated result by the superfix P , which indicates the resummed contribution from primary emissions alone i.e. excluding secondary emissions which lead to non-global Although we have emphasised that our estimate of the ISR corrections to the signal efficiency are incomplete, even to leading logarithmic accuracy, it is nevertheless of interest to compare to MC event generators. This is at least in part because MC generators themselves do not attain full single logarithmic accuracy and certainly exclude superleading logarithms. They do however contain a number of effects that would be formally subleading from the viewpoint of our calculation but could be of non-negligible significance numerically. Hence while we do not intend to make a detailed quantitative comparison we do expect to find qualitative similarities with MC results. Analytic signal efficiency: Plain of mass windows as a function of a generator level cut on minimum jet transverse momentum pT . This result has been generated using Herwig++ 2.7.0 for pp → ZH at 14 TeV with the Z decaying ISR only and divided out the contribution due to the lowest order result in both panels for clarity. To make this comparison, we generate pp → ZH events at 14 TeV using Herwig++ 2.7.0 with the UE-EE-5-MRST tune [46] and constrain the Higgs and Z boson to decay hadronically and leptonically respectively. Each generated event is directly handed over to the Rivet package [47], which implements our analyses. We tag the signal jet as the highest pT Cambridge/Aachen [48, 49] jet we omit FSR and non-perturbative corrections including hadronisation and underlying event (UE) corrections, switching them off for the MC results. The resulting comparison is shown in figure 1, where results are displayed for the ratio of the signal efficiency to the lowest order result. We observe that the signal efficiency, from MC, decreases obtains a smaller Sudakov suppression and hence a larger efficiency but starts to lose the association with a well defined signal peak. Final state radiation For the case of plain jet-mass we would expect that the correction due to final state radiation can be neglected in our region of interest where pT Physically FSR is associated to the b¯b dipole originating from Higgs decay. It is captured within the fat jet as long as the FSR gluons are not radiated at angles beyond those corresponding to the jet radius R. Due to angular ordering however, we would expect that limit the final state emission is always recombined inside the fat jet. To be more precise, large-angle radiation beyond the jet-radius R is cut-off by the ratio of the dipole size (b¯b transverse momenta, the correction due to final state radiation is of negligible magnitude consider FSR more carefully when it comes to analysing the taggers in future sections. Non-perturbative contributions In order to get a complete picture of the physical effects that dictate the signal efficiency we also need to study how the signal efficiency changes after including non-perturbative effects such as hadronisation and underlying event (UE). In order to estimate those effects we used Herwig++ 2.7.0 with improved modeling of underlying event [51] and the most recent section [52] and underlying event data from √ s = 300 GeV to √ UE-EE-5-MRST tune [46] which is able to describe the double-parton scattering cross s = 7 TeV. It can readily be anticipated that the underlying event effect in particular will significantly degrade the mass peak and hence lead to a loss of signal. For this study, we consider all final state hadrons to be stable, therefore we switch off the decay handler module in Herwig++. In doing so, we eliminate the chance of b flavour hadrons decaying into invisible particles such as neutrinos. If one were to include hadronic decay via invisible particles, one notices a universal reduction in signal efficiency for each tagger due to a loss of signal mass resolution. This is particularly important for the jets formed from the decay H → b¯b as compared to W /Z jets because these electroweak bosons instead couple strongly to light quarks. For further information on experimental techniques to mitigate the impact of these particular sources of missing transverse energy, see for example [53]. We also assume a b-tagging efficiency of 100% which is sufficient for a relative comparison of tagger performance and behaviour. The reader is referred to ref. [3] for a discussion on the impact of b-tagging efficiency on signal significance. In figure 2 we see how non-perturbative effects such as hadronisation and underlying immediately notices that whilst hadronisation has a more moderate effect on the signal dominant contribution comes from underlying event contamination which reduces the efthat one needs to consider removal of the UE for efficient tagging, which we shall discuss and the averaged UE contribution to the squared jet mass varies as R4 [54]. Thus working with smaller R jets one may expect this contribution to be less significant. One should of course consider also the presence of considerable pile-up contamination, which we do not treat in this paper (see [55–57] for discussion of pileup subtraction techniques), but to which the plain jet mass will also be very susceptible. efficiency for a plain jet mass cut as a function of the minimum jet transverse momentum. One can see the sizeable impact of both hadronisation and especially underlying event on the signal include FSR at parton level. For now it is evident (as is well known) that the plain jet, with a mass window cut, is not a useful option from the viewpoint of tagging signal jets due principally to effects such as ISR, UE and pile-up contamination. It however provides a reference point for the discussions to follow. Trimming [21] takes all the particles in a jet defined with radius R and reclusters them into subjets using a new jet definition with radius Rtrim < R. It retains only the subjets which and discards the others. The final subjets are merged to form the trimmed jet. It is standard to use the Cambridge-Aachen (C/A) jet algorithm [48, 49] for substructure studies with trimming (and other taggers) and this is what we shall employ here. Lowest order result Compared to the plain jet mass, trimming already has a more interesting structure even without considering any additional radiation. If the opening angle between the b¯b pair is less then Rtrim then trimming is inactive. However, if the angle is greater than Rtrim, one removes the softer particle if its energy fraction is below fcut. The result for signal efficiency is therefore given by an integral over z which can be expressed as, − z(1 − z) Strictly we should also have written above the condition for the hard prongs to be inside the fat jet as we did for the plain jet case. However since this condition only results in terms The subtracted term in the above equation represents the removal of any prong that has energy fraction below fcut, in the region where trimming is active. Evaluating the integral in eq. (3.1) gives the result + 2fcut −1+ 1− Rt2rim 1 − Rt2rim 4 − Rt2rim 4 − Rt2rim fcut − 2 1− 1− Rt2rim independent while above it one obtains q For now we shall consider values of fcut that are standard in trimming analyses and therefore are considerably smaller than 0.5. For such choices of fcut the second term in eq. (3.2), which requires fcut > 1/2, clearly does not contribute. While the result in eq. (3.2) is general, let us for illustrative purposes consider values of Rtrim not too small, 1. Then eq. (3.2) implies a transition point at Δ ' fcutRt2rim, that reported above for the signal and corresponds to the minimal jet mass that can be beyond Mj2/(fcutRt2rim) the background distribution starts to grow due to the onset of a double logarithmic behaviour so the mistag rate increases. and hence acquires a pT dependence. We remind the reader that these results apply specifically to the Higgs decay and for processes involving W /Z tagging different results will be obtained. This is due to the different splitting functions involved in hadronic W /Z decay. In figure 3 we compare the signal efficiency using Herwig++ 2.7.0 for trimming applied to boosted Higgs jets with no ISR, FSR or non-perturbative effects to the analytical calculation above eq. (3.2). We generate the tagging efficiency with two different fcut values and as we would expect, that the MC clearly reproduces the analytic behaviour of the tagger at lowest order and, for our choice of parameters, the expected transition points at around Initial state radiation Let us consider the action of trimming on ISR and compare to the case of the plain jet. For the plain jet we found a large logarithmic term that results in loss of signal with increasing pT . On the other hand we would expect trimming to substantially remove ISR radiation and hence wish to check the impact on the logarithmically enhanced terms that emerge from considering soft ISR. The key difference with the plain jet case is that when the angle fcut = 0.1 fcut = 0.05 fcut = 0.1 fcut = 0.05 tagging efficiency for two values of fcut as a function of generator level of jet transverse momentum. This result has been generated using Herwig++ 2.7.0 at parton level with no additional radiation for H → b¯b jets. We note that the location of the transition points are reproduced by MC. between the ISR gluon and the jet axis exceeds Rtrim the soft gluon is retained only if it has kt/pT greater than fcut, where kt is the transverse momentum of the soft gluon. If the kt fraction is below fcut the ISR emission is removed by trimming, thus not contributing to the jet mass, and hence in this region there is a complete cancellation with virtual corrections. Alternatively, if the ISR falls into the trimming radius, we always retain the emission, much like the plain jet case. These constraints on real emission can be expressed as: One can then repeat the calculation carried out for the plain jet mass in the previous section using the above constraint. Taking the ISR emission probability in the eikonal approximation as before, and incorporating virtual corrections we get (in a fixed-coupling We can evaluate the integrals straightforwardly and again shall discard terms that are one gets an answer of the form fcut − p2T Rt2rim eliminates the logarithm we obtained for the plain jet mass replacing it by a less hTarmful ln 1/fcut, provided one chooses fcut not too small. On the other hand for smaller fcut we see a transition to the logarithmic dependence seen for the plain mass. There is an additional correction term in equation eq. (3.5) that represents the region of integration 2 < Rt2rim. This term vanishes as Rtrim → 0 and suggests that choosing smaller Rtrim values will result in less contamination from ISR as one may readily expect. We shall however see later, when studying FSR radiative corrections, that we cannot choose Rt2rim on the other hand one chooses Rtrim to not be too small then at very high pT one should also consider the presence of this term, which appears only for fcut > 2pM2TRH2δM . For most trim practical purposes, with commonly used parameter values, this term can safely be ignored. contributes order 10 percent corrections relative to the main ln 1/fcut piece. In principle, we should also resum the logarithms of fcut that are obtained with Such a resummation is however also beset by non-global and clustering logarithms and therefore highly involved. Moreover the ln 1/fcut terms also play only a not particularly motivated on phenomenological grounds. We note that ln 1/fcut enhanced terms are also produced in corresponding calculations for QCD background [31] and were not resummed in that case either. Consequently, unlike the plain jet case, we do not exponentiate the radiative corrections to the signal efficiency for trimming, or any of the other taggers studied in this paper. Let us then compare the main features of our simple analytical NLO approximation, augmented to include running coupling effects, to what is seen in MC event generators. In figure 4 we again compare our analytical approximations, with running coupling effects as in the plain mass case eq. (2.13), to Herwig++ 2.7.0. For the MC studies we turn on ISR effects with boosted H → b¯b jets for a range of fcut values, as a function of jet pT , keeping Plotting the ratio of the ISR corrected signal efficiency to the lowest order result, we can see that the approximate NLO analytic result reproduces the MC trends reasonably values shown (transition points are expected at roughly 200 GeV and 280 GeV, which are beyond the range shown) and none is seen in the MC plots. The behaviour over the entire plotted pT range is quite flat with pT since it depends mainly on ln 1/fcut, with running coupling and uncalculated subleading effects (in the case of the MC results) providing the fcut = 0.1 fcut = 0.05 fcut = 0.005 fcut = 0.1 fcut = 0.05 fcut = 0.005 efficiencies for a range of fcut values as a function of a generator level cut on jet transverse momentum. This result has been generated using Herwig++ 2.7.0 [44] at parton level with ISR only for change from plain jet mass like behaviour to a ln fcut term, discussed in the main text. like degradation of the signal efficiency until the transition at about 890 GeV and then for higher pT a flatter behaviour with pT , consistent with MC results. This relative flatness over a large range of pT is of course in contrast to the pure mass cut case. Final state radiation Let us consider the response of trimming to final state radiation. In principle there are a into the fat jet, results in a shift in mass which can cause the resulting jet to fall outside the appearance of large logarithms whose structure we examine here. Additionally a relatively hard FSR gluon can also result in one of the primary b quarks falling below the asymmetry cuts that are used in taggers, and hence loss of the signal. Such hard configurations can still come with collinear enhancements and so their role should also be considered. In order for an FSR gluon to be removed by trimming it has to be emitted at an angle larger than Rtrim w.r.t. both the hard primary partons. In addition its energy, expressed as a fraction of the fat jet energy, must fall below the fcut cut-off. Lastly for the resulting One can therefore write the following result for real emission contributions, valid in E¯b = (1 − z) pT the region Rtrim x outside a cone with radius Rtrim. radiation has been expressed in terms of the standard antenna pattern with the notation be carried out over the region where trimming is active i.e. when the emitted gluon makes an angle larger than Rtrim with both b and ¯b. Moreover there is an additional step function There are three distinct regimes one can consider according to the value of Rtrim. Firstly when one has Rtrim This should be the most singular contribution one obtains for trimming so we analyse it collinear enhancement and one obtains a pure soft single logarithm. In this region trimming is similar to pruning and the mMDT as far as FSR is concerned, and we shall comment on the results in somewhat more detail in the next section. Finally in the region where angle corrections are strongly suppressed. For the soft and collinear enhanced region Rtrim calculation. First let us examine the loss in mass in more detail. One has: M H2 − Mj2 = 2 (pb · k + p¯b · k) = ω Ebθb2k + E¯bθ¯b2k . Consider first that the gluon k is emitted collinear to the b quark with momentum Δ and set θ¯b2k ≈ θb2¯b = z(1−z) . Requiring MH − Mj < δM and neglecting terms of order δM 2 x as the energy fraction of the soft gluon w.r.t. the energy of the hard emitting prong i.e. on x just gives fcut/z > x. In the collinear limit, one can also simplify the angular integration in eq. (3.7) which also possible to perform the angular integration exactly i.e. beyond the collinear limit, to account for less singular soft large-angle contributions. More details of the derivation and results on the angular integration are provided in appendix A. We can therefore express the soft-collinear contribution to the FSR corrections as: − x − 1 , where a factor of 2 has been inserted to account for an identical result from the region where k is collinear to ¯b rather than b and virtual corrections have been introduced corresponding to the −1 term in square brackets. − ln(z(1 − z)) ln fcut Θ (fcut − z ) , where we introduced collinear divergence when Rtrim → 0. Let us take Rt2rim → 0 and an accompanying fcut − 1 + − fcut C2 = − fcut − fcut ln . We note that for values of fcut the signal efficiency will be dominated by a ln fcut term in the coefficient C1. The presence of the fcut constraint however means that in practice such logarithms make only modest or negligible contributions for a wide range of ref. [34], which has an identical collinear divergence to that for trimming above, but where additionally the absence of an fcut condition leads to a much stronger ln 1 enhancement, which needs to be treated with resummation. It is also straightforward to include the effects of hard collinear radiation by considering the full pgq splitting function rather than just its single subjet and are always subject to the asymmetry condition x > fcut. divergent 1/x piece. In this region it is possible for the quark to fall below the fcut threshold and therefore to be removed. Such corrections do not come with soft enhancements and produce terms that vanish with either fcut or , hence do not have a sizeable numerical effect that would require resummation. For this reason, we do not calculate these terms explicitly, continuing to work in the soft and collinear limit. 1−2fcut = 0.8. If one chooses a value of Rtrim = 0.1 then ln Δ/Rt2rim ∼ ln 17 and one may resummation, implying a much more modest FSR contribution. However we should also examine the effect of this increased Rtrim on ISR and UE contributions. For our choice of parameters it is evident from MC studies that we do not pay a significant price for the increased Rtrim value in terms of the ISR contribution. At the same impact of ISR and the underlying event. This illustrates that by an appropriate choice of Rtrim one can negate large radiative losses due to FSR, without necessarily suffering from large ISR/UE effects. In general the optimal value of Rtrim will involve a trade-off between FSR radiative corrections and ISR/UE effects. We shall return to this point in section 7. We should also examine the role of soft divergences that are formally important in the = 0.032. For our choice of fcut = 0.1, 2 GeV, approximately 1.58 and for δM = 10 GeV approximately 0.34, thus indicating that resummation of soft logarithms is not a necessity. Expressed as a percentage of the tree level Hence, we find that even the leading soft-collinear enhanced contribution makes only tions. The main implication of this finding is that full fixed-order calculations or combinations of fixed-order results with parton showers (see refs. [58, 59] for a review of the latter methods), would give a better description of the signal efficiency than pure soft showers. We further explore in appendix B, in somewhat more detail, the role of fixed-order calculations in a description of the signal efficiency. Eq. (3.11) is intended to address the formal limit Rt2rim such small values of Rtrim is problematic due to degradation of the jet from FSR loss. In the opposite limit i.e. Rt2rim mass i.e. one may expect FSR losses to be negligible. On the other hand eq. (3.5) for ISR 4As we shall see later, this constitutes a somewhat non-optimal choice for Rtrim and is made here for purely illustrative purposes, in order to estimate the size of soft but non-collinear enhanced effects. FSR Herwig++ signal efficiency: Trimming Rtrim = 0.3 Herwig++ signal efficiency: Trimming Rtrim = 0.1 momentum for two different values of Rtrim. One can see the impact of hadronisation and underlying corrections warns us that large choices of Rtrim may not be optimal due to increased ISR of FSR corrections for trimming is therefore expected to be similar to that for pruning for which a detailed calculation is carried out in section 4.1. We simply note here that in the Non-perturbative contributions Let us now study the impact of non-perturbative corrections to the signal efficiency using trimming on boosted Higgs jets. In figure 6 we show the signal efficiency for a boosted Higgs signal jet after application function of the jet transverse momentum. One can see that hadronisation has little effect on the tagging rate of signal jets, due to the action of trimming on contributions which are soft and wide angle in the jet. UE has a larger impact on the signal efficiency due to soft contamination which is not checked for energy asymmetry. In other words inside the trimming radius the algorithm is inactive, and we automatically include all contamination coming from UE, which inside this region would contribute on average to a change in the jet mass squared varying as Rt4rim. The UE contribution could thus be substantially reduced by choosing a smaller Rtrim. This is in particular required at higher pT as evident from figure 6. Also, in contrast to the plain jet result in figure 2, one notes a significant reduction in sensitivity to non-perturbative effects when tagging signal jets using trimming. M[GeV], for pT = 3 TeV, R = 1 generated using Herwig++ 2.7.0 [44] at parton level. A minimum pT cut on generation of the hard process qq → qq was made at 3 TeV for 14 TeV pp collisions. We take the mass of the two hardest that the action of trimming for the chosen parameters, appears only to have an apparently subleading effect and hence the desirable property of Y-splitter, that of reducing background via a Sudakov suppression term (see eq. (5.6)), is largely unaffected. Such findings are certainly worthy of analytical follow-up for general choices of parameters, which we shall provide in our forthcoming work. Given the improvement in signal efficiency that we have achieved with Y-splitter with trimming, and the fact that the backgrounds are comparably (and in fact apparently somewhat more) suppressed compared to Y-pruning in the mass region of interest, it is worth root of background efficiency) that can be achieved with the various taggers, as a function of transverse momentum. These are shown in figure 13 for quark and gluon backgrounds. One observes that the Y-splitter with trimming method outperforms the taggers discussed ming. For trimming however this represents a non-optimal choice at high pT (see figure 18 detailed study of optimal parameters for Y-splitter+trimming remains to be carried out and we shall aim to present the results of such a study in forthcoming work. The results shown in figure 13 are for our standard process, pp → ZH, but similar results are also obtained for W tagging as shown in figure 14. Here we observe that Ysplitter with trimming now consistently outperforms the other taggers discussed over a mass distribution of Y-splitter+trimming is smaller relative to Y-pruning than the window S ε 3 Signal significance with quark bkgds Signal significance with gluon bkgds Y-split+Trimming (fcut = ycut = 0.1, Rtrim = 0.3) Trimming (fcut = 0.075, Rtrim = 0.1) mMDT (ycut = 0.1, µ = 0.67) Y-splitter (ycut = 0.1) Pruned (zcut = 0.1) Y-pruned (zcut = 0.1) Y-split+Trimming (fcut = ycut = 0.1, Rtrim = 0.3) Trimming (fcut = 0.075, Rtrim = 0.1) mMDT (ycut = 0.1, µ = 0.67) Y-splitter (ycut = 0.1) Pruned (zcut = 0.1) Y-pruned (zcut = 0.1) S ε 3 panel) backgrounds using Herwig++ 2.7.0 [44] with underlying event and hadronisation as a function of a generator level cut pT on transverse jet momentum. We compare the signal significance for different algorithms to Y-splitter+trimming and find that the latter outperforms the others at high pT . around MH (see figure 12). Hence, we observe a greater signal significance tagging W rather than Higgs relative to the other taggers for large pT .8 Optimal parameter values In this section we shall use analytical expressions to derive values of parameters that We do not expect the values so derived to really be optimal in the sense that they will not take into account non-perturbative effects. Indeed we should emphasise that optimal parameter values have already been extracted using full MC studies for all methods considered in the original papers and also examined in subsequent studies such as in ref. [26]. Analytical studies of optimal parameters have also been carried out by Rubin in ref. [34] in the context of a filtering analysis, which we do not consider here. Nevertheless we can regard it as one of the tests of the robustness of these methods that the values derived here with analytical formulae as inputs should be reasonable approximations to what one obtains in complete MC studies. This is because one wants ideally to have substructure methods where statements about performance are largely independent of our detailed knowledge about non-perturbative corrections. We are also interested in examining to what extent general trends that emerge with analytics, such as the dependence of optimal parameters on pT , are replicated in full MC studies. For the following splitter+mMDT/pruning/soft drop. These all have a similar qualitative effect on both the background and signal jet mass distribution as Y-splitter+trimming. Hence, one observes a comparable gain in signal significance over Y-splitter for all of these combinations. However, we find that Y-splitter+trimming has the best signal significance for tagging W bosons over background in the high pT limit. /5 S ε Signal significance with quark bkgds. Signal significance with gluon bkgds. Y-split+Trimming (fcut = ycut = 0.1, Rtrim = 0.3) Trimming (fcut = 0.05, Rtrim = 0.3) mMDT (ycut = 0.11, µ = 0.67) Y-splitter (ycut = 0.1) Y-pruned (zcut = 0.1) Pruned (zcut = 0.1) Y-split+Trimming (fcut = ycut = 0.1, Rtrim = 0.3) Trimming (fcut = 0.05, Rtrim = 0.3) mMDT (ycut = 0.11, µ = 0.67) Y-splitter (ycut = 0.1) Y-pruned (zcut = 0.1) Pruned (zcut = 0.1) ε √6 / panel) backgrounds using Herwig++ 2.7.0 [44] with underlying event and hadronisation as a function of a generator level cut pT on transverse jet momentum. We deem a jet tagged if it has a final splitter+trimming and find that the latter outperforms the others at high pT . In this plot, we use all tagger parameters which match those used for W tagging in the paper [31] for ease of comparison. studies we confine ourselves to quark backgrounds as we have no reason to believe that gluon backgrounds will differ significantly in terms of the conclusions we reach here. Having observed in this paper the relatively small radiative corrections, both for ISR and FSR, that emerge for signal processes over a broad range of parameter values, one feels encouraged in a first approximation to turn off these effects and treat the signal in a treelevel approximation, except for the case of trimming as we discuss below in more detail. In other words we anticipate that the signal significance ought to primarily be driven by the tree-level results for signal while for the background we shall use the resummed formulae first derived in [31]. For self-consistency, one should then also verify that for the optimal values one derives, the radiative corrections to signal efficiency can indeed be considered small relative to the tree level result. Let us follow the above described procedure for the mMDT and extract the optimal value where we have used R = 1. We then have the following expression for the signal significance: 1 − 2ycut pΣ (ycut, ρH + δρ) − Σ (ycut, ρH − δρ) −4 1 − 2ycut − 4 We can use this result in eq. (7.3), assuming that the optimal value lies in the region −4ymax 1 − 2ymax 3 + 4 ln ymax the integral of the background jet-mass distribution over the mass window corresponding to signal tagging with mMDT. Note that we have treated the signal efficiency at lowest order. We can find the value of ycut that maximises signal significance by taking the derivative of the r.h.s. of eq. (7.1) w.r.t. ycut and setting it to zero which gives: −4 1 − 2ycut optimal value for ycut satisfies One can numerically solve the above equation, which contains the essential information included running coupling effects in the above derivation, one finds it is straightforward to do so. Using the full calculation of ref. [31] for the background, i.e. including running the analytical signal significance plotted in figure 15 as a function of ycut. From figure 15 we note firstly that the peak position of the analytical signal significance is approximately in agreement with the numbers we quoted immediately above for the also shown, in the same figure, results from Herwig++ 2.7.0 at both parton level and at full hadron level including UE. We find the Herwig++ 2.7.0 results at parton level in quite reasonable agreement with the simple analytical estimates we have made, for both the peak positions and the evolution of optimal ycut with pT , though the values of the peak signal significance itself differ somewhat. It is noteworthy also that hadronisation and UE quoted in the literature. pT = 1 TeV pT = 2 TeV pT = 3 TeV mMDT (parton level) pT = 1 TeV pT = 2 TeV pT = 3 TeV Optimal parameters: Herwig++ (parton level) Herwig++ signal significance: pT = 1 TeV pT = 2 TeV pT = 3 TeV Higgs and Z to decay hadronically and leptonically respectively with quark backgrounds. We place a generator level cut on the Higgs transverse momentum pT of 1, 2 and 3 TeV. Jets are tagged around optimal ycut values as a function of pT (red line) with a 2% variation in signal significance about the peak (red shaded area). We overlay the optimal results for ycut obtained using Herwig++ 2.7.0 with hadronisation and underlying event at 1, 2 and 3 TeV, with an equivalent 2% variation about the peak signal significance (blue bars) and at parton level (black bars). do not change the picture significantly at the pT values we have studied here. One other feature that emerges from both analytical and MC studies is that the peaks themselves are fairly broad so that choosing a slightly non-optimal ycut does not greatly impact the tagger We have also provided in figure 15 a direct comparison between optimal values from Herwig++ 2.7.0 (including all effects) and analytical estimates. We show the results for the range of ycut values (denoted by the pink shaded region) that correspond to a ±2% variation around the peak signal significance. For Herwig++ 2.7.0 instead we indicate the same range of ycut values by the blue bars shown. We find a good degree of overlap within this tolerance band between full Herwig++ 2.7.0 results and analytical One can draw at least a couple of inferences from our observations above. Firstly, as we have argued, radiative corrections to the signal are clearly of minor significance to the tagger performance for mMDT. The fact that the analytics are generally in good agreement with Herwig++ 2.7.0 points to the importance of the background contribution in the context of the signal significance and the success of analytical approaches in describing this background [31]. The fact that non-perturbative effects play an evidently minor role at the values of pT studied above is also reassuring from the point of view of a robust understanding of tagger performance. We end with a caveat. If one moves to still lower pT values then one has to reconsider some of the arguments above. Here one would have a situation where say at 200-300 GeV apart, perhaps more significantly one can expect UE to start playing a larger role due to the larger effective radius ∼ pT Mj where UE particles accumulate without being removed by the asymmetry cut. Here one ought to consider the use of mMDT with filtering and optimise the parameters of both methods together as in the original analysis [3]. Pruning and Y-pruning Here we carry out a similar analysis for the case of pruning. The resummed expression for pruning, for QCD jets, is considerably more complicated than for mMDT. The result essentially has two components which in ref. [31] were dubbed the Y and I components respectively. We have already dealt with Y-pruning in some detail in this article in the context of signal jets. For the background, as we have also discussed in a previous section, single-logarithmic result by a Sudakov like form factor and gives rise to a desirable suppression of the background in the signal region, for high pT values. The I pruning contribution, logarithmic. For the sum of Y and I components, i.e. for pruning as a whole, one observes 2 for mMDT, while for ρ < zcut we see the I-pruning contribution starts to become more important which can cause growth of the background and the appearance of a second peak for quark jets and a shoulder like structure for gluon jets. √/S2.8 Herwig++ signal significance: pT = 1 TeV pT = 2 TeV pT = 3 TeV Pruning (parton level) pT = 1 TeV pT = 2 TeV pT = 3 TeV pT = 1 TeV pT = 2 TeV pT = 3 TeV as a function of zcut compared to Herwig++ 2.7.0 at parton level and with hadronisation and MPI. Details of generation given in figure 15. We do not, for brevity, present here the resummed results for pruning for QCD jets, referring the reader instead to section 5.3 of ref. [31]. Here we simply plot the analytical signal significance for pruning as for mMDT, with neglect of radiative corrections to the signal efficiency, but with the full resummed calculation for QCD background, which we take to be quark jets alone. The resulting signal significance is displayed in figure 16 along with MC results at parton and hadron level. One would expect the optimal zcut to lie in a larger zcut would push us into the region where the background starts to grow due to Optimal parameters: Herwig++ (parton level) may expect a value closer to 0.04 and these expectations are roughly consistent with what one notes with both the analytical and MC results shown. Once again we observe that non-perturbative effects do not change the essential picture one obtains from analytics and have only a limited impact on the signal significance relative to parton level. The pruning results have clear qualitative differences from the case of mMDT. In particular at higher pT we have to be more precise about the choice of zcut due to the somewhat narrower peak in the signal significance. We can compare, as for mMDT, analytical results to those from Herwig++ 2.7.0, once again with a ±2% tolerance band shown in the bottom right figure of figure 16. We observe that within this small tolerance band the results are compatible though at higher pT perhaps less so than for mMDT. In the original paper [20], the authors conclude that the optimal zcut value for pruning is 0.1 when using the C/A algorithm to cluster the initial jet, as we do here. W bosons) compared to this paper, however our results are consistent as we approach this region. For larger boosts, we observe that the optimal value choice for zcut tends to slightly smaller values (zcut ∼ 0.075). We also present in figure 17 results for the signal significance of Y-pruning, again taking quark jets as background. Here we note firstly that analytics are again broadly in agreement with MC results for the shape of the signal significance as a function of zcut. Secondly the peaks are quite broad and so choosing a somewhat non-optimal value of zcut does not critically affect the significance. Furthermore, the optimal zcut does not depend strongly on pT and is virtually constant over the limited pT range studied. Lastly within a ±2 % tolerance band there is good agreement between full MC results and simple analytics on the optimal values of zcut. Hence for mMDT and Y-pruning and to a slightly smaller degree for pruning we find that, over the pT values we studied here, analytical results based on resummed calculations for QCD background and lowest order results for signals, with neglect of non-perturbative effects, capture the essential features of tagger performance, as reflected in the signal significance. An extension of our studies to lower pT values would be of interest in order to ascertain the further validity of the simple picture we have used for our analytical results and probe in more detail the role of radiative corrections to the signal and that of non-perturbative contributions. We shall next examine the more involved case of optimal parameters for trimming. Here we carry out a similar analysis for trimming, but one now has to optimise two parameters, Rtrim and fcut. As performed for the pruning analysis, we use the analytic resummed expression for QCD jets given in ref. [31]. The result for the background jet mass distribution consists of a region with single log behaviour (equivalent in structure to mMDT) for tive corrections to the signal efficiency are crucial for optimisation. If one naively uses the tree level result given in eq. (3.2), it follows that the optimum value for Rtrim tends to zero. Y-pruning (parton level) pT = 1 TeV pT = 2 TeV pT = 3 TeV pT = 1 TeV pT = 2 TeV pT = 3 TeV Herwig++ signal significance: pT = 1 TeV pT = 2 TeV pT = 3 TeV Optimal parameters: Herwig++ (parton level) ground as a function of zcut compared to Herwig++ 2.7.0 at parton level and with hadronisation and MPI. Details of generation given in figure 15. This is because one can ensure that signal mass window is within the single logarithmic to small values of the QCD jet mass, thereby avoiding the double logarithmic peak. However, as shown in this paper, in the limit Rtrim → 0, one encounters large logarithmic corrections to the signal efficiency associated with final state radiation from the signal jet (see eq. (3.11)). This puts a limit on how small one can reduce the trimming radius whilst maintaining reasonable signal mass resolution. Hence, we now include FSR radiative corrections to the signal efficiency by integrating the the expression given in eq. (3.10) over z and adding this term to the Born level result eq. (3.2). Including this radiative correction, along with the resummed QCD background, we can obtain analytical estimates for the signal significance. fcut using Herwig++ 2.7.0 for H → b¯b jets with quark backgrounds with a minimum jet transverse momentum cut. The top panels are generated at parton level with transverse momenta 2 TeV and 3 TeV left and right and the bottom panels include hadronisation and underlying event. The area inside the black contour represents the analytic prediction with FSR radiative corrections to the signal efficiency for optimal values within 2% of the analytic peak signal significance. In figure 18 we show a 2D density plot for the signal significance with trimming over a range of Rtrim and fcut values using Monte Carlo at parton level (top) and with full hadronisation and underlying event (bottom) with a transverse momentum cut at 2 and 3 TeV left and right respectively. We overlay a black analytical contour representing the region in which the analytical signal significance is no more than ±2% away from the analytically derived peak value for Rtrim and fcut. One can see that we have reasonable agreement between the simple analytical estimates and the Herwig++ 2.7.0 results at parton level. However, when one includes non-perturbative effects, we observe that contamination from underlying event significantly reduces the signal significance as Rtrim increases. 4.4 Trimming Rtrim = 0.06 (parton level) pT = 1 TeV pT = 2 TeV pT = 3 TeV Trimming Rtrim = 0.06 pT = 1 TeV pT = 2 TeV pT = 3 TeV parton level and with hadronisation and MPI using Herwig++ 2.7.0. significance in the limit Rtrim We can use our simple analytical estimates to comment on the optimal values we observe from MC. Firstly, for optimal values of fcut and Rtrim, one would expect the signal mass window to reside in the single logarithmic mMDT-like region of the background, This expectation is consistent with both the analytical contour (top right edge) and MC results both at parton and full hadron level. This background driven effect is manifest as a suppression in signal significance when the product fcutRt2rim becomes large (i.e top right These numbers are in agreement with the analytical contour and MC results. Secondly, FSR corrections to the signal efficiency become significant in the region Rt2rim one would expect the optimal trimming radius to reside in the region Rtrim & 3 TeV this corresponds to Rtrim > 0.04 and at 2 TeV corresponds to Rtrim > 0.06. This is consistent with the analytical contour and MC, where we observe a reduction in signal region Rtrim ≈ We notice that, like mMDT and pruning, the signal significance is fairly insensitive signal significance is subject to non-perturbative corrections which increase with Rtrim, and consequently one should favour the small Rtrim limit of the analytical optimal contour the signal significance on the choice of fcut as in figures 15, 16, 17. The results can be found pT values. With the given choice of Rtrim, reminiscent of the pruning radius, it is natural to compare the results to those for pruning reported in figure 16. One notes that even with a similar choice of radius there are differences between the two techniques. While for pruning the optimal zcut decreases with increasing pT the optimal value for trimming stays more constant. The peak signal significance itself increases with pT in both cases. For a given pT the behaviour as a function of fcut is also different, especially at larger fcut. These differences originate in a number of sources: the difference in FSR corrections and their pT dependence which is more pronounced for trimming, differences in the definitions of fcut and zcut and last but not least differences arising from QCD background jets with pruning and trimming (see ref. [31]). In order to better understand the role for example of FSR effects, in the above context, we note that for pruning one can simply replace the signal optimal fcut values show a very similar trend with pT to those for pruning. However for trimming pT dependent FSR corrections cannot be neglected, especially at low pT , and play an important role in pushing the optimal fcut to smaller values than would be obtained by turning off FSR effects. This is the main reason behind the relative insensitivity of optimal fcut values seen with trimming, over the pT range studied in figure 19. In this article we have studied perturbative radiative corrections and non-perturbative effects for the case of signal jets, specifically for boosted Higgs production with H → b¯b, with the application of jet substructure taggers. For the former we have carried out relatively simple analytical calculations both to assess the impact of ISR and FSR as well as to study the signal mass, the fat jet pT , the mass of the resonance MH , and the parameters of the various taggers. To examine non-perturbative effects we have confined ourselves to MC studies. Our study was motivated by relatively recent calculations dedicated to the case of QCD background jets and in particular work presented in ref. [31]. There it was noted that while taggers should in principle discriminate against jets from QCD background, the degree to which this happened and the impact on the background jet mass distribution was not always as desired. While taggers such as pruning, mMDT and trimming were eswhich especially at high pT corresponded to masses in the signal mass region of interest. Likewise taggers should, in principle, not affect significantly signal jets, retaining them as far as possible. Additionally most taggers have a grooming element (via the fcut/ycut/zcut criteria) that is responsible for clearing the jet of contamination from ISR/UE thereby helping in the reconstruction of sharper mass peaks. Here our aim was to carry out analytical and MC studies to investigate in detail the impact of taggers on signal especially with regard to the interplay between tagger parameters as well as kinematic cuts such as jet pT , masses and mass windows. Our findings on the whole indicate that tagger performance is more robust for the case of signal jets than was apparent for QCD background. Most taggers are quite similar in their response to ISR and generally significantly ameliorate the loss of the signal efficiency seen for plain jet mass cuts, without these substructure techniques. An exception to this situation was the case of Y-splitter where the ISR and UE contamination resulted in a loss of signal efficiency identical to that seen for plain jets. Likewise for FSR, the radiative losses that one sees are on the whole modest for a reasonably wide range of tagger parameters. Here an interesting question opens up about the potential role for fixed-order calculations in the context of jet substructure studies. This is because one observes an absence of genuine logarithmic enhancements for sensibly chosen tagger parameter values. The signal efficiency, for the taggers studied here, ought then to be better described by exact calculations that incorporate hard gluon radiation or by combinations of matrix element corrections and parton showers than by the soft/collinear emissions encoded in pure parton showers. We carried out a comparison taggers, reported in appendix B. We find that we can reasonably adjust parameters descriptions. Such observations may also be useful beyond the immediate context of our work, in situations where differences in tagger performance could come from regions of phase space that are not under the control of a soft eikonal approximation. In these situations one would ideally want to combine resummed calculations, where necessary, with fixed-order calculations i.e. carry out matched resummed calculations. A summary of the results presented in this paper for the logarithmic structure of radiative corrections to the signal efficiency for each tagger are given in table. 1. A development we have made here is the introduction of a combination of Y-splitter with trimming in an attempt to improve the response of Y-splitter to ISR/UE contamination. The main reason why we made this effort was due to the fact that we observed that Y-splitter was very effective at suppressing QCD background in the signal region. The resulting improvement of signal efficiency coupled with the fact that the background suppression from Y-splitter remained essentially intact after the use of trimming, meant that the combination of Y-splitter+trimming actually outperforms other taggers studied here, in particular, at high pT . Our observation is in keeping with the general idea that suitably chosen tagger combinations may prove to be superior discovery tools compared to currently proposed individual methods [64]. In fact it is now becoming increasingly common to use combinations of techniques such as N-subjettiness [62] with for instance mMDT in an effort to maximise tagger performance (see e.g. [65]). There is also much effort aimed at better understanding tagger correlations and we expect that our forthcoming analytical calculations for the case of Y-splitter with trimming will shed further light on some of these issues [63]. Lastly, we have carried out an analytical study of optimal parameter values for various taggers. Having observed modest radiative corrections to the signal we neglected these effects and found that analytical estimates, based on lowest order results for the signal and resummed calculations for QCD background, generally provide a good indicator of the dependence of signal significance on the tagger parameters. The analytical formulae The coefficient C2 for the trimming FSR logarithm is given in eq. (3.12). which also do not include non-perturbative effects give rise to optimal values that are fairly compatible with those produced by full MC studies. This is encouraging from the point of view of robustness of the various methods considered since a dependence of optimal values on MC features (hadronisation models or MC tunes) are potentially not ideal. We note in closing that for other methods, such as N-subjettiness for example, there will also be a suppression of signal jets due to the fact that such observables directly restrict radiation from the signal prongs. Thus in those cases radiative corrections arising from soft/collinear emissions by signal prongs are highly significant as can be noted from ref. [35]. We hope that our work taken together with studies of such observables will enable a more complete understanding of features of signal jets in the context of jet substructure studies and provide yet stronger foundations for future developments. We are particularly grateful to Gavin Salam and Mike Seymour for illuminating discussions on the topic of jet substructure. We also thank Simone Marzani and Gregory Soyez for further useful discussions. We thank an anonymous referee for helpful remarks and suggestions which we have implemented in the current version of the article. We are also grateful to the Cloud Computing for Science and Economy project (CC1) at IFJ PAN (POIG 02.03.0300-033/09-04) in Cracow whose resources were used to carry out most of the numerical calculations for this project. Thanks also to Mariusz Witek and Milosz Zdybal for their help with CC1. This work was funded in part by the MCnetITN FP7 Marie Curie Initial Training Network PITN-GA-2012-315877. We would like to thank the U.K.’s STFC for financial support. This work is supported in part by the Lancaster-Manchester-Sheffield Consortium for Fundamental Physics under STFC grant ST/L000520/1. Angular integration for FSR To work out the coefficient of the soft FSR we need to perform the angular and z integrals for the antenna pattern in eq. (3.7) for trimming and likewise for all taggers. Generally, for a single gluon emission, one has to evaluate the contribution from FSR emission outside two cones of radius r centred on the b and ¯b quarks. The choice of r depends on the tagger in question, so after carrying out the angular integration, one can set r2 as Rt2rim for One has to then evaluate the integral I = w.r.t. both hard partons.10 The simplest way to evaluate the integral above is to first consider an integration over the entire solid angle and then to remove the contribution from inside two cones around the hard parton directions. We shall assume that the cones do not we perform a numerical calculation and find that our results agree with those of Rubin [34]. Therefore we write I = Iall − ICb − IC¯b where Iall is the integration over the full solid angle and ICb,¯b are the integrals inside the region corresponding to cones around b and ¯b directions respectively. Iall can be evaluated by standard techniques and yields, after azimuthal averaging, the textbook result corresponding to angular ordering of soft emission. Iall = The contribution inside the cone around b, ICb , can be evaluated as follows. Taking the b direction as the “z” axis we define the parton directions by the unit vectors: ~nb = (0, 0, 1) , The in-cone subtraction term for C1 can then be written as ICb = d (cos θbk) (1 − cos θbk) (1 − cos θb¯b cos θbk − sin θbk sin θb¯b cos φ) 10While we have retained, at this stage, the full angular antenna pattern for ease of comparison to standard formulae, we shall later take the small angle approximation to compute the final answer. ICb = d (cos θbk) (1 − cos θbk) | cos θbk − cos θb¯b| This term can be combined with the corresponding contribution (the first term) in Iall, I = where we have also included IC¯b via the interchange b ↔ b. The collinear divergence along each hard parton direction is cancelled by the in-cone contributions, leaving only a wide-angle contribution. Carrying out the angular integrations we get I = 2 log I = 2 Z 1−zcut 1 − z(1 − z) z(1 − z) = √ which corresponds to the result quoted for pruning in eq. (4.4). around the b and ¯b, does not apply. For this purpose we have evaluated the angular integration numerically and for MH /pT 1 i.e. when one can use the small-angle approximation, Fixed-order results vs parton showers for FSR corrections We have noted that FSR computed using the soft approximation gives numerically very small corrections to the leading-order results, for sensible choices of the mass window calculations are not a good guide to the actual tagger performance i.e. the signal efficiency, since they do not produce genuine logarithmic enhancements. One can expect instead that fixed-order calculations, with correct treatment of hard non-collinear radiation at order resummation effects are not likely to be significant it becomes of interest to compare signal efficiencies obtained with pure fixed-order calculations to those from MC generators. One showers, owing to the dominance of hard radiation and the consequent lack of importance of multiple soft/collinear emissions. SHERPA shower vs EVENT2: mMDT SHERPA shower vs EVENT2: Pruning 22.8 27.4 31.9 36.5 22.8 27.4 31.9 36.5 SHERPA boson with a transverse boost to pT = 3 TeV. H → b¯bg. Such a calculation can be straightforwardly performed by taking the exact H → b¯bg matrix element and integrating over phase space after application of cuts corresponding to jet finding and tagging in various algorithms. While straightforward this exercise proves cumbersome and has in any case to be carried out with numerical integration. One may instead try to obtain the same information more economically by exploiting existing fixed One of the most reliable and long-standing fixed-order programs available to us is the process e+e− → Z0 → qq¯ at lowest order and with an extra gluon emission i.e. up to along a given direction and then its decay products will, a significant fraction of the time, form a single fat jet. One can then apply the boosted object taggers to tag the Z boson paper, for the Higgs boson. The situation is similar but not identical to the case of the Higgs we have thus far considered, due to the polarisation of the Z boson so that the matrix element for Z decay to quarks differs from Higgs case and efficiencies at tree-level and beyond are affected, giving for example a different dependence on zcut, ycut at lowestorder. Nevertheless all of our conclusions about radiative corrections apply to this case as well, including our findings about the logarithmic structure of FSR contributions, since these results follow from the radiation of a gluon from the qq¯ pair, which is given by a process independent antenna pattern, that factorises from the process dependent lowest order decay of a scalar (i.e. Higgs) or a Z boson. Therefore in order to test our basic notion that fixed-order calculations should give a comparable FSR contribution to tagging efficiency, to that from MC event generators, it SHERPA shower vs EVENT2 Difference: Pruning SHERPA shower vs EVENT2 Difference: Trimming In the left hand panel we apply pruning with different values for the right hand panel we apply trimming with different values for √ and zcut with pT = 3 TeV. In should suffice to study results for boosted Z bosons from EVENT2 on the one hand and MC on the other. In order to minimise any process dependence one should choose precisely the same hard process for both the fixed-order and MC and hence we choose to study boosted to 3 TeV (as in the EVENT2 case) with the MC generator Sherpa 2.0.0 shower [67]. We first study the signal efficiencies, normalised to the lowest order result, that are These are shown in figure 20 as a function of the mass window, where the main text. A first observation is that there is a reasonable degree of qualitative and quantitative similarity between LO and shower estimates, over a wide range of mass windows, which establishes further our point about the essential perturbative stability of taggers against FSR corrections. The difference between the normalised signal efficiencies for SHERPA and pruning respectively. One should not in any case consider mass windows significantly lower than these values at high pT , in order to minimise NP hadronisation corrections from ISR. Differences start to become more marked for very low mass windows in particular for pruning, signalling the need for resummation and hadronisation corrections. We have also above and hence basically preserve the picture one obtains already at leading-order. One can also similarly study trimming where the choice of Rtrim is additionally crucial to ensure that radiative corrections are minimised so that signal efficiency is maintained. Another way of making this comparison is provided in figure 21 where we show the difference between EVENT2 and Sherpa 2.0.0 efficiencies (normalised to the lowest order result) as a function corresponding to the represents parameter values where the difference between the normalised signal efficiencies for Sherpa 2.0.0 and EVENT2 are less than two percent while the green and pink regions correspond to less than five and ten percent respectively. From the plot for pruning one notes that there is a correlation between values of and zcut needed to minimise radiative corrections. As one goes up in zcut, to stay within say the five percent zone, one has to correspondingly increase the size of the window. This is also in accordance with expectations from our simple analytics where one can expect large radiative corrections for of Rtrim required to minimise radiative degradation of mass. This is also reflected in figure 21 where once again the green and pink shaded regions represent differences of 5 percent order and shower descriptions. As one lowers Rtrim radiative losses get progressively larger. Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited. comparative study, Z. Phys. C 62 (1994) 127 [INSPIRE]. Rev. D 65 (2002) 096014 [hep-ph/0201098] [INSPIRE]. s = 7 TeV pp collisions with the ATLAS experiment, JHEP 05 (2012) 128 [arXiv:1203.4606] [INSPIRE]. [5] ATLAS collaboration, ATLAS measurements of the properties of jets for boosted particle searches, Phys. Rev. D 86 (2012) 072006 [arXiv:1206.5369] [INSPIRE]. [6] ATLAS collaboration, Performance of jet substructure techniques for large-R jets in [7] CMS collaboration, Studies of jet mass in dijet and W/Z + jet events, JHEP 05 (2013) 090 [8] ATLAS collaboration, Search for resonances decaying into top-quark pairs using fully proton-proton collisions at √ [arXiv:1306.4945] [INSPIRE]. [arXiv:1303.4811] [INSPIRE]. hadronic decays in pp collisions with ATLAS at √ [arXiv:1211.2202] [INSPIRE]. top quarks collected in pp collisions at √ [9] ATLAS collaboration, A search for tt¯ resonances in lepton+jets events with highly boosted s = 7 TeV with the ATLAS detector, JHEP 09 quarks with the ATLAS detector in √ 086 [arXiv:1210.4813] [INSPIRE]. [10] ATLAS collaboration, Search for pair production of massive particles decaying into three s = 7 TeV pp collisions at the LHC, JHEP 12 (2012) states with the ATLAS detector in proton-proton collisions at √ s = 7 TeV, Eur. Phys. J. C 73 (2013) 2263 [arXiv:1210.4826] [INSPIRE]. final state, JHEP 09 (2012) 029 [Erratum ibid. 03 (2014) 132] [arXiv:1204.2488] [INSPIRE]. at √s = 7 TeV, JHEP 12 (2012) 015 [arXiv:1209.4397] [INSPIRE]. CERN, Geneva Switzerland (2008). CMS-PAS-EXO-09-002, CERN, Geneva Switzerland (2009). [16] CMS collaboration, Search for high mass tt resonances in the all-hadronic mode, [arXiv:0806.0848] [INSPIRE]. JHEP 10 (2010) 078 [arXiv:1006.2833] [INSPIRE]. [arXiv:0912.0033] [INSPIRE]. massive jets, Phys. Rev. D 82 (2010) 054034 [arXiv:1006.2035] [INSPIRE]. [26] A. Abdesselam et al., Boosted objects: a probe of beyond the standard model physics, Eur. [27] A. Altheimer et al., Jet substructure at the Tevatron and LHC: new results, new tools, new benchmarks, J. Phys. G 39 (2012) 063001 [arXiv:1201.0008] [INSPIRE]. held at IFIC Valencia, 23rd–27th of July 2012, Eur. Phys. J. C 74 (2014) 2792 [arXiv:1311.2708] [INSPIRE]. Z+jet and dijet processes at the LHC, JHEP 10 (2012) 126 [arXiv:1207.1640] [INSPIRE]. observables in QCD, JHEP 08 (2006) 059 [hep-ph/0604094] [INSPIRE]. [41] R.B. Appleby and M.H. Seymour, Nonglobal logarithms in interjet energy flow with kt clustering requirement, JHEP 12 (2002) 063 [hep-ph/0211426] [INSPIRE]. [43] Y. Delenda, R. Appleby, M. Dasgupta and A. Banfi, On QCD resummation with kt [45] J. Bellm et al., HERWIG++ 2.7 release note, arXiv:1310.6877 [INSPIRE]. and LHC underlying event data, JHEP 10 (2013) 113 [arXiv:1307.5015] [INSPIRE]. JHEP 08 (1997) 001 [hep-ph/9707323] [INSPIRE]. scattering, in Monte Carlo generators for HERA physics, Hamburg Germany (1999), pg. 270 72 (2012) 2225 [arXiv:1206.0041] [INSPIRE]. Improved b-jet energy correction for H → b¯b searches at CDF, arXiv:1107.3026 [INSPIRE]. colliders, JHEP 02 (2008) 055 [arXiv:0712.3014] [INSPIRE]. the LHC, JHEP 08 (2010) 029 [arXiv:1005.0417] [INSPIRE]. http://www.hep.ucl.ac.uk/boost2014, University College London, London U.K. August [1] M.H. Seymour , Searches for new particles using cone and cluster jet algorithms: a [2] J.M. Butterworth , B.E. Cox and J.R. Forshaw , W W scattering at the CERN LHC , Phys. [3] J.M. Butterworth , A.R. Davison , M. Rubin and G.P. Salam , Jet substructure as a new Higgs search channel at the LHC , Phys. Rev. Lett . 100 ( 2008 ) 242001 [arXiv:0802.2470] [11] ATLAS collaboration, Search for pair-produced massive coloured scalars in four-jet final [12] CMS collaboration, Search for anomalous tt¯ production in the highly-boosted all-hadronic [13] CMS collaboration, Search for resonant tt¯ production in lepton+jets events in pp collisions [14] CMS collaboration, Search for heavy resonances in the W/Z-tagged dijet mass spectrum in pp collisions at 7 TeV, Phys . Lett . B 723 ( 2013 ) 280 [arXiv:1212. 1910 ] [INSPIRE]. [15] G. Brooijmans , High pT hadronic top quark identification , ATL-PHYS-CONF-2008-008 , [17] D.E. Kaplan , K. Rehermann , M.D. Schwartz and B. Tweedie , Top tagging: a method for identifying boosted hadronically decaying top quarks , Phys. Rev. Lett . 101 ( 2008 ) 142001 [18] T. Plehn , M. Spannowsky , M. Takeuchi and D. Zerwas , Stop reconstruction with tagged tops , [19] S.D. Ellis , C.K. Vermilion and J.R. Walsh , Techniques for improved heavy particle searches with jet substructure , Phys. Rev. D 80 ( 2009 ) 051501 [arXiv:0903.5081] [INSPIRE]. [20] S.D. Ellis , C.K. Vermilion and J.R. Walsh , Recombination algorithms and jet substructure: pruning as a tool for heavy particle searches , Phys. Rev. D 81 (2010) 094023 [21] D. Krohn , J. Thaler and L.-T. Wang , Jet trimming, JHEP 02 ( 2010 ) 084 [arXiv:0912.1342] [22] L.G. Almeida , S.J. Lee , G. Perez , G. Sterman and I. Sung , Template overlap method for [23] L.G. Almeida , O. Erdogan , J. Juknevich , S.J. Lee , G. Perez and G. Sterman , Three-particle templates for a boosted Higgs boson , Phys. Rev. D 85 ( 2012 ) 114046 [arXiv:1112. 1957 ] [24] S.D. Ellis , A. Hornig , T.S. Roy , D. Krohn and M.D. Schwartz , Qjets: a non-deterministic approach to tree-based jet substructure , Phys. Rev. Lett . 108 ( 2012 ) 182003 [25] D.E. Soper and M. Spannowsky , Finding physics signals with shower deconstruction , Phys. [28] A. Altheimer et al., Boosted objects and jet substructure at the LHC . Report of BOOST2012 , [29] G.P. Salam , Towards jetography, Eur. Phys. J. C 67 ( 2010 ) 637 [arXiv:0906. 1833 ] [30] A.J. Larkoski , G.P. Salam and J. Thaler , Energy correlation functions for jet substructure , [31] M. Dasgupta , A. Fregoso , S. Marzani and G.P. Salam , Towards an understanding of jet [32] M. Dasgupta , A. Fregoso , S. Marzani and A. Powling , Jet substructure with analytical [33] A.J. Larkoski , S. Marzani , G. Soyez and J. Thaler , Soft drop, JHEP 05 ( 2014 ) 146 [34] M. Rubin , Non-global logarithms in filtered jet algorithms , JHEP 05 ( 2010 ) 005 [35] I. Feige , M.D. Schwartz , I.W. Stewart and J. Thaler , Precision jet substructure from boosted event shapes , Phys. Rev. Lett . 109 ( 2012 ) 092001 [arXiv:1204.3898] [INSPIRE]. [36] A. Banfi and J. Cancino , Implications of QCD radiative corrections on high-pT Higgs searches , Phys. Lett . B 718 ( 2012 ) 499 [arXiv:1207.0674] [INSPIRE]. [37] A.J. Larkoski , I. Moult and D. Neill , Building a better boosted top tagger , Phys. Rev. D 91 [38] M. Dasgupta , K. Khelifa-Kerfa , S. Marzani and M. Spannowsky , On jet mass distributions in [39] J.R. Forshaw , A. Kyrieleis and M.H. Seymour , Super-leading logarithms in non-global [40] M. Dasgupta and G.P. Salam , Resummation of nonglobal QCD observables, Phys . Lett . B [42] A. Banfi and M. Dasgupta , Problems in resumming interjet energy flows with kt clustering , [44] M. Bahr et al., HERWIG++ physics and manual, Eur. Phys. J. C 58 ( 2008 ) 639 [47] A. Buckley et al., Rivet user manual , Comput. Phys. Commun . 184 ( 2013 ) 2803 [48] Y.L. Dokshitzer , G.D. Leder , S. Moretti and B.R. Webber , Better jet clustering algorithms , [49] M. Wobisch and T. Wengler , Hadronization corrections to jet cross-sections in deep inelastic [50] M. Cacciari , G.P. Salam and G. Soyez , FastJet user manual , Eur. Phys. J. C 72 ( 2012 ) 1896 [51] S. Gieseke , C. Rohr and A. Siodmok , Colour reconnections in HERWIG++, Eur . Phys . J. C [52] M. Bahr , M. Myska , M.H. Seymour and A. Siodmok , Extracting σeffective from the CDF γ + 3 jets measurement , JHEP 03 ( 2013 ) 129 [arXiv:1302.4325] [INSPIRE]. [53] CDF, D0 collaboration, T. Aaltonen , A. Buzatu , B. Kilminster , Y. Nagai and W. Yao , [54] M. Dasgupta , L. Magnea and G.P. Salam , Non-perturbative QCD effects in jets at hadron [55] G. Soyez , G.P. Salam , J. Kim , S. Dutta and M. Cacciari , Pileup subtraction for jet shapes , Phys. Rev. Lett . 110 ( 2013 ) 162001 [arXiv:1211.2811] [INSPIRE]. [56] D. Krohn , M.D. Schwartz , M. Low and L.-T. Wang , Jet cleansing: pileup removal at high luminosity , Phys. Rev. D 90 ( 2014 ) 065020 [arXiv:1309.4777] [INSPIRE]. [57] M. Cacciari and G.P. Salam , Pileup subtraction using jet areas , Phys. Lett . B 659 ( 2008 ) 119 [58] P. Richardson and D. Winn , Investigation of Monte Carlo uncertainties on Higgs boson searches using jet substructure , Eur. Phys. J. C 72 ( 2012 ) 2178 [arXiv:1207.0380] [59] A. Buckley et al., General-purpose event generators for LHC physics, Phys. Rept . 504 ( 2011 ) [60] S.D. Ellis and D.E. Soper , Successive combination jet algorithm for hadron collisions , Phys. [61] J. Thaler and L.-T. Wang , Strategies to identify boosted tops , JHEP 07 ( 2008 ) 092 [62] J. Thaler and K. Van Tilburg , Identifying boosted objects with N -subjettiness , JHEP 03 [63] M. Dasgupta , A. Powling and A. Siodmok , work in progress. [64] D.E. Soper and M. Spannowsky , Combining subjet algorithms to enhance ZH detection at [65] G. Soyez , Theory lessons from LHC Run I, talk delivered at BOOST 2014 , [66] S. Catani and M.H. Seymour , The dipole formalism for the calculation of QCD jet cross-sections at next-to-leading order , Phys. Lett . B 378 ( 1996 ) 287 [hep-ph/9602277] [67] T. Gleisberg et al., Event generation with SHERPA 1 . 1 , JHEP 02 ( 2009 ) 007 [68] T. Sj¨ostrand , S. Mrenna and P.Z. Skands , PYTHIA 6.4 physics and manual , JHEP 05 [69] CDF collaboration, T. Aaltonen et al., Studying the underlying event in Drell-Yan and high transverse momentum jet production at the Tevatron , Phys. Rev. D 82 (2010) 034001

This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2FJHEP08%282015%29079.pdf

Mrinal Dasgupta, Alexander Powling, Andrzej Siodmok. On jet substructure methods for signal jets, Journal of High Energy Physics, 2015, 79, DOI: 10.1007/JHEP08(2015)079