Qjets: A NonDeterministic Approach to TreeBased Jet Substructure
Abstract
Jet substructure is typically studied using clustering algorithms, such as , which arrange the jets’ constituents into trees. Instead of considering a single tree per jet, we propose that multiple trees should be considered, weighted by an appropriate metric. Then each jet in each event produces a distribution for an observable, rather than a single value. Advantages of this approach include: 1) observables have significantly increased statistical stability; and, 2) new observables, such as the variance of the distribution, provide new handles for signal and background discrimination. For example, we find that employing a set of trees substantially reduces the observed fluctuations in the pruned mass distribution, enhancing the likelihood of new particle discovery for a given integrated luminosity. Furthermore, the resulting pruned mass distributions for (background) QCD jets are found to be substantially wider than that for (signal) jets with intrinsic mass scales, e.g. boosted jets. A cut on this width yields a substantial enhancement in significance relative to a cut on the standard pruned jet mass alone. In particular the luminosity needed for a given significance requirement decreases by a factor of two relative to standard pruning.
To develop intuition about highenergy collisions like those at the LHC it is often helpful to think of an event as being produced by a multistage process. In this picture, a short distance scattering produces a few hard partons. The partons then shower soft and collinear QCD radiation. Finally, at long distances, the (colored) partons bind into the (color singlet) hadrons that we observe in the detector. This partonshower picture explains how clusters of nearby finalstate particles, called jets, defined by a jet algorithm, can reveal something about the shortdistance physics. Simulations of the parton shower produce events which, with sufficient tuning, exhibit remarkable agreement with collider data for nearly any conceivable infrared safe observable.
If one takes the partonshower picture literally, the constituents of a jet arise from a showerlike series of splittings producing a “tree” structure. Since the shower model for QCD is dominated by soft and collinear splittings, any deviation from this behavior could indicate the presence of contamination within the jet, or might indicate that the jet is not purely of QCD origin (e.g., it could come from a boosted heavy particle). Thus, by associating trees (by “trees,” we mean “clustering histories”) to jets one can obtain useful information, and indeed this is the basis for much of the work in the field of jet substructure (see Ref. Abdesselam, A. et. al (2011); *Almeida:2011ud; *Salam:2009jx; *2012arXiv1201.0008A for a review).
The association of a tree to a jet naturally emerges from the partonshower picture. In the parton shower, soft and collinear radiation is emitted in a particular sequence: a ordered shower builds a tree by adding on emissions in decreasing order of transverse momentum, while an angular ordered shower adds emissions in a sequence of decreasing angle. The recombination jet algorithms try to match this behavior. The algorithm Ellis and Soper (1993); *Catani:1993hr assembles a jet in increasing order of the (relative) metric that depends on both angle and the magnitude of the momentum, and the Cambridge/Aachen (C/A) algorithm Wobisch and Wengler (1998); *Dokshitzer:1997in assembles in increasing order of angle. Both can be viewed as a reasonable guess for the showering sequence history.
One problem with thinking of jet algorithms as reversing the parton shower is that the parton shower is not invertible – a given set of fourmomenta of final state particles could have evolved through a multitude of intermediate trees. In this paper we propose a way to account for the noninvertible nature of the parton shower by associating to each jet a set of trees instead of a single tree.
Related ideas have been discussed in the past. Long ago a probabilistic approach was used to improve the behavior of seeded jet algorithms Giele and Glover (1997). More recently, it has been shown that combining even highly correlated observables, such as jet masses arising from different grooming techniques Gallicchio et al. (2011); *Cui:2010km; *Soper:2010xk, can improve discovery significance. In addition, Ref. Soper and Spannowsky (2011) considered associating multiple trees to a jet to compare with models of showering in signal and background processes, and Ref. Volobouev (2011) proposed a measure of jet fuzziness to gauge the ambiguity in jet reconstruction. However, our approach is fundamentally different from these previous studies. We are interested in observables constructed from a distribution of trees for each jet in each event. For instance, we will show that by averaging treebased observables over the trees for each jet, their statistical stability can be substantially improved.
Associating a set of trees to a jet would not be feasible if one had to consider every tree which could be formed from a given set of final state fourmomenta in a jet. Fortunately, good approximations to such distributions obtained using every tree can be captured through a procedure analogous to MonteCarlo integration, allowing us to use a very small fraction of the trees. This is possible beause infrared and collinear safe jet observables must be insensitive to small reshufflings of the momenta, implying that large classes of trees give very similar information.
The algorithm we propose assembles a tree via a series of mergings:

At every stage of clustering, a set of weights for all pairs of the fourvectors is computed, and a probability , where , is assigned to each pair.

A random number is generated and used to choose a pair with probability . The chosen pair is merged, and the procedure is repeated until all particles all clustered.
This algorithm directly produces trees distributed according to their weight . To produce a distribution of trees for each jet, this algorithm is simply repeated times (not necessarily yielding distinct trees). Note that any algorithm which modifies a tree during its construction (e.g., jet pruning) can be adapted to work with this procedure, as demonstrated below.
One particularly interesting class of weights is given by
(1) 
with a real number we call rigidity. Here, is the jet distance measure for the pair, e.g.,
(2) 
where , and is the minimum over all pairs at this stage in the clustering. Note that with this metric, our algorithm reduces to a traditional clustering algorithm when , i.e., in that limit the minimal is always chosen. In this sense, it is helpful to think of the traditional, single tree algorithm as the “classical” approach, with controlling the deviation from the “classical” clustering behavior. With this analogy, we call the trees constructed in this nondeterministic fashion Qjets (“quantum” jets).
In order to get the most information out of the Qjets, it is logical to consider observables which are sensitive to the ordering of the clusterings in the tree. One such observable is the pruned jet mass, which we will use as our illustrative example. As described in Ref. Ellis et al. (2010a); *Ellis:2009su pruning is one of the jet grooming tools Butterworth et al. (2008); *Kaplan:2008ie; *Krohn:2009th. It is used to sharpen signal and reduce background when considering boosted heavy objects. The basic idea is to move along the tree and try to discard radiation which is soft and not collinear, and therefore likely to represent contamination from a part of the event in which we are not particularly interested (like the underlying event). In detail, if a step in the clustering would merge particles and which satisfy
(3) 
then the merging is vetoed and the softer of the two fourmomenta is discarded. In the specific analysis described here we take and , which are typical cuts for the C/A algorithm.
We apply this pruned Qjets procedure to samples of simulated boosted (signal) and QCD (background) jets generated with Pythia v6.422 Sjostrand et al. (2006) with ordered showers using the Perugia 2011 tunes Skands (2010) and assuming a LHC. In lieu of detector simulation we group the visible output of Pythia into massless “calorimeter cells” (with ), preserving the energy and the direction to the cell. The cells with energy bigger than become the inputs to the initial jetfinding algorithm (small alterations to this cut have no appreciable impact on our results). To find the initial jets we use the anti algorithm Cacciari et al. (2008) with as implemented in Fastjet v2.4.2 Cacciari et al. ; *Cacciari:2005hq; *Cacciari:2011ma and require . Once a jet is identified, the cells clustered in the jet become input to the Qjetpruning algorithm. A fastjet plugin with this implementation of Qjets is available at http://jets.physics.harvard.edu/Qjets.
Consider first a single QCD jet from the sample described above. Fig. 1 exhibits the pruned mass distribution for this jet obtained with the classical procedure for both and C/A pruning (the 2 vertical lines) and with using both the and C/A metrics for in Eq. (1). The curves illustrate the dependence on the form of , as well as on the value of the rigidity parameter . The upper panel is for where the trees are confined to stay close to the classical tree and the pruned masses likewise stay near the corresponding classical result. For small enough (say, ), a broad spectrum of trees is sampled. This is shown in the lower panel of Fig. 1 for , where the distributions generated with the and C/A definitions of the distance look similar, and have little correspondence with the classical results. This suggests that for a small enough rigidity parameter pruned Qjets become independent of the choice of distance measure used; they are therefore more likely to be characterizing physical features of an event rather than artifacts of using a particular jet algorithm.
We will now discuss two fundamentally different ways in which the discovery potential (e.g.for finding boosted jets on top of their QCD background) can be enhanced using Qjets:

Observables have smaller statistical variation. Even for the same number of background jets, the use of Qjets reduces the background fluctuations and increases the discovery potential , where and are the numbers of signal and background jets in the signal window and denotes the fluctuation in .

Qualitatively new observables, which depend on there being a distribution of trees for each jet, can now be considered. For example, we define below a powerful observable we call volatility which measures the width of the pruned Qjet mass distribution for each jet, something inaccessible to a classical jet algorithm
To quantify the first of these points, we consider a large number of pseudoexperiments, each of which analyses jets, with taken from a Poisson distribution with mean . With a classical jet algorithm we can extract a significance by counting, in each pseudoexperiment, the number and , of jets or QCD jets respectively, with pruned mass in a signal window, say between . The significance is then given by , where is the average over the pseudoexperiments of the number of signal events in the window and is the RMS fluctuation of over those pseudoexperiments. As expected and are proportional to , while and vary with . In addition to looking at , we can also look at the RMS fluctuations in the average pruned Qjet mass of the signal jets, , averaged over the signal jets in the signal window for each pseudoexperiment. This tells us the statistical uncertainty with which the mass could be measured from these events.
With Qjets, we can do something more sophisticated. Instead of the contribution of a given jet to or being 1 or 0 depending on whether the pruned mass is in the signal window or not, the contribution of the jet is now a rational number between 0 and 1, given by the fraction of the pruned masses that fall in the signal mass window. This is a way of reducing the contribution from events which are less signal like, without discarding them completely. In the limit , this reduces to the classical measure, but for finite , we expect an improvement in both significance and in .
Vol.  Rigidity  

cut ()  
None  1.07(1)  1.13(1)  1.18(1)  1.14(1)  1.06(1)  
0.05  1.43(4)  1.44(3)  1.39(3)  1.27(1)  1.08(1)  
0.04  1.51(4)  1.45(4)  1.39(3)  1.29(3)  1.10(1)  
0.03  1.51(2)  1.45(3)  1.37(4)  1.35(2)  1.10(1)  
0.02  1.28(5)  1.24(3)  1.28(3)  1.36(3)  1.13(1)  
None  1.32(2)  1.31(2)  1.25(2)  1.10(2)  1.03(1)  
0.05  0.80(1)  0.80(1)  0.81(1)  0.96(1)  1.01(1)  
0.04  0.62(3)  0.69(3)  0.71(2)  0.93(1)  1.00(1)  
0.03  0.56(4)  0.57(5)  0.60(4)  0.87(1)  0.98(1)  
0.02  0.48(7)  0.49(7)  0.50(7)  0.77(2)  0.95(1) 
For numerical analysis we use the C/A algorithm for both the classical and Qjets cases and take . (We find that the results saturate for ). We present results in Table 1 as ratios of the Qjets result to the classical result, indicating the improvement in significance and mass uncertainty we can expect. These ratios should be independent of and so we determine statistical uncertainties by fitting to results for and . The approximate statistical uncertainties are shown in parenthesis and apply to the last digit. We performed pseudoexperiments, expecting statistical fluctuations from this procedure.
The first set of rows in Table 1 display measurements of the discovery potential compared to the results with classical pruning. Focus on the rows labeled “none” for now (volatility is explained below). Since this quantity scales as , the square of the number in the Table can be interpreted as an effective luminosity improvement due to employing the Qjet procedure. For example, for the number means an effective increase in the luminosity by . Larger values confine the range of trees and yield results very near the classical pruning results, i.e., values (, with a much broader range of trees) also tend to degrade (decrease) the discovery potential. . Smaller
The second set of rows exhibit the average jet mass fluctuation (note classical over Qjets here). Values greater than unity mean that the mass can be measured more precisely with the Qjet procedure for the same luminosity. Note that there is continuing improvement in as decreases. That we get sensible results for (i.e.with a flat distance measure) is presumably because pruning is relatively insensitive to which tree we assign; even for physically unlikely clusterings, the hard radiation that reconstructs the mass is typically not pruned away.
The second way we have considered using Qjets is in constructing qualitatively new types of observables. As an example, consider the volatility of a jet, defined by
(4) 
where and are the RMS deviation and the mean of the pruned jet mass distribution for a single jet. The distribution of volatility for signal and background Qjets with is shown in Fig. 2. We see that jets have a lower volatility than QCD jets. This is easily understood, since the jets have an intrinsic physical mass scale, while the QCD jets do not. Cutting on volatility, can therefore improve significance in a boosted search. The improvement is given in Table 1 for various values of .
The efficiencies for a volatility cut on signal and background are shown in Fig. 3. These efficiencies are defined as the fraction of the Qjets that yield a pruned mass in the mass bin after the volatility cut. We plot them normalized to the classical results ( with no volatility cut). In the limit the curve collapses to the point (1,1). The upper right region of the plot corresponds to large values of , i.e., effectively no volatility cut. We find that the largest signal significance is obtained for a volatility cut of approximately , where for near zero we achieve a relative of and a relative improvement of (the square of this number is the factor of two quoted in the Abstract). This corresponds to the neighborhood of the point in Fig. 3. Finally we note that the precision of the mass measurement, shown in the lower rows in the table, is somewhat degraded by placing a cut on the volatility. This should not be a surprise as the cut discards some of the signal jets. A more comprehensive discussion of the statistics and of volatility will be given in Ellis et al. .
In this paper, we have shown that it can be advantageous to consider a large number of trees constructed from the same jet in a single event, rather than a single tree as is done in traditional clustering algorithms. Although this paper has focused on treebased observables, the Qjets idea, of using nondeterminism in event analysis, can naturally be applied in many other ways. Indeed, most observables, including jet substructure observables, such as jet masses, moments, pull Gallicchio and Schwartz (2010), jet shapes Ellis et al. (2010b); *Gallicchio:2011xq, etc., as well as more global observables, such as the number, distribution and 4momenta for the jets in an event, work by trying to make the best guess at which properties of which final state particles tell us the most information about the underlying physics. The basic idea for Qjets is that there is an inherent ambiguity in this best guess, both due to there not being a precise correspondence between final state particles and underlying physics, and due to our poor ability to extract that correspondence even if it were welldefined (as in a color singlet decay, for example). Thus, it would be natural to consider multiple interpretations of any observable, to see whether getting away from the best guess can give us more robust information about the underlying physics, as it has with the treebased substructure considered here. In will be interesting to see in future work how far this nondeterministic approach can be pushed.
SDE, AH, and TSR were supported in part by US Department of Energy under contract number DEFGO296ER40956. MDS was supported in part by the Department of Energy, under grant DESC003916. DK was supported in part by a Simons postdoctoral fellowship and by an LHCTI travel grant. AH, DK, and TSR were supported in part by the KITP, where a portion of this work was completed, under National Science Foundation under Grant No. PHY0551164. Some computations were performed on the Odyssey cluster at Harvard University.
References
 Abdesselam, A. et. al (2011) Abdesselam, A. et. al, Eur.Phys.J. C71, 1661 (2011), arXiv:1012.5412 [hepph] .
 Almeida et al. (2011) L. G. Almeida, R. Alon, and M. Spannowsky, (2011), arXiv:1110.3684 [hepph] .
 Salam (2010) G. P. Salam, Eur.Phys.J. C67, 637 (2010), arXiv:0906.1833 [hepph] .
 Altheimer, A. et. al (2012) Altheimer, A. et. al, ArXiv eprints (2012), arXiv:1201.0008 [hepph] .
 Ellis and Soper (1993) S. D. Ellis and D. E. Soper, Phys.Rev. D48, 3160 (1993), arXiv:hepph/9305266 [hepph] .
 Catani et al. (1993) S. Catani, Y. L. Dokshitzer, M. H. Seymour, and B. R. Webber, Nucl. Phys. B406, 187 (1993).
 Wobisch and Wengler (1998) M. Wobisch and T. Wengler, (1998), arXiv:hepph/9907280 [hepph] .
 Dokshitzer et al. (1997) Y. L. Dokshitzer, G. D. Leder, S. Moretti, and B. R. Webber, JHEP 08, 001 (1997), arXiv:hepph/9707323 .
 Giele and Glover (1997) W. Giele and E. Glover, (1997), arXiv:hepph/9712355 [hepph] .
 Gallicchio et al. (2011) J. Gallicchio, J. Huth, M. Kagan, M. D. Schwartz, K. Black, et al., JHEP 1104, 069 (2011), arXiv:1010.3698 [hepph] .
 Cui et al. (2011) Y. Cui, Z. Han, and M. D. Schwartz, Phys.Rev. D83, 074023 (2011), arXiv:1012.2077 [hepph] .
 Soper and Spannowsky (2010) D. E. Soper and M. Spannowsky, JHEP 1008, 029 (2010), arXiv:1005.0417 [hepph] .
 Soper and Spannowsky (2011) D. E. Soper and M. Spannowsky, (2011), arXiv:1102.3480 [hepph] .
 Volobouev (2011) I. Volobouev, J.Phys.Conf.Ser. 293, 012028 (2011).
 Ellis et al. (2010a) S. D. Ellis, C. K. Vermilion, and J. R. Walsh, Phys.Rev. D81, 094023 (2010a), arXiv:0912.0033 [hepph] .
 Ellis et al. (2009) S. D. Ellis, C. K. Vermilion, and J. R. Walsh, Phys.Rev. D80, 051501 (2009), arXiv:0903.5081 [hepph] .
 Butterworth et al. (2008) J. M. Butterworth, A. R. Davison, M. Rubin, and G. P. Salam, Phys.Rev.Lett. 100, 242001 (2008), arXiv:0802.2470 [hepph] .
 Kaplan et al. (2008) D. E. Kaplan, K. Rehermann, M. D. Schwartz, and B. Tweedie, Phys.Rev.Lett. 101, 142001 (2008), arXiv:0806.0848 [hepph] .
 Krohn et al. (2010) D. Krohn, J. Thaler, and L.T. Wang, JHEP 1002, 084 (2010), arXiv:0912.1342 [hepph] .
 Sjostrand et al. (2006) T. Sjostrand, S. Mrenna, and P. Z. Skands, JHEP 0605, 026 (2006), arXiv:hepph/0603175 [hepph] .
 Skands (2010) P. Z. Skands, Phys.Rev. D82, 074018 (2010), arXiv:1005.3457 [hepph] .
 Cacciari et al. (2008) M. Cacciari, G. P. Salam, and G. Soyez, JHEP 0804, 063 (2008), arXiv:0802.1189 [hepph] .
 (23) M. Cacciari, G. Salam, and G. Soyez, “FastJet,” Http://fastjet.fr/.
 Cacciari and Salam (2006) M. Cacciari and G. P. Salam, Phys.Lett. B641, 57 (2006), arXiv:hepph/0512210 [hepph] .
 Cacciari et al. (2011) M. Cacciari, G. P. Salam, and G. Soyez, (2011), arXiv:1111.6097 [hepph] .
 (26) S. D. Ellis, A. Hornig, D. Krohn, T. S. Roy, and M. D. Schwartz, in preparation .
 Gallicchio and Schwartz (2010) J. Gallicchio and M. D. Schwartz, Phys. Rev. Lett. 105, 022001 (2010), arXiv:1001.5027 [hepph] .
 Ellis et al. (2010b) S. D. Ellis, C. K. Vermilion, J. R. Walsh, A. Hornig, and C. Lee, JHEP 1011, 101 (2010b), arXiv:1001.0014 [hepph] .
 Gallicchio and Schwartz (2011) J. Gallicchio and M. D. Schwartz, Phys.Rev.Lett. 107, 172001 (2011), arXiv:1106.3076 [hepph] .