Recursive generation of one-loop amplitudes in the Standard Model

We introduce the computer code Recola for the recursive generation of tree-level and one-loop amplitudes in the Standard Model. Tree-level amplitudes are constructed using off-shell currents instead of Feynman diagrams as basic building blocks. One-loop amplitudes are represented as linear combinations of tensor integrals whose coefficients are calculated similarly to the tree-level amplitudes by recursive construction of loop off-shell currents. We introduce a novel algorithm for the treatment of colour, assigning a colour structure to each off-shell current which enables us to recursively construct the colour structure of the amplitude efficiently. Recola is interfaced with a tensor-integral library and provides complete one-loop Standard Model amplitudes including rational terms and counterterms. As a first application we consider Z+2jets production at the LHC and calculate with Recola the next-to-leading-order electroweak corrections to the dominant partonic channels.


Introduction
The study of the mechanism of electroweak (EW) symmetry breaking and the search for physics beyond the Standard Model (SM) is the primary goal of the Large Hadron Collider (LHC) and the corresponding experiments. With the discovery of a bosonic resonance with a mass of around 125 GeV important progress has been achieved. Still it remains an open question if this resonance is the SM Higgs boson and if there are phenomena of new physics at the TeV scale. Evidence for a discovery of new particles and the precise determination of their masses and couplings on the one hand as well as the establishment of exclusion limits on the other hand are achieved by sophisticated experimental analyses, capable of highlighting a small signal on a huge background. For the interpretation of the data a precise knowledge of the background is essential. This often relies on theoretical descriptions, sometimes also on data-driven estimations where the extrapolation to the signal region is based on theoretical distributions. A sound comparison of experimental signals with theoretical predictions allowing precise tests of the SM (or of theories beyond) requires high precision from both experiment and theory.
Theoretical predictions at leading order (LO) in perturbation theory are usually insufficient to match the experimental precision. At a hadron collider QCD corrections are indispensable, but also EW corrections can have an important impact. For instance, for Higgs-boson production in vector-boson fusion, EW and QCD corrections are of the same order of magnitude [1]. Moreover, the high energies attained by the LHC allow to collect data in phase-space regions where the effects of logarithms of EW origin become sizeable. The high centre-of-mass energies available at the LHC generate a lot of events with many particles in the final state. Therefore a proper theoretical description of LHC physics requires next-to-leading-order (NLO) computations of multiparticle processes (with five, six, or more external legs) in the full SM (including EW corrections).
In the past years many groups have concentrated their efforts to make such calculations feasible. New techniques have been proposed, mainly for the computation of one-loop virtual corrections which are considered the bottleneck of NLO calculations. In the standard approach based on Feynman diagrams the major problem is caused by huge algebraic expressions appearing in the computation of the virtual amplitudes. The development of techniques based on Generalised Unitarity [2][3][4][5][6][7][8][9] allowed a change of perspective. The formal starting point of these methods is the general decomposition of one-loop amplitudes as linear combinations of scalar integrals, as obtained from the standard Passarino-Veltman reduction [10]. The computation of the coefficients of the scalar functions is then reduced to the calculation of tree-level amplitudes by means of cutting equations. The simplicity of these relations allowed the automation of NLO QCD computations leading to the development of computer programs [11][12][13][14][15][16][17][18] which enabled the calculations of many QCD processes [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36] of the Les Houches priority list [37].
More recently, based on ideas of Soper [38] purely numerical methods for the calculation of one-loop QCD amplitudes have been put forward [39]. They are based on subtracting the soft, collinear, and ultraviolet divergences of one-loop amplitudes and performing the loop integration of the remaining finite integrals numerically after suitably deforming the integration contours in the complex space. These methods do not rely on Feynman graphs and have been proven to work for multi-parton amplitudes [40].
The success of the new methods did, however, not supersede the traditional diagrammatic approach. The Generalised Unitarity methods still suffer from the numerical instabilities characteristic of the Passarino-Veltman reduction. These are overcome by computing the amplitude in quadruple or multiple precision at critical phase-space points, reducing however the CPU efficiency [41,42]. While rescue solutions in this context have been proposed in Refs. [43,44], it is well-known that numerical instabilities can be avoided by constructing the amplitude as a linear combination of tensor integrals, which is the natural representation in the Feynman-diagrammatic approach. Various groups [18,[45][46][47][48] have developed efficient computational techniques for the calculation of the tensor integrals that avoid numerical instabilities. Using these improved methods in the Feynman-diagrammatic approach yields more than competitive numerical codes. In particular the methods of Refs. [46,48] have been successfully applied to the calculation of both EW corrections [1,[49][50][51][52] and QCD corrections [53][54][55][56].
Recently the diagrammatic approach has been further boosted by the advent of the Open-Loops algorithm [57]. Organising the diagrams into cut-opened topologies, the coefficients of the tensor integrals are recursively built with tree-level-like techniques; the efficiency is increased by pinch identities that relate higher-point loop diagrams to pre-computed lower-point diagrams.
Since the method is based on individual topologies, colour factorises and is treated algebraically. The present implementation of OpenLoops can handle NLO QCD corrections to any Standard Model process.
An interesting hybrid method, proposed by Andreas van Hameren in Ref. [58] for evaluating one-loop gluonic amplitudes, combines stable and universal tensor-reduction methods for loop integrals with a basic result of Generalised Unitarity, namely the reduction of the computation of a one-loop amplitude to that of a set of tree-level amplitudes. The technique relies on the representation of the amplitude in terms of tensor integrals, whose coefficients are computed recursively [59] without resorting to Feynman diagrams at any stage.
In this paper we present Recola, a generator of SM one-loop (and tree) amplitudes. It is based on an algorithm which implements recursion relations for the computation of the coefficients of the tensor integrals. The goal is to combine the efficiency of the numerically stable tensor-integral reduction with the automation made possible by a completely recursive nondiagrammatic approach.
As a first application of Recola we have calculated EW corrections to pp → Z + 2 jets. Due to its large cross section and similar signatures this process provides a major background for Higgs-boson production in vector-boson fusion kinematics. The dominant NLO QCD corrections of O(α 3 s α) have been investigated in Refs. [60,61], while a subset of EW Z + 2 jets production has been studied at NLO QCD in Ref. [62]. Here, we calculate the EW corrections of O(α 2 s α 2 ) to the dominant partonic channels for Z + 2 jets production. This paper is organised as follows: in Section 2 we present the algorithm for the construction of amplitudes, introducing first our implementation of recursion relations at tree level (Section 2.1) and discussing in Section 2.2 its generalisation to one-loop amplitudes. After some remarks on the computation of the rational terms and counterterms (Section 2.3), we describe in Section 2.4 a new algorithm for the treatment of colour. The setup of our calculation for Z+2 jets production is detailed in Section 3.1, and numerical results are discussed in Section 3.2. Finally, Section 4 contains our conclusions.

Recola: REcursive Computation of One-Loop Amplitudes
Recola is a code written in Fortran90 for the computation of tree and one-loop scattering amplitudes in the SM, based on recursion relations.
Let us consider a process with E external legs, and select a particle P of the model 1 together with a sub-set of n external legs. The off-shell current w(P, {n}) is then defined as the sum of all Feynman sub-graphs which generate P combining the selected n external particles: Here the shaded bubble pictorially represents all possible sub-graphs, and the dot indicates the off-shell part of the current. If according to the Feynman rules of the theory a sub-set of n external particles cannot generate P , the corresponding current vanishes. For n > 1 the propagator of P is included in the definition and the current is called "internal". A current generated by only one (n = 1) external particle P ′ is called "external"; if P = P ′ , it coincides with the wave function of the particle, otherwise it vanishes. In a theory with tri-and quadri-linear couplings only, the internal off-shell currents can be constructed using recursively the Dyson-Schwinger equations: Each term of the sums represents a "branch", which is obtained by multiplication of the generating currents w(P i , {i}), w(P j , {j}) [and w(P k , {k}) for the second term of (2. 2)] with the interaction vertex, marked by the small box, and the propagator of P , marked by the thick line. Branches have to be built for all possible generating currents formed by sub-sets {i}, {j} (and {k}) of the n external particles such that i + j (+k) = n. The recursive construction of the amplitude can be efficiently implemented using lower-multiplicity off-shell currents as seeds for the numerical evaluation of higher-multiplicity ones. First, all possible currents with two external legs (n = 2) are calculated combining through a tri-linear coupling pairs of external currents only. Next the currents with three external particles (n = 3) are generated summing the branches with three external currents linked through a quadri-linear coupling and those with one external current and one of the calculated internal currents with two external legs, linked through a tri-linear coupling. Analogously the currents with n = 4 are then computed combining two or three currents with n = 1, n = 2 and n = 3, and so on. As a consequence of the summation in (2.2), the current w(P, {n}) depends on the particle P and on the set of generating external particles {n} but not on the particular way these particles have been combined in order to get w(P, {n}). In this respect, working with off-shell currents rather than Feynman diagrams allows to avoid recomputing identical sub-graphs contributing to different diagrams, since each current is computed just once; furthermore, the summation in (2.2) reduces the number of generated objects which are passed to the next step of the recursion. In order to calculate the amplitude for a tree-level process A → B we first use crossing symmetry to switch to incoming particles and consider the corresponding process A + B → 0, where B is charge conjugate to B. Next we choose one of the external legs, for definiteness the E th leg, to close the construction of the amplitude. Then all currents resulting from the other E − 1 external legs are recursively constructed. In the last step of the procedure, i.e. for n = E − 1, we require the generated particle P to be the selected E th external particle. As a consequence this last current is unique, and the amplitude M is obtained multiplying with the inverse of its propagator and with the wave function of the selected E th particle (which coincides with the E th external current): The recursive evaluation of the currents begins with the external currents (n = 1), which, for colourless particles of a given polarisation λ and momentum p, are given by the corresponding wave functions: The explicit expressions for the spinors u λ (p),ū λ (p), v λ (p),v λ (p) of fermions and for the polarisation vectors ǫ λ (p), ǫ * λ (p) of vector bosons have been coded using Ref. [69]. For coloured external particles also the information on colour has to be kept, as we explain in Section 2.4.
Once the external currents have been numerically evaluated, the internal ones are built using the Feynman rules of the theory. For example, given a pair of external e − and e + with momenta p 1 and p 2 and currents u λ 1 (p 1 ) andv λ 2 (p 2 ), one can generate the internal current of a photon contracting the two external currents with the QED vertex −i e γ µ and the photon propagator −i g µν /(p 1 + p 2 ) 2 . Once the particle P of the off-shell propagator is fixed, in general, several branches contribute to the same internal current since different combinations of particles with appropriate interaction vertices can generate the same particle P . In these cases the contributions can be simply summed up, according to (2.2).
The recursive algorithm is implemented in the code Recola through two steps.
In the first part, the initialisation phase, the currents are identified by integer numbers which contain all the relevant non-dynamical information: the particle content (e − , e + , γ, etc.), the colour information (see Section 2.4) and an integer tag number. The tag number is assigned according to a binary notation [66,70]. In practice, the external currents get a label 2 i−1 , where i is the i th external leg, i.e. 1 → 1, 2 → 2, 3 → 4, . . . , E → 2 E−1 . The tag number of an internal current is obtained by a summation of the integer tags of the external currents contributing to it. The binary notation ensures that, given the tag number of any internal current, the contributing external legs can be uniquely identified. Moreover, it reflects the basic property that a current depends on the generating external particles {n}, but not on the particular way these particles have been combined in order to obtain the current.
The initialisation part of the code builds a skeleton of the amplitude: all needed off-shell currents are enumerated and, for each branch, all generating off-shell currents and the generated one are identified. This part is run once for all, before giving explicit values to the momenta of the external particles, i.e. before performing the phase-space integration in a Monte Carlo program.
The second part of the code, the dynamical production phase, uses the results of the first part to actually compute the amplitude for each point of the phase-space. First, the external currents are numerically computed; then, branch after branch, all internal currents are recursively evaluated according to the skeleton generated in the initialisation phase. Here the code has to be as efficient as possible in terms of CPU time because it must run on a large grid of points in phase-space; the computation of all non-dynamical quantities in the initialisation phase allows to avoid a repetition of those operations which can be performed once independently of the particular values of the external momenta.

One-loop recursion relations
Let us now move to the one-loop case. After summing all contributing Feynman graphs G i every one-loop amplitude can be written as a linear combination of tensor integrals: Here the tensor coefficients c (j,r j ,N j ) µ 1 ···µr j do not depend on the loop momentum q, which is present only in the tensor integrals The index j classifies the different tensor integrals needed for the process, the integer N j equals the number of loop propagators, and r j ( ≤ N j in the 't Hooft-Feynman gauge) is the rank of the tensor integral.
Leaving aside the computation of the tensor integrals, to be performed with the preferred technique, we focus here on the tensor coefficients c (j,r j ,N j ) µ 1 ···µr j , which for multi-leg processes, due to the complexity of the SM, result in long algebraic expressions in the standard Feynmandiagram approach. An interesting idea has been proposed in Ref. [58], where recursive relations for the tensor coefficients of gluon amplitudes have been derived for colour-ordered amplitudes of purely gluonic processes. We have further developed this approach to deal with the full SM. There is a clear topological correspondence between a one-loop diagram with E external legs and a tree diagram with E + 2 external legs, obtained after cutting one of the loop lines. After uniquely fixing this correspondence, one can compute the tensor coefficients c (j,r j ,N j ) µ 1 ···µr j with recursion relations similar to those used for tree amplitudes.
Given a process A → B at one loop, we consider first the set of all tree processes A + B + P +P → 0 for each particle P of the SM. 2 However, the set {A + B + P +P → 0, ∀P ∈ SM} contains more diagrams than the original one-loop process A → B. This is due to the fact that we can cut the loop diagram at any of its loop lines and that we can run along the loop clockwise and counterclockwise. Therefore, we have to fix some rules to discard the redundant diagrams. To explain these rules, we can work without loss of generality in a theory with a single scalar particle φ with a tri-linear coupling (the generalisation to the presence of a quadri-linear coupling is straightforward). In such a theory the set {A + B + P +P → 0, ∀P ∈ SM} reduces to one process, i.e. A + B + φ +φ → 0. We use tag numbers for external and internal currents as explained in Section 2.1 and assign tag numbers 2 E and 2 E+1 to the currents corresponding to the two additional external legs of P andP . These currents are called "external loop currents", while the legs of P andP are called "external loop legs".
Let us first consider the sets of diagrams with three and four external legs where only external particles enter the loop. Marking with a cross the two external loop legs of the trees, we get: 2 Here all unphysical particles, in particular also Faddeev-Popov ghosts, must be included. . (2.8) The With these rules the way of cutting each loop diagram of (2.8) becomes unique: . (2.9) The rules have to be generalised to other classes of diagrams where external legs can combine in tree sub-graphs before entering the loop flow. To this end we define an identifier number for each current, given by the smallest external tag among those forming its tag number. For example, a current with tag number 13 = 1 + 4 + 8, which has been created combining the external legs 1, 4 and 8, has identifier 1; a current with tag number 6 = 2 + 4 has identifier 2. For external currents the identifier coincides with the tag number. In case of a quadri-linear coupling, the two currents entering the loop are represented by a common identifier, the minimum of the two identifiers for the individual currents. Now the generalisation of the two rules is straightforward: 1) The current with identifier 1 must be attached to the external loop current with tag number 2 E .
2) The currents with the three smallest identifiers must be attached to the loop flow following the ascending order of their identifiers.
The proper treatment of self-energy insertions deserves particular care. For two identical particles flowing in the self-energy loop, the selection rules (actually the first rule alone) reduce the number of cut diagrams to one. On the other hand in this case the self-energy diagram gets a symmetry factor 1/2 because of Wick's theorem. If the two particles in the loop are different, we get two cut diagrams, which however give the same contribution. Therefore, the cutting procedure for self-energies reproduces the loop diagrams correctly if the corresponding tree contributions are multiplied by a factor 1/2 in all cases.
In addition to rules 1) and 2), in most renormalisation schemes some classes of diagrams have to be discarded, namely tadpoles and self-energy insertions on external legs. Therefore, being S = 1+2+4+. . . +2 E−1 = 2 E − 1 the sum of all external tags of the process, we use the following additional rules: 3) A current with tag number equal to S or equal to S − 2 n , n = 0, 1, . . . E − 1, cannot enter the loop flow in a branch with a tri-linear vertex. This eliminates tadpole diagrams and those self-energy contributions made of tri-linear vertices which are inserted on external legs.

4)
If in a branch with a quadri-linear vertex one of the two currents entering the loop flow is external, the sum of their tag numbers cannot be equal to S. This eliminates the self-energy contributions involving one quadri-linear vertex inserted on external legs.
Applying these four rules to the generation of the skeleton for the off-shell currents of the tree processes of the set {A → B + P +P , ∀P ∈ SM}, we obtain the proper skeleton for the one-loop process A → B.
Having reduced the formal generation of the one-loop amplitude to the generation of a set of tree-level processes, we can build the "loop off-shell currents" in a similar way as the tree-level currents in Section 2.1, in order to obtain the tensor coefficients c (j,r j ,N j ) µ 1 ···µr j of (2.5). At one-loop level, the particle which closes the recursive construction is the external loop leg with tag number 2 E+1 . Since all loop lines are virtual lines and retain their propagator, the last step of (2.3), where the last generated current is multiplied with the wave function of the particle closing the recursion, × 2 E+1 , is performed without multiplication by the inverse propagator resulting in The external currents for the first E external legs are defined as in Section 2.1; the external currents of the two external loop legs are defined such that the contraction originally contained in the loop can be easily reproduced. To this end, we introduce a suitable set of spinors ψ i = u i ,v i and polarisation vectors ǫ µ i for the cut fermions and vector bosons, where i denotes the "polarisation", and α, β and µ, ν are spinor and Lorentz indices, respectively. The loops are glued together as: Except for the scalar case, the cutting procedure associates to the one-loop amplitude the sum of four tree-level amplitudes with particular spinors/polarisation vectors for the cut particle.
Having fixed the external currents, we describe how to compute the internal ones. As explained in Section 2.1, for tree amplitudes these are computed summing up the currents generated in branches where the generating currents are multiplied with the Feynman rules for the vertex and the propagator of the generated particle. This is valid also at one-loop level for pure tree currents built by combining the original external legs 1, . . . , 2 E−1 .
The new features of the loop case are connected to the loop off-shell currents involving the external loop leg with tag number 2 E carrying a loop momentum q. The external loop current with tag number 2 E defines the beginning of the loop flow; all currents with tag number ≥ 2 E belong to the loop flow and are called loop currents, while the branches generating them are called loop branches. Every internal loop current contains a q-dependence, generated by the Feynman rules for the vertex and the propagator. Working in the 't Hooft-Feynman gauge in the SM, the q-dependence of (vertex)×(propagator) takes the form where the linear q-dependence in the numerator results from a fermion propagator, from a coupling of three vector bosons, or from a coupling between one vector boson and two scalar/ghost particles (in all other cases a µ = 0). If the interacting particles are fermions and/or vector bosons, the coefficients a µ and b have an additional Dirac and/or Lorentz structure which is not made explicit here for simplicity. Denoting by w 1 (q) the first internal loop current, we have (2.14) where p 1 is the sum of the external momenta entering the first loop branch while d µ 1 1,1 and d 1,0 result from a product of the generating currents with the constants a µ and b in (2.13).
If the current w 1 corresponds to a fermion/vector boson, w 1 as well as d µ 1 1,1 and d 1,0 carry an additional spinor/Lorentz index, suppressed in (2.14). Proceeding along the loop flow, the second internal loop current is built combining w 1 (q) with tree currents and with a product (vertex)×(propagator) of the form (2.13) and reads . (2.15) The recursively constructed l th loop current of the loop flow is of the form . , k = 0, . . . , l are in general not symmetric under the exchange of their indices µ 1 · · · µ k , but, being implicitly multiplied by the symmetric product q µ 1 · · · q µ k , they can be symmetrised at each step. In this way the number of independent coefficients d µ 1 ···µ k l,k is decreased, leading to a reduction of operations in subsequent steps of the recursion.
The recursion relations for loop branches allow us to compute the coefficients of the loop currents, but we are not allowed to sum them unless their denominators are equal. From (2.16) one can see that the denominators are products of propagators and are determined by a sequence of off-set momenta {p 1 , . . . , p l } and masses {m 1 , . . . , m l }. Therefore, while tree currents are defined by the tag number, the particle content, and the colour information, the loop currents need an additional parameter, called sequence number, containing the information on {p 1 , . . . , p l } and {m 1 , . . . , m l }. In this way, loop currents with different q-dependent denominators are distinguished, and contributions from loop branches with the same denominators can be summed as for tree branches.
The introduction of the sequence number spoils the uniqueness of the last current. Given the cut particle and its polarisation [the index i in (2.11)], the coefficients {d N j ,0 , d µ 1 N j ,1 , . . . , d µ 1 ···µ N j N j ,N j } of the last current with sequence number n s give a contribution to the tensor coefficients of (2.5) with j = n s . The index j, describing the class of the tensor integrals, is then identified with the sequence number of the last currents. At this level, contributions from different polarisations of the cut particle can be summed up. For different cut particles one gets in general contributions to different classes of tensor integrals. However, since not the particles but only their masses and momenta enter the sequence number, also contributions to the same tensor integral classes appear and are combined. Also at one-loop level the code is divided into an initialisation and a production phase. In the initialisation phase, the skeleton of the branches is generated and all quantities which do not depend on the momenta are fixed. In particular, the sequence numbers of the last currents allow already at this step to determine the list of needed tensor integrals. Therefore the computation of the tensor integrals can be done independently of the one of the tensor coefficients (in the production phase of the code).

Rational terms and renormalisation
In dimensional regularisation the calculation of Feynman amplitudes is performed in D = 4 − 2ǫ space-time dimensions, and the result is arranged as a power series in ǫ. In this way, UV divergences of the loop integrals manifest themselves as 1/ǫ poles to be subtracted upon renormalisation. For consistency, all space-time related objects entering the amplitude have to be promoted to their D-dimensional generalisations and all manipulations have to be performed in D dimensions. Otherwise terms of order O(ǫ) are missed which, combined with the 1/ǫ pole of the loop integral, give finite contributions to the amplitude. Because these are rational functions of the kinematical invariants, they are conventionally called rational terms. A rational term is dubbed R 1 -term if it results from the ǫ-dependence of the denominators of the loop integrals and it is called R 2 -term if it is generated by a O(ǫ) term in the numerator of the Feynman amplitude [71].
We assume that the tensor integrals, taken as input by Recola, contain the R 1 -terms, either by keeping the denominators of the loop propagators D-dimensional in the calculation or by explicitly adding these terms. Note, however, that even if the calculation of the tensor integrals is performed in D dimensions, they enter Recola as numerical four-dimensional tensors. The numerical construction of the tensor coefficients, on the other hand, works strictly in four spacetime dimensions, so that the R 2 -terms are not taken into account automatically. These terms can, however, be easily computed using effective Feynman rules which have been implemented in our code using the results of Refs. [71][72][73][74] 3 . The insertion of the effective Feynman rules for vertices and propagators is performed in the tree-level amplitude generator, taking care that only one of the vertices results from a rational term.
Renormalisation is performed via counterterms based on the conventions of Ref. [75]. In analogy with the effective Feynman rules for the rational terms, the insertion of counterterms takes place in the tree-level amplitude generator. Presently, counterterms are fixed following the complex-mass scheme of Refs. [76,77] and the results of Ref. [75] for all parameters of the SM. The strong coupling constant is renormalised in the MS-scheme at a general scale Q for contributions coming from gluons and light quarks, while the top-quark contribution is subtracted at zero momentum.

Treatment of colour
In the computation of the currents an important aspect is the treatment of colour, which does not factorise in the recursive construction (contrary to the diagrammatic approach). One could obtain the factorisation of colour by splitting the amplitude in a sum of colour-ordered amplitudes, as done in Ref. [78]. This, however, would increase the number of amplitudes to compute and would become complicated in the full SM. Alternatively, one could compute colour-dressed amplitudes, as for instance in Ref. [66], where off-shell currents would carry explicit colour indices and would have to be computed for each index separately, slowing down the calculation considerably. Although the number of colour-dressed amplitudes to be computed can be decreased by a Monte Carlo sampling over colour configurations [67], the number of operations at intermediate steps remains large. In order to optimise the colour treatment further, we developed an alternative approach based on "structure-dressed" amplitudes, where each current gets an explicit colour structure. This is easily achieved working in the colour-flow representation of the 1/N c expansion [79], introduced in Refs. [80,81] for perturbative QCD computations, where the conventional 8 gluon fields A a µ are replaced by a 3×3 matrix (A µ ) i j = 1 while the vertices become 2g µρ g νσ − g µσ g νρ − g µν g ρσ 2g µν g ρσ − g µρ g νσ − g µσ g νρ 2g µσ g νρ − g µν g ρσ − g µρ g νσ , In all Feynman rules the colour part is described by products of Kronecker δs, and therefore in the colour-flow representation the colour structure of the amplitude can be simply obtained as a linear combination of all possible products of Kronecker δs carrying the colour indices of the external particles. For a process with k external gluons and m external quark-antiquark pairs the amplitude takes the simple form: A α 1 ,...,αn β 1 ,...,βn = P (1,...,n) δ α 1 β 1 · · · δ αn βn A 1,...,n , n = k + m, (2.21) where in general all n! permutations P (1, . . . , n) of the indices β 1 , . . . , β n have to be considered. In a framework based on colour-dressed amplitudes, the colour indices of the external particles would be fixed and at each branch, given the colour of the generating currents, all colour configurations (3 for quarks or antiquarks, 9 for gluons) for the generated current would be computed. Many of them are zero, and the others differ just by simple factors. Since in this approach one would define and compute unnecessarily many currents, we decided to follow a different strategy.
Instead of assigning an explicit colour to the currents, we assign them a "colour structure", which is a product of Kronecker δs. In order to understand how these structures look like, let us first consider the external currents for quarks, antiquarks, and gluons: where α and β are the colour indices of the external particles while i and j are "open" colour indices, which, during the recursive construction of internal currents, are contracted with the indices of the Feynman rules of (2.21). In the recursive procedure these contractions generate products of δs: some of them carry indices of external particles only, as in (2.21), and some others involve the open indices of the generated current. For example, the combination of an external quark with colour structure δ i 1 β 1 with an external gluon with colour structure δ i 2 β 2 δ α 2 j 2 produces, according to the Feynman rules of (2.21), a quark with two possible colour structures: δ α 2 β 1 δ i β 2 and δ α 2 β 2 δ i β 1 . In both structures the first δ carries just external indices α 1 , β 1 , β 2 , while the second one contains also the open index i of the generated quark current.
The resulting colour structures for the off-shell currents are in complete correspondence to the colour structure in (2.21) for the full amplitude. The indices of the δ-structures of a particular off-shell current are given by the colour indices α i , β j of the external particles generating the current and by potential open indices for the generated particle: no open indices for a colourneutral particle, one for a quark/antiquark, two for a gluon. Therefore, in general, the colour structure of a gluon current, obtained from k external gluons and n − k external quark-antiquark pairs, takes the form where permutations P (1, . . . , n + 1) of the indices β 1 , . . . , β n , j on the right-hand side correspond to different currents. For the colour structure of a quark (antiquark) current, obtained from k external gluons, n − k − 1 external quark-antiquark pairs and an additional quark (antiquark) we have where again permutations P (1, . . . , n) of β 1 , . . . , β n in the first case and of β 1 , . . . , β n−1 , j in the second correspond to different currents. We can easily distinguish two parts in the colour structure: the "open part", which is always present in coloured currents, contains one (for quarks and antiquarks) or two (for gluons) δs with open colour indices i and/or j, while the "saturated part", which is absent in external currents, is a product of δs with only external colour indices. Only the open parts of the structures play an active role in the combination of currents, while the saturated parts of the generating currents simply multiply to give the saturated part of the generated current. In our code these two parts are represented by integer numbers, based on a binary notation.
Moving from colour-dressed to structure-dressed currents reduces already the number of currents. For example, there are 9 colour-dressed currents for a gluon generated from the currents of a quark and an antiquark (although many of them vanish), while we have just one structuredressed current. A further optimisation can be achieved introducing a colour label and giving the same colour label to currents differing just by a colour factor (due to subsequent multiplication of different colour coefficients). In the example considered above of an external quark combined with an external gluon, the coefficients of the two possible colour structures δ α 2 β 1 δ i β 2 and δ α 2 β 2 δ i β 1 differ only by a factor −1/N c . For structure-dressed currents this label is easily introduced, together with the corresponding colour factors, already in the initialisation phase and can be used in the production phase of the code to compute just one of the currents with the same colour label.
3 Electroweak corrections to Z + 2 jets production at the LHC As a first example for the application of the code Recola, we consider the EW corrections to the dominant partonic channels contributing to the process pp → Z + 2 jets.

Details of the calculation
At leading order (LO) in perturbation theory, the production of a Z boson at the LHC in association with a pair of hard jets is governed by the partonic subprocesses and their crossing-related counterparts. Since we neglect flavour mixing as well as finite-mass effects for the light quarks, the LO amplitudes do not depend on the quark generation, and the contributions of the various generations to the cross section differ only by their parton luminosities. While the mixed quark-gluon (gluonic) channels (3.1) contribute to the cross section Figure 1: From left to right: Sample tree diagrams for the QCD contributions to q i g → q i g Z and to q iqi → q jqj Z, and the EW contributions to q iqj → q iqj Z.  Fig. 1. If standard experimental acceptance cuts are applied (see Section 3.2.1 for the specification of our cuts), the mixed quark-gluon channels clearly dominate over the four-quark channels, with the subprocesses ug → ugZ and dg → dgZ contributing ∼ 70% and the complete class (3.1) of partonic subprocesses contributing ∼ 80% to the total cross section. 4 Therefore as a first step towards a complete NLO calculation of EW effects in Z + 2 jets we here calculate EW corrections to the gluonic channels (3.1).

General setup
In our calculation we describe potentially resonant Z-boson propagators appearing in loop diagrams (see Fig. 2 left for a sample diagram) by attributing a complex mass to internal Z bosons. To this end, we consistently use the complex-mass scheme [76,77,82] where µ 2 W and µ 2 Z are defined as the poles of the W-and Z-boson propagators in the complex plane. On the other hand, the external Z boson is treated as a stable final-state particle with its invariant mass being fixed to M Z , where M 2 Z is the real part of the complex pole of the Z-boson propagator. The pole values M V and Γ V (V = W, Z) for the mass and width of the W and Z boson are related to the on-shell results M OS V and Γ OS V obtained from the LEP and Tevatron experiments by [83] For the definition of the electromagnetic coupling constant α we adopt the G µ scheme, i.e. we fix the value of α via its tree-level relation with the Fermi constant G µ : Compared to the Thomson-limit definition of α, the definition of α Gµ in the G µ -scheme incorporates effects of the renormalisation-group running from the scale Q 2 = 0 to the scale Q 2 = M 2 W . In addition NLO corrections involving logarithms of light quark masses are avoided as such contributions do not enter the muon decay.

Virtual corrections
The virtual corrections involve O(300) diagrams per partonic channel, including 20 pentagon and 71 box contributions. 5 The most complicated topologies are given by pentagons involving 5-point functions of rank up to r = 4 (see Fig. 2 right for a sample diagram). The virtual amplitude is calculated using the 't Hooft-Feynman gauge. The calculation of the tensor integrals is performed employing recursive numerical reduction to scalar integrals based on Refs. [46,48,[84][85][86].
Numerical instabilities from small Gram determinants are avoided by resorting to various expansion algorithms for the problematic momentum configurations [48]. Both, in the case of UV divergences as well as in the case of infrared (IR) divergences, dimensional regularisation is applied to extract the corresponding singularities. The EW sector of the SM is renormalised using an on-shell prescription for the W-and Z-boson masses in the framework of the complex-mass scheme [77]. As the coupling α Gµ is derived from M W , M Z and G µ , its counterterm inherits a correction term ∆r from the weak corrections to muon decay [87].
For virtual NLO contributions the finite top-quark mass affects partonic channels involving external bottom quarks in a different way than channels with external quarks of the first two generations. While the top-quark mass is properly taken into account in closed fermion loops, finite top-quark-mass effects constrained to diagrams with external bottom quarks are neglected ( bgZ and bb → ggZ are suppressed by the bottom PDFs, gg → bbZ contributes about 1% at LO).

Real corrections
The EW real corrections to the subprocess (3.1) are induced by photon Bremsstrahlung and given by Emission of a soft or a collinear photon from an external quark leads to IR divergences which are regularised dimensionally. If an IR-save event definition is used, the final-state singularities cancel with corresponding IR poles from the virtual corrections. For the initial-state singularities this cancellation is incomplete but the remnant can be absorbed into a redefinition of the quark distribution function. Technically we make use of the Catani-Seymour dipole formalism as formulated in Ref. [88], which we transferred in a straightforward way to the case of dimensionally regularised photon emission. In addition to the singularities from soft and collinear photon emission we face a further source of IR divergences originating from a soft final-state gluon (see Ref. [50]). Isolated soft gluons do not pose any problem as they do not pass our selection cuts because the requirement of two hard jets is not fulfilled. However, in IR-safe observables quarks, and thus all QCD partons, have to be recombined with photons if they are sufficiently collinear. Thus, if a soft gluon is collinear to a photon, it still passes the selection cuts if recombined with the collinear photon, giving rise to a soft-gluon divergence that would be cancelled by the virtual QCD corrections to Z+1 jet+γ production. Following Refs. [50,89] we eliminate this singularity by discarding events which contain a jet consisting of a hard photon recombined with a soft parton a (a = q i ,q i , g) taking the photon-jet energy fraction z γ = E γ /(E γ + E a ) as a discriminator. Photonic jets with z γ above a critical value z cut γ are attributed to the process pp → Z + 1 jet + γ and therefore they are excluded. However, this event definition is still not IR-save because the application of the z γcut to recombined quark-photon jets spoils the cancellation of final-state collinear singularities with the IR-divergences from the virtual corrections. This is cured by absorbing the left-over singularities into the measured quark-photon fragmentation function [90,91].

Implementation
In order to ensure correctness of our results, and in particular of the calculation with Recola, we have performed two independent calculations which we find to be in mutual agreement. While the first one applies the technique of recursive amplitude generation as described in Section 2, the second one relies on the conventional Feynman-diagrammatic approach.
In the first calculation the amplitude generator Recola provides the Feynman amplitudes. For the evaluation of the tensor integrals Recola is interfaced with the Fortran library Collier [92]. To this end Collier has been extended by an efficient algorithm for building up the tensor integrals from the recursively calculated Lorentz-invariant coefficient functions. The phase-space integration is performed by means of a generic in-house Monte-Carlo generator [93] following the multi-channel sampling approach.
The second code uses FeynArts 3.2 [94,95] and FormCalc 3.1 [96] for the generation and simplification of the Feynman amplitudes. For the numerical evaluation the amplitudes are translated into the Weyl-van der Waerden formalism [97] using the program Pole [98]. The tensor integrals are again evaluated by Collier which by itself provides two independent implementations of all its building blocks. Finally, the phase-space integration is performed with the multi-channel generator Lusifer [99].

Accuracy and efficiency of Recola
In this section we estimate the accuracy and efficiency of the purely numerical algorithm Recola by comparing with the code Pole which is based on algebraically generated analytical expressions.
In Table 1 we compare results obtained with Recola (upper numbers) and Pole (lower numbers) for the NLO contribution δσ NLO EW to the total cross section for various classes of partonic channels. We further display separate results for the finite virtual corrections including the integrated dipoles (virtual) and for the finite real corrections including the dipole-subtraction terms (real). In addition we also present the relative deviation |R/P − 1| between the results of Recola (R) and Pole (P). For the results obtained by Recola we have requested 5 × 10 6 accepted events in the case of the virtual corrections and 10 8 accepted events in the case of the real corrections, while (roughly by a factor of 10) lower statistics has been used for the calculation with Pole. The results in Table 1  to the total cross section of various partonic process classed (summed over q = u, d). In addition to the complete NLO correction we separately give the finite virtual and real corrections and the relative differences between Recola and Pole.
A more detailed comparison can be obtained by comparing the weights at individual phase space points. For 10 6 Monte-Carlo generated phase-space points we have compared the pure virtual contribution 2 Re(M * LO δM NLO ) to the squared matrix element (with divergences omitted) calculated by Recola and Pole for the partonic process ug → ugZ. In Fig. 3 we show the integrated fraction of the 10 6 phase-space points for which the agreement |R/P − 1| between Recola and Pole is worse than ∆. We find a typical agreement of 10 −11 − 10 −14 , with less than 1% of the phase-space points showing an agreement worse than 10 −7 and less than 0.02% showing an agreement worse than 10 −5 . The result of this test of the precision of the code is similar to the one performed by the OpenLoops collaboration [57] which also uses the tensor-integral library Collier. Note, however, that we compare two independent codes for the calculation of the tensor coefficients which are based on two entirely different algorithms. Furthermore the distribution of the phase-space points is in our case determined from the multi-channel Monte Carlo generator adapted to the peaking structure of the underlying process.
Finally, we give some details on timing and the amount of memory required. The evaluation of the spin-and colour summed one-loop matrix elements takes about 30 ms per phase-space point for ug → ugZ (or any other partonic process considered in this paper) on a single Intel i7-2720QM core with gfortran 4.6.1. The size of the executable on disk is about 3 MB. The matrix elements are constructed during the run and the required memory depends strongly on the size of the dynamically generated internal arrays and thus on the considered process. For pp → Z + 2 jets memory is no issue. More complicated processes will be considered in the future.

Input parameters and selection cuts
We use the following set of input parameters [100], (3.4). The electromagnetic coupling constant α Gµ is determined from G µ via (3.5). The CKM matrix only appears in loop amplitudes and is set to unity.
For the prediction of the hadronic pp → Z + 2 jets cross section the partonic cross sections have to be convoluted with the corresponding parton distribution functions (PDFs). Since our calculation does not take into account NLO QCD effects, we consistently resort to LO PDFs, using the LHAPDF implementation of the central MSTW2008LO PDF set [102]. From there we infer the value α LO s (M Z ) = 0.1394 (3.8) for the strong coupling constant. We identify the QCD factorisation scale µ F and the renormalisation scale µ R choosing Note that the choice of the scales µ F,R as well as the actual value for the strong coupling α s plays a minor role for our numerical analysis of EW radiative corrections in Section 3.2.2. We focus on the relative importance of the NLO EW corrections considering the ratio σ NLO EW /σ LO from which the α s -and the scale dependence drop out.
For the jet-reconstruction we use the anti-k T clustering algorithm [103] with separation parameter R = 0.4. For our scenario with exactly two partons and one potential photon in the final state this simply amounts to recombining the photon with a parton a if R aγ = (y a − y γ ) 2 + φ 2 aγ < R. Here y = 1 2 ln[(E + p L )/(E − p L )] is the particle's rapidity with E denoting its energy and p L its three-momentum component along the beam axis, and φ aγ is the azimuthal angle between the the photon and the parton a in the plane transverse to the beam axis. In case of recombination, the resulting photon-parton jet is subjected to the cut z γ = E γ /(E γ + E a ) < 0.7 in order to distinguish between Z + 2 jets and Z + 1 jet + γ production as explained in Section 3.1.3. After a possible recombination, we require two hard jets with   The second column provides the corresponding cross section where the numbers in parenthesis give the integration error on the last digit. The third column contains the relative contribution to the total cross section in percent. In the fourth column we list the NLO EW cross section for the gluonic channels and in the last column the relative EW corrections.
for the final event.

Results
In this section we present results for the total cross section and various differential distributions using the numerical input parameters and acceptance cuts introduced above. The total cross section and its composition at LO for the 8 TeV LHC is shown in Table 2 where the absolute and relative contributions of the partonic channels are listed. In the lower part of Table 2 we provide the contribution to the total cross section of partonic processes with external gluons (gluonic) and of the four-quark processes (four-quark). We find the total cross section dominated by processes with external gluons, in particular by the quark-gluon induced processes (69%). We also provide the NLO cross section and the relative EW corrections for the gluonic channels in the last two columns of Table 2. For our set of cuts, they range between −1.0% and −1.4% for the different gluonic channels. In the following we present results for distributions at LO and NLO for the gluonic channels only. Although these channels dominate the total cross section we emphasise that there are certain phase-space regions where the relative importance of the four-quark processes is enhanced and these channels and the corresponding EW corrections should not be neglected to describe pp → Z + 2 jets properly. For each distribution we provide two plots: the upper panels show the LO and NLO prediction for the differential cross section while the lower panels show the NLO result normalised to the LO result. In Fig. 4 we present results for the differential cross section as a function of the transverse momentum and the rapidity for the harder jet j 1 and softer jet j 2 , respectively. Both transverse momentum distributions show steep slopes over six orders of magnitude in the displayed p T range. The EW corrections lower the LO prediction, and their relative size grows in absolute value with increasing transverse momentum due to the well-known EW Sudakov logarithms [104][105][106][107]. For transverse momenta of the softer jet p T,j 2 ≃ 1 TeV the impact of EW corrections amounts up to −10% relative to the Born approximation, while for transverse momenta of the harder jet p T,j 1 ≃ 1 TeV the EW effects are of the order of −15%. Since, the rapidity distributions are not sensitive to Sudakov logarithms, the corresponding EW corrections are flat and around −1% as for the total cross section.
The differential cross section as a function of the di-jet invariant mass and as a function of the transverse momentum of the Z boson is shown on the left-hand side in Fig. 5. For both distributions we find the expected dependence on the Sudakov logarithms, although the sensitivity in the di-jet invariant-mass distribution is less pronounced than in the p T,Z -distribution. For M jj ≃ 500 GeV the EW corrections are of the order of −2%; they amount up to −4% for M jj ≃2 TeV. The transverse momentum distribution of the Z boson receives large corrections, from −15% for p T,Z ≃ 500 GeV to −25% for p T,Z ≃ 1 TeV. On the upper right-hand side in Fig. 5 we present the differential distribution of the relative azimuthal angle φ jj between the two jets. The φ jj -distribution shows that the two jets are preferably back-to-back in the transverse plane and that the EW corrections lower the differential cross section by 1−1.5%. They induce a shape change at the permille level relative to the LO approximation. The rapidity distribution of the Z boson is depicted in the lower right-hand side of Fig. 5. In the central region |y Z | < 2, where most of the Z bosons are produced, the EW corrections lower the LO cross section by 1−1.5% while for |y Z | > 2 their effect drops to the permille level.

Conclusions
The full exploitation of the Large Hadron Collider relies on precise theoretical predictions. To this end QCD and electroweak next-to-leading order corrections have to be calculated for many processes involving many particles in the final state. This requires efficient and reliable automatic tools.
In this paper we have presented Recola, a Fortran90 code for the REcursive Computation of One-Loop Amplitudes. It uses methods based on Dyson-Schwinger equations to calculate the coefficients of all tensor integrals appearing in a one-loop amplitude recursively. The tensor integrals can then be evaluated with efficient numerically stable techniques. The algorithm has been implemented for the full electroweak Standard Model, including counterterms and rational terms, but could be generalised to more complicated theories in a straightforward way. The implementation supports the complex-mass scheme and is thus applicable to processes involving intermediate unstable particles. For the treatment of colour we have developed a new recursive algorithm based on colour structures that naturally appear in the colour-flow representation.
As a first application of Recola, we have calculated the electroweak corrections to the dominant partonic channels in Z + 2 jets production at the LHC. The results have been verified with an independent calculation based on Feynman-diagrammatic methods. For a typical set of cuts, the electroweak corrections are negative at the level of one percent, but become sizeable where large energy scales are relevant. However, in general and in particular for large energies of the jets, a meaningful prediction requires the inclusion of next-to-leading-order corrections to all partonic channels. This will be pursued in a forthcoming publication.