Sengupta, Debajyoti
PIPPIN: Generating variable length full events from partons
Quétant, Guillaume, Raine, John Andrew, Leigh, Matthew, Sengupta, Debajyoti, Golling, Tobias
This occurs before In the realm of high-energy physics, the simulation of they hit the detector material and greatly increases the particle collisions is a crucial tool for the downstream number of particles produced by the collision. Following analysis of the huge amount of data produced by collider this, all stable particles are propagated through the experiments. The classical generation of these simulated detector material, where they can interact with it and events is a complex task. By far the most computationally produce secondary particles. This showering process is taxing subtask is the propagation of particles through typically the most computationally expensive part of the the detectors, accounting for the interactions with the simulation. Finally, the deposited energy in the detector detector material and modelling the secondary radiation is digitised and reconstructed into objects such as jets, showers this produces. This process is usually performed leptons and missing transverse energy (MET).
Improving new physics searches with diffusion models for event observables and jet constituents
Sengupta, Debajyoti, Leigh, Matthew, Raine, John Andrew, Klein, Samuel, Golling, Tobias
We introduce a new technique called Drapes to enhance the sensitivity in searches for new physics at the LHC. By training diffusion models on side-band data, we show how background templates for the signal region can be generated either directly from noise, or by partially applying the diffusion process to existing data. In the partial diffusion case, data can be drawn from side-band regions, with the inverse diffusion performed for new target conditional values, or from the signal region, preserving the distribution over the conditional property that defines the signal region. We apply this technique to the hunt for resonances using the LHCO di-jet dataset, and achieve state-of-the-art performance for background template generation using high level input features. We also show how Drapes can be applied to low level inputs with jet constituents, reducing the model dependence on the choice of input observables. Using jet constituents we can further improve sensitivity to the signal process, but observe a loss in performance where the signal significance before applying any selection is below 4$\sigma$.
EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion
Buhmann, Erik, Ewen, Cedric, Faroughy, Darius A., Golling, Tobias, Kasieczka, Gregor, Leigh, Matthew, Quétant, Guillaume, Raine, John Andrew, Sengupta, Debajyoti, Shih, David
Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep sets framework. This model offers a much faster alternative to previous transformer-based diffusion models without reducing the quality of the generated jets. In addition, we introduce \epcfm, the first permutation equivariant continuous normalizing flow (CNF) for particle cloud generation. This model is trained with {\it flow-matching}, a scalable and easy-to-train objective based on optimal transport that directly regresses the vector fields connecting the Gaussian noise prior to the data distribution. Our experiments demonstrate that \epcjedi and \epcfm both achieve state-of-the-art performance on the top-quark JetNet datasets whilst maintaining fast generation speed. Most notably, we find that the \epcfm model consistently outperforms all the other generative models considered here across every metric. Finally, we also introduce two new particle cloud performance metrics: the first based on the Kullback-Leibler divergence between feature distributions, the second is the negative log-posterior of a multi-model ParticleNet classifier.
PC-Droid: Faster diffusion and improved quality for particle cloud generation
Leigh, Matthew, Sengupta, Debajyoti, Raine, John Andrew, Quétant, Guillaume, Golling, Tobias
Building on the success of PC-JeDi we introduce PC-Droid, a substantially improved diffusion model for the generation of jet particle clouds. By leveraging a new diffusion formulation, studying more recent integration solvers, and training on all jet types simultaneously, we are able to achieve state-of-the-art performance for all types of jets across all evaluation metrics. We study the trade-off between generation speed and quality by comparing two attention based architectures, as well as the potential of consistency distillation to reduce the number of diffusion steps. Both the faster architecture and consistency models demonstrate performance surpassing many competing models, with generation time up to two orders of magnitude faster than PC-JeDi and three orders of magnitude faster than Delphes.
CURTAINs Flows For Flows: Constructing Unobserved Regions with Maximum Likelihood Estimation
Sengupta, Debajyoti, Klein, Samuel, Raine, John Andrew, Golling, Tobias
Model independent techniques for constructing background data templates using generative models have shown great promise for use in searches for new physics processes at the LHC. We introduce a major improvement to the CURTAINs method by training the conditional normalizing flow between two side-band regions using maximum likelihood estimation instead of an optimal transport loss. The new training objective improves the robustness and fidelity of the transformed data and is much faster and easier to train. We compare the performance against the previous approach and the current state of the art using the LHC Olympics anomaly detection dataset, where we see a significant improvement in sensitivity over the original CURTAINs method. Furthermore, CURTAINsF4F requires substantially less computational resources to cover a large number of signal regions than other fully data driven approaches. When using an efficient configuration, an order of magnitude more models can be trained in the same time required for ten signal regions, without a significant drop in performance.
PC-JeDi: Diffusion for Particle Cloud Generation in High Energy Physics
Leigh, Matthew, Sengupta, Debajyoti, Quétant, Guillaume, Raine, John Andrew, Zoch, Knut, Golling, Tobias
In this paper, we present a new method to efficiently generate jets in High Energy Physics called PC-JeDi. This method utilises score-based diffusion models in conjunction with transformers which are well suited to the task of generating jets as particle clouds due to their permutation equivariance. PC-JeDi achieves competitive performance with current state-of-the-art methods across several metrics that evaluate the quality of the generated jets. Although slower than other models, due to the large number of forward passes required by diffusion models, it is still substantially faster than traditional detailed simulation. Furthermore, PC-JeDi uses conditional generation to produce jets with a desired mass and transverse momentum for two different particles, top quarks and gluons.