Kieseler, Jan
End-to-End Optimal Detector Design with Mutual Information Surrogates
Wozniak, Kinga Anna, Mulligan, Stephen, Kieseler, Jan, Klute, Markus, Fleuret, Francois, Golling, Tobias
We introduce a novel approach for end-to-end black-box optimization of high energy physics (HEP) detectors using local deep learning (DL) surrogates. These surrogates approximate a scalar objective function that encapsulates the complex interplay of particle-matter interactions and physics analysis goals. In addition to a standard reconstruction-based metric commonly used in the field, we investigate the information-theoretic metric of mutual information. Unlike traditional methods, mutual information is inherently task-agnostic, offering a broader optimization paradigm that is less constrained by predefined targets. We demonstrate the effectiveness of our method in a realistic physics analysis scenario: optimizing the thicknesses of calorimeter detector layers based on simulated particle interactions. The surrogate model learns to approximate objective gradients, enabling efficient optimization with respect to energy resolution. Our findings reveal three key insights: (1) end-to-end black-box optimization using local surrogates is a practical and compelling approach for detector design, providing direct optimization of detector parameters in alignment with physics analysis goals; (2) mutual information-based optimization yields design choices that closely match those from state-of-the-art physics-informed methods, indicating that these approaches operate near optimality and reinforcing their reliability in HEP detector design; and (3) information-theoretic methods provide a powerful, generalizable framework for optimizing scientific instruments. By reframing the optimization process through an information-theoretic lens rather than domain-specific heuristics, mutual information enables the exploration of new avenues for discovery beyond conventional approaches.
Constrained Optimization of Charged Particle Tracking with Multi-Agent Reinforcement Learning
Kortus, Tobias, Keidel, Ralf, Gauger, Nicolas R., Kieseler, Jan
Reinforcement learning demonstrated immense success in modelling complex physics-driven systems, providing end-to-end trainable solutions by interacting with a simulated or real environment, maximizing a scalar reward signal. In this work, we propose, building upon previous work, a multi-agent reinforcement learning approach with assignment constraints for reconstructing particle tracks in pixelated particle detectors. Our approach optimizes collaboratively a parametrized policy, functioning as a heuristic to a multidimensional assignment problem, by jointly minimizing the total amount of particle scattering over the reconstructed tracks in a readout frame. To satisfy constraints, guaranteeing a unique assignment of particle hits, we propose a safety layer solving a linear assignment problem for every joint action. Further, to enforce cost margins, increasing the distance of the local policies predictions to the decision boundaries of the optimizer mappings, we recommend the use of an additional component in the blackbox gradient estimation, forcing the policy to solutions with lower total assignment costs. We empirically show on simulated data, generated for a particle detector developed for proton imaging, the effectiveness of our approach, compared to multiple single- and multi-agent baselines. We further demonstrate the effectiveness of constraints with cost margins for both optimization and generalization, introduced by wider regions with high reconstruction performance as well as reduced predictive instabilities. Our results form the basis for further developments in RL-based tracking, offering both enhanced performance with constrained policies and greater flexibility in optimizing tracking algorithms through the option for individual and team rewards.
TomOpt: Differential optimisation for task- and constraint-aware design of particle detectors in the context of muon tomography
Strong, Giles C., Lagrange, Maxime, Orio, Aitor, Bordignon, Anna, Bury, Florian, Dorigo, Tommaso, Giammanco, Andrea, Heikal, Mariam, Kieseler, Jan, Lamparth, Max, del Árbol, Pablo Martínez Ruíz, Nardi, Federico, Vischia, Pietro, Zaraket, Haitham
Over the past two decades, the availability of high-performance computing and the development of neural networks of larger capacity have conspired to fuel a revolution in the way we think at the optimisation of complex systems. When the dimensionality of the space of relevant design parameters exceeds a few units, and brute-force scans cease be a viable option for its exploration. We nowadays, have the option of letting automated systems find their way to configurations that correspond to advantageous extrema of carefully specified objective functions. The engine under the hood of these optimisation searches is automatic differentiation, which allows computer programs to keep track of the gradient of the objective function, through the chain rule of differential calculus, as computer code performs arbitrarily complex successions of operations to model the behaviour of the system. Crucial to a successful optimisation of the system is the inclusion in the model of all relevant effects that have an impact on the precision of the inference that the data generated by the system may produce. An incomplete description of the inference itself, or a mock up of the reconstruction techniques performing the dimensionality reduction step which translates raw data into high-level features informing the objective function, are likely to prevent the identification of designs that maximise the true objective, as they introduce a misalignment.
Jet Flavour Classification Using DeepJet
Bols, Emil, Kieseler, Jan, Verzetti, Mauro, Stoye, Markus, Stakia, Anna
The Standard Model of particle physics (SM) [1, 2] is a remarkably effective theory, able to describe the experimental observations made thus far in high energy physics with unprecedented precision and completeness. Despite its success however, this model fails to explain several observations like the baryon asymmetry and the presence of dark matter, which inspires searches for extensions to the SM. The study of the recently discovered [3-5] Higgs boson [6-11], and the search for extensions of the electroweak sector are two of the most active research sectors in the field. Because of the flavour asymmetry associated to production and decay processes in each case, the ability to classify jets originating from heavy-flavour (bottom and charm) quarks is important. Heavy-flavour (HF) jets contain an open-bottom or open-charm hadron as a result of the fragmentation process. This hadron carries a large fraction of the initial parton momentum. HF hadrons also have a sizeable lifetime, with a of 0.5 mm and 0.3 mm for bottom and charm, respectively.
Learning representations of irregular particle-detector geometry with distance-weighted graph networks
Qasim, Shah Rukh, Kieseler, Jan, Iiyama, Yutaro, Pierini, Maurizio
We explore the use of graph networks to deal with irregular-geometry detectors in the context of particle reconstruction. Thanks to their representation-learning capabilities, graph networks can exploit the full detector granularity, while natively managing the event sparsity and arbitrarily complex detector geometries. We introduce two distance-weighted graph network architectures, dubbed GarNet and GravNet layers, and apply them to a typical particle reconstruction task. The performance of the new architectures is evaluated on a data set of simulated particle interactions on a toy model of a highly granular calorimeter, loosely inspired by the endcap calorimeter to be installed in the CMS detector for the High-Luminosity LHC phase. We study the clustering of energy depositions, which is the basis for calorimetric particle reconstruction, and provide a quantitative comparison to alternative approaches. The proposed algorithms outperform existing methods or reach competitive performance with lower computing-resource consumption. Being geometry-agnostic, the new architectures are not restricted to calorimetry and can be easily adapted to other use cases, such as tracking in silicon detectors.