Not enough data to create a plot.
Try a different view from the menu above.
Cranmer, Kyle
Simulation Intelligence: Towards a New Generation of Scientific Methods
Lavin, Alexander, Zenil, Hector, Paige, Brooks, Krakauer, David, Gottschlich, Justin, Mattson, Tim, Anandkumar, Anima, Choudry, Sanjay, Rocki, Kamil, Baydin, Atılım Güneş, Prunkl, Carina, Paige, Brooks, Isayev, Olexandr, Peterson, Erik, McMahon, Peter L., Macke, Jakob, Cranmer, Kyle, Zhang, Jiaxin, Wainwright, Haruko, Hanuka, Adi, Veloso, Manuela, Assefa, Samuel, Zheng, Stephan, Pfeffer, Avi
The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science.
Exact and Approximate Hierarchical Clustering Using A*
Greenberg, Craig S., Macaluso, Sebastian, Monath, Nicholas, Dubey, Avinava, Flaherty, Patrick, Zaheer, Manzil, Ahmed, Amr, Cranmer, Kyle, McCallum, Andrew
Hierarchical clustering is a critical task in numerous domains. Many approaches are based on heuristics and the properties of the resulting clusterings are studied post hoc. However, in several applications, there is a natural cost function that can be used to characterize the quality of the clustering. In those cases, hierarchical clustering can be seen as a combinatorial optimization problem. To that end, we introduce a new approach based on A* search. We overcome the prohibitively large search space by combining A* with a novel \emph{trellis} data structure. This combination results in an exact algorithm that scales beyond previous state of the art, from a search space with $10^{12}$ trees to $10^{15}$ trees, and an approximate algorithm that improves over baselines, even in enormous search spaces that contain more than $10^{1000}$ trees. We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks. We describe how our method provides significantly improved theoretical bounds on the time and space complexity of A* for clustering.
Hierarchical clustering in particle physics through reinforcement learning
Brehmer, Johann, Macaluso, Sebastian, Pappadopulo, Duccio, Cranmer, Kyle
Particle physics experiments often require the reconstruction of decay patterns through a hierarchical clustering of the observed final-state particles. We show that this task can be phrased as a Markov Decision Process and adapt reinforcement learning algorithms to solve it. In particular, we show that Monte-Carlo Tree Search guided by a neural policy can construct high-quality hierarchical clusterings and outperform established greedy and beam search baselines.
Simulation-based inference methods for particle physics
Brehmer, Johann, Cranmer, Kyle
Our predictions for particle physics processes are realized in a chain of complex simulators. They allow us to generate high-fidelity simulated data, but they are not well-suited for inference on the theory parameters with observed data. We explain why the likelihood function of high-dimensional LHC data cannot be explicitly evaluated, why this matters for data analysis, and reframe what the field has traditionally done to circumvent this problem. We then review new simulation-based inference methods that let us directly analyze high-dimensional data by combining machine learning techniques and information from the simulator. Initial studies indicate that these techniques have the potential to substantially improve the precision of LHC measurements. Finally, we discuss probabilistic programming, an emerging paradigm that lets us extend inference to the latent process of the simulator.
Semi-parametric $\gamma$-ray modeling with Gaussian processes and variational inference
Mishra-Sharma, Siddharth, Cranmer, Kyle
Mismodeling the uncertain, diffuse emission of Galactic origin can seriously bias the characterization of astrophysical gamma-ray data, particularly in the region of the Inner Milky Way where such emission can make up over 80% of the photon counts observed at ~GeV energies. We introduce a novel class of methods that use Gaussian processes and variational inference to build flexible background and signal models for gamma-ray analyses with the goal of enabling a more robust interpretation of the make-up of the gamma-ray sky, particularly focusing on characterizing potential signals of dark matter in the Galactic Center with data from the Fermi telescope.
Sampling using $SU(N)$ gauge equivariant flows
Boyda, Denis, Kanwar, Gurtej, Racanière, Sébastien, Rezende, Danilo Jimenez, Albergo, Michael S., Cranmer, Kyle, Hackett, Daniel C., Shanahan, Phiala E.
In Ref. [11], this approach was demonstrated in the Gauge theories based on SU(N) or U(N) groups describe context of U(1) gauge theory. Here, we develop a class of many aspects of nature. For example, the Standard kernels for SU(N) group elements (and describe a similar Model of nuclear and particle physics is a nonabelian construction for U(N) group elements). We show that if gauge theory with the symmetry group U(1) an invertible transformation acts only on the eigenvalues SU(2) SU(3), candidate theories for physics beyond the of a matrix and is equivariant under permutation of those Standard Model can be defined based on strongly interacting eigenvalues, then it is equivariant under matrix conjugation SU(N) gauge theories [1, 2], SU(N) gauge symmetries and may be used as a kernel. Moreover, by making emerge in various condensed matter systems [3-7], a connection to the maximal torus within the group and and SU(N) and U(N) gauge symmetries feature in the to the Weyl group of the root system, we show that this low energy limit of certain string-theory vacua [8]. In is in fact a universal way to define a kernel for unitary the context of the rapidly-developing area of machinelearning groups.
Discovering Symbolic Models from Deep Learning with Inductive Biases
Cranmer, Miles, Sanchez-Gonzalez, Alvaro, Battaglia, Peter, Xu, Rui, Cranmer, Kyle, Spergel, David, Ho, Shirley
We develop a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases. We focus on Graph Neural Networks (GNNs). The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical relations. We find the correct known equations, including force laws and Hamiltonians, can be extracted from the neural network. We then apply our method to a nontrivial cosmology example--a detailed dark matter simulation--and discover a new analytic formula which can predict the concentration of dark matter from the mass distribution of nearby cosmic structures. The symbolic expressions extracted from the GNN using our technique also generalized to out-of-distributiondata better than the GNN itself. Our approach offers alternative directions for interpreting neural networks and discovering novel physical principles from the representations they learn.
Mining for Dark Matter Substructure: Inferring subhalo population properties from strong lenses with machine learning
Brehmer, Johann, Mishra-Sharma, Siddharth, Hermans, Joeri, Louppe, Gilles, Cranmer, Kyle
The subtle and unique imprint of dark matter substructure on extended arcs in strong lensing systems contains a wealth of information about the properties and distribution of dark matter on small scales and, consequently, about the underlying particle physics. However, teasing out this effect poses a significant challenge since the likelihood function for realistic simulations of population-level parameters is intractable. We apply recently-developed simulation-based inference techniques to the problem of substructure inference in galaxy-galaxy strong lenses. By leveraging additional information extracted from the simulator, neural networks are efficiently trained to estimate likelihood ratios associated with population-level parameters characterizing substructure. Through proof-of-principle application to simulated data, we show that these methods can provide an efficient and principled way to simultaneously analyze an ensemble of strong lenses, and can be used to mine the large sample of lensing images deliverable by near-future surveys for signatures of dark matter substructure.
MadMiner: Machine learning-based inference for particle physics
Brehmer, Johann, Kling, Felix, Espejo, Irina, Cranmer, Kyle
The legacy measurements of the LHC will require analyzing high-dimensional event data for subtle kinematic signatures, which is challenging for established analysis methods. Recently, a powerful family of multivariate inference techniques that leverage both matrix element information and machine learning has been developed. This approach neither requires the reduction of high-dimensional data to summary statistics nor any simplifications to the underlying physics or detector response. In this paper we introduce MadMiner, a Python module that streamlines the steps involved in this procedure. Wrapping around MadGraph5_aMC and Pythia 8, it supports almost any physics process and model. To aid phenomenological studies, the tool also wraps around Delphes 3, though it is extendable to a full Geant4-based detector simulation. We demonstrate the use of MadMiner in an example analysis of dimension-six operators in ttH production, finding that the new techniques substantially increase the sensitivity to new physics.
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale
Baydin, Atılım Güneş, Shao, Lei, Bhimji, Wahid, Heinrich, Lukas, Meadows, Lawrence, Liu, Jialin, Munk, Andreas, Naderiparizi, Saeid, Gram-Hansen, Bradley, Louppe, Gilles, Ma, Mingfei, Zhao, Xiaohui, Torr, Philip, Lee, Victor, Cranmer, Kyle, Prabhat, null, Wood, Frank
Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.