Models, code, and papers for "Kyle Cranmer":

Backdrop: Stochastic Backpropagation

Jun 04, 2018
Siavash Golkar, Kyle Cranmer

We introduce backdrop, a flexible and simple-to-implement method, intuitively described as dropout acting only along the backpropagation pipeline. Backdrop is implemented via one or more masking layers which are inserted at specific points along the network. Each backdrop masking layer acts as the identity in the forward pass, but randomly masks parts of the backward gradient propagation. Intuitively, inserting a backdrop layer after any convolutional layer leads to stochastic gradients corresponding to features of that scale. Therefore, backdrop is well suited for problems in which the data have a multi-scale, hierarchical structure. Backdrop can also be applied to problems with non-decomposable loss functions where standard SGD methods are not well suited. We perform a number of experiments and demonstrate that backdrop leads to significant improvements in generalization.

* 11 pages, 9 figures, 2 tables. Source code available at https://github.com/dexgen/backdrop 

  Click for Model/Code and Paper
Inferring the quantum density matrix with machine learning

Apr 11, 2019
Kyle Cranmer, Siavash Golkar, Duccio Pappadopulo

We introduce two methods for estimating the density matrix for a quantum system: Quantum Maximum Likelihood and Quantum Variational Inference. In these methods, we construct a variational family to model the density matrix of a mixed quantum state. We also introduce quantum flows, the quantum analog of normalizing flows, which can be used to increase the expressivity of this variational family. The eigenstates and eigenvalues of interest are then derived by optimizing an appropriate loss function. The approach is qualitatively different than traditional lattice techniques that rely on the time dependence of correlation functions that summarize the lattice configurations. The resulting estimate of the density matrix can then be used to evaluate the expectation of an arbitrary operator, which opens the door to new possibilities.

* 12 pages, 3 figures 

  Click for Model/Code and Paper
Adversarial Variational Optimization of Non-Differentiable Simulators

Oct 05, 2018
Gilles Louppe, Joeri Hermans, Kyle Cranmer

Complex computer simulators are increasingly used across fields of science as generative models tying parameters of an underlying theory to experimental observations. Inference in this setup is often difficult, as simulators rarely admit a tractable density or likelihood function. We introduce Adversarial Variational Optimization (AVO), a likelihood-free inference algorithm for fitting a non-differentiable generative model incorporating ideas from generative adversarial networks, variational optimization and empirical Bayes. We adapt the training procedure of generative adversarial networks by replacing the differentiable generative network with a domain-specific simulator. We solve the resulting non-differentiable minimax problem by minimizing variational upper bounds of the two adversarial objectives. Effectively, the procedure results in learning a proposal distribution over simulator parameters, such that the JS divergence between the marginal distribution of the synthetic data and the empirical distribution of observed data is minimized. We evaluate and compare the method with simulators producing both discrete and continuous data.


  Click for Model/Code and Paper
Learning to Pivot with Adversarial Networks

Jun 01, 2017
Gilles Louppe, Michael Kagan, Kyle Cranmer

Several techniques for domain adaptation have been proposed to account for differences in the distribution of the data used for training and testing. The majority of this work focuses on a binary domain label. Similar problems occur in a scientific context where there may be a continuous family of plausible data generation processes associated to the presence of systematic uncertainties. Robust inference is possible if it is based on a pivot -- a quantity whose distribution does not depend on the unknown values of the nuisance parameters that parametrize this family of data generation processes. In this work, we introduce and derive theoretical results for a training procedure based on adversarial networks for enforcing the pivotal property (or, equivalently, fairness with respect to continuous attributes) on a predictive model. The method includes a hyperparameter to control the trade-off between accuracy and robustness. We demonstrate the effectiveness of this approach with a toy example and examples from particle physics.

* v1: Original submission. v2: Fixed references. v3: version submitted to NIPS'2017. Code available at https://github.com/glouppe/paper-learning-to-pivot 

  Click for Model/Code and Paper
Approximating Likelihood Ratios with Calibrated Discriminative Classifiers

Mar 18, 2016
Kyle Cranmer, Juan Pavez, Gilles Louppe

In many fields of science, generalized likelihood ratio tests are established tools for statistical inference. At the same time, it has become increasingly common that a simulator (or generative model) is used to describe complex processes that tie parameters $\theta$ of an underlying theory and measurement apparatus to high-dimensional observations $\mathbf{x}\in \mathbb{R}^p$. However, simulator often do not provide a way to evaluate the likelihood function for a given observation $\mathbf{x}$, which motivates a new class of likelihood-free inference algorithms. In this paper, we show that likelihood ratios are invariant under a specific class of dimensionality reduction maps $\mathbb{R}^p \mapsto \mathbb{R}$. As a direct consequence, we show that discriminative classifiers can be used to approximate the generalized likelihood ratio statistic when only a generative model for the data is available. This leads to a new machine learning-based approach to likelihood-free inference that is complementary to Approximate Bayesian Computation, and which does not require a prior on the model parameters. Experimental results on artificial problems with known exact likelihoods illustrate the potential of the proposed method.

* 35 pages, 5 figures 

  Click for Model/Code and Paper
MadMiner: Machine learning-based inference for particle physics

Jul 24, 2019
Johann Brehmer, Felix Kling, Irina Espejo, Kyle Cranmer

The legacy measurements of the LHC will require analyzing high-dimensional event data for subtle kinematic signatures, which is challenging for established analysis methods. Recently, a powerful family of multivariate inference techniques that leverage both matrix element information and machine learning has been developed. This approach neither requires the reduction of high-dimensional data to summary statistics nor any simplifications to the underlying physics or detector response. In this paper we introduce MadMiner, a Python module that streamlines the steps involved in this procedure. Wrapping around MadGraph5_aMC and Pythia 8, it supports almost any physics process and model. To aid phenomenological studies, the tool also wraps around Delphes 3, though it is extendable to a full Geant4-based detector simulation. We demonstrate the use of MadMiner in an example analysis of dimension-six operators in ttH production, finding that the new techniques substantially increase the sensitivity to new physics.

* MadMiner is available at https://github.com/diana-hep/madminer 

  Click for Model/Code and Paper
Mining gold from implicit models to improve likelihood-free inference

Oct 09, 2018
Johann Brehmer, Gilles Louppe, Juan Pavez, Kyle Cranmer

Simulators often provide the best description of real-world phenomena. However, they also lead to challenging inverse problems because the density they implicitly define is often intractable. We present a new suite of simulation-based inference techniques that go beyond the traditional Approximate Bayesian Computation approach, which struggles in a high-dimensional setting, and extend methods that use surrogate models based on neural networks. We show that additional information, such as the joint likelihood ratio and the joint score, can often be extracted from simulators and used to augment the training data for these surrogate models. Finally, we demonstrate that these new techniques are more sample efficient and provide higher-fidelity inference than traditional methods.

* Code available at https://github.com/johannbrehmer/simulator-mining-example . v2: Fixed typos. v3: Expanded discussion, added Lotka-Volterra example 

  Click for Model/Code and Paper
A Guide to Constraining Effective Field Theories with Machine Learning

Jul 26, 2018
Johann Brehmer, Kyle Cranmer, Gilles Louppe, Juan Pavez

We develop, discuss, and compare several inference techniques to constrain theory parameters in collider experiments. By harnessing the latent-space structure of particle physics processes, we extract extra information from the simulator. This augmented data can be used to train neural networks that precisely estimate the likelihood ratio. The new methods scale well to many observables and high-dimensional parameter spaces, do not require any approximations of the parton shower and detector response, and can be evaluated in microseconds. Using weak-boson-fusion Higgs production as an example process, we compare the performance of several techniques. The best results are found for likelihood ratio estimators trained with extra information about the score, the gradient of the log likelihood function with respect to the theory parameters. The score also provides sufficient statistics that contain all the information needed for inference in the neighborhood of the Standard Model. These methods enable us to put significantly stronger bounds on effective dimension-six operators than the traditional approach based on histograms. They also outperform generic machine learning methods that do not make use of the particle physics structure, demonstrating their potential to substantially improve the new physics reach of the LHC legacy results.

* Phys. Rev. D 98, 052004 (2018) 
* See also the companion publication "Constraining Effective Field Theories with Machine Learning" at arXiv:1805.00013, a brief introduction presenting the key ideas. The code for these studies is available at https://github.com/johannbrehmer/higgs_inference . v2: Added references. v3: Improved description of algorithms, added references. v4: Clarified text, added references 

  Click for Model/Code and Paper
Constraining Effective Field Theories with Machine Learning

Jul 26, 2018
Johann Brehmer, Kyle Cranmer, Gilles Louppe, Juan Pavez

We present powerful new analysis techniques to constrain effective field theories at the LHC. By leveraging the structure of particle physics processes, we extract extra information from Monte-Carlo simulations, which can be used to train neural network models that estimate the likelihood ratio. These methods scale well to processes with many observables and theory parameters, do not require any approximations of the parton shower or detector response, and can be evaluated in microseconds. We show that they allow us to put significantly stronger bounds on dimension-six operators than existing methods, demonstrating their potential to improve the precision of the LHC legacy constraints.

* Phys. Rev. Lett. 121, 111801 (2018) 
* See also the companion publication "A Guide to Constraining Effective Field Theories with Machine Learning" at arXiv:1805.00020, an in-depth analysis of machine learning techniques for LHC measurements. The code for these studies is available at https://github.com/johannbrehmer/higgs_inference . v2: New schematic figure explaining the new algorithms, added references. v3, v4: Added references 

  Click for Model/Code and Paper
QCD-Aware Recursive Neural Networks for Jet Physics

Jul 13, 2018
Gilles Louppe, Kyunghyun Cho, Cyril Becot, Kyle Cranmer

Recent progress in applying machine learning for jet physics has been built upon an analogy between calorimeters and images. In this work, we present a novel class of recursive neural networks built instead upon an analogy between QCD and natural languages. In the analogy, four-momenta are like words and the clustering history of sequential recombination jet algorithms is like the parsing of a sentence. Our approach works directly with the four-momenta of a variable-length set of particles, and the jet-based tree structure varies on an event-by-event basis. Our experiments highlight the flexibility of our method for building task-specific jet embeddings and show that recursive architectures are significantly more accurate and data efficient than previous image-based networks. We extend the analogy from individual jets (sentences) to full events (paragraphs), and show for the first time an event-level classifier operating on all the stable particles produced in an LHC event.

* 16 pages, 5 figures, 3 appendices, corresponding code at https://github.com/glouppe/recnn 

  Click for Model/Code and Paper
Hamiltonian Graph Networks with ODE Integrators

Sep 27, 2019
Alvaro Sanchez-Gonzalez, Victor Bapst, Kyle Cranmer, Peter Battaglia

We introduce an approach for imposing physically informed inductive biases in learned simulation models. We combine graph networks with a differentiable ordinary differential equation integrator as a mechanism for predicting future states, and a Hamiltonian as an internal representation. We find that our approach outperforms baselines without these biases in terms of predictive accuracy, energy accuracy, and zero-shot generalization to time-step sizes and integrator orders not experienced during training. This advances the state-of-the-art of learned simulation, and in principle is applicable beyond physical domains.


  Click for Model/Code and Paper
Likelihood-free inference with an improved cross-entropy estimator

Aug 02, 2018
Markus Stoye, Johann Brehmer, Gilles Louppe, Juan Pavez, Kyle Cranmer

We extend recent work (Brehmer, et. al., 2018) that use neural networks as surrogate models for likelihood-free inference. As in the previous work, we exploit the fact that the joint likelihood ratio and joint score, conditioned on both observed and latent variables, can often be extracted from an implicit generative model or simulator to augment the training data for these surrogate models. We show how this augmented training data can be used to provide a new cross-entropy estimator, which provides improved sample efficiency compared to previous loss functions exploiting this augmented training data.

* 8 pages, 3 figures 

  Click for Model/Code and Paper
Parameterized Machine Learning for High-Energy Physics

Jan 28, 2016
Pierre Baldi, Kyle Cranmer, Taylor Faucett, Peter Sadowski, Daniel Whiteson

We investigate a new structure for machine learning classifiers applied to problems in high-energy physics by expanding the inputs to include not only measured features but also physics parameters. The physics parameters represent a smoothly varying learning task, and the resulting parameterized classifier can smoothly interpolate between them and replace sets of classifiers trained at individual values. This simplifies the training process and gives improved performance at intermediate values, even for complex problems requiring deep learning. Applications include tools parameterized in terms of theoretical model parameters, such as the mass of a particle, which allow for a single network to provide improved discrimination across a range of masses. This concept is simple to implement and allows for optimized interpolatable results.

* For submission to PRD 

  Click for Model/Code and Paper
Mining for Dark Matter Substructure: Inferring subhalo population properties from strong lenses with machine learning

Sep 04, 2019
Johann Brehmer, Siddharth Mishra-Sharma, Joeri Hermans, Gilles Louppe, Kyle Cranmer

The subtle and unique imprint of dark matter substructure on extended arcs in strong lensing systems contains a wealth of information about the properties and distribution of dark matter on small scales and, consequently, about the underlying particle physics. However, teasing out this effect poses a significant challenge since the likelihood function for realistic simulations of population-level parameters is intractable. We apply recently-developed simulation-based inference techniques to the problem of substructure inference in galaxy-galaxy strong lenses. By leveraging additional information extracted from the simulator, neural networks are efficiently trained to estimate likelihood ratios associated with population-level parameters characterizing substructure. Through proof-of-principle application to simulated data, we show that these methods can provide an efficient and principled way to simultaneously analyze an ensemble of strong lenses, and can be used to mine the large sample of lensing images deliverable by near-future surveys for signatures of dark matter substructure.

* 22 pages, 6 figures, code available at https://github.com/smsharma/mining-for-substructure-lens 

  Click for Model/Code and Paper
Effective LHC measurements with matrix elements and machine learning

Jun 04, 2019
Johann Brehmer, Kyle Cranmer, Irina Espejo, Felix Kling, Gilles Louppe, Juan Pavez

One major challenge for the legacy measurements at the LHC is that the likelihood function is not tractable when the collected data is high-dimensional and the detector response has to be modeled. We review how different analysis strategies solve this issue, including the traditional histogram approach used in most particle physics analyses, the Matrix Element Method, Optimal Observables, and modern techniques based on neural density estimation. We then discuss powerful new inference methods that use a combination of matrix element information and machine learning to accurately estimate the likelihood function. The MadMiner package automates all necessary data-processing steps. In first studies we find that these new techniques have the potential to substantially improve the sensitivity of the LHC legacy measurements.

* Keynote at the 19th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2019) 

  Click for Model/Code and Paper
Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model

Sep 01, 2018
Atilim Gunes Baydin, Lukas Heinrich, Wahid Bhimji, Bradley Gram-Hansen, Gilles Louppe, Lei Shao, Prabhat, Kyle Cranmer, Frank Wood

We present a novel framework that enables efficient probabilistic inference in large-scale scientific models by allowing the execution of existing domain-specific simulators as probabilistic programs, resulting in highly interpretable posterior inference. Our framework is general purpose and scalable, and is based on a cross-platform probabilistic execution protocol through which an inference engine can control simulators in a language-agnostic way. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the tau lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. High-energy physics has a rich set of simulators based on quantum field theory and the interaction of particles in matter. We show how to use probabilistic programming to perform Bayesian inference in these existing simulator codebases directly, in particular conditioning on observable outputs from a simulated particle detector to directly produce an interpretable posterior distribution over decay pathways. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of Markov chain Monte Carlo sampling.

* 18 pages, 5 figures 

  Click for Model/Code and Paper
Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators

Dec 21, 2017
Mario Lezcano Casado, Atilim Gunes Baydin, David Martinez Rubio, Tuan Anh Le, Frank Wood, Lukas Heinrich, Gilles Louppe, Kyle Cranmer, Karen Ng, Wahid Bhimji, Prabhat

We consider the problem of Bayesian inference in the family of probabilistic models implicitly defined by stochastic generative models of data. In scientific fields ranging from population biology to cosmology, low-level mechanistic components are composed to create complex generative models. These models lead to intractable likelihoods and are typically non-differentiable, which poses challenges for traditional approaches to inference. We extend previous work in "inference compilation", which combines universal probabilistic programming and deep learning methods, to large-scale scientific simulators, and introduce a C++ based probabilistic programming library called CPProb. We successfully use CPProb to interface with SHERPA, a large code-base used in particle physics. Here we describe the technical innovations realized and planned for this library.

* 7 pages, 2 figures 

  Click for Model/Code and Paper
Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale

Jul 08, 2019
Atılım Güneş Baydin, Lei Shao, Wahid Bhimji, Lukas Heinrich, Lawrence Meadows, Jialin Liu, Andreas Munk, Saeid Naderiparizi, Bradley Gram-Hansen, Gilles Louppe, Mingfei Ma, Xiaohui Zhao, Philip Torr, Victor Lee, Kyle Cranmer, Prabhat, Frank Wood

Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN--LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global minibatch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.

* 14 pages, 8 figures 

  Click for Model/Code and Paper
Machine Learning in High Energy Physics Community White Paper

Jul 08, 2018
Kim Albertsson, Piero Altoe, Dustin Anderson, Michael Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian Bevan, Wahid Bhimji, Daniele Bonacorsi, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Taylor Childers, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Javier Duarte, Martin Erdmann, Jonas Eschle, Amir Farbin, Matthew Feickert, Nuno Filipe Castro, Conor Fitzpatrick, Michele Floris, Alessandra Forti, Jordi Garra-Tico, Jochen Gemmler, Maria Girone, Paul Glaysher, Sergei Gleyzer, Vladimir Gligorov, Tobias Golling, Jonas Graw, Lindsey Gray, Dick Greenwood, Thomas Hacker, John Harvey, Benedikt Hegner, Lukas Heinrich, Ben Hooberman, Johannes Junggeburth, Michael Kagan, Meghan Kane, Konstantin Kanishchev, Przemysław Karpiński, Zahari Kassabov, Gaurav Kaul, Dorian Kcira, Thomas Keck, Alexei Klimentov, Jim Kowalkowski, Luke Kreczko, Alexander Kurepin, Rob Kutschke, Valentin Kuznetsov, Nicolas Köhler, Igor Lakomov, Kevin Lannon, Mario Lassnig, Antonio Limosani, Gilles Louppe, Aashrita Mangu, Pere Mato, Narain Meenakshi, Helge Meinhard, Dario Menasce, Lorenzo Moneta, Seth Moortgat, Mark Neubauer, Harvey Newman, Hans Pabst, Michela Paganini, Manfred Paulini, Gabriel Perdue, Uzziel Perez, Attilio Picazio, Jim Pivarski, Harrison Prosper, Fernanda Psihas, Alexander Radovic, Ryan Reece, Aurelius Rinkevicius, Eduardo Rodrigues, Jamal Rorie, David Rousseau, Aaron Sauers, Steven Schramm, Ariel Schwartzman, Horst Severini, Paul Seyfert, Filip Siroky, Konstantin Skazytkin, Mike Sokoloff, Graeme Stewart, Bob Stienen, Ian Stockdale, Giles Strong, Savannah Thais, Karen Tomko, Eli Upfal, Emanuele Usai, Andrey Ustyuzhanin, Martin Vala, Sofia Vallecorsa, Mauro Verzetti, Xavier Vilasís-Cardona, Jean-Roch Vlimant, Ilija Vukotic, Sean-Jiun Wang, Gordon Watts, Michael Williams, Wenjing Wu, Stefan Wunsch, Omar Zapata

Machine learning is an important research area in particle physics, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas in machine learning in particle physics with a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.

* Editors: Sergei Gleyzer, Paul Seyfert and Steven Schramm 

  Click for Model/Code and Paper