Machine Learning in High Energy Physics Community White Paper
Jul 08, 2018
Kim Albertsson, Piero Altoe, Dustin Anderson, Michael Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian Bevan, Wahid Bhimji, Daniele Bonacorsi, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Taylor Childers, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Javier Duarte, Martin Erdmann, Jonas Eschle, Amir Farbin, Matthew Feickert, Nuno Filipe Castro, Conor Fitzpatrick, Michele Floris, Alessandra Forti, Jordi Garra-Tico, Jochen Gemmler, Maria Girone, Paul Glaysher, Sergei Gleyzer, Vladimir Gligorov, Tobias Golling, Jonas Graw, Lindsey Gray, Dick Greenwood, Thomas Hacker, John Harvey, Benedikt Hegner, Lukas Heinrich, Ben Hooberman, Johannes Junggeburth, Michael Kagan, Meghan Kane, Konstantin Kanishchev, Przemysław Karpiński, Zahari Kassabov, Gaurav Kaul, Dorian Kcira, Thomas Keck, Alexei Klimentov, Jim Kowalkowski, Luke Kreczko, Alexander Kurepin, Rob Kutschke, Valentin Kuznetsov, Nicolas Köhler, Igor Lakomov, Kevin Lannon, Mario Lassnig, Antonio Limosani, Gilles Louppe, Aashrita Mangu, Pere Mato, Narain Meenakshi, Helge Meinhard, Dario Menasce, Lorenzo Moneta, Seth Moortgat, Mark Neubauer, Harvey Newman, Hans Pabst, Michela Paganini, Manfred Paulini, Gabriel Perdue, Uzziel Perez, Attilio Picazio, Jim Pivarski, Harrison Prosper, Fernanda Psihas, Alexander Radovic, Ryan Reece, Aurelius Rinkevicius, Eduardo Rodrigues, Jamal Rorie, David Rousseau, Aaron Sauers, Steven Schramm, Ariel Schwartzman, Horst Severini, Paul Seyfert, Filip Siroky, Konstantin Skazytkin, Mike Sokoloff, Graeme Stewart, Bob Stienen, Ian Stockdale, Giles Strong, Savannah Thais, Karen Tomko, Eli Upfal, Emanuele Usai, Andrey Ustyuzhanin, Martin Vala, Sofia Vallecorsa, Mauro Verzetti, Xavier Vilasís-Cardona, Jean-Roch Vlimant, Ilija Vukotic, Sean-Jiun Wang, Gordon Watts, Michael Williams, Wenjing Wu, Stefan Wunsch, Omar Zapata

Machine learning is an important research area in particle physics, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas in machine learning in particle physics with a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.

* Editors: Sergei Gleyzer, Paul Seyfert and Steven Schramm
Click to Read Paper
We develop a framework that allows the use of the multi-level Monte Carlo (MLMC) methodology (Giles 2015) to calculate expectations with respect to the invariant measures of ergodic SDEs. In that context, we study the (over-damped) Langevin equations with strongly convex potential. We show that, when appropriate contracting couplings for the numerical integrators are available, one can obtain a time-uniform estimates of the MLMC variance in stark contrast to the majority of the results in the MLMC literature. As a consequence, one can approximate expectations with respect to the invariant measure in an unbiased way without the need of a Metropolis- Hastings step. In addition, a root mean square error of $\mathcal{O}(\epsilon)$ is achieved with $\mathcal{O}(\epsilon^{-2})$ complexity on par with Markov Chain Monte Carlo (MCMC) methods, which however can be computationally intensive when applied to large data sets. Finally, we present a multilevel version of the recently introduced Stochastic Gradient Langevin (SGLD) method (Welling and Teh, 2011) built for large datasets applications. We show that this is the first stochastic gradient MCMC method with complexity $\mathcal{O}(\epsilon^{-2}|\log {\epsilon}|^{3})$, which is asymptotically an order $\epsilon$ lower than the $ \mathcal{O}(\epsilon^{-3})$ complexity of all stochastic gradient MCMC methods that are currently available. Numerical experiments confirm our theoretical findings.

* 30 pages, 6 figures
Click to Read Paper