This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.

* Journal of Artificial Intelligence Research, Vol 4, (1996), 237-285
* See for any accompanying files
Click to Read Paper
Even in the absence of any explicit semantic annotation, vast collections of audio recordings provide valuable information for learning the categorical structure of sounds. We consider several class-agnostic semantic constraints that apply to unlabeled nonspeech audio: (i) noise and translations in time do not change the underlying sound category, (ii) a mixture of two sound events inherits the categories of the constituents, and (iii) the categories of events in close temporal proximity are likely to be the same or related. Without labels to ground them, these constraints are incompatible with classification loss functions. However, they may still be leveraged to identify geometric inequalities needed for triplet loss-based training of convolutional neural networks. The result is low-dimensional embeddings of the input spectrograms that recover 41% and 84% of the performance of their fully-supervised counterparts when applied to downstream query-by-example sound retrieval and sound event classification tasks, respectively. Moreover, in limited-supervision settings, our unsupervised embeddings double the state-of-the-art classification performance.

* Submitted to ICASSP 2018
Click to Read Paper
We report on the production experience of the D0 experiment at the Fermilab Tevatron, using the SAM data handling system with a variety of computing hardware configurations, batch systems, and mass storage strategies. We have stored more than 300 TB of data in the Fermilab Enstore mass storage system. We deliver data through this system at an average rate of more than 2 TB/day to analysis programs, with a substantial multiplication factor in the consumed data through intelligent cache management. We handle more than 1.7 Million files in this system and provide data delivery to user jobs at Fermilab on four types of systems: a reconstruction farm, a large SMP system, a Linux batch cluster, and a Linux desktop cluster. In addition, we import simulation data generated at 6 sites worldwide, and deliver data to jobs at many more sites. We describe the scope of the data handling deployment worldwide, the operational experience with this system, and the feedback of that experience.

* ECONF C0303241:MOKT002,2003
* Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 6 pages. PSN MOKT002
Click to Read Paper
Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels. We examine fully connected Deep Neural Networks (DNNs), AlexNet [1], VGG [2], Inception [3], and ResNet [4]. We investigate varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on our audio classification task, and larger training and label sets help up to a point. A model using embeddings from these classifiers does much better than raw features on the Audio Set [5] Acoustic Event Detection (AED) classification task.

* Accepted for publication at ICASSP 2017 Changes: Added definitions of mAP, AUC, and d-prime. Updated mAP/AUC/d-prime numbers for Audio Set based on changes of latest Audio Set revision. Changed wording to fit 4 page limit with new additions
Click to Read Paper
A Deep Neural Network for Pixel-Level Electromagnetic Particle Identification in the MicroBooNE Liquid Argon Time Projection Chamber
Aug 22, 2018
MicroBooNE collaboration, C. Adams, M. Alrashed, R. An, J. Anthony, J. Asaadi, A. Ashkenazi, M. Auger, S. Balasubramanian, B. Baller, C. Barnes, G. Barr, M. Bass, F. Bay, A. Bhat, K. Bhattacharya, M. Bishai, A. Blake, T. Bolton, L. Camilleri, D. Caratelli, I. Caro Terrazas, R. Carr, R. Castillo Fernandez, F. Cavanna, G. Cerati, Y. Chen, E. Church, D. Cianci, E. Cohen, G. H. Collin, J. M. Conrad, M. Convery, L. Cooper-Troendle, J. I. Crespo-Anadon, M. Del Tutto, D. Devitt, A. Diaz, K. Duffy, S. Dytman, B. Eberly, A. Ereditato, L. Escudero Sanchez, J. Esquivel, J. J. Evans, A. A. Fadeeva, R. S. Fitzpatrick, B. T. Fleming, D. Franco, A. P. Furmanski, D. Garcia-Gamez, G. T. Garvey, V. Genty, D. Goeldi, S. Gollapinni, O. Goodwin, E. Gramellini, H. Greenlee, R. Grosso, R. Guenette, P. Guzowski, A. Hackenburg, P. Hamilton, O. Hen, J. Hewes, C. Hill, G. A. Horton-Smith, A. Hourlier, E. -C. Huang, C. James, J. Jan de Vries, L. Jiang, R. A. Johnson, J. Joshi, H. Jostlein, Y. -J. Jwa, G. Karagiorgi, W. Ketchum, B. Kirby, M. Kirby, T. Kobilarcik, I. Kreslo, Y. Li, A. Lister, B. R. Littlejohn, S. Lockwitz, D. Lorca, W. C. Louis, M. Luethi, B. Lundberg, X. Luo, A. Marchionni, S. Marcocci, C. Mariani, J. Marshall, J. Martin-Albo, D. A. Martinez Caicedo, A. Mastbaum, V. Meddage, T. Mettler, G. B. Mills, K. Mistry, A. Mogan, J. Moon, M. Mooney, C. D. Moore, J. Mousseau, M. Murphy, R. Murrells, D. Naples, P. Nienaber, J. Nowak, O. Palamara, V. Pandey, V. Paolone, A. Papadopoulou, V. Papavassiliou, S. F. Pate, Z. Pavlovic, E. Piasetzky, D. Porzio, G. Pulliam, X. Qian, J. L. Raaf, A. Rafique, L. Rochester, M. Ross-Lonergan, C. Rudolf von Rohr, B. Russell, D. W. Schmitz, A. Schukraft, W. Seligman, M. H. Shaevitz, R. Sharankova, J. Sinclair, A. Smith, E. L. Snider, M. Soderberg, S. Soldner-Rembold, S. R. Soleti, P. Spentzouris, J. Spitz, J. St. John, T. Strauss, K. Sutton, S. Sword-Fehlberg, A. M. Szelc, N. Tagg, W. Tang, K. Terao, M. Thomson, R. T. Thornton, M. Toups, Y. -T. Tsai, S. Tufanli, T. Usher, W. Van De Pontseele, R. G. Van de Water, B. Viren, M. Weber, H. Wei, D. A. Wickremasinghe, K. Wierman, Z. Williams, S. Wolbers, T. Wongjirad, K. Woodruff, T. Yang, G. Yarbrough, L. E. Yates, G. P. Zeller, J. Zennamo, C. Zhang

We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction chain for the MicroBooNE detector. We show the first demonstration of a network's validity on real LArTPC data using MicroBooNE collection plane images. The demonstration is performed for stopping muon and a $\nu_\mu$ charged current neutral pion data samples.

Click to Read Paper
For decades, advances in electronics were directly driven by the scaling of CMOS transistors according to Moore's law. However, both the CMOS scaling and the classical computer architecture are approaching fundamental and practical limits, and new computing architectures based on emerging devices, such as resistive random-access memory (RRAM) devices, are expected to sustain the exponential growth of computing capability. Here we propose a novel memory-centric, reconfigurable, general purpose computing platform that is capable of handling the explosive amount of data in a fast and energy-efficient manner. The proposed computing architecture is based on a uniform, physical, resistive, memory-centric fabric that can be optimally reconfigured and utilized to perform different computing and data storage tasks in a massively parallel approach. The system can be tailored to achieve maximal energy efficiency based on the data flow by dynamically allocating the basic computing fabric for storage, arithmetic, and analog computing including neuromorphic computing tasks.

Click to Read Paper