* Accepted to the Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2017

**Click to Read Paper and Get Code**

Learnings Options End-to-End for Continuous Action Tasks

Nov 30, 2017

Martin Klissarov, Pierre-Luc Bacon, Jean Harb, Doina Precup

Nov 30, 2017

Martin Klissarov, Pierre-Luc Bacon, Jean Harb, Doina Precup

**Click to Read Paper and Get Code**

Attend Before you Act: Leveraging human visual attention for continual learning

Jul 25, 2018

Khimya Khetarpal, Doina Precup

Jul 25, 2018

Khimya Khetarpal, Doina Precup

* Lifelong Learning: A Reinforcement Learning Approach (LLARLA) Workshop, ICML 2018

**Click to Read Paper and Get Code**

* 24th Annual Proceedings of the Advances in Neural Information Processing Systems (2010) pp. 1-9

* 8 pages, 7 figures

**Click to Read Paper and Get Code**

Variational Generative Stochastic Networks with Collaborative Shaping

Aug 02, 2017

Philip Bachman, Doina Precup

We develop an approach to training generative models based on unrolling a variational auto-encoder into a Markov chain, and shaping the chain's trajectories using a technique inspired by recent work in Approximate Bayesian computation. We show that the global minimizer of the resulting objective is achieved when the generative model reproduces the target distribution. To allow finer control over the behavior of the models, we add a regularization term inspired by techniques used for regularizing certain types of policy search in reinforcement learning. We present empirical results on the MNIST and TFD datasets which show that our approach offers state-of-the-art performance, both quantitatively and from a qualitative point of view.
Aug 02, 2017

Philip Bachman, Doina Precup

* Old paper, from ICML 2015

**Click to Read Paper and Get Code**

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

Apr 18, 2017

Jean Harb, Doina Precup

Apr 18, 2017

Jean Harb, Doina Precup

* 8 pages, 3 figures, NIPS 2016 Deep Reinforcement Learning Workshop

**Click to Read Paper and Get Code**

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options

Mar 19, 2017

Peeyush Kumar, Doina Precup

Mar 19, 2017

Peeyush Kumar, Doina Precup

**Click to Read Paper and Get Code**

**Click to Read Paper and Get Code**

* Accepted for publication at Advances in Neural Information Processing Systems (NIPS) 2015

**Click to Read Paper and Get Code**

**Click to Read Paper and Get Code**

* Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

**Click to Read Paper and Get Code**

We show that the Bellman operator underlying the options framework leads to a matrix splitting, an approach traditionally used to speed up convergence of iterative solvers for large linear systems of equations. Based on standard comparison theorems for matrix splittings, we then show how the asymptotic rate of convergence varies as a function of the inherent timescales of the options. This new perspective highlights a trade-off between asymptotic performance and the cost of computation associated with building a good set of options.

* The results presented in the previous version of this paper were found be applicable only to "gating execution" and not "call-and-return". We made this distinction clear in the text and added an extension to the call-and-return model

* The results presented in the previous version of this paper were found be applicable only to "gating execution" and not "call-and-return". We made this distinction clear in the text and added an extension to the call-and-return model

**Click to Read Paper and Get Code**
Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials

Mar 24, 2019

Hossein Aboutalebi, Doina Precup, Tibor Schuster

Mar 24, 2019

Hossein Aboutalebi, Doina Precup, Tibor Schuster

**Click to Read Paper and Get Code**

Off-Policy Deep Reinforcement Learning without Exploration

Dec 07, 2018

Scott Fujimoto, David Meger, Doina Precup

Reinforcement learning traditionally considers the task of balancing exploration and exploitation. This work examines batch reinforcement learning--the task of maximally exploiting a given batch of off-policy data, without further data collection. We demonstrate that due to errors introduced by extrapolation, standard off-policy deep reinforcement learning algorithms, such as DQN and DDPG, are only capable of learning with data correlated to their current policy, making them ineffective for most off-policy applications. We introduce a novel class of off-policy algorithms, batch-constrained reinforcement learning, which restricts the action space to force the agent towards behaving on-policy with respect to a subset of the given data. We extend this notion to deep reinforcement learning, and to the best of our knowledge, present the first continuous control deep reinforcement learning algorithm which can learn effectively from uncorrelated off-policy data.
Dec 07, 2018

Scott Fujimoto, David Meger, Doina Precup

**Click to Read Paper and Get Code**

Safe Option-Critic: Learning Safety in the Option-Critic Architecture

Jul 21, 2018

Arushi Jain, Khimya Khetarpal, Doina Precup

Jul 21, 2018

Arushi Jain, Khimya Khetarpal, Doina Precup

* 9 pages, 13 figures, to be published in ALA - ICML Workshop 2018

**Click to Read Paper and Get Code**

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning

Jul 04, 2018

Guillaume Rabusseau, Tianyu Li, Doina Precup

In this paper, we unravel a fundamental connection between weighted finite automata~(WFAs) and second-order recurrent neural networks~(2-RNNs): in the case of sequences of discrete symbols, WFAs and 2-RNNs with linear activation functions are expressively equivalent. Motivated by this result, we build upon a recent extension of the spectral learning algorithm to vector-valued WFAs and propose the first provable learning algorithm for linear 2-RNNs defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the so-called Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed method are assessed in a simulation study.
Jul 04, 2018

Guillaume Rabusseau, Tianyu Li, Doina Precup

**Click to Read Paper and Get Code**

Neural Network Based Nonlinear Weighted Finite Automata

Dec 21, 2017

Tianyu Li, Guillaume Rabusseau, Doina Precup

Dec 21, 2017

Tianyu Li, Guillaume Rabusseau, Doina Precup

* AISTATS 2018

**Click to Read Paper and Get Code**

**Click to Read Paper and Get Code**

Testing Visual Attention in Dynamic Environments

Oct 30, 2015

Philip Bachman, David Krueger, Doina Precup

Oct 30, 2015

Philip Bachman, David Krueger, Doina Precup

**Click to Read Paper and Get Code**

* To appear in Advances in Neural Information Processing Systems 27 (NIPS 2014), Advances in Neural Information Processing Systems 27, Dec. 2014

**Click to Read Paper and Get Code**