Stochastic Hyperparameter Optimization through Hypernetworks

Mar 08, 2018

Jonathan Lorraine, David Duvenaud

Mar 08, 2018

Jonathan Lorraine, David Duvenaud

* 9 pages, 6 figures; revised figures

**Click to Read Paper**

Herding and kernel herding are deterministic methods of choosing samples which summarise a probability distribution. A related task is choosing samples for estimating integrals using Bayesian quadrature. We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature. We then show that sequential Bayesian quadrature can be viewed as a weighted version of kernel herding which achieves performance superior to any other weighted herding method. We demonstrate empirically a rate of convergence faster than O(1/N). Our results also imply an upper bound on the empirical error of the Bayesian quadrature estimate.

* Accepted as an oral presentation at Uncertainty in Artificial Intelligence 2012. Updated to fix several typos

* Accepted as an oral presentation at Uncertainty in Artificial Intelligence 2012. Updated to fix several typos

**Click to Read Paper**
Towards Understanding Linear Word Analogies

Oct 27, 2018

Kawin Ethayarajh, David Duvenaud, Graeme Hirst

Oct 27, 2018

Kawin Ethayarajh, David Duvenaud, Graeme Hirst

**Click to Read Paper**

Inference Suboptimality in Variational Autoencoders

May 27, 2018

Chris Cremer, Xuechen Li, David Duvenaud

May 27, 2018

Chris Cremer, Xuechen Li, David Duvenaud

* ICML

**Click to Read Paper**

Reinterpreting Importance-Weighted Autoencoders

Aug 15, 2017

Chris Cremer, Quaid Morris, David Duvenaud

The standard interpretation of importance-weighted autoencoders is that they maximize a tighter lower bound on the marginal likelihood than the standard evidence lower bound. We give an alternate interpretation of this procedure: that it optimizes the standard variational lower bound, but using a more complex distribution. We formally derive this result, present a tighter lower bound, and visualize the implicit importance-weighted distribution.
Aug 15, 2017

Chris Cremer, Quaid Morris, David Duvenaud

* ICLR 2017 Workshop

**Click to Read Paper**

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

May 28, 2017

Geoffrey Roeder, Yuhuai Wu, David Duvenaud

May 28, 2017

Geoffrey Roeder, Yuhuai Wu, David Duvenaud

**Click to Read Paper**

Markov Chain Monte Carlo (MCMC) algorithms are a workhorse of probabilistic modeling and inference, but are difficult to debug, and are prone to silent failure if implemented naively. We outline several strategies for testing the correctness of MCMC algorithms. Specifically, we advocate writing code in a modular way, where conditional probability calculations are kept separate from the logic of the sampler. We discuss strategies for both unit testing and integration testing. As a running example, we show how a Python implementation of Gibbs sampling for a mixture of Gaussians model can be tested.

* Presented at the 2014 NIPS workshop on Software Engineering for Machine Learning

* Presented at the 2014 NIPS workshop on Software Engineering for Machine Learning

**Click to Read Paper**
Probabilistic ODE Solvers with Runge-Kutta Means

Oct 24, 2014

Michael Schober, David Duvenaud, Philipp Hennig

Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.
Oct 24, 2014

Michael Schober, David Duvenaud, Philipp Hennig

* 18 pages (9 page conference paper, plus supplements); appears in Advances in Neural Information Processing Systems (NIPS), 2014

**Click to Read Paper**

Warped Mixtures for Nonparametric Cluster Shapes

Aug 09, 2014

Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

Aug 09, 2014

Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

* Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

**Click to Read Paper**

**Click to Read Paper**

Early Stopping is Nonparametric Variational Inference

Apr 06, 2015

Dougal Maclaurin, David Duvenaud, Ryan P. Adams

Apr 06, 2015

Dougal Maclaurin, David Duvenaud, Ryan P. Adams

* 8 pages, 5 figures

**Click to Read Paper**

Gradient-based Hyperparameter Optimization through Reversible Learning

Apr 02, 2015

Dougal Maclaurin, David Duvenaud, Ryan P. Adams

Apr 02, 2015

Dougal Maclaurin, David Duvenaud, Ryan P. Adams

* 10 figures. Submitted to ICML

**Click to Read Paper**

We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.

* Appearing in Neural Information Processing Systems 2011

* Appearing in Neural Information Processing Systems 2011

**Click to Read Paper**
Noisy Natural Gradient as Variational Inference

Feb 26, 2018

Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse

Feb 26, 2018

Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse

**Click to Read Paper**

Neural networks for the prediction organic chemistry reactions

Oct 17, 2016

Jennifer N. Wei, David Duvenaud, Alán Aspuru-Guzik

Oct 17, 2016

Jennifer N. Wei, David Duvenaud, Alán Aspuru-Guzik

* ACS.Cent.Sci. 2 (2016) 725-732

* 21 pages, 5 figures

**Click to Read Paper**

Explaining Image Classifiers by Counterfactual Generation

Oct 11, 2018

Chun-Hao Chang, Elliot Creager, Anna Goldenberg, David Duvenaud

Oct 11, 2018

Chun-Hao Chang, Elliot Creager, Anna Goldenberg, David Duvenaud

**Click to Read Paper**

Avoiding pathologies in very deep networks

Jul 08, 2016

David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani

Jul 08, 2016

David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani

* Fixed a typo regarding number of layers

**Click to Read Paper**

Isolating Sources of Disentanglement in Variational Autoencoders

Oct 22, 2018

Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud

Oct 22, 2018

Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud

* Added more experiments and improved clarity

**Click to Read Paper**