Mondrian Forests: Efficient Online Random Forests

Feb 16, 2015

Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Feb 16, 2015

Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

* Advances in Neural Information Processing Systems 27 (NIPS), pages 3140-3148, 2014

**Click to Read Paper**

**Click to Read Paper**

Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes

Feb 14, 2018

Hyunjik Kim, Yee Whye Teh

Feb 14, 2018

Hyunjik Kim, Yee Whye Teh

* AISTATS 2018 (oral)

**Click to Read Paper**

**Click to Read Paper**

Inferring ground truth from multi-annotator ordinal data: a probabilistic approach

Apr 30, 2013

Balaji Lakshminarayanan, Yee Whye Teh

Apr 30, 2013

Balaji Lakshminarayanan, Yee Whye Teh

**Click to Read Paper**

Belief Optimization for Binary Networks: A Stable Alternative to Loopy Belief Propagation

Jan 10, 2013

Max Welling, Yee Whye Teh

Jan 10, 2013

Max Welling, Yee Whye Teh

* Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

**Click to Read Paper**

* NIPS - Neural Information Processing Systems (2012)

**Click to Read Paper**

A Fast and Simple Algorithm for Training Neural Probabilistic Language Models

Jun 27, 2012

Andriy Mnih, Yee Whye Teh

In spite of their superior performance, neural probabilistic language models (NPLMs) remain far less widely used than n-gram models due to their notoriously long training times, which are measured in weeks even for moderately-sized datasets. Training NPLMs is computationally expensive because they are explicitly normalized, which leads to having to consider all words in the vocabulary when computing the log-likelihood gradients. We propose a fast and simple algorithm for training NPLMs based on noise-contrastive estimation, a newly introduced procedure for estimating unnormalized continuous distributions. We investigate the behaviour of the algorithm on the Penn Treebank corpus and show that it reduces the training times by more than an order of magnitude without affecting the quality of the resulting models. The algorithm is also more efficient and much more stable than importance sampling because it requires far fewer noise samples to perform well. We demonstrate the scalability of the proposed approach by training several neural language models on a 47M-word corpus with a 80K-word vocabulary, obtaining state-of-the-art results on the Microsoft Research Sentence Completion Challenge dataset.
Jun 27, 2012

Andriy Mnih, Yee Whye Teh

* In Proceedings of the 29th International Conference on Machine Learning, pages 1751-1758, 2012

* Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

**Click to Read Paper**

Fast MCMC sampling for Markov jump processes and continuous time Bayesian networks

Feb 14, 2012

Vinayak Rao, Yee Whye Teh

Feb 14, 2012

Vinayak Rao, Yee Whye Teh

**Click to Read Paper**

Learning Item Trees for Probabilistic Modelling of Implicit Feedback

Sep 27, 2011

Andriy Mnih, Yee Whye Teh

Sep 27, 2011

Andriy Mnih, Yee Whye Teh

* 8 pages

**Click to Read Paper**

The Mondrian Kernel

Jun 16, 2016

Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh

We introduce the Mondrian kernel, a fast random feature approximation to the Laplace kernel. It is suitable for both batch and online learning, and admits a fast kernel-width-selection procedure as the random features can be re-used efficiently for all kernel widths. The features are constructed by sampling trees via a Mondrian process [Roy and Teh, 2009], and we highlight the connection to Mondrian forests [Lakshminarayanan et al., 2014], where trees are also sampled via a Mondrian process, but fit independently. This link provides a new insight into the relationship between kernel methods and random forests.
Jun 16, 2016

Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh

* Accepted for presentation at the 32nd Conference on Uncertainty in Artificial Intelligence (UAI 2016)

**Click to Read Paper**

Probabilistic symmetry and invariant neural networks

Jan 18, 2019

Benjamin Bloem-Reddy, Yee Whye Teh

Jan 18, 2019

Benjamin Bloem-Reddy, Yee Whye Teh

**Click to Read Paper**

A nonparametric HMM for genetic imputation and coalescent inference

Nov 02, 2016

Lloyd T. Elliott, Yee Whye Teh

Genetic sequence data are well described by hidden Markov models (HMMs) in which latent states correspond to clusters of similar mutation patterns. Theory from statistical genetics suggests that these HMMs are nonhomogeneous (their transition probabilities vary along the chromosome) and have large support for self transitions. We develop a new nonparametric model of genetic sequence data, based on the hierarchical Dirichlet process, which supports these self transitions and nonhomogeneity. Our model provides a parameterization of the genetic process that is more parsimonious than other more general nonparametric models which have previously been applied to population genetics. We provide truncation-free MCMC inference for our model using a new auxiliary sampling scheme for Bayesian nonparametric HMMs. In a series of experiments on male X chromosome data from the Thousand Genomes Project and also on data simulated from a population bottleneck we show the benefits of our model over the popular finite model fastPHASE, which can itself be seen as a parametric truncation of our model. We find that the number of HMM states found by our model is correlated with the time to the most recent common ancestor in population bottlenecks. This work demonstrates the flexibility of Bayesian nonparametrics applied to large and complex genetic data.
Nov 02, 2016

Lloyd T. Elliott, Yee Whye Teh

**Click to Read Paper**

Discovering Multiple Constraints that are Frequently Approximately Satisfied

Jan 10, 2013

Geoffrey E. Hinton, Yee Whye Teh

Some high-dimensional data.sets can be modelled by assuming that there are many different linear constraints, each of which is Frequently Approximately Satisfied (FAS) by the data. The probability of a data vector under the model is then proportional to the product of the probabilities of its constraint violations. We describe three methods of learning products of constraints using a heavy-tailed probability distribution for the violations.
Jan 10, 2013

Geoffrey E. Hinton, Yee Whye Teh

* Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

**Click to Read Paper**

Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data

Oct 26, 2018

Xenia Miscouridou, François Caron, Yee Whye Teh

We propose a novel class of network models for temporal dyadic interaction data. Our goal is to capture a number of important features often observed in social interactions: sparsity, degree heterogeneity, community structure and reciprocity. We propose a family of models based on self-exciting Hawkes point processes in which events depend on the history of the process. The key component is the conditional intensity function of the Hawkes Process, which captures the fact that interactions may arise as a response to past interactions (reciprocity), or due to shared interests between individuals (community structure). In order to capture the sparsity and degree heterogeneity, the base (non time dependent) part of the intensity function builds on compound random measures following Todeschini et al. (2016). We conduct experiments on a variety of real-world temporal interaction data and show that the proposed model outperforms many competing approaches for link prediction, and leads to interpretable parameters.
Oct 26, 2018

Xenia Miscouridou, François Caron, Yee Whye Teh

**Click to Read Paper**

Causal Inference via Kernel Deviance Measures

Apr 12, 2018

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

Apr 12, 2018

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

**Click to Read Paper**

Poisson intensity estimation with reproducing kernels

Jun 26, 2017

Seth Flaxman, Yee Whye Teh, Dino Sejdinovic

Despite the fundamental nature of the inhomogeneous Poisson process in the theory and application of stochastic processes, and its attractive generalizations (e.g. Cox process), few tractable nonparametric modeling approaches of intensity functions exist, especially when observed points lie in a high-dimensional space. In this paper we develop a new, computationally tractable Reproducing Kernel Hilbert Space (RKHS) formulation for the inhomogeneous Poisson process. We model the square root of the intensity as an RKHS function. Whereas RKHS models used in supervised learning rely on the so-called representer theorem, the form of the inhomogeneous Poisson process likelihood means that the representer theorem does not apply. However, we prove that the representer theorem does hold in an appropriately transformed RKHS, guaranteeing that the optimization of the penalized likelihood can be cast as a tractable finite-dimensional problem. The resulting approach is simple to implement, and readily scales to high dimensions and large-scale datasets.
Jun 26, 2017

Seth Flaxman, Yee Whye Teh, Dino Sejdinovic

* AISTATS 2017

**Click to Read Paper**

Gaussian Processes for Survival Analysis

Nov 02, 2016

Tamara Fernández, Nicolás Rivera, Yee Whye Teh

We introduce a semi-parametric Bayesian model for survival analysis. The model is centred on a parametric baseline hazard, and uses a Gaussian process to model variations away from it nonparametrically, as well as dependence on covariates. As opposed to many other methods in survival analysis, our framework does not impose unnecessary constraints in the hazard rate or in the survival function. Furthermore, our model handles left, right and interval censoring mechanisms common in survival analysis. We propose a MCMC algorithm to perform inference and an approximation scheme based on random Fourier features to make computations faster. We report experimental results on synthetic and real data, showing that our model performs better than competing models such as Cox proportional hazards, ANOVA-DDP and random survival forests.
Nov 02, 2016

Tamara Fernández, Nicolás Rivera, Yee Whye Teh

* To appear in NIPS 2016

**Click to Read Paper**

Bayesian nonparametrics for Sparse Dynamic Networks

Jul 06, 2016

Konstantina Palla, Francois Caron, Yee Whye Teh

We propose a Bayesian nonparametric prior for time-varying networks. To each node of the network is associated a positive parameter, modeling the sociability of that node. Sociabilities are assumed to evolve over time, and are modeled via a dynamic point process model. The model is able to (a) capture smooth evolution of the interaction between nodes, allowing edges to appear/disappear over time (b) capture long term evolution of the sociabilities of the nodes (c) and yield sparse graphs, where the number of edges grows subquadratically with the number of nodes. The evolution of the sociabilities is described by a tractable time-varying gamma process. We provide some theoretical insights into the model and apply it to three real world datasets.
Jul 06, 2016

Konstantina Palla, Francois Caron, Yee Whye Teh

* 10 pages, 8 figures

**Click to Read Paper**

DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression

Feb 15, 2016

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

Feb 15, 2016

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

**Click to Read Paper**