Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Francisco J. R. Ruiz

Quantum Circuit Optimization with AlphaTensor

Mar 05, 2024
Francisco J. R. Ruiz, Tuomas Laakkonen, Johannes Bausch, Matej Balog, Mohammadamin Barekatain, Francisco J. H. Heras, Alexander Novikov, Nathan Fitzpatrick, Bernardino Romera-Paredes, John van de Wetering, Alhussein Fawzi, Konstantinos Meichanetzidis, Pushmeet Kohli

A key challenge in realizing fault-tolerant quantum computers is circuit optimization. Focusing on the most expensive gates in fault-tolerant quantum computation (namely, the T gates), we address the problem of T-count optimization, i.e., minimizing the number of T gates that are needed to implement a given circuit. To achieve this, we develop AlphaTensor-Quantum, a method based on deep reinforcement learning that exploits the relationship between optimizing T-count and tensor decomposition. Unlike existing methods for T-count optimization, AlphaTensor-Quantum can incorporate domain-specific knowledge about quantum computation and leverage gadgets, which significantly reduces the T-count of the optimized circuits. AlphaTensor-Quantum outperforms the existing methods for T-count optimization on a set of arithmetic benchmarks (even when compared without making use of gadgets). Remarkably, it discovers an efficient algorithm akin to Karatsuba's method for multiplication in finite fields. AlphaTensor-Quantum also finds the best human-designed solutions for relevant arithmetic computations used in Shor's algorithm and for quantum chemistry simulation, thus demonstrating it can save hundreds of hours of research by optimizing relevant quantum circuits in a fully automated way.

* 25 pages main paper + 19 pages appendix

Via

Access Paper or Ask Questions

Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

Jun 14, 2021
Xiaohui Chen, Xu Han, Jiajing Hu, Francisco J. R. Ruiz, Liping Liu

Figure 1 for Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

Figure 2 for Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

Figure 3 for Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

Figure 4 for Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation

A graph generative model defines a distribution over graphs. One type of generative model is constructed by autoregressive neural networks, which sequentially add nodes and edges to generate a graph. However, the likelihood of a graph under the autoregressive model is intractable, as there are numerous sequences leading to the given graph; this makes maximum likelihood estimation challenging. Instead, in this work we derive the exact joint probability over the graph and the node ordering of the sequential process. From the joint, we approximately marginalize out the node orderings and compute a lower bound on the log-likelihood using variational inference. We train graph generative models by maximizing this bound, without using the ad-hoc node orderings of previous methods. Our experiments show that the log-likelihood bound is significantly tighter than the bound of previous schemes. Moreover, the models fitted with the proposed algorithm can generate high-quality graphs that match the structures of target graphs not seen during training. We have made our code publicly available at \hyperref[https://github.com/tufts-ml/graph-generation-vi]{https://github.com/tufts-ml/graph-generation-vi}.

* ICML 2021

Via

Access Paper or Ask Questions

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Oct 29, 2020
Lorenz Richter, Ayman Boustati, Nikolas Nüsken, Francisco J. R. Ruiz, Ömer Deniz Akyildiz

Figure 1 for VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Figure 2 for VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Figure 3 for VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Figure 4 for VarGrad: A Low-Variance Gradient Estimator for Variational Inference

We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call $\textit{VarGrad}$ due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE.

Via

Access Paper or Ask Questions

Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains

Oct 05, 2020
Francisco J. R. Ruiz, Michalis K. Titsias, Taylan Cemgil, Arnaud Doucet

Figure 1 for Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains

Figure 2 for Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains

Figure 3 for Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains

Figure 4 for Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains

The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture; one of them parameterizes the model's likelihood. Fitting its parameters via maximum likelihood is challenging since the computation of the likelihood involves an intractable integral over the latent space; thus the VAE is trained instead by maximizing a variational lower bound. Here, we develop a maximum likelihood training scheme for VAEs by introducing unbiased gradient estimators of the log-likelihood. We obtain the unbiased estimators by augmenting the latent space with a set of importance samples, similarly to the importance weighted auto-encoder (IWAE), and then constructing a Markov chain Monte Carlo (MCMC) coupling procedure on this augmented space. We provide the conditions under which the estimators can be computed in finite time and have finite variance. We demonstrate experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance on three image datasets.

* 16 pages, 4 figures

Via

Access Paper or Ask Questions

Prescribed Generative Adversarial Networks

Oct 09, 2019
Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei, Michalis K. Titsias

Figure 1 for Prescribed Generative Adversarial Networks

Figure 2 for Prescribed Generative Adversarial Networks

Figure 3 for Prescribed Generative Adversarial Networks

Figure 4 for Prescribed Generative Adversarial Networks

Generative adversarial networks (GANs) are a powerful approach to unsupervised learning. They have achieved state-of-the-art performance in the image domain. However, GANs are limited in two ways. They often learn distributions with low support---a phenomenon known as mode collapse---and they do not guarantee the existence of a probability density, which makes evaluating generalization using predictive log-likelihood impossible. In this paper, we develop the prescribed GAN (PresGAN) to address these shortcomings. PresGANs add noise to the output of a density network and optimize an entropy-regularized adversarial loss. The added noise renders tractable approximations of the predictive log-likelihood and stabilizes the training procedure. The entropy regularizer encourages PresGANs to capture all the modes of the data distribution. Fitting PresGANs involves computing the intractable gradients of the entropy regularization term; PresGANs sidestep this intractability using unbiased stochastic estimates. We evaluate PresGANs on several datasets and found they mitigate mode collapse and generate samples with high perceptual quality. We further found that PresGANs reduce the gap in performance in terms of predictive log-likelihood between traditional GANs and variational autoencoders (VAEs).

* Code for this paper can be found at https://github.com/adjidieng/PresGANs

Via

Access Paper or Ask Questions

The Dynamic Embedded Topic Model

Jul 12, 2019
Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei

Figure 1 for The Dynamic Embedded Topic Model

Figure 2 for The Dynamic Embedded Topic Model

Figure 3 for The Dynamic Embedded Topic Model

Figure 4 for The Dynamic Embedded Topic Model

Topic modeling analyzes documents to learn meaningful patterns of words. Dynamic topic models capture how these patterns vary over time for a set of documents that were collected over a large time span. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution whose parameter is given by the inner product between the word embedding and an embedding representation of its assigned topic at a particular time step. The word embeddings allow the D-ETM to generalize to rare words. The D-ETM learns smooth topic trajectories by defining a random walk prior over the embeddings of the topics. We fit the D-ETM using structured amortized variational inference. On a collection of United Nations debates, we find that the D-ETM learns interpretable topics and outperforms D-LDA in terms of both topic quality and predictive performance.

Via

Access Paper or Ask Questions

Topic Modeling in Embedding Spaces

Jul 08, 2019
Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei

Figure 1 for Topic Modeling in Embedding Spaces

Figure 2 for Topic Modeling in Embedding Spaces

Figure 3 for Topic Modeling in Embedding Spaces

Figure 4 for Topic Modeling in Embedding Spaces

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic Model (ETM), a generative model of documents that marries traditional topic models with word embeddings. In particular, it models each word with a categorical distribution whose natural parameter is the inner product between a word embedding and an embedding of its assigned topic. To fit the ETM, we develop an efficient amortized variational inference algorithm. The ETM discovers interpretable topics even with large vocabularies that include rare words and stop words. It outperforms existing document models, such as latent Dirichlet allocation (LDA), in terms of both topic quality and predictive performance.

* Code can be found at https://github.com/adjidieng/ETM

Via

Access Paper or Ask Questions

A Contrastive Divergence for Combining Variational Inference and MCMC

May 28, 2019
Francisco J. R. Ruiz, Michalis K. Titsias

Figure 1 for A Contrastive Divergence for Combining Variational Inference and MCMC

Figure 2 for A Contrastive Divergence for Combining Variational Inference and MCMC

Figure 3 for A Contrastive Divergence for Combining Variational Inference and MCMC

Figure 4 for A Contrastive Divergence for Combining Variational Inference and MCMC

We develop a method to combine Markov chain Monte Carlo (MCMC) and variational inference (VI), leveraging the advantages of both inference approaches. Specifically, we improve the variational distribution by running a few MCMC steps. To make inference tractable, we introduce the variational contrastive divergence (VCD), a new divergence that replaces the standard Kullback-Leibler (KL) divergence used in VI. The VCD captures a notion of discrepancy between the initial variational distribution and its improved version (obtained after running the MCMC steps), and it converges asymptotically to the symmetrized KL divergence between the variational distribution and the posterior of interest. The VCD objective can be optimized efficiently with respect to the variational parameters via stochastic optimization. We show experimentally that optimizing the VCD leads to better predictive performance on two latent variable models: logistic matrix factorization and variational autoencoders (VAEs).

* International Conference on Machine Learning (ICML 2019). 12 pages, 3 figures

Via

Access Paper or Ask Questions

Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

Nov 07, 2018
Maryam Fatemi, Karl Granström, Lennart Svensson, Francisco J. R. Ruiz, Lars Hammarstrand

Figure 1 for Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

Figure 2 for Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

Figure 3 for Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

Figure 4 for Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

This paper addresses the mapping problem. Using a conjugate prior form, we derive the exact theoretical batch multi-object posterior density of the map given a set of measurements. The landmarks in the map are modeled as extended objects, and the measurements are described as a Poisson process, conditioned on the map. We use a Poisson process prior on the map and prove that the posterior distribution is a hybrid Poisson, multi-Bernoulli mixture distribution. We devise a Gibbs sampling algorithm to sample from the batch multi-object posterior. The proposed method can handle uncertainties in the data associations and the cardinality of the set of landmarks, and is parallelizable, making it suitable for large-scale problems. The performance of the proposed method is evaluated on synthetic data and is shown to outperform a state-of-the-art method.

* IEEE Transactions on Signal Processing, Vol. 65, Issue 11, June 2017
* 14 pages, 6 figures

Via

Access Paper or Ask Questions

Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation

Oct 18, 2018
Francisco J. R. Ruiz, Isabel Valera, Lennart Svensson, Fernando Perez-Cruz

Figure 1 for Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation

Figure 2 for Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation

Figure 3 for Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation

Figure 4 for Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation

New communication standards need to deal with machine-to-machine communications, in which users may start or stop transmitting at any time in an asynchronous manner. Thus, the number of users is an unknown and time-varying parameter that needs to be accurately estimated in order to properly recover the symbols transmitted by all users in the system. In this paper, we address the problem of joint channel parameter and data estimation in a multiuser communication channel in which the number of transmitters is not known. For that purpose, we develop the infinite factorial finite state machine model, a Bayesian nonparametric model based on the Markov Indian buffet that allows for an unbounded number of transmitters with arbitrary channel length. We propose an inference algorithm that makes use of slice sampling and particle Gibbs with ancestor sampling. Our approach is fully blind as it does not require a prior channel estimation step, prior knowledge of the number of transmitters, or any signaling information. Our experimental results, loosely based on the LTE random access channel, show that the proposed approach can effectively recover the data-generating process for a wide range of scenarios, with varying number of transmitters, number of receivers, constellation order, channel length, and signal-to-noise ratio.

* IEEE Transactions on Cognitive Communications and Networking, June 2018, Vol 2, Issue 2, pages 177-191
* 15 pages, 15 figures

Via

Access Paper or Ask Questions