* Published as a conference paper at ICASSP 2018

**Click to Read Paper**

Vicious Circle Principle and Logic Programs with Aggregates

Aug 21, 2018

Michael Gelfond, Yuanlin Zhang

The paper presents a knowledge representation language $\mathcal{A}log$ which extends ASP with aggregates. The goal is to have a language based on simple syntax and clear intuitive and mathematical semantics. We give some properties of $\mathcal{A}log$, an algorithm for computing its answer sets, and comparison with other approaches.
Aug 21, 2018

Michael Gelfond, Yuanlin Zhang

* arXiv admin note: text overlap with arXiv:1405.3637

**Click to Read Paper**

Vicious Circle Principle and Formation of Sets in ASP Based Languages

Aug 29, 2016

Michael Gelfond, Yuanlin Zhang

The paper continues the investigation of Poincare and Russel's Vicious Circle Principle (VCP) in the context of the design of logic programming languages with sets. We expand previously introduced language Alog with aggregates by allowing infinite sets and several additional set related constructs useful for knowledge representation and teaching. In addition, we propose an alternative formalization of the original VCP and incorporate it into the semantics of new language, Slog+, which allows more liberal construction of sets and their use in programming rules. We show that, for programs without disjunction and infinite sets, the formal semantics of aggregates in Slog+ coincides with that of several other known languages. Their intuitive and formal semantics, however, are based on quite different ideas and seem to be more involved than that of Slog+.
Aug 29, 2016

Michael Gelfond, Yuanlin Zhang

* Paper presented at the 9th Workshop on Answer Set Programming and Other Computing Paradigms (ASPOCP 2016), New York City, USA, 16 October 2016

**Click to Read Paper**

Splash: User-friendly Programming Interface for Parallelizing Stochastic Algorithms

Sep 23, 2015

Yuchen Zhang, Michael I. Jordan

Sep 23, 2015

Yuchen Zhang, Michael I. Jordan

* redo experiments to learn bigger models; compare Splash with state-of-the-art implementations on Spark

**Click to Read Paper**

As a contribution to the challenge of building game-playing AI systems, we develop and analyse a formal language for representing and reasoning about strategies. Our logical language builds on the existing general Game Description Language (GDL) and extends it by a standard modality for linear time along with two dual connectives to express preferences when combining strategies. The semantics of the language is provided by a standard state-transition model. As such, problems that require reasoning about games can be solved by the standard methods for reasoning about actions and change. We also endow the language with a specific semantics by which strategy formulas are understood as move recommendations for a player. To illustrate how our formalism supports automated reasoning about strategies, we demonstrate two example methods of implementation\/: first, we formalise the semantic interpretation of our language in conjunction with game rules and strategy rules in the Situation Calculus; second, we show how the reasoning problem can be solved with Answer Set Programming.

**Click to Read Paper*** Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

**Click to Read Paper**

Embarrassingly Parallel Inference for Gaussian Processes

Jun 13, 2018

Michael Minyi Zhang, Sinead A. Williamson

Training Gaussian process-based models typically involves an $ O(N^3)$ computational bottleneck due to inverting the covariance matrix. Popular methods for overcoming this matrix inversion problem cannot adequately model all types of latent functions, and are often not parallelizable. However, judicious choice of model structure can ameliorate this problem. A mixture-of-experts model that uses a mixture of $K$ Gaussian processes offers modeling flexibility and opportunities for scalable inference. Our embarassingly parallel algorithm combines low-dimensional matrix inversions with importance sampling to yield a flexible, scalable mixture-of-experts model that offers comparable performance to Gaussian process regression at a much lower computational cost.
Jun 13, 2018

Michael Minyi Zhang, Sinead A. Williamson

**Click to Read Paper**

* 13 pages, 8 figures, 5 tables, added DOI and updated to meet ACM formatting requirements, In Proceedings of FAT* (2019)

**Click to Read Paper**

Investigating the Impact of Data Volume and Domain Similarity on Transfer Learning Applications

May 30, 2018

Michael Bernico, Yuntao Li, Dingchao Zhang

May 30, 2018

Michael Bernico, Yuntao Li, Dingchao Zhang

* 9 pages

**Click to Read Paper**

Accelerated Inference for Latent Variable Models

Nov 06, 2017

Michael Minyi Zhang, Fernando Perez-Cruz

Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we could sample feature assignments according to a predictive likelihood. However, this still may not be efficient in high dimensions. We present a novel method to accelerate the mixing of latent variable model inference by proposing feature locations from the data, as opposed to the prior. This sampling method is efficient for proper mixing of the Markov chain Monte Carlo sampler, computationally attractive because this method can be performed in parallel, and is theoretically guaranteed to converge to the posterior distribution as its limiting distribution.
Nov 06, 2017

Michael Minyi Zhang, Fernando Perez-Cruz

**Click to Read Paper**

SPARC - Sorted ASP with Consistency Restoring Rules

Jan 08, 2013

Evgenii Balai, Michael Gelfond, Yuanlin Zhang

Jan 08, 2013

Evgenii Balai, Michael Gelfond, Yuanlin Zhang

* Proceedings of Answer Set Programming and Other Computing Paradigms (ASPOCP 2012), 5th International Workshop, September 4, 2012, Budapest, Hungary

**Click to Read Paper**

Coherence Functions with Applications in Large-Margin Classification Methods

Apr 10, 2012

Zhihua Zhang, Guang Dai, Michael I. Jordan

Support vector machines (SVMs) naturally embody sparseness due to their use of hinge loss functions. However, SVMs can not directly estimate conditional class probabilities. In this paper we propose and study a family of coherence functions, which are convex and differentiable, as surrogates of the hinge function. The coherence function is derived by using the maximum-entropy principle and is characterized by a temperature parameter. It bridges the hinge function and the logit function in logistic regression. The limit of the coherence function at zero temperature corresponds to the hinge function, and the limit of the minimizer of its expected error is the minimizer of the expected error of the hinge loss. We refer to the use of the coherence function in large-margin classification as C-learning, and we present efficient coordinate descent algorithms for the training of regularized ${\cal C}$-learning models.
Apr 10, 2012

Zhihua Zhang, Guang Dai, Michael I. Jordan

* 28 pages

**Click to Read Paper**

Communication Efficient Parallel Algorithms for Optimization on Manifolds

Nov 01, 2018

Bayan Saparbayeva, Michael Minyi Zhang, Lizhen Lin

Nov 01, 2018

Bayan Saparbayeva, Michael Minyi Zhang, Lizhen Lin

* 15 pages

**Click to Read Paper**

Robust and Parallel Bayesian Model Selection

Mar 22, 2018

Michael Minyi Zhang, Henry Lam, Lizhen Lin

Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another challenge one may encounter is the presence of outliers and contaminations that damage the inference quality. The parallel "divide and conquer" model selection strategy divides the observations of the full data set into roughly equal subsets and perform inference and model selection independently on each subset. After local subset inference, this method aggregates the posterior model probabilities or other model/variable selection criteria to obtain a final model by using the notion of geometric median. This approach leads to improved concentration in finding the "correct" model and model parameters and also is provably robust to outliers and data contamination.
Mar 22, 2018

Michael Minyi Zhang, Henry Lam, Lizhen Lin

* Computational Statistics & Data Analysis, Volume 127, 2018, Pages 229-247, ISSN 0167-9473

**Click to Read Paper**

Using Deep Neural Networks to Automate Large Scale Statistical Analysis for Big Data Applications

Aug 09, 2017

Rongrong Zhang, Wei Deng, Michael Yu Zhu

Aug 09, 2017

Rongrong Zhang, Wei Deng, Michael Yu Zhu

**Click to Read Paper**

Parallel Markov Chain Monte Carlo for the Indian Buffet Process

Mar 09, 2017

Michael M. Zhang, Avinava Dubey, Sinead A. Williamson

Indian Buffet Process based models are an elegant way for discovering underlying features within a data set, but inference in such models can be slow. Inferring underlying features using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which leads to a quadratic increase in computational complexity. Existing attempts at distributing inference have introduced additional approximation within the inference procedure. In this paper we present a novel algorithm to perform asymptotically exact parallel Markov chain Monte Carlo inference for Indian Buffet Process models. We take advantage of the fact that the features are conditionally independent under the beta-Bernoulli process. Because of this conditional independence, we can partition the features into two parts: one part containing only the finitely many instantiated features and the other part containing the infinite tail of uninstantiated features. For the finite partition, parallel inference is simple given the instantiation of features. But for the infinite tail, performing uncollapsed MCMC leads to poor mixing and hence we collapse out the features. The resulting hybrid sampler, while being parallel, produces samples asymptotically from the true posterior.
Mar 09, 2017

Michael M. Zhang, Avinava Dubey, Sinead A. Williamson

* Workshop paper in Bayesian Nonparametrics: The Next Generation, NIPS 2015

**Click to Read Paper**

Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators

Nov 30, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

For the problem of high-dimensional sparse linear regression, it is known that an $\ell_0$-based estimator can achieve a $1/n$ "fast" rate on the prediction error without any conditions on the design matrix, whereas in absence of restrictive conditions on the design matrix, popular polynomial-time methods only guarantee the $1/\sqrt{n}$ "slow" rate. In this paper, we show that the slow rate is intrinsic to a broad class of M-estimators. In particular, for estimators based on minimizing a least-squares cost function together with a (possibly non-convex) coordinate-wise separable regularizer, there is always a "bad" local optimum such that the associated prediction error is lower bounded by a constant multiple of $1/\sqrt{n}$. For convex regularizers, this lower bound applies to all global optima. The theory is applicable to many popular estimators, including convex $\ell_1$-based methods as well as M-estimators based on nonconvex regularizers, including the SCAD penalty or the MCP regularizer. In addition, for a broad class of nonconvex regularizers, we show that the bad local optima are very common, in that a broad class of local minimization algorithms with random initialization will typically converge to a bad solution.
Nov 30, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

* Add more coverage on related work; add a new lower bound for design matrices satisfying the restricted eigenvalue condition

**Click to Read Paper**

$\ell_1$-regularized Neural Networks are Improperly Learnable in Polynomial Time

Oct 13, 2015

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

Oct 13, 2015

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

* 16 pages

**Click to Read Paper**

Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds

Feb 06, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

We study the following generalized matrix rank estimation problem: given an $n \times n$ matrix and a constant $c \geq 0$, estimate the number of eigenvalues that are greater than $c$. In the distributed setting, the matrix of interest is the sum of $m$ matrices held by separate machines. We show that any deterministic algorithm solving this problem must communicate $\Omega(n^2)$ bits, which is order-equivalent to transmitting the whole matrix. In contrast, we propose a randomized algorithm that communicates only $\widetilde O(n)$ bits. The upper bound is matched by an $\Omega(n)$ lower bound on the randomized communication complexity. We demonstrate the practical effectiveness of the proposed algorithm with some numerical experiments.
Feb 06, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

* 23 pages, 5 figures

**Click to Read Paper**