A Non-asymptotic, Sharp, and User-friendly Reverse Chernoff-Cramèr Bound

Oct 21, 2018

Anru Zhang, Yuchen Zhou

The Chernoff-Cram\`er bound is a widely used technique to analyze the upper tail bound of random variable based on its moment generating function. By elementary proofs, we develop a user-friendly reverse Chernoff-Cram\`er bound that yields non-asymptotic lower tail bounds for generic random variables. The new reverse Chernoff-Cram\`er bound is used to derive a series of results, including the sharp lower tail bounds for the sum of independent sub-Gaussian and sub-exponential random variables, which matches the classic Hoefflding-type and Bernstein-type concentration inequalities, respectively. We also provide non-asymptotic matching upper and lower tail bounds for a suite of distributions, including gamma, beta, (regular, weighted, and noncentral) chi-squared, binomial, Poisson, Irwin-Hall, etc. We apply the result to develop matching upper and lower bounds for extreme value expectation of the sum of independent sub-Gaussian and sub-exponential random variables. A statistical application of sparse signal identification is finally studied.
Oct 21, 2018

Anru Zhang, Yuchen Zhou

**Click to Read Paper**

Neural Ranking Models for Temporal Dependency Structure Parsing

Sep 02, 2018

Yuchen Zhang, Nianwen Xue

Sep 02, 2018

Yuchen Zhang, Nianwen Xue

* 11 pages, 2 figures, 7 tables, to appear at EMNLP 2018, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018

**Click to Read Paper**

* Yuchen Zhang and Nianwen Xue. 2018. Structured Interpretation of Temporal Relations. In Proceedings of the 11th Language Resources and Evaluation Conference (LREC-2018), Miyazaki, Japan

* 9 pages, 2 figures, 8 tables, LREC-2018

**Click to Read Paper**

Splash: User-friendly Programming Interface for Parallelizing Stochastic Algorithms

Sep 23, 2015

Yuchen Zhang, Michael I. Jordan

Sep 23, 2015

Yuchen Zhang, Michael I. Jordan

* redo experiments to learn bigger models; compare Splash with state-of-the-art implementations on Spark

**Click to Read Paper**

Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization

Sep 09, 2015

Yuchen Zhang, Lin Xiao

We consider a generic convex optimization problem associated with regularized empirical risk minimization of linear predictors. The problem structure allows us to reformulate it as a convex-concave saddle point problem. We propose a stochastic primal-dual coordinate (SPDC) method, which alternates between maximizing over a randomly chosen dual variable and minimizing over the primal variable. An extrapolation step on the primal variable is performed to obtain accelerated convergence rate. We also develop a mini-batch version of the SPDC method which facilitates parallel computing, and an extension with weighted sampling probabilities on the dual variables, which has a better complexity than uniform sampling on unnormalized data. Both theoretically and empirically, we show that the SPDC method has comparable or better performance than several state-of-the-art optimization methods.
Sep 09, 2015

Yuchen Zhang, Lin Xiao

**Click to Read Paper**

Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss

Jan 01, 2015

Yuchen Zhang, Lin Xiao

Jan 01, 2015

Yuchen Zhang, Lin Xiao

**Click to Read Paper**

A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics

Apr 09, 2018

Yuchen Zhang, Percy Liang, Moses Charikar

Apr 09, 2018

Yuchen Zhang, Percy Liang, Moses Charikar

* Correct two mistakes in the proofs of Lemma 3 and Lemma 5

**Click to Read Paper**

Macro Grammars and Holistic Triggering for Efficient Semantic Parsing

Aug 31, 2017

Yuchen Zhang, Panupong Pasupat, Percy Liang

To learn a semantic parser from denotations, a learning algorithm must search over a combinatorially large space of logical forms for ones consistent with the annotated denotations. We propose a new online learning algorithm that searches faster as training progresses. The two key ideas are using macro grammars to cache the abstract patterns of useful logical forms found thus far, and holistic triggering to efficiently retrieve the most relevant patterns based on sentence similarity. On the WikiTableQuestions dataset, we first expand the search space of an existing model to improve the state-of-the-art accuracy from 38.7% to 42.7%, and then use macro grammars and holistic triggering to achieve an 11x speedup and an accuracy of 43.7%.
Aug 31, 2017

Yuchen Zhang, Panupong Pasupat, Percy Liang

* EMNLP 2017

**Click to Read Paper**

Convexified Convolutional Neural Networks

Sep 04, 2016

Yuchen Zhang, Percy Liang, Martin J. Wainwright

Sep 04, 2016

Yuchen Zhang, Percy Liang, Martin J. Wainwright

* 29 pages

**Click to Read Paper**

Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators

Nov 30, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

For the problem of high-dimensional sparse linear regression, it is known that an $\ell_0$-based estimator can achieve a $1/n$ "fast" rate on the prediction error without any conditions on the design matrix, whereas in absence of restrictive conditions on the design matrix, popular polynomial-time methods only guarantee the $1/\sqrt{n}$ "slow" rate. In this paper, we show that the slow rate is intrinsic to a broad class of M-estimators. In particular, for estimators based on minimizing a least-squares cost function together with a (possibly non-convex) coordinate-wise separable regularizer, there is always a "bad" local optimum such that the associated prediction error is lower bounded by a constant multiple of $1/\sqrt{n}$. For convex regularizers, this lower bound applies to all global optima. The theory is applicable to many popular estimators, including convex $\ell_1$-based methods as well as M-estimators based on nonconvex regularizers, including the SCAD penalty or the MCP regularizer. In addition, for a broad class of nonconvex regularizers, we show that the bad local optima are very common, in that a broad class of local minimization algorithms with random initialization will typically converge to a bad solution.
Nov 30, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

* Add more coverage on related work; add a new lower bound for design matrices satisfying the restricted eigenvalue condition

**Click to Read Paper**

$\ell_1$-regularized Neural Networks are Improperly Learnable in Polynomial Time

Oct 13, 2015

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

Oct 13, 2015

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

* 16 pages

**Click to Read Paper**

Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds

Feb 06, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

We study the following generalized matrix rank estimation problem: given an $n \times n$ matrix and a constant $c \geq 0$, estimate the number of eigenvalues that are greater than $c$. In the distributed setting, the matrix of interest is the sum of $m$ matrices held by separate machines. We show that any deterministic algorithm solving this problem must communicate $\Omega(n^2)$ bits, which is order-equivalent to transmitting the whole matrix. In contrast, we propose a randomized algorithm that communicates only $\widetilde O(n)$ bits. The upper bound is matched by an $\Omega(n)$ lower bound on the randomized communication complexity. We demonstrate the practical effectiveness of the proposed algorithm with some numerical experiments.
Feb 06, 2015

Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan

* 23 pages, 5 figures

**Click to Read Paper**

Comunication-Efficient Algorithms for Statistical Optimization

Oct 11, 2013

Yuchen Zhang, John C. Duchi, Martin Wainwright

Oct 11, 2013

Yuchen Zhang, John C. Duchi, Martin Wainwright

* 44 pages, to appear in Journal of Machine Learning Research (JMLR)

**Click to Read Paper**

Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing

Nov 01, 2014

Yuchen Zhang, Xi Chen, Dengyong Zhou, Michael I. Jordan

Crowdsourcing is a popular paradigm for effectively collecting labels at low cost. The Dawid-Skene estimator has been widely used for inferring the true labels from the noisy labels provided by non-expert crowdsourcing workers. However, since the estimator maximizes a non-convex log-likelihood function, it is hard to theoretically justify its performance. In this paper, we propose a two-stage efficient algorithm for multi-class crowd labeling problems. The first stage uses the spectral method to obtain an initial estimate of parameters. Then the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm. We show that our algorithm achieves the optimal convergence rate up to a logarithmic factor. We conduct extensive experiments on synthetic and real datasets. Experimental results demonstrate that the proposed algorithm is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods.
Nov 01, 2014

Yuchen Zhang, Xi Chen, Dengyong Zhou, Michael I. Jordan

**Click to Read Paper**

Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates

Apr 29, 2014

Yuchen Zhang, John C. Duchi, Martin J. Wainwright

Apr 29, 2014

Yuchen Zhang, John C. Duchi, Martin J. Wainwright

**Click to Read Paper**

Language-Independent Representor for Neural Machine Translation

Nov 01, 2018

Long Zhou, Yuchen Liu, Jiajun Zhang, Chengqing Zong, Guoping Huang

Nov 01, 2018

Long Zhou, Yuchen Liu, Jiajun Zhang, Chengqing Zong, Guoping Huang

**Click to Read Paper**

Learning Halfspaces and Neural Networks with Random Initialization

Nov 25, 2015

Yuchen Zhang, Jason D. Lee, Martin J. Wainwright, Michael I. Jordan

Nov 25, 2015

Yuchen Zhang, Jason D. Lee, Martin J. Wainwright, Michael I. Jordan

* 31 pages

**Click to Read Paper**

Optimality guarantees for distributed statistical estimation

Jun 21, 2014

John C. Duchi, Michael I. Jordan, Martin J. Wainwright, Yuchen Zhang

Jun 21, 2014

John C. Duchi, Michael I. Jordan, Martin J. Wainwright, Yuchen Zhang

* 34 pages, 1 figure. Preliminary version appearing in Neural Information Processing Systems 2013 (http://papers.nips.cc/paper/4902-information-theoretic-lower-bounds-for-distributed-statistical-estimation-with-communication-constraints)

**Click to Read Paper**

AceKG: A Large-scale Knowledge Graph for Academic Data Mining

Aug 07, 2018

Ruijie Wang, Yuchen Yan, Jialu Wang, Yuting Jia, Ye Zhang, Weinan Zhang, Xinbing Wang

Aug 07, 2018

Ruijie Wang, Yuchen Yan, Jialu Wang, Yuting Jia, Ye Zhang, Weinan Zhang, Xinbing Wang

* CIKM 2018

**Click to Read Paper**

Imaging around corners with single-pixel detector by computational ghost imaging

Dec 08, 2016

Bin Bai, Jianbin Liu, Yu Zhou, Songlin Zhang, Yuchen He, Zhuo Xu

Dec 08, 2016

Bin Bai, Jianbin Liu, Yu Zhou, Songlin Zhang, Yuchen He, Zhuo Xu

**Click to Read Paper**