Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels

May 30, 2016

Chao Gao, Dengyong Zhou

Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided by crowdsourcing workers is Dawid-Skene estimator. In this paper, we prove convergence rates of a projected EM algorithm for the Dawid-Skene estimator. The revealed exponent in the rate of convergence is shown to be optimal via a lower bound argument. Our work resolves the long standing issue of whether Dawid-Skene estimator has sound theoretical guarantees besides its good performance observed in practice. In addition, a comparative study with majority voting illustrates both advantages and pitfalls of the Dawid-Skene estimator.
May 30, 2016

Chao Gao, Dengyong Zhou

**Click to Read Paper**

Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing

Dec 16, 2015

Nihar B. Shah, Dengyong Zhou

Dec 16, 2015

Nihar B. Shah, Dengyong Zhou

**Click to Read Paper**

On the Impossibility of Convex Inference in Human Computation

Nov 21, 2014

Nihar B. Shah, Dengyong Zhou

Nov 21, 2014

Nihar B. Shah, Dengyong Zhou

**Click to Read Paper**

Provably Optimal Algorithms for Generalized Linear Contextual Bandits

Jun 18, 2017

Lihong Li, Yu Lu, Dengyong Zhou

Contextual bandits are widely used in Internet services from news recommendation to advertising, and to Web search. Generalized linear models (logistical regression in particular) have demonstrated stronger performance than linear models in many applications where rewards are binary. However, most theoretical analyses on contextual bandits so far are on linear bandits. In this work, we propose an upper confidence bound based algorithm for generalized linear contextual bandits, which achieves an $\tilde{O}(\sqrt{dT})$ regret over $T$ rounds with $d$ dimensional feature vectors. This regret matches the minimax lower bound, up to logarithmic terms, and improves on the best previous result by a $\sqrt{d}$ factor, assuming the number of arms is fixed. A key component in our analysis is to establish a new, sharp finite-sample confidence bound for maximum-likelihood estimates in generalized linear models, which may be of independent interest. We also analyze a simpler upper confidence bound algorithm, which is useful in practice, and prove it to have optimal regret for certain cases.
Jun 18, 2017

Lihong Li, Yu Lu, Dengyong Zhou

**Click to Read Paper**

In many machine learning applications, crowdsourcing has become the primary means for label collection. In this paper, we study the optimal error rate for aggregating labels provided by a set of non-expert workers. Under the classic Dawid-Skene model, we establish matching upper and lower bounds with an exact exponent $mI(\pi)$ in which $m$ is the number of workers and $I(\pi)$ the average Chernoff information that characterizes the workers' collective ability. Such an exact characterization of the error exponent allows us to state a precise sample size requirement $m>\frac{1}{I(\pi)}\log\frac{1}{\epsilon}$ in order to achieve an $\epsilon$ misclassification error. In addition, our results imply the optimality of various EM algorithms for crowdsourcing initialized by consistent estimators.

**Click to Read Paper**
Statistical Decision Making for Optimal Budget Allocation in Crowd Labeling

Apr 24, 2014

Xi Chen, Qihang Lin, Dengyong Zhou

Apr 24, 2014

Xi Chen, Qihang Lin, Dengyong Zhou

**Click to Read Paper**

**Click to Read Paper**

Approval Voting and Incentives in Crowdsourcing

Sep 07, 2015

Nihar B. Shah, Dengyong Zhou, Yuval Peres

Sep 07, 2015

Nihar B. Shah, Dengyong Zhou, Yuval Peres

**Click to Read Paper**

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Oct 29, 2018

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

Oct 29, 2018

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

**Click to Read Paper**

Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing

Nov 01, 2014

Yuchen Zhang, Xi Chen, Dengyong Zhou, Michael I. Jordan

Crowdsourcing is a popular paradigm for effectively collecting labels at low cost. The Dawid-Skene estimator has been widely used for inferring the true labels from the noisy labels provided by non-expert crowdsourcing workers. However, since the estimator maximizes a non-convex log-likelihood function, it is hard to theoretically justify its performance. In this paper, we propose a two-stage efficient algorithm for multi-class crowd labeling problems. The first stage uses the spectral method to obtain an initial estimate of parameters. Then the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm. We show that our algorithm achieves the optimal convergence rate up to a logarithmic factor. We conduct extensive experiments on synthetic and real datasets. Experimental results demonstrate that the proposed algorithm is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods.
Nov 01, 2014

Yuchen Zhang, Xi Chen, Dengyong Zhou, Michael I. Jordan

**Click to Read Paper**

On the Discrimination-Generalization Tradeoff in GANs

Feb 23, 2018

Pengchuan Zhang, Qiang Liu, Dengyong Zhou, Tao Xu, Xiaodong He

Feb 23, 2018

Pengchuan Zhang, Qiang Liu, Dengyong Zhou, Tao Xu, Xiaodong He

**Click to Read Paper**

Towards Neural Phrase-based Machine Translation

Sep 24, 2018

Po-Sen Huang, Chong Wang, Sitao Huang, Dengyong Zhou, Li Deng

Sep 24, 2018

Po-Sen Huang, Chong Wang, Sitao Huang, Dengyong Zhou, Li Deng

**Click to Read Paper**

Action-depedent Control Variates for Policy Optimization via Stein's Identity

Feb 23, 2018

Hao Liu, Yihao Feng, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu

Feb 23, 2018

Hao Liu, Yihao Feng, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu

**Click to Read Paper**

Stochastic Variance Reduction Methods for Policy Evaluation

Jun 09, 2017

Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

Jun 09, 2017

Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

**Click to Read Paper**

Sequence Modeling via Segmentations

Jul 18, 2018

Chong Wang, Yining Wang, Po-Sen Huang, Abdelrahman Mohamed, Dengyong Zhou, Li Deng

Jul 18, 2018

Chong Wang, Yining Wang, Po-Sen Huang, Abdelrahman Mohamed, Dengyong Zhou, Li Deng

**Click to Read Paper**

Regularized Minimax Conditional Entropy for Crowdsourcing

Mar 25, 2015

Dengyong Zhou, Qiang Liu, John C. Platt, Christopher Meek, Nihar B. Shah

Mar 25, 2015

Dengyong Zhou, Qiang Liu, John C. Platt, Christopher Meek, Nihar B. Shah

**Click to Read Paper**

Neuro-Symbolic Program Synthesis

Nov 06, 2016

Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli

Nov 06, 2016

Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli

**Click to Read Paper**

Neural Phrase-to-Phrase Machine Translation

Nov 06, 2018

Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

Nov 06, 2018

Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

**Click to Read Paper**