Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions

Jun 26, 2019

Lijun Zhang, Guanghui Wang, Wei-Wei Tu, Zhi-Hua Zhou

Jun 26, 2019

Lijun Zhang, Guanghui Wang, Wei-Wei Tu, Zhi-Hua Zhou

**Click to Read Paper and Get Code**

Deep Descriptor Transforming for Image Co-Localization

May 08, 2017

Xiu-Shen Wei, Chen-Lin Zhang, Yao Li, Chen-Wei Xie, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou

May 08, 2017

Xiu-Shen Wei, Chen-Lin Zhang, Yao Li, Chen-Wei Xie, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou

* Accepted by IJCAI 2017

**Click to Read Paper and Get Code**

Theoretical Foundation of Co-Training and Disagreement-Based Algorithms

Aug 15, 2017

Wei Wang, Zhi-Hua Zhou

Aug 15, 2017

Wei Wang, Zhi-Hua Zhou

**Click to Read Paper and Get Code**

AUC (area under ROC curve) is an important evaluation criterion, which has been popularly used in many learning tasks such as class-imbalance learning, cost-sensitive learning, learning to rank, etc. Many learning approaches try to optimize AUC, while owing to the non-convexity and discontinuousness of AUC, almost all approaches work with surrogate loss functions. Thus, the consistency of AUC is crucial; however, it has been almost untouched before. In this paper, we provide a sufficient condition for the asymptotic consistency of learning approaches based on surrogate loss functions. Based on this result, we prove that exponential loss and logistic loss are consistent with AUC, but hinge loss is inconsistent. Then, we derive the $q$-norm hinge loss and general hinge loss that are consistent with AUC. We also derive the consistent bounds for exponential loss and logistic loss, and obtain the consistent bounds for many surrogate loss functions under the non-noise setting. Further, we disclose an equivalence between the exponential surrogate loss of AUC and exponential surrogate loss of accuracy, and one straightforward consequence of such finding is that AdaBoost and RankBoost are equivalent.

**Click to Read Paper and Get Code**
Great successes of deep neural networks have been witnessed in various real applications. Many algorithmic and implementation techniques have been developed, however, theoretical understanding of many aspects of deep neural networks is far from clear. A particular interesting issue is the usefulness of dropout, which was motivated from the intuition of preventing complex co-adaptation of feature detectors. In this paper, we study the Rademacher complexity of different types of dropout, and our theoretical results disclose that for shallow neural networks (with one or none hidden layer) dropout is able to reduce the Rademacher complexity in polynomial, whereas for deep neural networks it can amazingly lead to an exponential reduction of the Rademacher complexity.

* 20 pagea

* 20 pagea

**Click to Read Paper and Get Code*** Artificial Intelligence 203:1-18 2013

* 35 pages

**Click to Read Paper and Get Code**

* 22 pages, 1 figure

**Click to Read Paper and Get Code**

* 9 pages, 1 figure, 4 tables

**Click to Read Paper and Get Code**

A Continuously Growing Dataset of Sentential Paraphrases

Aug 01, 2017

Wuwei Lan, Siyu Qiu, Hua He, Wei Xu

Aug 01, 2017

Wuwei Lan, Siyu Qiu, Hua He, Wei Xu

* 11 pages, accepted to EMNLP 2017

**Click to Read Paper and Get Code**

A Survey on Traffic Signal Control Methods

Apr 17, 2019

Hua Wei, Guanjie Zheng, Vikash Gayah, Zhenhui Li

Apr 17, 2019

Hua Wei, Guanjie Zheng, Vikash Gayah, Zhenhui Li

* 30 pages

**Click to Read Paper and Get Code**

A Compositional Textual Model for Recognition of Imperfect Word Images

Nov 27, 2018

Wei Tang, John Corring, Ying Wu, Gang Hua

Nov 27, 2018

Wei Tang, John Corring, Ying Wu, Gang Hua

**Click to Read Paper and Get Code**

**Click to Read Paper and Get Code**

On the Resistance of Nearest Neighbor to Random Noisy Labels

Sep 13, 2018

Wei Gao, Bin-Bin Yang, Zhi-Hua Zhou

Nearest neighbor has always been one of the most appealing non-parametric approaches in machine learning, pattern recognition, computer vision, etc. Previous empirical studies partly shows that nearest neighbor is resistant to noise, yet there is a lack of deep analysis. This work presents the finite-sample and distribution-dependent bounds on the consistency of nearest neighbor in the random noise setting. The theoretical results show that, for asymmetric noises, k-nearest neighbor is robust enough to classify most data correctly, except for a handful of examples, whose labels are totally misled by random noises. For symmetric noises, however, k-nearest neighbor achieves the same consistent rate as that of noise-free setting, which verifies the resistance of k-nearest neighbor to random noisy labels. Motivated by the theoretical analysis, we propose the Robust k-Nearest Neighbor (RkNN) approach to deal with noisy labels. The basic idea is to make unilateral corrections to examples, whose labels are totally misled by random noises, and classify the others directly by utilizing the robustness of k-nearest neighbor. We verify the effectiveness of the proposed algorithm both theoretically and empirically.
Sep 13, 2018

Wei Gao, Bin-Bin Yang, Zhi-Hua Zhou

* 35 pages

**Click to Read Paper and Get Code**

Segmentation of ultrasound images of thyroid nodule for assisting fine needle aspiration cytology

Nov 03, 2012

Jie Zhao, Wei Zheng, Li Zhang, Hua Tian

Nov 03, 2012

Jie Zhao, Wei Zheng, Li Zhang, Hua Tian

* 15pages,13figures

**Click to Read Paper and Get Code**

Recommender system has attracted much attention during the past decade. Many attack detection algorithms have been developed for better recommendations, mostly focusing on shilling attacks, where an attack organizer produces a large number of user profiles by the same strategy to promote or demote an item. This work considers a different attack style: unorganized malicious attacks, where attackers individually utilize a small number of user profiles to attack different items without any organizer. This attack style occurs in many real applications, yet relevant study remains open. We first formulate the unorganized malicious attacks detection as a matrix completion problem, and propose the Unorganized Malicious Attacks detection (UMA) approach, a proximal alternating splitting augmented Lagrangian method. We verify, both theoretically and empirically, the effectiveness of our proposed approach.

**Click to Read Paper and Get Code**
Effects of the optimisation of the margin distribution on generalisation in deep architectures

Apr 19, 2017

Lech Szymanski, Brendan McCane, Wei Gao, Zhi-Hua Zhou

Apr 19, 2017

Lech Szymanski, Brendan McCane, Wei Gao, Zhi-Hua Zhou

**Click to Read Paper and Get Code**

* Proceeding of 30th International Conference on Machine Learning

**Click to Read Paper and Get Code**

Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements

Apr 01, 2019

Kaixuan Wei, Jiaolong Yang, Ying Fu, David Wipf, Hua Huang

Apr 01, 2019

Kaixuan Wei, Jiaolong Yang, Ying Fu, David Wipf, Hua Huang

* Accepted to CVPR2019; code is available at https://github.com/Vandermode/ERRNet

**Click to Read Paper and Get Code**

Revisiting Spatial-Temporal Similarity: A Deep Learning Framework for Traffic Prediction

Nov 03, 2018

Huaxiu Yao, Xianfeng Tang, Hua Wei, Guanjie Zheng, Zhenhui Li

Traffic prediction has drawn increasing attention in AI research field due to the increasing availability of large-scale traffic data and its importance in the real world. For example, an accurate taxi demand prediction can assist taxi companies in pre-allocating taxis. The key challenge of traffic prediction lies in how to model the complex spatial dependencies and temporal dynamics. Although both factors have been considered in modeling, existing works make strong assumptions about spatial dependence and temporal dynamics, i.e., spatial dependence is stationary in time, and temporal dynamics is strictly periodical. However, in practice, the spatial dependence could be dynamic (i.e., changing from time to time), and the temporal dynamics could have some perturbation from one period to another period. In this paper, we make two important observations: (1) the spatial dependencies between locations are dynamic; and (2) the temporal dependency follows daily and weekly pattern but it is not strictly periodic for its dynamic temporal shifting. To address these two issues, we propose a novel Spatial-Temporal Dynamic Network (STDN), in which a flow gating mechanism is introduced to learn the dynamic similarity between locations, and a periodically shifted attention mechanism is designed to handle long-term periodic temporal shifting. To the best of our knowledge, this is the first work that tackles both issues in a unified framework. Our experimental results on real-world traffic datasets verify the effectiveness of the proposed method.
Nov 03, 2018

Huaxiu Yao, Xianfeng Tang, Hua Wei, Guanjie Zheng, Zhenhui Li

* Accepted by AAAI 2019

**Click to Read Paper and Get Code**

Chinese Poetry Generation with Planning based Neural Network

Dec 07, 2016

Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, Enhong Chen

Dec 07, 2016

Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, Enhong Chen

* Accepted paper at COLING 2016

**Click to Read Paper and Get Code**