Models, code, and papers for "Li Zhou":

A Survey on Contextual Multi-armed Bandits

Feb 01, 2016
Li Zhou

In this survey we cover a few stochastic and adversarial contextual bandit algorithms. We analyze each algorithm's assumption and regret bound.

  Click for Model/Code and Paper
A Note on Information-Directed Sampling and Thompson Sampling

Mar 24, 2015
Li Zhou

This note introduce three Bayesian style Multi-armed bandit algorithms: Information-directed sampling, Thompson Sampling and Generalized Thompson Sampling. The goal is to give an intuitive explanation for these three algorithms and their regret bounds, and provide some derivations that are omitted in the original papers.

  Click for Model/Code and Paper
Personalized Web Search

Feb 03, 2015
Li Zhou

Personalization is important for search engines to improve user experience. Most of the existing work do pure feature engineering and extract a lot of session-style features and then train a ranking model. Here we proposed a novel way to model both long term and short term user behavior using Multi-armed bandit algorithm. Our algorithm can generalize session information across users well, and as an Explore-Exploit style algorithm, it can generalize to new urls and new users well. Experiments show that our algorithm can improve performance over the default ranking and outperforms several popular Multi-armed bandit algorithms.

  Click for Model/Code and Paper
MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels

Aug 13, 2018
Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, Li Fei-Fei

Recent deep networks are capable of memorizing the entire data even when the labels are completely random. To overcome the overfitting on corrupted labels, we propose a novel technique of learning another neural network, called MentorNet, to supervise the training of the base deep networks, namely, StudentNet. During training, MentorNet provides a curriculum (sample weighting scheme) for StudentNet to focus on the sample the label of which is probably correct. Unlike the existing curriculum that is usually predefined by human experts, MentorNet learns a data-driven curriculum dynamically with StudentNet. Experimental results demonstrate that our approach can significantly improve the generalization performance of deep networks trained on corrupted training data. Notably, to the best of our knowledge, we achieve the best-published result on WebVision, a large benchmark containing 2.2 million images of real-world noisy labels. The code are at

* published at ICML 2018 

  Click for Model/Code and Paper
Deep Reinforcement Learning-based Image Captioning with Embedding Reward

Apr 12, 2017
Zhou Ren, Xiaoyu Wang, Ning Zhang, Xutao Lv, Li-Jia Li

Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However, in this paper, we introduce a novel decision-making framework for image captioning. We utilize a "policy network" and a "value network" to collaboratively generate captions. The policy network serves as a local guidance by providing the confidence of predicting the next word according to the current state. Additionally, the value network serves as a global and lookahead guidance by evaluating all possible extensions of the current state. In essence, it adjusts the goal of predicting the correct words towards the goal of generating captions similar to the ground truth captions. We train both networks using an actor-critic reinforcement learning model, with a novel reward defined by visual-semantic embedding. Extensive experiments and analyses on the Microsoft COCO dataset show that the proposed framework outperforms state-of-the-art approaches across different evaluation metrics.

  Click for Model/Code and Paper
Transductive Optimization of Top k Precision

Oct 20, 2015
Li-Ping Liu, Thomas G. Dietterich, Nan Li, Zhi-Hua Zhou

Consider a binary classification problem in which the learner is given a labeled training set, an unlabeled test set, and is restricted to choosing exactly $k$ test points to output as positive predictions. Problems of this kind---{\it transductive precision@$k$}---arise in information retrieval, digital advertising, and reserve design for endangered species. Previous methods separate the training of the model from its use in scoring the test points. This paper introduces a new approach, Transductive Top K (TTK), that seeks to minimize the hinge loss over all training instances under the constraint that exactly $k$ test instances are predicted as positive. The paper presents two optimization methods for this challenging problem. Experiments and analysis confirm the importance of incorporating the knowledge of $k$ into the learning process. Experimental evaluations of the TTK approach show that the performance of TTK matches or exceeds existing state-of-the-art methods on 7 UCI datasets and 3 reserve design problem instances.

  Click for Model/Code and Paper
Multi-domain Dialogue State Tracking as Dynamic Knowledge Graph Enhanced Question Answering

Nov 07, 2019
Li Zhou, Kevin Small

Multi-domain dialogue state tracking (DST) is a critical component for conversational AI systems. The domain ontology (i.e., specification of domains, slots, and values) of a conversational AI system is generally incomplete, making the capability for DST models to generalize to new slots, values, and domains during inference imperative. In this paper, we propose to model multi-domain DST as a question answering problem, referred to as Dialogue State Tracking via Question Answering (DSTQA). Within DSTQA, each turn generates a question asking for the value of a (domain, slot) pair, thus making it naturally extensible to unseen domains, slots, and values. Additionally, we use a dynamically-evolving knowledge graph to explicitly learn relationships between (domain, slot) pairs. Our model has a 5.80% and 12.21% relative improvement over the current state-of-the-art model on MultiWOZ 2.0 and MultiWOZ 2.1 datasets, respectively. Additionally, our model consistently outperforms the state-of-the-art model in domain adaptation settings.

  Click for Model/Code and Paper
A Fourier Analytical Approach to Estimation of Smooth Functions in Gaussian Shift Model

Nov 05, 2019
Fan Zhou, Ping Li

We study the estimation of $f(\btheta)$ under Gaussian shift model $\bx = \btheta+\bxi$, where $\btheta \in \RR^d$ is an unknown parameter, $\bxi \sim \mathcal{N}(\mathbf{0},\bSigma)$ is the random noise with covariance matrix $\bSigma$, and $f$ is a given function which belongs to certain Besov space with smoothness index $s>1$. Let $\sigma^2 = \|\bSigma\|_{op}$ be the operator norm of $\bSigma$ and $\sigma^{-2\alpha} = \br(\bSigma)$ be its effective rank with some $0<\alpha<1$ and $\sigma>0$. We develop a new estimator $g(\bx)$ based on a Fourier analytical approach that achieves effective bias reduction. We show that when the intrinsic dimension of the problem is large enough such that nontrivial bias reduction is needed, the mean square error (MSE) rate of $g(\bx)$ is $O\big(\sigma^2 \vee \sigma^{2(1-\alpha)s}\big)$ as $\sigma\rightarrow 0$. By developing new methods to establish the minimax lower bounds under standard Gaussian shift model, we show that this rate is indeed minimax optimal and so is $g(\bx)$. The minimax rate implies a sharp threshold on the smoothness $s$ such that for only $f$ with smoothness above the threshold, $f(\btheta)$ can be estimated efficiently with an MSE rate of the order $O(\sigma^2)$. Normal approximation and asymptotic efficiency were proved for $g(\bx)$ under mild restrictions. Furthermore, we propose a data-driven procedure to develop an adaptive estimator when the covariance matrix $\bSigma$ is unknown. Numerical simulations are presented to validate our analysis. The simplicity of implementation and its superiority over the plug-in approach indicate the new estimator can be applied to a broad range of real world applications.

  Click for Model/Code and Paper
Feature Fusion Detector for Semantic Cognition of Remote Sensing

Sep 28, 2019
Wei Zhou, Yiying Li

The value of remote sensing images is of vital importance in many areas and needs to be refined by some cognitive approaches. The remote sensing detection is an appropriate way to achieve the semantic cognition. However, such detection is a challenging issue for scale diversity, diversity of views, small objects, sophisticated light and shadow backgrounds. In this article, inspired by the state-of-the-art detection framework FPN, we propose a novel approach for constructing a feature fusion module that optimizes feature context utilization in detection, calling our system LFFN for Layer-weakening Feature Fusion Network. We explore the inherent relevance of different layers to the final decision, and the incentives of higher-level features to lower-level features. More importantly, we explore the characteristics of different backbone networks in the mining of basic features and the correlation utilization of convolutional channels, and call our upgraded version as advanced LFFN. Based on experiments on the remote sensing dataset from Google Earth, our LFFN has proved effective and practical for the semantic cognition of remote sensing, achieving 89% mAP which is 4.1% higher than that of FPN. Moreover, in terms of the generalization performance, LFFN achieves 79.9% mAP on VOC 2007 and achieves 73.0% mAP on VOC 2012 test, and advacned LFFN obtains the mAP values of 80.7% and 74.4% on VOC 2007 and 2012 respectively, outperforming the comparable state-of-the-art SSD and Faster R-CNN models.

* 12 pages,6 figures 

  Click for Model/Code and Paper
STN-Homography: estimate homography parameters directly

Jun 06, 2019
Qiang Zhou, Xin Li

In this paper, we introduce the STN-Homography model to directly estimate the homography matrix between image pair. Different most CNN-based homography estimation methods which use an alternative 4-point homography parameterization, we use prove that, after coordinate normalization, the variance of elements of coordinate normalized $3\times3$ homography matrix is very small and suitable to be regressed well with CNN. Based on proposed STN-Homography, we use a hierarchical architecture which stacks several STN-Homography models and successively reduce the estimation error. Effectiveness of the proposed method is shown through experiments on MSCOCO dataset, in which it significantly outperforms the state-of-the-art. The average processing time of our hierarchical STN-Homography with 1 stage is only 4.87 ms on the GPU, and the processing time for hierarchical STN-Homography with 3 stages is 17.85 ms. The code will soon be open sourced.

  Click for Model/Code and Paper
Gaussian DAGs on network data

May 26, 2019
Hangjian Li, Qing Zhou

The traditional directed acyclic graph (DAG) model assumes data are generated independently from the underlying joint distribution defined by the DAG. In many applications, however, individuals are linked via a network and thus the independence assumption does not hold. We propose a novel Gaussian DAG model for network data, where the dependence among individual data points (row covariance) is modeled by an undirected graph. Under this model, we develop a maximum penalized likelihood method to estimate the DAG structure and the row correlation matrix. The algorithm iterates between a decoupled lasso regression step and a graphical lasso step. We show with extensive simulated and real network data, that our algorithm improves the accuracy of DAG structure learning by leveraging the information from the estimated row correlations. Moreover, we demonstrate that the performance of existing DAG learning methods can be substantially improved via de-correlation of network data with the estimated row correlation matrix from our algorithm.

* 14 pages, 5 figures 

  Click for Model/Code and Paper
FSSD: Feature Fusion Single Shot Multibox Detector

May 17, 2018
Zuoxin Li, Fuqiang Zhou

SSD (Single Shot Multibox Detector) is one of the best object detection algorithms with both high accuracy and fast speed. However, SSD's feature pyramid detection method makes it hard to fuse the features from different scales. In this paper, we proposed FSSD (Feature Fusion Single Shot Multibox Detector), an enhanced SSD with a novel and lightweight feature fusion module which can improve the performance significantly over SSD with just a little speed drop. In the feature fusion module, features from different layers with different scales are concatenated together, followed by some down-sampling blocks to generate new feature pyramid, which will be fed to multibox detectors to predict the final detection results. On the Pascal VOC 2007 test, our network can achieve 82.7 mAP (mean average precision) at the speed of 65.8 FPS (frame per second) with the input size 300$\times$300 using a single Nvidia 1080Ti GPU. In addition, our result on COCO is also better than the conventional SSD with a large margin. Our FSSD outperforms a lot of state-of-the-art object detection algorithms in both aspects of accuracy and speed. Code is available at

* add project code 

  Click for Model/Code and Paper
Safety-Aware Apprenticeship Learning

Apr 28, 2018
Weichao Zhou, Wenchao Li

Apprenticeship learning (AL) is a kind of Learning from Demonstration techniques where the reward function of a Markov Decision Process (MDP) is unknown to the learning agent and the agent has to derive a good policy by observing an expert's demonstrations. In this paper, we study the problem of how to make AL algorithms inherently safe while still meeting its learning objective. We consider a setting where the unknown reward function is assumed to be a linear combination of a set of state features, and the safety property is specified in Probabilistic Computation Tree Logic (PCTL). By embedding probabilistic model checking inside AL, we propose a novel counterexample-guided approach that can ensure safety while retaining performance of the learnt policy. We demonstrate the effectiveness of our approach on several challenging AL scenarios where safety is essential.

* Accepted by International Conference on Computer Aided Verification (CAV) 2018 

  Click for Model/Code and Paper
Graph Convolution: A High-Order and Adaptive Approach

Oct 20, 2017
Zhenpeng Zhou, Xiaocheng Li

In this paper, we presented a novel convolutional neural network framework for graph modeling, with the introduction of two new modules specially designed for graph-structured data: the $k$-th order convolution operator and the adaptive filtering module. Importantly, our framework of High-order and Adaptive Graph Convolutional Network (HA-GCN) is a general-purposed architecture that fits various applications on both node and graph centrics, as well as graph generative models. We conducted extensive experiments on demonstrating the advantages of our framework. Particularly, our HA-GCN outperforms the state-of-the-art models on node classification and molecule property prediction tasks. It also generates 32% more real molecules on the molecule generation task, both of which will significantly benefit real-world applications such as material design and drug screening.

  Click for Model/Code and Paper
Sparse Algorithm for Robust LSSVM in Primal Space

Feb 07, 2017
Li Chen, Shuisheng Zhou

As enjoying the closed form solution, least squares support vector machine (LSSVM) has been widely used for classification and regression problems having the comparable performance with other types of SVMs. However, LSSVM has two drawbacks: sensitive to outliers and lacking sparseness. Robust LSSVM (R-LSSVM) overcomes the first partly via nonconvex truncated loss function, but the current algorithms for R-LSSVM with the dense solution are faced with the second drawback and are inefficient for training large-scale problems. In this paper, we interpret the robustness of R-LSSVM from a re-weighted viewpoint and give a primal R-LSSVM by the representer theorem. The new model may have sparse solution if the corresponding kernel matrix has low rank. Then approximating the kernel matrix by a low-rank matrix and smoothing the loss function by entropy penalty function, we propose a convergent sparse R-LSSVM (SR-LSSVM) algorithm to achieve the sparse solution of primal R-LSSVM, which overcomes two drawbacks of LSSVM simultaneously. The proposed algorithm has lower complexity than the existing algorithms and is very efficient for training large-scale problems. Many experimental results illustrate that SR-LSSVM can achieve better or comparable performance with less training time than related algorithms, especially for training large scale problems.

* 22 pages, 4 figures 

  Click for Model/Code and Paper
Latent Contextual Bandits and their Application to Personalized Recommendations for New Users

Apr 22, 2016
Li Zhou, Emma Brunskill

Personalized recommendations for new users, also known as the cold-start problem, can be formulated as a contextual bandit problem. Existing contextual bandit algorithms generally rely on features alone to capture user variability. Such methods are inefficient in learning new users' interests. In this paper we propose Latent Contextual Bandits. We consider both the benefit of leveraging a set of learned latent user classes for new users, and how we can learn such latent classes from prior users. We show that our approach achieves a better regret bound than existing algorithms. We also demonstrate the benefit of our approach using a large real world dataset and a preliminary user study.

* 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) 

  Click for Model/Code and Paper
Differentially Private Distributed Online Learning

Jun 23, 2015
Chencheng Li, Pan Zhou

Online learning has been in the spotlight from the machine learning society for a long time. To handle massive data in Big Data era, one single learner could never efficiently finish this heavy task. Hence, in this paper, we propose a novel distributed online learning algorithm to solve the problem. Comparing to typical centralized online learner, the distributed learners optimize their own learning parameters based on local data sources and timely communicate with neighbors. However, communication may lead to a privacy breach. Thus, we use differential privacy to preserve the privacy of learners, and study the influence of guaranteeing differential privacy on the utility of the distributed online learning algorithm. Furthermore, by using the results from Kakade and Tewari (2009), we use the regret bounds of online learning to achieve fast convergence rates for offline learning algorithms in distributed scenarios, which provides tighter utility performance than the existing state-of-the-art results. In simulation, we demonstrate that the differentially private offline learning algorithm has high variance, but we can use mini-batch to improve the performance. Finally, the simulations show that the analytical results of our proposed theorems are right and our private distributed online learning algorithm is a general framework.

  Click for Model/Code and Paper
$HS^2$: Active Learning over Hypergraphs

Nov 25, 2018
I Chien, Huozhi Zhou, Pan Li

We propose a hypergraph-based active learning scheme which we term $HS^2$, $HS^2$ generalizes the previously reported algorithm $S^2$ originally proposed for graph-based active learning with pointwise queries [Dasarathy et al., COLT 2015]. Our $HS^2$ method can accommodate hypergraph structures and allows one to ask both pointwise queries and pairwise queries. Based on a novel parametric system particularly designed for hypergraphs, we derive theoretical results on the query complexity of $HS^2$ for the above described generalized settings. Both the theoretical and empirical results show that $HS^2$ requires a significantly fewer number of queries than $S^2$ when one uses $S^2$ over a graph obtained from the corresponding hypergraph via clique expansion.

  Click for Model/Code and Paper
Cascaded CNN-resBiLSTM-CTC: An End-to-End Acoustic Model For Speech Recognition

Oct 30, 2018
Xinpei Zhou, Jiwei Li, Xi Zhou

Automatic speech recognition (ASR) tasks are resolved by end-to-end deep learning models, which benefits us by less preparation of raw data, and easier transformation between languages. We propose a novel end-to-end deep learning model architecture namely cascaded CNN-resBiLSTM-CTC. In the proposed model, we add residual blocks in BiLSTM layers to extract sophisticated phoneme and semantic information together, and apply cascaded structure to pay more attention mining information of hard negative samples. By applying both simple Fast Fourier Transform (FFT) technique and n-gram language model (LM) rescoring method, we manage to achieve word error rate (WER) of 3.41% on LibriSpeech test clean corpora. Furthermore, we propose a new batch-varied method to speed up the training process in length-varied tasks, which result in 25% less training time.

* 5 pages, 1 figure, 4 tables. Submitted to 2019 ICASSP (International Conference on Acoustics, Speech, and Signal Processing) 

  Click for Model/Code and Paper
Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization

Nov 28, 2019
Qi Zhou, Houqiang Li, Jie Wang

Model-based reinforcement learning algorithms tend to achieve higher sample efficiency than model-free methods. However, due to the inevitable errors of learned models, model-based methods struggle to achieve the same asymptotic performance as model-free methods. In this paper, We propose a Policy Optimization method with Model-Based Uncertainty (POMBU)---a novel model-based approach---that can effectively improve the asymptotic performance using the uncertainty in Q-values. We derive an upper bound of the uncertainty, based on which we can approximate the uncertainty accurately and efficiently for model-based methods. We further propose an uncertainty-aware policy optimization algorithm that optimizes the policy conservatively to encourage performance improvement with high probability. This can significantly alleviate the overfitting of policy to inaccurate models. Experiments show POMBU can outperform existing state-of-the-art policy optimization algorithms in terms of sample efficiency and asymptotic performance. Moreover, the experiments demonstrate the excellent robustness of POMBU compared to previous model-based approaches.

  Click for Model/Code and Paper