Models, code, and papers for "Wei Cui":

Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues

May 24, 2017
Wei Cui, Liang Gao

We present an optical mapping near-eye (OMNI) three-dimensional display method for wearable devices. By dividing a display screen into different sub-panels and optically mapping them to various depths, we create a multiplane volumetric image with correct focus cues for depth perception. The resultant system can drive the eye's accommodation to the distance that is consistent with binocular stereopsis, thereby alleviating the vergence-accommodation conflict, the primary cause for eye fatigue and discomfort. Compared with the previous methods, the OMNI display offers prominent advantages in adaptability, image dynamic range, and refresh rate.

* 5 pages, 6 figures, 2 tables, short article for Optics Letters 

  Click for Model/Code and Paper
Spatial Deep Learning for Wireless Scheduling

Aug 04, 2018
Wei Cui, Kaiming Shen, Wei Yu

The optimal scheduling of interfering links in a dense wireless network with full frequency reuse is a challenging task. The traditional method involves first estimating all the interfering channel strengths then optimizing the scheduling based on the model. This model-based method is however resource and computationally intensive, because channel estimation is expensive in dense networks; further, finding even a locally optimal solution of the resulting optimization problem may be computationally complex. This paper shows that by using a deep learning approach, it is possible to bypass channel estimation and to schedule links efficiently based solely on the geographic locations of transmitters and receivers. This is accomplished by using locally optimal schedules generated using a fractional programming method for randomly deployed device-to-device networks as training data, and by using a novel neural network architecture that takes the geographic spatial convolutions of the interfering or interfered neighboring nodes as input over multiple feedback stages to learn the optimum solution. The resulting neural network gives near-optimal performance for sum-rate maximization and is capable of generalizing to larger deployment areas and to deployments of different link densities. Finally, this paper proposes a novel scheduling approach that utilizes the sum-rate optimal scheduling heuristics over judiciously chosen subsets of links to provide fair scheduling across the network.

* This paper is the full version of the paper to be presented at IEEE Global Communications Conference 2018. It includes 30 pages and 13 figures 

  Click for Model/Code and Paper
ECG Identification under Exercise and Rest Situations via Various Learning Methods

May 11, 2019
Zihan Wang, Yaoguang Li, Wei Cui

As the advancement of information security, human recognition as its core technology, has absorbed an increasing amount of attention in the past few years. A myriad of biometric features including fingerprint, face, iris, have been applied to security systems, which are occasionally considered vulnerable to forgery and spoofing attacks. Due to the difficulty of being fabricated, electrocardiogram (ECG) has attracted much attention. Though many works have shown the excellent human identification provided by ECG, most current ECG human identification (ECGID) researches only focus on rest situation. In this manuscript, we overcome the oversimplification of previous researches and evaluate the performance under both exercise and rest situations, especially the influence of exercise on ECGID. By applying various existing learning methods to our ECG dataset, we find that current methods which can well support the identification of individuals under rests, do not suffice to present satisfying ECGID performance under exercise situations, therefore exposing the deficiency of existing ECG identification methods.

  Click for Model/Code and Paper
Neural Open Information Extraction

May 11, 2018
Lei Cui, Furu Wei, Ming Zhou

Conventional Open Information Extraction (Open IE) systems are usually built on hand-crafted patterns from other NLP tools such as syntactic parsing, yet they face problems of error propagation. In this paper, we propose a neural Open IE approach with an encoder-decoder framework. Distinct from existing methods, the neural Open IE approach learns highly confident arguments and relation tuples bootstrapped from a state-of-the-art Open IE system. An empirical study on a large benchmark dataset shows that the neural Open IE system significantly outperforms several baselines, while maintaining comparable computational efficiency.

  Click for Model/Code and Paper
Identifying the Mislabeled Training Samples of ECG Signals using Machine Learning

Dec 11, 2017
Yaoguang Li, Wei Cui, Cong Wang

The classification accuracy of electrocardiogram signal is often affected by diverse factors in which mislabeled training samples issue is one of the most influential problems. In order to mitigate this negative effect, the method of cross validation is introduced to identify the mislabeled samples. The method utilizes the cooperative advantages of different classifiers to act as a filter for the training samples. The filter removes the mislabeled training samples and retains the correctly labeled ones with the help of 10-fold cross validation. Consequently, a new training set is provided to the final classifiers to acquire higher classification accuracies. Finally, we numerically show the effectiveness of the proposed method with the MIT-BIH arrhythmia database.

  Click for Model/Code and Paper
Deep Learning for Fine-Grained Image Analysis: A Survey

Jul 06, 2019
Xiu-Shen Wei, Jianxin Wu, Quan Cui

Computer vision (CV) is the process of using machines to understand and analyze imagery, which is an integral branch of artificial intelligence. Among various research areas of CV, fine-grained image analysis (FGIA) is a longstanding and fundamental problem, and has become ubiquitous in diverse real-world applications. The task of FGIA targets analyzing visual objects from subordinate categories, \eg, species of birds or models of cars. The small inter-class variations and the large intra-class variations caused by the fine-grained nature makes it a challenging problem. During the booming of deep learning, recent years have witnessed remarkable progress of FGIA using deep learning techniques. In this paper, we aim to give a survey on recent advances of deep learning based FGIA techniques in a systematic way. Specifically, we organize the existing studies of FGIA techniques into three major categories: fine-grained image recognition, fine-grained image retrieval and fine-grained image generation. In addition, we also cover some other important issues of FGIA, such as publicly available benchmark datasets and its related domain specific applications. Finally, we conclude this survey by highlighting several directions and open problems which need be further explored by the community in the future.

* Project page: 

  Click for Model/Code and Paper
Data-based wind disaster climate identification algorithm and extreme wind speed prediction

Aug 29, 2019
Wei Cui, Teng Ma, Lin Zhao, Yaojun Ge

An extreme wind speed estimation method that considers wind hazard climate types is critical for design wind load calculation for building structures affected by mixed climates. However, it is very difficult to obtain wind hazard climate types from meteorological data records, because they restrict the application of extreme wind speed estimation in mixed climates. This paper first proposes a wind hazard type identification algorithm based on a numerical pattern recognition method that utilizes feature extraction and generalization. Next, it compares six commonly used machine learning models using K-fold cross-validation. Finally, it takes meteorological data from three locations near the southeast coast of China as examples to examine the algorithm performance. Based on classification results, the extreme wind speeds calculated based on mixed wind hazard types is compared with those obtained from conventional methods, and the effects on structural design for different return periods are discussed.

  Click for Model/Code and Paper
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks

Oct 16, 2018
Xiaodong Cui, Wei Zhang, Zoltán Tüske, Michael Picheny

We propose a population-based Evolutionary Stochastic Gradient Descent (ESGD) framework for optimizing deep neural networks. ESGD combines SGD and gradient-free evolutionary algorithms as complementary algorithms in one framework in which the optimization alternates between the SGD step and evolution step to improve the average fitness of the population. With a back-off strategy in the SGD step and an elitist strategy in the evolution step, it guarantees that the best fitness in the population will never degrade. In addition, individuals in the population optimized with various SGD-based optimizers using distinct hyper-parameters in the SGD step are considered as competing species in a coevolution setting such that the complementarity of the optimizers is also taken into account. The effectiveness of ESGD is demonstrated across multiple applications including speech recognition, image recognition and language modeling, using networks with a variety of deep architectures.

  Click for Model/Code and Paper
Unsupervised Machine Commenting with Neural Variational Topic Model

Sep 13, 2018
Shuming Ma, Lei Cui, Furu Wei, Xu Sun

Article comments can provide supplementary opinions and facts for readers, thereby increase the attraction and engagement of articles. Therefore, automatically commenting is helpful in improving the activeness of the community, such as online forums and news websites. Previous work shows that training an automatic commenting system requires large parallel corpora. Although part of articles are naturally paired with the comments on some websites, most articles and comments are unpaired on the Internet. To fully exploit the unpaired data, we completely remove the need for parallel data and propose a novel unsupervised approach to train an automatic article commenting model, relying on nothing but unpaired articles and comments. Our model is based on a retrieval-based commenting framework, which uses news to retrieve comments based on the similarity of their topics. The topic representation is obtained from a neural variational topic model, which is trained in an unsupervised manner. We evaluate our model on a news comment dataset. Experiments show that our proposed topic-based approach significantly outperforms previous lexicon-based models. The model also profits from paired corpora and achieves state-of-the-art performance under semi-supervised scenarios.

  Click for Model/Code and Paper
Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition

Jul 10, 2019
Khoi-Nguyen C. Mac, Xiaodong Cui, Wei Zhang, Michael Picheny

In automatic speech recognition (ASR), wideband (WB) and narrowband (NB) speech signals with different sampling rates typically use separate acoustic models. Therefore mixed-bandwidth (MB) acoustic modeling has important practical values for ASR system deployment. In this paper, we extensively investigate large-scale MB deep neural network acoustic modeling for ASR using 1,150 hours of WB data and 2,300 hours of NB data. We study various MB strategies including downsampling, upsampling and bandwidth extension for MB acoustic modeling and evaluate their performance on 8 diverse WB and NB test sets from various application domains. To deal with the large amounts of training data, distributed training is carried out on multiple GPUs using synchronous data parallelism.

* Interspeech 2019 

  Click for Model/Code and Paper
Exploring difference in public perceptions on HPV vaccine between gender groups from Twitter using deep learning

Jul 06, 2019
Jingcheng Du, Chongliang Luo, Qiang Wei, Yong Chen, Cui Tao

In this study, we proposed a convolutional neural network model for gender prediction using English Twitter text as input. Ensemble of proposed model achieved an accuracy at 0.8237 on gender prediction and compared favorably with the state-of-the-art performance in a recent author profiling task. We further leveraged the trained models to predict the gender labels from an HPV vaccine related corpus and identified gender difference in public perceptions regarding HPV vaccine. The findings are largely consistent with previous survey-based studies.

* This manuscript has been accepted by 2019 KDD Workshop on Applied Data Science for Healthcare 

  Click for Model/Code and Paper
Transfer Learning for Sequences via Learning to Collocate

Feb 25, 2019
Wanyun Cui, Guangyu Zheng, Zhiqiang Shen, Sihang Jiang, Wei Wang

Transfer learning aims to solve the data sparsity for a target domain by applying information of the source domain. Given a sequence (e.g. a natural language sentence), the transfer learning, usually enabled by recurrent neural network (RNN), represents the sequential information transfer. RNN uses a chain of repeating cells to model the sequence data. However, previous studies of neural network based transfer learning simply represents the whole sentence by a single vector, which is unfeasible for seq2seq and sequence labeling. Meanwhile, such layer-wise transfer learning mechanisms lose the fine-grained cell-level information from the source domain. In this paper, we proposed the aligned recurrent transfer, ART, to achieve cell-level information transfer. ART is under the pre-training framework. Each cell attentively accepts transferred information from a set of positions in the source domain. Therefore, ART learns the cross-domain word collocations in a more flexible way. We conducted extensive experiments on both sequence labeling tasks (POS tagging, NER) and sentence classification (sentiment analysis). ART outperforms the state-of-the-arts over all experiments.

* Published at ICLR 2019 

  Click for Model/Code and Paper
Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks

Nov 21, 2018
Mengdi Wang, Qing Zhang, Jun Yang, Xiaoyuan Cui, Wei Lin

In this work, we propose a graph-adaptive pruning (GAP) method for efficient inference of convolutional neural networks (CNNs). In this method, the network is viewed as a computational graph, in which the vertices denote the computation nodes and edges represent the information flow. Through topology analysis, GAP is capable of adapting to different network structures, especially the widely used cross connections and multi-path data flow in recent novel convolutional models. The models can be adaptively pruned at vertex-level as well as edge-level without any post-processing, thus GAP can directly get practical model compression and inference speed-up. Moreover, it does not need any customized computation library or hardware support. Finetuning is conducted after pruning to restore the model performance. In the finetuning step, we adopt a self-taught knowledge distillation (KD) strategy by utilizing information from the original model, through which, the performance of the optimized model can be sufficiently improved, without introduction of any other teacher model. Experimental results show the proposed GAP can achieve promising result to make inference more efficient, e.g., for ResNeXt-29 on CIFAR10, it can get 13X model compression and 4.3X practical speed-up with marginal loss of accuracy.

* 7 pages, 7 figures 

  Click for Model/Code and Paper
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts

Sep 13, 2018
Shuming Ma, Lei Cui, Damai Dai, Furu Wei, Xu Sun

We introduce the task of automatic live commenting. Live commenting, which is also called `video barrage', is an emerging feature on online video sites that allows real-time comments from viewers to fly across the screen like bullets or roll at the right side of the screen. The live comments are a mixture of opinions for the video and the chit chats with other comments. Automatic live commenting requires AI agents to comprehend the videos and interact with human viewers who also make the comments, so it is a good testbed of an AI agent's ability of dealing with both dynamic vision and language. In this work, we construct a large-scale live comment dataset with 2,361 videos and 895,929 live comments. Then, we introduce two neural models to generate live comments based on the visual and textual contexts, which achieve better performance than previous neural baselines such as the sequence-to-sequence model. Finally, we provide a retrieval-based evaluation protocol for automatic live commenting where the model is asked to sort a set of candidate comments based on the log-likelihood score, and evaluated on metrics such as mean-reciprocal-rank. Putting it all together, we demonstrate the first `LiveBot'.

  Click for Model/Code and Paper
Exploiting Persona Information for Diverse Generation of Conversational Responses

May 29, 2019
Haoyu Song, Wei-Nan Zhang, Yiming Cui, Dong Wang, Ting Liu

In human conversations, due to their personalities in mind, people can easily carry out and maintain the conversations. Giving conversational context with persona information to a chatbot, how to exploit the information to generate diverse and sustainable conversations is still a non-trivial task. Previous work on persona-based conversational models successfully make use of predefined persona information and have shown great promise in delivering more realistic responses. And they all learn with the assumption that given a source input, there is only one target response. However, in human conversations, there are massive appropriate responses to a given input message. In this paper, we propose a memory-augmented architecture to exploit persona information from context and incorporate a conditional variational autoencoder model together to generate diverse and sustainable conversations. We evaluate the proposed model on a benchmark persona-chat dataset. Both automatic and human evaluations show that our model can deliver more diverse and more engaging persona-based responses than baseline approaches.

* published as a conference paper at IJCAI 2019 (to appear). 7 pages, 1 figures 

  Click for Model/Code and Paper
TableBank: Table Benchmark for Image-based Table Detection and Recognition

Mar 05, 2019
Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet. Existing research for image-based table detection and recognition usually fine-tunes pre-trained models on out-of-domain data with a few thousands human labeled examples, which is difficult to generalize on real world applications. With TableBank that contains 417K high-quality labeled tables, we build several strong baselines using state-of-the-art models with deep neural networks. We make TableBank publicly available ( and hope it will empower more deep learning approaches in the table detection and recognition task.

  Click for Model/Code and Paper
RPC: A Large-Scale Retail Product Checkout Dataset

Jan 22, 2019
Xiu-Shen Wei, Quan Cui, Lei Yang, Peng Wang, Lingqiao Liu

Over recent years, emerging interest has occurred in integrating computer vision technology into the retail industry. Automatic checkout (ACO) is one of the critical problems in this area which aims to automatically generate the shopping list from the images of the products to purchase. The main challenge of this problem comes from the large scale and the fine-grained nature of the product categories as well as the difficulty for collecting training images that reflect the realistic checkout scenarios due to continuous update of the products. Despite its significant practical and research value, this problem is not extensively studied in the computer vision community, largely due to the lack of a high-quality dataset. To fill this gap, in this work we propose a new dataset to facilitate relevant research. Our dataset enjoys the following characteristics: (1) It is by far the largest dataset in terms of both product image quantity and product categories. (2) It includes single-product images taken in a controlled environment and multi-product images taken by the checkout system. (3) It provides different levels of annotations for the check-out images. Comparing with the existing datasets, ours is closer to the realistic setting and can derive a variety of research problems. Besides the dataset, we also benchmark the performance on this dataset with various approaches. The dataset and related resources can be found at \url{}.

* Project page: 

  Click for Model/Code and Paper
KBQA: Learning Question Answering over QA Corpora and Knowledge Bases

Mar 06, 2019
Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu Song, Seung-won Hwang, Wei Wang

Question answering (QA) has become a popular way for humans to access billion-scale knowledge bases. Unlike web search, QA over a knowledge base gives out accurate and concise results, provided that natural language questions can be understood and mapped precisely to structured queries over the knowledge base. The challenge, however, is that a human can ask one question in many different ways. Previous approaches have natural limits due to their representations: rule based approaches only understand a small set of "canned" questions, while keyword based or synonym based approaches cannot fully understand the questions. In this paper, we design a new kind of question representation: templates, over a billion scale knowledge base and a million scale QA corpora. For example, for questions about a city's population, we learn templates such as What's the population of $city?, How many people are there in $city?. We learned 27 million templates for 2782 intents. Based on these templates, our QA system KBQA effectively supports binary factoid questions, as well as complex questions which are composed of a series of binary factoid questions. Furthermore, we expand predicates in RDF knowledge base, which boosts the coverage of knowledge base by 57 times. Our QA system beats all other state-of-art works on both effectiveness and efficiency over QALD benchmarks.

* Proceedings of the VLDB Endowment, Volume 10 Issue 5, January 2017 

  Click for Model/Code and Paper
Retrieval-Enhanced Adversarial Training for Neural Response Generation

Sep 12, 2018
Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Yining Chen, Ting Liu

Dialogue systems are usually built on either generation-based or retrieval-based approaches, yet they do not benefit from the advantages of different models. In this paper, we propose a Retrieval-Enhanced Adversarial Training (REAT) method for neural response generation. Distinct from existing ap- proaches, the REAT method leverages an encoder-decoder framework in terms of an adversarial training paradigm, while taking advantage of N-best response candidates from a retrieval-based system to construct the discriminator. An empirical study on a large scale public available benchmark dataset shows that the REAT method significantly outper- forms the vanilla Seq2Seq model as well as the conventional adversarial training approach.

  Click for Model/Code and Paper
MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Apr 22, 2018
Guoxin Cui, Jun Xu, Wei Zeng, Yanyan Lan, Jiafeng Guo, Xueqi Cheng

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations. Gradient quantization has been proposed as an effective approach to reducing the communication volume. One key issue in gradient quantization is setting the number of bits for quantizing the gradients. Small number of bits can significantly reduce the communication overhead while hurts the gradient accuracies, and vise versa. An ideal quantization method would dynamically balance the communication overhead and model accuracy, through adjusting the number bits according to the knowledge learned from the immediate past training iterations. Existing methods, however, quantize the gradients either with fixed number of bits, or with predefined heuristic rules. In this paper we propose a novel adaptive quantization method within the framework of reinforcement learning. The method, referred to as MQGrad, formalizes the selection of quantization bits as actions in a Markov decision process (MDP) where the MDP states records the information collected from the past optimization iterations (e.g., the sequence of the loss function values). During the training iterations of a machine learning algorithm, MQGrad continuously updates the MDP state according to the changes of the loss function. Based on the information, MDP learns to select the optimal actions (number of bits) to quantize the gradients. Experimental results based on a benchmark dataset showed that MQGrad can accelerate the learning of a large scale deep neural network while keeping its prediction accuracies.

* 7 pages, 5 figures 

  Click for Model/Code and Paper