Models, code, and papers for "Ying Xu":

Learning over inherently distributed data

Jul 30, 2019
Donghui Yan, Ying Xu

The recent decades have seen a surge of interests in distributed computing. Existing work focus primarily on either distributed computing platforms, data query tools, or, algorithms to divide big data and conquer at individual machines etc. It is, however, increasingly often that the data of interest are inherently distributed, i.e., data are stored at multiple distributed sites due to diverse collection channels, business operations etc. We propose to enable learning and inference in such a setting via a general framework based on the distortion minimizing local transformations. This framework only requires a small amount of local signatures to be shared among distributed sites, eliminating the need of having to transmitting big data. Computation can be done very efficiently via parallel local computation. The error incurred due to distributed computing vanishes when increasing the size of local signatures. As the shared data need not be in their original form, data privacy may also be preserved. Experiments on linear (logistic) regression and Random Forests have shown promise of this approach. This framework is expected to apply to a general class of tools in learning and inference with the continuity property.

* 26 pages, 9 figures 

  Click for Model/Code and Paper
Inpatient2Vec: Medical Representation Learning for Inpatients

Apr 18, 2019
Ying Wang, Xiao Xu

Representation learning (RL) plays an important role in extracting proper representations from complex medical data for various analyzing tasks, such as patient grouping, clinical endpoint prediction and medication recommendation. Medical data can be divided into two typical categories, outpatient and inpatient, that have different data characteristics. However, few of existing RL methods are specially designed for inpatients data, which have strong temporal relations and consistent diagnosis. In addition, for unordered medical activity set, existing medical RL methods utilize a simple pooling strategy, which would result in indistinguishable contributions among the activities for learning. In this work, weproposeInpatient2Vec, anovelmodel for learning three kinds of representations for inpatient, including medical activity, hospital day and diagnosis. A multi-layer self-attention mechanism with two training tasks is designed to capture the inpatient data characteristics and process the unordered set. Using a real-world dataset, we demonstrate that the proposed approach outperforms the competitive baselines on semantic similarity measurement and clinical events prediction tasks.


  Click for Model/Code and Paper
Semi-supervised Target-level Sentiment Analysis via Variational Autoencoder

Oct 24, 2018
Weidi Xu, Ying Tan

Target-level aspect-based sentiment analysis (TABSA) is a long-standing challenge, which requires fine-grained semantical reasoning about a certain aspect. As manual annotation over the aspects is laborious and time-consuming, the amount of labeled data is limited for supervised learning. This paper proposes a semi-supervised method for the TABSA problem based on the Variational Autoencoder (VAE). VAE is a powerful deep generative model which models the latent distribution via variational inference. By disentangling the latent representation into the aspect-specific sentiment and the context, the method implicitly induces the underlying sentiment prediction for the unlabeled data, which then benefits the TABSA classifier. Our method is classifier-agnostic, i.e., the classifier is an independent module and various advanced supervised models can be integrated. Experimental results are obtained on the SemEval 2014 task 4 and show that our method is effective with four classical classifiers. The proposed method outperforms two general semi-supervised methods and achieves competitive performance.

* 8 pages, 5 figures, 6 tables 

  Click for Model/Code and Paper
Voiceprint recognition of Parkinson patients based on deep learning

Dec 17, 2018
Zhijing Xu, Juan Wang, Ying Zhang, Xiangjian He

More than 90% of the Parkinson Disease (PD) patients suffer from vocal disorders. Speech impairment is already indicator of PD. This study focuses on PD diagnosis through voiceprint features. In this paper, a method based on Deep Neural Network (DNN) recognition and classification combined with Mini-Batch Gradient Descent (MBGD) is proposed to distinguish PD patients from healthy people using voiceprint features. In order to exact the voiceprint features from patients, Weighted Mel Frequency Cepstrum Coefficients (WMFCC) is applied. The proposed method is tested on experimental data obtained by the voice recordings of three sustained vowels /a/, /o/ and /u/ from participants (48 PD and 20 healthy people). The results show that the proposed method achieves a high accuracy of diagnosis of PD patients from healthy people, than the conventional methods like Support Vector Machine (SVM) and other mentioned in this paper. The accuracy achieved is 89.5%. WMFCC approach can solve the problem that the high-order cepstrum coefficients are small and the features component's representation ability to the audio is weak. MBGD reduces the computational loads of the loss function, and increases the training speed of the system. DNN classifier enhances the classification ability of voiceprint features. Therefore, the above approaches can provide a solid solution for the quick auxiliary diagnosis of PD in early stage.

* 10 pages,4 figures 

  Click for Model/Code and Paper
Online Product Quantization

Mar 24, 2018
Donna Xu, Ivor W. Tsang, Ying Zhang

Approximate nearest neighbor (ANN) search has achieved great success in many tasks. However, existing popular methods for ANN search, such as hashing and quantization methods, are designed for static databases only. They cannot handle well the database with data distribution evolving dynamically, due to the high computational effort for retraining the model based on the new database. In this paper, we address the problem by developing an online product quantization (online PQ) model and incrementally updating the quantization codebook that accommodates to the incoming streaming data. Moreover, to further alleviate the issue of large scale computation for the online PQ update, we design two budget constraints for the model to update partial PQ codebook instead of all. We derive a loss bound which guarantees the performance of our online PQ model. Furthermore, we develop an online PQ model over a sliding window with both data insertion and deletion supported, to reflect the real-time behaviour of the data. The experiments demonstrate that our online PQ model is both time-efficient and effective for ANN search in dynamic large scale databases compared with baseline methods and the idea of partial PQ codebook update further reduces the update cost.

* To appear in IEEE Transactions on Knowledge and Data Engineering (DOI: 10.1109/TKDE.2018.2817526) 

  Click for Model/Code and Paper
Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks

Oct 23, 2019
Jinming Xu, Ye Tian, Ying Sun, Gesualdo Scutari

This paper proposes a novel family of primal-dual-based distributed algorithms for smooth, convex, multi-agent optimization over networks that uses only gradient information and gossip communications. The algorithms can also employ acceleration on the computation and communications. We provide a unified analysis of their convergence rate, measured in terms of the Bregman distance associated to the saddle point reformation of the distributed optimization problem. When acceleration is employed, the rate is shown to be optimal, in the sense that it matches (under the proposed metric) existing complexity lower bounds of distributed algorithms applicable to such a class of problem and using only gradient information and gossip communications. Preliminary numerical results on distributed least-square regression problems show that the proposed algorithm compares favorably on existing distributed schemes.


  Click for Model/Code and Paper
Similarity Kernel and Clustering via Random Projection Forests

Aug 28, 2019
Donghui Yan, Songxiang Gu, Ying Xu, Zhiwei Qin

Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains. Inspired by the success of ensemble methods and the flexibility of trees, we propose to learn a similarity kernel called rpf-kernel through random projection forests (rpForests). Our theoretical analysis reveals a highly desirable property of rpf-kernel: far-away (dissimilar) points have a low similarity value while nearby (similar) points would have a high similarity}, and the similarities have a native interpretation as the probability of points remaining in the same leaf nodes during the growth of rpForests. The learned rpf-kernel leads to an effective clustering algorithm--rpfCluster. On a wide variety of real and benchmark datasets, rpfCluster compares favorably to K-means clustering, spectral clustering and a state-of-the-art clustering ensemble algorithm--Cluster Forests. Our approach is simple to implement and readily adapt to the geometry of the underlying data. Given its desirable theoretical property and competitive empirical performance when applied to clustering, we expect rpf-kernel to be applicable to many problems of an unsupervised nature or as a regularizer in some supervised or weakly supervised settings.

* 22 pages, 5 figures 

  Click for Model/Code and Paper
MemNet: A Persistent Memory Network for Image Restoration

Aug 07, 2017
Ying Tai, Jian Yang, Xiaoming Liu, Chunyan Xu

Recently, very deep convolutional neural networks (CNNs) have been attracting considerable attention in image restoration. However, as the depth grows, the long-term dependency problem is rarely realized for these very deep models, which results in the prior states/layers having little influence on the subsequent ones. Motivated by the fact that human thoughts have persistency, we propose a very deep persistent memory network (MemNet) that introduces a memory block, consisting of a recursive unit and a gate unit, to explicitly mine persistent memory through an adaptive learning process. The recursive unit learns multi-level representations of the current state under different receptive fields. The representations and the outputs from the previous memory blocks are concatenated and sent to the gate unit, which adaptively controls how much of the previous states should be reserved, and decides how much of the current state should be stored. We apply MemNet to three image restoration tasks, i.e., image denosing, super-resolution and JPEG deblocking. Comprehensive experiments demonstrate the necessity of the MemNet and its unanimous superiority on all three tasks over the state of the arts. Code is available at https://github.com/tyshiwo/MemNet.

* Accepted by ICCV 2017 (Spotlight presentation) 

  Click for Model/Code and Paper
Variational Autoencoders for Semi-supervised Text Classification

Nov 24, 2016
Weidi Xu, Haoze Sun, Chao Deng, Ying Tan

Although semi-supervised variational autoencoder (SemiVAE) works in image classification task, it fails in text classification task if using vanilla LSTM as its decoder. From a perspective of reinforcement learning, it is verified that the decoder's capability to distinguish between different categorical labels is essential. Therefore, Semi-supervised Sequential Variational Autoencoder (SSVAE) is proposed, which increases the capability by feeding label into its decoder RNN at each time-step. Two specific decoder structures are investigated and both of them are verified to be effective. Besides, in order to reduce the computational complexity in training, a novel optimization method is proposed, which estimates the gradient of the unlabeled objective function by sampling, along with two variance reduction techniques. Experimental results on Large Movie Review Dataset (IMDB) and AG's News corpus show that the proposed approach significantly improves the classification accuracy compared with pure-supervised classifiers, and achieves competitive performance against previous advanced methods. State-of-the-art results can be obtained by integrating other pretraining-based methods.

* 8 pages, 4 figure 

  Click for Model/Code and Paper
Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking

Aug 23, 2016
Nannan Li, Dan Xu, Zhenqiang Ying, Zhihao Li, Ge Li

In this paper, we address the problem of searching action proposals in unconstrained video clips. Our approach starts from actionness estimation on frame-level bounding boxes, and then aggregates the bounding boxes belonging to the same actor across frames via linking, associating, tracking to generate spatial-temporal continuous action paths. To achieve the target, a novel actionness estimation method is firstly proposed by utilizing both human appearance and motion cues. Then, the association of the action paths is formulated as a maximum set coverage problem with the results of actionness estimation as a priori. To further promote the performance, we design an improved optimization objective for the problem and provide a greedy search algorithm to solve it. Finally, a tracking-by-detection scheme is designed to further refine the searched action paths. Extensive experiments on two challenging datasets, UCF-Sports and UCF-101, show that the proposed approach advances state-of-the-art proposal generation performance in terms of both accuracy and proposal quantity.


  Click for Model/Code and Paper
Learning Trajectory Prediction with Continuous Inverse Optimal Control via Langevin Sampling of Energy-Based Models

Apr 10, 2019
Yifei Xu, Tianyang Zhao, Chris Baker, Yibiao Zhao, Ying Nian Wu

Autonomous driving is a challenging multiagent domain which requires optimizing complex, mixed cooperative-competitive interactions. Learning to predict contingent distributions over other vehicles' trajectories simplifies the problem, allowing approximate solutions by trajectory optimization with dynamic constraints. We take a model-based approach to prediction, in order to make use of structured prior knowledge of vehicle kinematics, and the assumption that other drivers plan trajectories to minimize an unknown cost function. We introduce a novel inverse optimal control (IOC) algorithm to learn other vehicles' cost functions in an energy-based generative model. Langevin Sampling, a Monte Carlo based sampling algorithm, is used to directly sample the control sequence. Our algorithm provides greater flexibility than standard IOC methods, and can learn higher-level, non-Markovian cost functions defined over entire trajectories. We extend weighted feature-based cost functions with neural networks to obtain NN-augmented cost functions, which combine the advantages of both model-based and model-free learning. Results show that model-based IOC can achieve state-of-the-art vehicle trajectory prediction accuracy, and naturally take scene information into account.


  Click for Model/Code and Paper
Fictitious GAN: Training GANs with Historical Models

Jul 11, 2018
Hao Ge, Yin Xia, Xu Chen, Randall Berry, Ying Wu

Generative adversarial networks (GANs) are powerful tools for learning generative models. In practice, the training may suffer from lack of convergence. GANs are commonly viewed as a two-player zero-sum game between two neural networks. Here, we leverage this game theoretic view to study the convergence behavior of the training process. Inspired by the fictitious play learning process, a novel training method, referred to as Fictitious GAN, is introduced. Fictitious GAN trains the deep neural networks using a mixture of historical models. Specifically, the discriminator (resp. generator) is updated according to the best-response to the mixture outputs from a sequence of previously trained generators (resp. discriminators). It is shown that Fictitious GAN can effectively resolve some convergence issues that cannot be resolved by the standard training approach. It is proved that asymptotically the average of the generator outputs has the same distribution as the data samples.

* 19 pages. First three authors have equal contributions 

  Click for Model/Code and Paper
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation

Jul 23, 2016
Jie Zhou, Ying Cao, Xuguang Wang, Peng Li, Wei Xu

Neural machine translation (NMT) aims at solving machine translation (MT) problems using neural networks and has exhibited promising results in recent years. However, most of the existing NMT models are shallow and there is still a performance gap between a single NMT model and the best conventional MT system. In this work, we introduce a new type of linear connections, named fast-forward connections, based on deep Long Short-Term Memory (LSTM) networks, and an interleaved bi-directional architecture for stacking the LSTM layers. Fast-forward connections play an essential role in propagating the gradients and building a deep topology of depth 16. On the WMT'14 English-to-French task, we achieve BLEU=37.7 with a single attention model, which outperforms the corresponding single shallow model by 6.2 BLEU points. This is the first time that a single NMT model achieves state-of-the-art performance and outperforms the best conventional model by 0.7 BLEU points. We can still achieve BLEU=36.3 even without using an attention mechanism. After special handling of unknown words and model ensembling, we obtain the best score reported to date on this task with BLEU=40.4. Our models are also validated on the more difficult WMT'14 English-to-German task.

* TACL 2016 

  Click for Model/Code and Paper
Multi-task GLOH feature selection for human age estimation

May 06, 2011
Yixiong Liang, Lingbo Liu, Ying Xu, Yao Xiang, Beiji Zou

In this paper, we propose a novel age estimation method based on GLOH feature descriptor and multi-task learning (MTL). The GLOH feature descriptor, one of the state-of-the-art feature descriptor, is used to capture the age-related local and spatial information of face image. As the exacted GLOH features are often redundant, MTL is designed to select the most informative feature bins for age estimation problem, while the corresponding weights are determined by ridge regression. This approach largely reduces the dimensions of feature, which can not only improve performance but also decrease the computational burden. Experiments on the public available FG-NET database show that the proposed method can achieve comparable performance over previous approaches while using much fewer features.


  Click for Model/Code and Paper
Grid-GCN for Fast and Scalable Point Cloud Learning

Dec 20, 2019
Qiangeng Xu, Xudong Sun, Cho-ying Wu, Panqu Wang, Ulrich Neumann

Due to the sparsity and irregularity of the point cloud data, methods that directly consume points have become popular. Among all point-based models, graph convolutional networks (GCN) lead to notable performance by fully preserving the data granularity and exploiting point interrelation. However, point-based networks spend a significant amount of time on data structuring (e.g., Farthest Point Sampling (FPS) and neighbor points querying), which limit the speed and scalability. In this paper, we present a method, named Grid-GCN, for fast and scalable point cloud learning. Grid-GCN uses a novel data structuring strategy, Coverage-Aware Grid Query (CAGQ). By leveraging the efficiency of grid space, CAGQ improves spatial coverage while reducing the theoretical time complexity. Compared with popular sampling methods such as Farthest Point Sampling (FPS) and Ball Query, CAGQ achieves up to 50X speed-up. With a Grid Context Aggregation (GCA) module, Grid-GCN achieves state-of-the-art performance on major point cloud classification and segmentation benchmarks with significantly faster runtime than previous studies. Remarkably, Grid-GCN achieves the inference speed of 50fps on ScanNet using 81920 points per scene as input.


  Click for Model/Code and Paper
Event-based Feature Extraction Using Adaptive Selection Thresholds

Jul 30, 2019
Saeed Afshar, Ying Xu, Jonathan Tapson, André van Schaik, Gregory Cohen

Unsupervised feature extraction algorithms form one of the most important building blocks in machine learning systems. These algorithms are often adapted to the event-based domain to perform online learning in neuromorphic hardware. However, not designed for the purpose, such algorithms typically require significant simplification during implementation to meet hardware constraints, creating trade offs with performance. Furthermore, conventional feature extraction algorithms are not designed to generate useful intermediary signals which are valuable only in the context of neuromorphic hardware limitations. In this work a novel event-based feature extraction method is proposed that focuses on these issues. The algorithm operates via simple adaptive selection thresholds which allow a simpler implementation of network homeostasis than previous works by trading off a small amount of information loss in the form of missed events that fall outside the selection thresholds. The behavior of the selection thresholds and the output of the network as a whole are shown to provide uniquely useful signals indicating network weight convergence without the need to access network weights. A novel heuristic method for network size selection is proposed which makes use of noise events and their feature representations. The use of selection thresholds is shown to produce network activation patterns that predict classification accuracy allowing rapid evaluation and optimization of system parameters without the need to run back-end classifiers. The feature extraction method is tested on both the N-MNIST benchmarking dataset and a dataset of airplanes passing through the field of view. Multiple configurations with different classifiers are tested with the results quantifying the resultant performance gains at each processing stage.

* 15 Pages. 9 Figures 

  Click for Model/Code and Paper
Restricting Greed in Training of Generative Adversarial Network

Sep 06, 2018
Haoxuan You, Zhicheng Jiao, Haojun Xu, Jie Li, Ying Wang, Xinbo Gao

Generative adversarial network (GAN) has gotten wide re-search interest in the field of deep learning. Variations of GAN have achieved competitive results on specific tasks. However, the stability of training and diversity of generated instances are still worth studying further. Training of GAN can be thought of as a greedy procedure, in which the generative net tries to make the locally optimal choice (minimizing loss function of discriminator) in each iteration. Unfortunately, this often makes generated data resemble only a few modes of real data and rotate between modes. To alleviate these problems, we propose a novel training strategy to restrict greed in training of GAN. With help of our method, the generated samples can cover more instance modes with more stable training process. Evaluating our method on several representative datasets, we demonstrate superiority of improved training strategy on typical GAN models with different distance metrics.


  Click for Model/Code and Paper
Deep Reinforcement Learning for Dynamic Treatment Regimes on Medical Registry Data

Jan 28, 2018
Ning Liu, Ying Liu, Brent Logan, Zhiyuan Xu, Jian Tang, Yanzhi Wang

This paper presents the first deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctor and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict the most possible expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts' decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.


  Click for Model/Code and Paper
Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering

Sep 01, 2016
Peng Li, Wei Li, Zhengyan He, Xuguang Wang, Ying Cao, Jie Zhou, Wei Xu

While question answering (QA) with neural network, i.e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system. To alleviate this problem, we propose a large scale human annotated real-world QA dataset WebQA with more than 42k questions and 556k evidences. As existing neural QA methods resolve QA either as sequence generation or classification/ranking problem, they face challenges of expensive softmax computation, unseen answers handling or separate candidate answer generation component. In this work, we cast neural QA as a sequence labeling problem and propose an end-to-end sequence labeling model, which overcomes all the above challenges. Experimental results on WebQA show that our model outperforms the baselines significantly with an F1 score of 74.69% with word-based input, and the performance drops only 3.72 F1 points with more challenging character-based input.

* 10 pages, 3 figures, withdraw experimental results on CNN/Daily Mail datasets 

  Click for Model/Code and Paper