Research papers and code for "Ping Wang":
Directly learning features from the point cloud has become an active research direction in 3D understanding. Existing learning-based methods usually construct local regions from the point cloud and extract the corresponding features using shared Multi-Layer Perceptron (MLP) and max pooling. However, most of these processes do not adequately take the spatial distribution of the point cloud into account, limiting the ability to perceive fine-grained patterns. We design a novel Local Spatial Attention (LSA) module to adaptively generate attention maps according to the spatial distribution of local regions. The feature learning process which integrates with these attention maps can effectively capture the local geometric structure. We further propose the Spatial Feature Extractor (SFE), which constructs a branch architecture, to aggregate the spatial information with associated features in each layer of the network better.The experiments show that our network, named LSANet, can achieve on par or better performance than the state-of-the-art methods when evaluating on the challenging benchmark datasets. The source code is available at https://github.com/LinZhuoChen/LSANet.

Click to Read Paper and Get Code
In this paper, we propose a novel iterative convolution-thresholding method (ICTM) that is applicable to a range of variational models for image segmentation. A variational model usually minimizes an energy functional consisting of a fidelity term and a regularization term. In the ICTM, the interface between two different segment domains is implicitly represented by their characteristic functions. The fidelity term is then usually written as a linear functional of the characteristic functions and the regularized term is approximated by a functional of characteristic functions in terms of heat kernel convolution. This allows us to design an iterative convolution-thresholding method to minimize the approximate energy. The method is simple, efficient and enjoys the energy-decaying property. Numerical experiments show that the method is easy to implement, robust and applicable to various image segmentation models.

* 13 pages, 4 figures
Click to Read Paper and Get Code
We study the problem of recovering sparse signals from compressed linear measurements. This problem, often referred to as sparse recovery or sparse reconstruction, has generated a great deal of interest in recent years. To recover the sparse signals, we propose a new method called multiple orthogonal least squares (MOLS), which extends the well-known orthogonal least squares (OLS) algorithm by allowing multiple $L$ indices to be chosen per iteration. Owing to inclusion of multiple support indices in each selection, the MOLS algorithm converges in much fewer iterations and improves the computational efficiency over the conventional OLS algorithm. Theoretical analysis shows that MOLS ($L > 1$) performs exact recovery of all $K$-sparse signals within $K$ iterations if the measurement matrix satisfies the restricted isometry property (RIP) with isometry constant $\delta_{LK} < \frac{\sqrt{L}}{\sqrt{K} + 2 \sqrt{L}}.$ The recovery performance of MOLS in the noisy scenario is also studied. It is shown that stable recovery of sparse signals can be achieved with the MOLS algorithm when the signal-to-noise ratio (SNR) scales linearly with the sparsity level of input signals.

Click to Read Paper and Get Code
We consider a new variant of \textsc{AMSGrad}. AMSGrad \cite{RKK18} is a popular adaptive gradient based optimization algorithm that is widely used in training deep neural networks. Our new variant of the algorithm assumes that mini-batch gradients in consecutive iterations have some underlying structure, which makes the gradients sequentially predictable. By exploiting the predictability and some ideas from the field of \textsc{Optimistic Online learning}, the new algorithm can accelerate the convergence and enjoy a tighter regret bound. We conduct experiments on training various neural networks on several datasets to show that the proposed method speeds up the convergence in practice.

Click to Read Paper and Get Code
Previous works utilized "smaller-norm-less-important" criterion to prune filters with smaller norm values in a convolutional neural network. In this paper, we analyze this norm-based criterion and point out that its effectiveness depends on two requirements that not always met: (1) the norm deviation of the filters should be large; (2) the minimum norm of the filters should be small. To solve this problem, we propose a novel filter pruning method, namely Filter Pruning via Geometric Median (FPGM), to compress the model regardless of those two requirements. Unlike previous methods, PFGM compresses CNN models by determining and pruning those filters with redundant information via Geometric Median (GM), rather than those with "relatively less" importance. When applied to two image classification benchmarks, our method validates its usefulness and strengths. Notably, on Cifar-10, PFGM reduces more than 52% FLOPs on ResNet-110 with even 2.69% relative accuracy improvement. Besides, on ILSCRC-2012, PFGM reduces more than 42% FLOPs on ResNet-101 without top-5 accuracy drop, which has advanced the state-of-the-art.

* 9 pages
Click to Read Paper and Get Code
For massive data, the family of subsampling algorithms is popular to downsize the data volume and reduce computational burden. Existing studies focus on approximating the ordinary least squares estimate in linear regression, where statistical leverage scores are often used to define subsampling probabilities. In this paper, we propose fast subsampling algorithms to efficiently approximate the maximum likelihood estimate in logistic regression. We first establish consistency and asymptotic normality of the estimator from a general subsampling algorithm, and then derive optimal subsampling probabilities that minimize the asymptotic mean squared error of the resultant estimator. An alternative minimization criterion is also proposed to further reduce the computational cost. The optimal subsampling probabilities depend on the full data estimate, so we develop a two-step algorithm to approximate the optimal subsampling procedure. This algorithm is computationally efficient and has a significant reduction in computing time compared to the full data approach. Consistency and asymptotic normality of the estimator from a two-step algorithm are also established. Synthetic and real data sets are used to evaluate the practical performance of the proposed method.

Click to Read Paper and Get Code
Many monocular visual SLAM algorithms are derived from incremental structure-from-motion (SfM) methods. This work proposes a novel monocular SLAM method which integrates recent advances made in global SfM. In particular, we present two main contributions to visual SLAM. First, we solve the visual odometry problem by a novel rank-1 matrix factorization technique which is more robust to the errors in map initialization. Second, we adopt a recent global SfM method for the pose-graph optimization, which leads to a multi-stage linear formulation and enables L1 optimization for better robustness to false loops. The combination of these two approaches generates more robust reconstruction and is significantly faster (4X) than recent state-of-the-art SLAM systems. We also present a new dataset recorded with ground truth camera motion in a Vicon motion capture room, and compare our method to prior systems on it and established benchmark datasets.

* 3DV 2017 Project Page: https://frobelbest.github.io/gslam
Click to Read Paper and Get Code
An important and difficult challenge in building computational models for narratives is the automatic evaluation of narrative quality. Quality evaluation connects narrative understanding and generation as generation systems need to evaluate their own products. To circumvent difficulties in acquiring annotations, we employ upvotes in social media as an approximate measure for story quality. We collected 54,484 answers from a crowd-powered question-and-answer website, Quora, and then used active learning to build a classifier that labeled 28,320 answers as stories. To predict the number of upvotes without the use of social network features, we create neural networks that model textual regions and the interdependence among regions, which serve as strong benchmarks for future research. To our best knowledge, this is the first large-scale study for automatic evaluation of narrative quality.

* 7 pages, 2 figures. Accepted at the 2017 IJCAI conference
Click to Read Paper and Get Code
Object proposals are an ensemble of bounding boxes with high potential to contain objects. In order to determine a small set of proposals with a high recall, a common scheme is extracting multiple features followed by a ranking algorithm which however, incurs two major challenges: {\bf 1)} The ranking model often imposes pairwise constraints between each proposal, rendering the problem away from an efficient training/testing phase; {\bf 2)} Linear kernels are utilized due to the computational and memory bottleneck of training a kernelized model. In this paper, we remedy these two issues by suggesting a {\em kernelized partial ranking model}. In particular, we demonstrate that {\bf i)} our partial ranking model reduces the number of constraints from $O(n^2)$ to $O(nk)$ where $n$ is the number of all potential proposals for an image but we are only interested in top-$k$ of them that has the largest overlap with the ground truth; {\bf ii)} we permit non-linear kernels in our model which is often superior to the linear classifier in terms of accuracy. For the sake of mitigating the computational and memory issues, we introduce a consistent weighted sampling~(CWS) paradigm that approximates the non-linear kernel as well as facilitates an efficient learning. In fact, as we will show, training a linear CWS model amounts to learning a kernelized model. Extensive experiments demonstrate that equipped with the non-linear kernel and the partial ranking algorithm, recall at top-$k$ proposals can be substantially improved.

* Pattern Recognition, 2017
Click to Read Paper and Get Code
Imaging genetic research has essentially focused on discovering unique and co-association effects, but typically ignoring to identify outliers or atypical objects in genetic as well as non-genetics variables. Identifying significant outliers is an essential and challenging issue for imaging genetics and multiple sources data analysis. Therefore, we need to examine for transcription errors of identified outliers. First, we address the influence function (IF) of kernel mean element, kernel covariance operator, kernel cross-covariance operator, kernel canonical correlation analysis (kernel CCA) and multiple kernel CCA. Second, we propose an IF of multiple kernel CCA, which can be applied for more than two datasets. Third, we propose a visualization method to detect influential observations of multiple sources of data based on the IF of kernel CCA and multiple kernel CCA. Finally, the proposed methods are capable of analyzing outliers of subjects usually found in biomedical applications, in which the number of dimension is large. To examine the outliers, we use the stem-and-leaf display. Experiments on both synthesized and imaging genetics data (e.g., SNP, fMRI, and DNA methylation) demonstrate that the proposed visualization can be applied effectively.

* arXiv admin note: substantial text overlap with arXiv:1602.05563
Click to Read Paper and Get Code
Neural abstractive text summarization (NATS) has received a lot of attention in the past few years from both industry and academia. In this paper, we introduce an open-source toolkit, namely LeafNATS, for training and evaluation of different sequence-to-sequence based models for the NATS task, and for deploying the pre-trained models to real-world applications. The toolkit is modularized and extensible in addition to maintaining competitive performance in the NATS task. A live news blogging system has also been implemented to demonstrate how these models can aid blog/news editors by providing them suggestions of headlines and summaries of their articles.

* Accepted by NAACL-HLT 2019 demo track
Click to Read Paper and Get Code
Accurately predicting the time of occurrence of an event of interest is a critical problem in longitudinal data analysis. One of the main challenges in this context is the presence of instances whose event outcomes become unobservable after a certain time point or when some instances do not experience any event during the monitoring period. Such a phenomenon is called censoring which can be effectively handled using survival analysis techniques. Traditionally, statistical approaches have been widely developed in the literature to overcome this censoring issue. In addition, many machine learning algorithms are adapted to effectively handle survival data and tackle other challenging problems that arise in real-world data. In this survey, we provide a comprehensive and structured review of the representative statistical methods along with the machine learning techniques used in survival analysis and provide a detailed taxonomy of the existing methods. We also discuss several topics that are closely related to survival analysis and illustrate several successful applications in various real-world application domains. We hope that this paper will provide a more thorough understanding of the recent advances in survival analysis and offer some guidelines on applying these approaches to solve new problems that arise in applications with censored data.

Click to Read Paper and Get Code
Maintenance work zones on the road network have impacts on the normal travelling of vehicles, which increase the risk of traffic accidents. The traffic characteristic analysis in maintenance work zones is a basis for maintenance work zone related research such as layout design, traffic control and safety assessment. Due to the difficulty in vehicle microscopic behaviour data acquisition, traditional traffic characteristic analysis mainly focuses on macroscopic characteristics. With the development of data acquisition technology, it becomes much easier to obtain a large amount of microscopic behaviour data nowadays, which lays a good foundation for analysing the traffic characteristics from a new point of view. This paper puts forward a method for expressing and displaying the vehicle behaviour distribution in maintenance work zones. Using portable vehicle microscopic behaviour data acquisition devices, lots of data can be obtained. Based on this data, an endpoint detection technology is used to automatically extract the segments in behaviour data with violent fluctuations, which are segments where vehicles take behaviours such as acceleration or turning. Using the support vector machine classification method, the specific types of behaviours of the segments extracted can be identified, and together with a data combination method, a total of ten types of behaviours can be identified. Then the kernel density analysis is used to cluster different types of behaviours of all passing vehicles to show the distribution on maps. By this method, how vehicles travel through maintenance work zones, and how different vehicle behaviours distribute in maintenance work zones can be displayed intuitively on maps, which is a novel traffic characteristic and can shed light to maintenance work zone related researches such as safety assessment and design method.

* 14 pages, 12 figures, 1 table
Click to Read Paper and Get Code
The study of healthy brain development helps to better understand the brain transformation and brain connectivity patterns which happen during childhood to adulthood. This study presents a sparse machine learning solution across whole-brain functional connectivity (FC) measures of three sets of data, derived from resting state functional magnetic resonance imaging (rs-fMRI) and task fMRI data, including a working memory n-back task (nb-fMRI) and an emotion identification task (em-fMRI). These multi-modal image data are collected on a sample of adolescents from the Philadelphia Neurodevelopmental Cohort (PNC) for the prediction of brain ages. Due to extremely large variable-to-instance ratio of PNC data, a high dimensional matrix with several irrelevant and highly correlated features is generated and hence a pattern learning approach is necessary to extract significant features. We propose a sparse learner based on the residual errors along the estimation of an inverse problem for the extreme learning machine (ELM) neural network. The purpose of the approach is to overcome the overlearning problem through pruning of several redundant features and their corresponding output weights. The proposed multimodal sparse ELM classifier based on residual errors (RES-ELM) is highly competitive in terms of the classification accuracy compared to its counterparts such as conventional ELM, and sparse Bayesian learning ELM.

Click to Read Paper and Get Code
Batch Normalization (BN) improves both convergence and generalization in training neural networks. This work understands these phenomena theoretically. We analyze BN by using a basic block of neural networks, consisting of a kernel layer, a BN layer, and a nonlinear activation function. This basic network helps us understand the impacts of BN in three aspects. First, by viewing BN as an implicit regularizer, BN can be decomposed into population normalization (PN) and gamma decay as an explicit regularization. Second, learning dynamics of BN and the regularization show that training converged with large maximum and effective learning rate. Third, generalization of BN is explored by using statistical mechanics. Experiments demonstrate that BN in convolutional neural networks share the same traits of regularization as the above analyses.

* Preprint. Work in progress. 17 pages
Click to Read Paper and Get Code
Many Natural Language Processing and Computational Linguistics applications involves the generation of new texts based on some existing texts, such as summarization, text simplification and machine translation. However, there has been a serious problem haunting these applications for decades, that is, how to automatically and accurately assess quality of these applications. In this paper, we will present some preliminary results on one especially useful and challenging problem in NLP system evaluation: how to pinpoint content differences of two text passages (especially for large pas-sages such as articles and books). Our idea is intuitive and very different from existing approaches. We treat one text passage as a small knowledge base, and ask it a large number of questions to exhaustively identify all content points in it. By comparing the correctly answered questions from two text passages, we will be able to compare their content precisely. The experiment using 2007 DUC summarization corpus clearly shows promising results.

* AAAI 2018
Click to Read Paper and Get Code
Vehicle Routing Problem with Private fleet and common Carrier (VRPPC) has been proposed to help a supplier manage package delivery services from a single depot to multiple customers. Most of the existing VRPPC works consider deterministic parameters which may not be practical and uncertainty has to be taken into account. In this paper, we propose the Optimal Stochastic Delivery Planning with Deadline (ODPD) to help a supplier plan and optimize the package delivery. The aim of ODPD is to service all customers within a given deadline while considering the randomness in customer demands and traveling time. We formulate the ODPD as a stochastic integer programming, and use the cardinality minimization approach for calculating the deadline violation probability. To accelerate computation, the L-shaped decomposition method is adopted. We conduct extensive performance evaluation based on real customer locations and traveling time from Google Map.

* 7 pages, 6 figures, Vehicular Technology Conference (VTC fall), 2017 IEEE 86th
Click to Read Paper and Get Code
Discovering pulsars is a significant and meaningful research topic in the field of radio astronomy. With the advent of astronomical instruments such as he Five-hundred-meter Aperture Spherical Telescope (FAST) in China, data volumes and data rates are exponentially growing. This fact necessitates a focus on artificial intelligence (AI) technologies that can perform the automatic pulsar candidate identification to mine large astronomical data sets. Automatic pulsar candidate identification can be considered as a task of determining potential candidates for further investigation and eliminating noises of radio frequency interferences or other non-pulsar signals. It is very hard to raise the performance of DCNN-based pulsar identification because the limited training samples restrict network structure to be designed deep enough for learning good features as well as the crucial class imbalance problem due to very limited number of real pulsar samples. To address these problems, we proposed a framework which combines deep convolution generative adversarial network (DCGAN) with support vector machine (SVM) to deal with imbalance class problem and to improve pulsar identification accuracy. DCGAN is used as sample generation and feature learning model, and SVM is adopted as the classifier for predicting candidate's labels in the inference stage. The proposed framework is a novel technique which not only can solve imbalance class problem but also can learn discriminative feature representations of pulsar candidates instead of computing hand-crafted features in preprocessing steps too, which makes it more accurate for automatic pulsar candidate selection. Experiments on two pulsar datasets verify the effectiveness and efficiency of our proposed method.

* arXiv admin note: text overlap with arXiv:1603.05166 by other authors
Click to Read Paper and Get Code
Many unsupervised kernel methods rely on the estimation of the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both kernel CO and kernel CCO are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In addition, while the influence function (IF) of an estimator can characterize its robustness, asymptotic properties and standard error, the IF of a standard kernel canonical correlation analysis (standard kernel CCA) has not been derived yet. To fill this gap, we first propose a robust kernel covariance operator (robust kernel CO) and a robust kernel cross-covariance operator (robust kernel CCO) based on a generalized loss function instead of the quadratic loss function. Second, we derive the IF for robust kernel CCO and standard kernel CCA. Using the IF of the standard kernel CCA, we can detect influential observations from two sets of data. Finally, we propose a method based on the robust kernel CO and the robust kernel CCO, called {\bf robust kernel CCA}, which is less sensitive to noise than the standard kernel CCA. The introduced principles can also be applied to many other kernel methods involving kernel CO or kernel CCO. Our experiments on synthesized data and imaging genetics analysis demonstrate that the proposed IF of standard kernel CCA can identify outliers. It is also seen that the proposed robust kernel CCA method performs better for ideal and contaminated data than the standard kernel CCA.

* arXiv admin note: text overlap with arXiv:1602.05563
Click to Read Paper and Get Code
Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this prob- lem very well since only very limited word co-occurrence information is available in short texts. This paper studies how to incorporate the external word correlation knowledge into short texts to improve the coherence of topic modeling. Based on recent results in word embeddings that learn se- mantically representations for words from a large corpus, we introduce a novel method, Embedding-based Topic Model (ETM), to learn latent topics from short texts. ETM not only solves the problem of very limited word co-occurrence information by aggregating short texts into long pseudo- texts, but also utilizes a Markov Random Field regularized model that gives correlated words a better chance to be put into the same topic. The experiments on real-world datasets validate the effectiveness of our model comparing with the state-of-the-art models.

Click to Read Paper and Get Code