Models, code, and papers for "Jie An":

Mesh Variational Autoencoders with Edge Contraction Pooling

Aug 07, 2019
Yu-Jie Yuan, Yu-Kun Lai, Jie Yang, Hongbo Fu, Lin Gao

3D shape analysis is an important research topic in computer vision and graphics. While existing methods have generalized image-based deep learning to meshes using graph-based convolutions, the lack of an effective pooling operation restricts the learning capability of their networks. In this paper, we propose a novel pooling operation for mesh datasets with the same connectivity but different geometry, by building a mesh hierarchy using mesh simplification. For this purpose, we develop a modified mesh simplification method to avoid generating highly irregularly sized triangles. Our pooling operation effectively encodes the correspondence between coarser and finer meshes in the hierarchy. We then present a variational auto-encoder structure with the edge contraction pooling and graph-based convolutions, to explore probability latent spaces of 3D surfaces. Our network requires far fewer parameters than the original mesh VAE and thus can handle denser models thanks to our new pooling operation and convolutional kernels. Our evaluation also shows that our method has better generalization ability and is more reliable in various applications, including shape generation, shape interpolation and shape embedding.


  Click for Model/Code and Paper
SDM-NET: Deep Generative Network for Structured Deformable Mesh

Sep 03, 2019
Lin Gao, Jie Yang, Tong Wu, Yu-Jie Yuan, Hongbo Fu, Yu-Kun Lai, Hao Zhang

We introduce SDM-NET, a deep generative neural network which produces structured deformable meshes. Specifically, the network is trained to generate a spatial arrangement of closed, deformable mesh parts, which respect the global part structure of a shape collection, e.g., chairs, airplanes, etc. Our key observation is that while the overall structure of a 3D shape can be complex, the shape can usually be decomposed into a set of parts, each homeomorphic to a box, and the finer-scale geometry of the part can be recovered by deforming the box. The architecture of SDM-NET is that of a two-level variational autoencoder (VAE). At the part level, a PartVAE learns a deformable model of part geometries. At the structural level, we train a Structured Parts VAE (SP-VAE), which jointly learns the part structure of a shape collection and the part geometries, ensuring a coherence between global shape structure and surface details. Through extensive experiments and comparisons with the state-of-the-art deep generative models of shapes, we demonstrate the superiority of SDM-NET in generating meshes with visual quality, flexible topology, and meaningful structures, which benefit shape interpolation and other subsequently modeling tasks.

* Conditionally Accepted to Siggraph Asia 2019 

  Click for Model/Code and Paper
Wasserstein Distance Guided Cross-Domain Learning

Oct 14, 2019
Jie Su

Domain adaptation aims to generalise a high-performance learner on target domain (non-labelled data) by leveraging the knowledge from source domain (rich labelled data) which comes from a different but related distribution. Assuming the source and target domains data(e.g. images) come from a joint distribution but follow on different marginal distributions, the domain adaptation work aims to infer the joint distribution from the source and target domain to learn the domain invariant features. Therefore, in this study, I extend the existing state-of-the-art approach to solve the domain adaptation problem. In particular, I propose a new approach to infer the joint distribution of images from different distributions, namely Wasserstein Distance Guided Cross-Domain Learning (WDGCDL). WDGCDL applies the Wasserstein distance to estimate the divergence between the source and target distribution which provides good gradient property and promising generalisation bound. Moreover, to tackle the training difficulty of the proposed framework, I propose two different training schemes for stable training. Qualitative results show that this new approach is superior to the existing state-of-the-art methods in the standard domain adaptation benchmark.

* 47 pages, Master Thesis 

  Click for Model/Code and Paper
A family of neighborhood contingency logics

Sep 24, 2018
Jie Fan

This article proposes the axiomatizations of contingency logics of various natural classes of neighborhood frames. In particular, by defining a suitable canonical neighborhood function, we give sound and complete axiomatizations of monotone contingency logic and regular contingency logic, thereby answering two open questions raised by Bakhtiari, van Ditmarsch, and Hansen. The canonical function is inspired by a function proposed by Kuhn in~1995. We show that Kuhn's function is actually equal to a related function originally given by Humberstone.

* 18 pages. arXiv admin note: substantial text overlap with arXiv:1802.03516 

  Click for Model/Code and Paper
Deep Q-Networks for Accelerating the Training of Deep Neural Networks

Jul 13, 2017
Jie Fu

In this paper, we propose a principled deep reinforcement learning (RL) approach that is able to accelerate the convergence rate of general deep neural networks (DNNs). With our approach, a deep RL agent (synonym for optimizer in this work) is used to automatically learn policies about how to schedule learning rates during the optimization of a DNN. The state features of the agent are learned from the weight statistics of the optimizee during training. The reward function of this agent is designed to learn policies that minimize the optimizee's training time given a certain performance goal. The actions of the agent correspond to changing the learning rate for the optimizee during training. As far as we know, this is the first attempt to use deep RL to learn how to optimize a large-sized DNN. We perform extensive experiments on a standard benchmark dataset and demonstrate the effectiveness of the policies learned by our approach.

* We choose to withdraw this paper. The DQN itself has too many hyperparameters, which makes it almost impossible to be applied to reasonably large datasets. In the later versions (from v4) with SGDR experiments, it seems that the agent only performs random actions 

  Click for Model/Code and Paper
Importance sampling-based approximate optimal planning and control

Dec 16, 2016
Jie Fu

In this paper, we propose a sampling-based planning and optimal control method of nonlinear systems under non-differentiable constraints. Motivated by developing scalable planning algorithms, we consider the optimal motion plan to be a feedback controller that can be approximated by a weighted sum of given bases. Given this approximate optimal control formulation, our main contribution is to introduce importance sampling, specifically, model-reference adaptive search algorithm, to iteratively compute the optimal weight parameters, i.e., the weights corresponding to the optimal policy function approximation given chosen bases. The key idea is to perform the search by iteratively estimating a parametrized distribution which converges to a Dirac's Delta that infinitely peaks on the global optimal weights. Then, using this direct policy search, we incorporated trajectory-based verification to ensure that, for a class of nonlinear systems, the obtained policy is not only optimal but robust to bounded disturbances. The correctness and efficiency of the methods are demonstrated through numerical experiments including linear systems with a nonlinear cost function and motion planning for a Dubins car.

* submitted to IEEE ACC 2017 

  Click for Model/Code and Paper
A Novel Block-DCT and PCA Based Image Perceptual Hashing Algorithm

Jun 18, 2013
Zeng Jie

Image perceptual hashing finds applications in content indexing, large-scale image database management, certification and authentication and digital watermarking. We propose a Block-DCT and PCA based image perceptual hash in this article and explore the algorithm in the application of tamper detection. The main idea of the algorithm is to integrate color histogram and DCT coefficients of image blocks as perceptual feature, then to compress perceptual features as inter-feature with PCA, and to threshold to create a robust hash. The robustness and discrimination properties of the proposed algorithm are evaluated in detail. Our algorithms first construct a secondary image, derived from input image by pseudo-randomly extracting features that approximately capture semi-global geometric characteristics. From the secondary image (which does not perceptually resemble the input), we further extract the final features which can be used as a hash value (and can be further suitably quantized). In this paper, we use spectral matrix invariants as embodied by Singular Value Decomposition. Surprisingly, formation of the secondary image turns out be quite important since it not only introduces further robustness, but also enhances the security properties. Indeed, our experiments reveal that our hashing algorithms extract most of the geometric information from the images and hence are robust to severe perturbations (e.g. up to %50 cropping by area with 20 degree rotations) on images while avoiding misclassification. Experimental results show that the proposed image perceptual hash algorithm can effectively address the tamper detection problem with advantageous robustness and discrimination.

* 7 pages, 5 figrues 

  Click for Model/Code and Paper
Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis

Oct 20, 2019
Jie Zhu, Blanca Gallego

The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individual might benefit from an intervention. The majority of this literature has concentrated on the estimation of the effect of treatment on binary outcomes. However, many medical interventions are evaluated in terms of their effect on future events, which are subject to loss to follow-up. In this study, we describe a framework for the estimation of heterogeneous treatment effect in terms of differences in time-to-event (survival) probabilities. We divide the problem into three phases: (1) estimation of treatment effect conditioned on unique sets of the covariate vector; (2) identification of features important for heterogeneity using an ensemble of non-parametric variable importance methods; and (3) estimation of treatment effect on the reference classes defined by the previously selected features, using one-step Targeted Maximum Likelihood Estimation. We conducted a series of simulation studies and found that this method performs well when either sample size or event rate is high enough and the number of covariates contributing to the effect heterogeneity is moderate. An application of this method to a clinical case study was conducted by estimating the effect of oral anticoagulants on newly diagnosed non-valvular atrial fibrillation patients using data from the UK Clinical Practice Research Datalink.


  Click for Model/Code and Paper
Dependency-Guided LSTM-CRF for Named Entity Recognition

Sep 23, 2019
Zhanming Jie, Wei Lu

Dependency tree structures capture long-distance and syntactic relationships between words in a sentence. The syntactic relations (e.g., nominal subject, object) can potentially infer the existence of certain named entities. In addition, the performance of a named entity recognizer could benefit from the long-distance dependencies between the words in dependency trees. In this work, we propose a simple yet effective dependency-guided LSTM-CRF model to encode the complete dependency trees and capture the above properties for the task of named entity recognition (NER). The data statistics show strong correlations between the entity types and dependency relations. We conduct extensive experiments on several standard datasets and demonstrate the effectiveness of the proposed model in improving NER and achieving state-of-the-art performance. Our analysis reveals that the significant improvements mainly result from the dependency relations and long-distance interactions provided by dependency trees.

* 13 pages, 6 figures, accepted by EMNLP 2019 

  Click for Model/Code and Paper
Average-case Analysis of the Assignment Problem with Independent Preferences

Jun 01, 2019
Yansong Gao, Jie Zhang

The fundamental assignment problem is in search of welfare maximization mechanisms to allocate items to agents when the private preferences over indivisible items are provided by self-interested agents. The mainstream mechanism \textit{Random Priority} is asymptotically the best mechanism for this purpose, when comparing its welfare to the optimal social welfare using the canonical \textit{worst-case approximation ratio}. Despite its popularity, the efficiency loss indicated by the worst-case ratio does not have a constant bound. Recently, [Deng, Gao, Zhang 2017] show that when the agents' preferences are drawn from a uniform distribution, its \textit{average-case approximation ratio} is upper bounded by 3.718. They left it as an open question of whether a constant ratio holds for general scenarios. In this paper, we offer an affirmative answer to this question by showing that the ratio is bounded by $1/\mu$ when the preference values are independent and identically distributed random variables, where $\mu$ is the expectation of the value distribution. This upper bound also improves the upper bound of 3.718 in [Deng, Gao, Zhang 2017] for the Uniform distribution. Moreover, under mild conditions, the ratio has a \textit{constant} bound for any independent random values. En route to these results, we develop powerful tools to show the insights that in most instances the efficiency loss is small.

* To appear in IJCAI 2019 

  Click for Model/Code and Paper
A Review of Semi Supervised Learning Theories and Recent Advances

May 28, 2019
Enmei Tu, Jie Yang

Semi-supervised learning, which has emerged from the beginning of this century, is a new type of learning method between traditional supervised learning and unsupervised learning. The main idea of semi-supervised learning is to introduce unlabeled samples into the model training process to avoid performance (or model) degeneration due to insufficiency of labeled samples. Semi-supervised learning has been applied successfully in many fields. This paper reviews the development process and main theories of semi-supervised learning, as well as its recent advances and importance in solving real-world problems demonstrated by typical application examples.

* Chinese language, 14 pages 

  Click for Model/Code and Paper
Deep learning based mood tagging for Chinese song lyrics

May 23, 2019
Jie Wang, Xinyan Zhao

Nowadays, listening music has been and will always be an indispensable part of our daily life. In recent years, sentiment analysis of music has been widely used in the information retrieval systems, personalized recommendation systems and so on. Due to the development of deep learning, this paper commits to find an effective approach for mood tagging of Chinese song lyrics. To achieve this goal, both machine-learning and deep-learning models have been studied and compared. Eventually, a CNN-based model with pre-trained word embedding has been demonstrated to effectively extract the distribution of emotional features of Chinese lyrics, with at least 15 percentage points higher than traditional machine-learning methods (i.e. TF-IDF+SVM and LIWC+SVM), and 7 percentage points higher than other deep-learning models (i.e. RNN, LSTM). In this paper, more than 160,000 lyrics corpus has been leveraged for pre-training word embedding for mood tagging boost.


  Click for Model/Code and Paper
Theme-aware generation model for chinese lyrics

May 23, 2019
Jie Wang, Xinyan Zhao

With rapid development of neural networks, deep-learning has been extended to various natural language generation fields, such as machine translation, dialogue generation and even literature creation. In this paper, we propose a theme-aware language generation model for Chinese music lyrics, which improves the theme-connectivity and coherence of generated paragraphs greatly. A multi-channel sequence-to-sequence (seq2seq) model encodes themes and previous sentences as global and local contextual information. Moreover, attention mechanism is incorporated for sequence decoding, enabling to fuse context into predicted next texts. To prepare appropriate train corpus, LDA (Latent Dirichlet Allocation) is applied for theme extraction. Generated lyrics is grammatically correct and semantically coherent with selected themes, which offers a valuable modelling method in other fields including multi-turn chatbots, long paragraph generation and etc.


  Click for Model/Code and Paper
Combining RGB and Points to Predict Grasping Region for Robotic Bin-Picking

Apr 24, 2019
Quanquan Shao, Jie Hu

This paper focuses on a robotic picking tasks in cluttered scenario. Because of the diversity of objects and clutter by placing, it is much difficult to recognize and estimate their pose before grasping. Here, we use U-net, a special Convolution Neural Networks (CNN), to combine RGB images and depth information to predict picking region without recognition and pose estimation. The efficiency of diverse visual input of the network were compared, including RGB, RGB-D and RGB-Points. And we found the RGB-Points input could get a precision of 95.74%.

* 5 pages, 6 figures 

  Click for Model/Code and Paper
Visual Saliency Maps Can Apply to Facial Expression Recognition

Nov 12, 2018
Zhenyue Qin, Jie Wu

Human eyes concentrate different facial regions during distinct cognitive activities. We study utilising facial visual saliency maps to classify different facial expressions into different emotions. Our results show that our novel method of merely using facial saliency maps can achieve a descent accuracy of 65\%, much higher than the chance level of $1/7$. Furthermore, our approach is of semi-supervision, i.e., our facial saliency maps are generated from a general saliency prediction algorithm that is not explicitly designed for face images. We also discovered that the classification accuracies of each emotional class using saliency maps demonstrate a strong positive correlation with the accuracies produced by face images. Our work implies that humans may look at different facial areas in order to perceive different emotions.


  Click for Model/Code and Paper
Reinforcement Learning based Dynamic Model Selection for Short-Term Load Forecasting

Nov 05, 2018
Cong Feng, Jie Zhang

With the growing prevalence of smart grid technology, short-term load forecasting (STLF) becomes particularly important in power system operations. There is a large collection of methods developed for STLF, but selecting a suitable method under varying conditions is still challenging. This paper develops a novel reinforcement learning based dynamic model selection (DMS) method for STLF. A forecasting model pool is first built, including ten state-of-the-art machine learning based forecasting models. Then a Q-learning agent learns the optimal policy of selecting the best forecasting model for the next time step, based on the model performance. The optimal DMS policy is applied to select the best model at each time step with a moving window. Numerical simulations on two-year load and weather data show that the Q-learning algorithm converges fast, resulting in effective and efficient DMS. The developed STLF model with Q-learning based DMS improves the forecasting accuracy by approximately 50%, compared to the state-of-the-art machine learning based STLF models.


  Click for Model/Code and Paper
Semantic WordRank: Generating Finer Single-Document Summarizations

Sep 12, 2018
Hao Zhang, Jie Wang

We present Semantic WordRank (SWR), an unsupervised method for generating an extractive summary of a single document. Built on a weighted word graph with semantic and co-occurrence edges, SWR scores sentences using an article-structure-biased PageRank algorithm with a Softplus function adjustment, and promotes topic diversity using spectral subtopic clustering under the Word-Movers-Distance metric. We evaluate SWR on the DUC-02 and SummBank datasets and show that SWR produces better summaries than the state-of-the-art algorithms over DUC-02 under common ROUGE measures. We then show that, under the same measures over SummBank, SWR outperforms each of the three human annotators (aka. judges) and compares favorably with the combined performance of all judges.

* 12 pages, accepted by IDEAL2018 

  Click for Model/Code and Paper
Dependency-based Hybrid Trees for Semantic Parsing

Sep 01, 2018
Zhanming Jie, Wei Lu

We propose a novel dependency-based hybrid tree model for semantic parsing, which converts natural language utterance into machine interpretable meaning representations. Unlike previous state-of-the-art models, the semantic information is interpreted as the latent dependency between the natural language words in our joint representation. Such dependency information can capture the interactions between the semantics and natural language words. We integrate a neural component into our model and propose an efficient dynamic-programming algorithm to perform tractable inference. Through extensive experiments on the standard multilingual GeoQuery dataset with eight languages, we demonstrate that our proposed approach is able to achieve state-of-the-art performance across several languages. Analysis also justifies the effectiveness of using our new dependency-based representation.

* Accepted by EMNLP 2018 

  Click for Model/Code and Paper