Models, code, and papers for "Jian Sun":

RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space

Feb 26, 2019
Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, Jian Tang

We study the problem of learning representations of entities and relations in knowledge graphs for predicting missing links. The success of such a task heavily relies on the ability of modeling and inferring the patterns of (or between) the relations. In this paper, we present a new approach for knowledge graph embedding called RotatE, which is able to model and infer various relation patterns including: symmetry/antisymmetry, inversion, and composition. Specifically, the RotatE model defines each relation as a rotation from the source entity to the target entity in the complex vector space. In addition, we propose a novel self-adversarial negative sampling technique for efficiently and effectively training the RotatE model. Experimental results on multiple benchmark knowledge graphs show that the proposed RotatE model is not only scalable, but also able to infer and model various relation patterns and significantly outperform existing state-of-the-art models for link prediction.

* Accepted to ICLR 2019 

  Click for Model/Code and Paper
DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases

May 19, 2019
Zhiqing Sun, Jian Tang, Pan Du, Zhi-Hong Deng, Jian-Yun Nie

Keyphrase extraction from documents is useful to a variety of applications such as information retrieval and document summarization. This paper presents an end-to-end method called DivGraphPointer for extracting a set of diversified keyphrases from a document. DivGraphPointer combines the advantages of traditional graph-based ranking methods and recent neural network-based approaches. Specifically, given a document, a word graph is constructed from the document based on word proximity and is encoded with graph convolutional networks, which effectively capture document-level word salience by modeling long-range dependency between words in the document and aggregating multiple appearances of identical words into one node. Furthermore, we propose a diversified point network to generate a set of diverse keyphrases out of the word graph in the decoding process. Experimental results on five benchmark data sets show that our proposed method significantly outperforms the existing state-of-the-art approaches.

* Accepted to SIGIR 2019 

  Click for Model/Code and Paper
Fast Guided Filter

May 05, 2015
Kaiming He, Jian Sun

The guided filter is a technique for edge-aware image filtering. Because of its nice visual quality, fast speed, and ease of implementation, the guided filter has witnessed various applications in real products, such as image editing apps in phones and stereo reconstruction, and has been included in official MATLAB and OpenCV. In this note, we remind that the guided filter can be simply sped up from O(N) time to O(N/s^2) time for a subsampling ratio s. In a variety of applications, this leads to a speedup of >10x with almost no visible degradation. We hope this acceleration will improve performance of current applications and further popularize this filter. Code is released.

* Technical report 

  Click for Model/Code and Paper
Convolutional Neural Networks at Constrained Time Cost

Dec 04, 2014
Kaiming He, Jian Sun

Though recent advanced convolutional neural networks (CNNs) have been improving the image recognition accuracy, the models are getting more complex and time-consuming. For real-world applications in industrial and commercial scenarios, engineers and developers are often faced with the requirement of constrained time budget. In this paper, we investigate the accuracy of CNNs under constrained time cost. Under this constraint, the designs of the network architectures should exhibit as trade-offs among the factors like depth, numbers of filters, filter sizes, etc. With a series of controlled comparisons, we progressively modify a baseline model while preserving its time complexity. This is also helpful for understanding the importance of the factors in network designs. We present an architecture that achieves very competitive accuracy in the ImageNet dataset (11.8% top-5 error, 10-view test), yet is 20% faster than "AlexNet" (16.0% top-5 error, 10-view test).

* 8-page technical report 

  Click for Model/Code and Paper
Gromov-Hausdorff Approximation of Metric Spaces with Linear Structure

May 06, 2013
Frédéric Chazal, Jian Sun

In many real-world applications data come as discrete metric spaces sampled around 1-dimensional filamentary structures that can be seen as metric graphs. In this paper we address the metric reconstruction problem of such filamentary structures from data sampled around them. We prove that they can be approximated, with respect to the Gromov-Hausdorff distance by well-chosen Reeb graphs (and some of their variants) and we provide an efficient and easy to implement algorithm to compute such approximations in almost linear time. We illustrate the performances of our algorithm on a few synthetic and real data sets.


  Click for Model/Code and Paper
Mutual information is copula entropy

Aug 06, 2008
Jian Ma, Zengqi Sun

We prove that mutual information is actually negative copula entropy, based on which a method for mutual information estimation is proposed.


  Click for Model/Code and Paper
Dependence Structure Estimation via Copula

Apr 28, 2008
Jian Ma, Zengqi Sun

We propose a new framework for dependence structure learning via copula. Copula is a statistical theory on dependence and measurement of association. Graphical models are considered as a type of special case of copula families, named product copula. In this paper, a nonparametric algorithm for copula estimation is presented. Then a Chow-Liu like method based on dependence measure via copula is proposed to estimate maximum spanning product copula with only bivariate dependence relations. The advantage of the framework is that learning with empirical copula focuses only on dependence relations among random variables, without knowing the properties of individual variables. Another advantage is that copula is a universal model of dependence and therefore the framework based on it can be generalized to deal with a wide range of complex dependence relations. Experiments on both simulated data and real application data show the effectiveness of the proposed method.


  Click for Model/Code and Paper
Copula Component Analysis

Mar 20, 2007
Jian Ma, Zengqi Sun

A framework named Copula Component Analysis (CCA) for blind source separation is proposed as a generalization of Independent Component Analysis (ICA). It differs from ICA which assumes independence of sources that the underlying components may be dependent with certain structure which is represented by Copula. By incorporating dependency structure, much accurate estimation can be made in principle in the case that the assumption of independence is invalidated. A two phrase inference method is introduced for CCA which is based on the notion of multidimensional ICA.


  Click for Model/Code and Paper
Channel Pruning for Accelerating Very Deep Neural Networks

Aug 21, 2017
Yihui He, Xiangyu Zhang, Jian Sun

In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction. We further generalize this algorithm to multi-layer and multi-branch cases. Our method reduces the accumulated error and enhance the compatibility with various architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. More importantly, our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively, which is significant. Code has been made publicly available.

* To be appear at ICCV 2017 

  Click for Model/Code and Paper
Instance-aware Semantic Segmentation via Multi-task Network Cascades

Dec 14, 2015
Jifeng Dai, Kaiming He, Jian Sun

Semantic segmentation research has recently witnessed rapid progress, but many leading methods are unable to identify object instances. In this paper, we present Multi-task Network Cascades for instance-aware semantic segmentation. Our model consists of three networks, respectively differentiating instances, estimating masks, and categorizing objects. These networks form a cascaded structure, and are designed to share their convolutional features. We develop an algorithm for the nontrivial end-to-end training of this causal, cascaded structure. Our solution is a clean, single-step training framework and can be generalized to cascades that have more stages. We demonstrate state-of-the-art instance-aware semantic segmentation accuracy on PASCAL VOC. Meanwhile, our method takes only 360ms testing an image using VGG-16, which is two orders of magnitude faster than previous systems for this challenging problem. As a by product, our method also achieves compelling object detection results which surpass the competitive Fast/Faster R-CNN systems. The method described in this paper is the foundation of our submissions to the MS COCO 2015 segmentation competition, where we won the 1st place.

* Tech report. 1st-place winner of MS COCO 2015 segmentation competition 

  Click for Model/Code and Paper
BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation

May 18, 2015
Jifeng Dai, Kaiming He, Jian Sun

Recent leading approaches to semantic segmentation rely on deep convolutional networks trained with human-annotated, pixel-level segmentation masks. Such pixel-accurate supervision demands expensive labeling effort and limits the performance of deep networks that usually benefit from more training data. In this paper, we propose a method that achieves competitive accuracy but only requires easily obtained bounding box annotations. The basic idea is to iterate between automatically generating region proposals and training convolutional networks. These two steps gradually recover segmentation masks for improving the networks, and vise versa. Our method, called BoxSup, produces competitive results supervised by boxes only, on par with strong baselines fully supervised by masks under the same setting. By leveraging a large amount of bounding boxes, BoxSup further unleashes the power of deep convolutional networks and yields state-of-the-art results on PASCAL VOC 2012 and PASCAL-CONTEXT.


  Click for Model/Code and Paper
Convolutional Feature Masking for Joint Object and Stuff Segmentation

Apr 02, 2015
Jifeng Dai, Kaiming He, Jian Sun

The topic of semantic segmentation has witnessed considerable progress due to the powerful features learned by convolutional neural networks (CNNs). The current leading approaches for semantic segmentation exploit shape information by extracting CNN features from masked image regions. This strategy introduces artificial boundaries on the images and may impact the quality of the extracted features. Besides, the operations on the raw image domain require to compute thousands of networks on a single image, which is time-consuming. In this paper, we propose to exploit shape information via masking convolutional features. The proposal segments (e.g., super-pixels) are treated as masks on the convolutional feature maps. The CNN features of segments are directly masked out from these maps and used to train classifiers for recognition. We further propose a joint method to handle objects and "stuff" (e.g., grass, sky, water) in the same framework. State-of-the-art results are demonstrated on benchmarks of PASCAL VOC and new PASCAL-CONTEXT, with a compelling computational speed.

* IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015 

  Click for Model/Code and Paper
FoodTracker: A Real-time Food Detection Mobile Application by Deep Convolutional Neural Networks

Sep 16, 2019
Jianing Sun, Katarzyna Radecka, Zeljko Zilic

We present a mobile application made to recognize food items of multi-object meal from a single image in real-time, and then return the nutrition facts with components and approximate amounts. Our work is organized in two parts. First, we build a deep convolutional neural network merging with YOLO, a state-of-the-art detection strategy, to achieve simultaneous multi-object recognition and localization with nearly 80% mean average precision. Second, we adapt our model into a mobile application with extending function for nutrition analysis. After inferring and decoding the model output in the app side, we present detection results that include bounding box position and class label in either real-time or local mode. Our model is well-suited for mobile devices with negligible inference time and small memory requirements with a deep learning algorithm.

* The 16th International Conference on Machine Vision Applications 

  Click for Model/Code and Paper
HRGE-Net: Hierarchical Relational Graph Embedding Network for Multi-view 3D Shape Recognition

Aug 27, 2019
Xin Wei, Ruixuan Yu, Jian Sun

View-based approach that recognizes 3D shape through its projected 2D images achieved state-of-the-art performance for 3D shape recognition. One essential challenge for view-based approach is how to aggregate the multi-view features extracted from 2D images to be a global 3D shape descriptor. In this work, we propose a novel feature aggregation network by fully investigating the relations among views. We construct a relational graph with multi-view images as nodes, and design relational graph embedding by modeling pairwise and neighboring relations among views. By gradually coarsening the graph, we build a hierarchical relational graph embedding network (HRGE-Net) to aggregate the multi-view features to be a global shape descriptor. Extensive experiments show that HRGE-Net achieves stateof-the-art performance for 3D shape classification and retrieval on benchmark datasets.


  Click for Model/Code and Paper
HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Nov 22, 2018
Shipeng Wang, Jian Sun, Zongben Xu

Deep neural networks are traditionally trained using human-designed stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as \textit{HyperAdam}, is proposed that combines the idea of "learning to optimize" and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates. The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.


  Click for Model/Code and Paper
Learning Spectral Transform Network on 3D Surface for Non-rigid Shape Analysis

Oct 21, 2018
Ruixuan Yu, Jian Sun, Huibin Li

Designing a network on 3D surface for non-rigid shape analysis is a challenging task. In this work, we propose a novel spectral transform network on 3D surface to learn shape descriptors. The proposed network architecture consists of four stages: raw descriptor extraction, surface second-order pooling, mixture of power function-based spectral transform, and metric learning. The proposed network is simple and shallow. Quantitative experiments on challenging benchmarks show its effectiveness for non-rigid shape retrieval and classification, e.g., it achieved the highest accuracies on SHREC14, 15 datasets as well as the Range subset of SHREC17 dataset.

* 16 pages, 3 figures 

  Click for Model/Code and Paper
GridFace: Face Rectification via Learning Local Homography Transformations

Aug 19, 2018
Erjin Zhou, Zhimin Cao, Jian Sun

In this paper, we propose a method, called GridFace, to reduce facial geometric variations and improve the recognition performance. Our method rectifies the face by local homography transformations, which are estimated by a face rectification network. To encourage the image generation with canonical views, we apply a regularization based on the natural face distribution. We learn the rectification network and recognition network in an end-to-end manner. Extensive experiments show our method greatly reduces geometric variations, and gains significant improvements in unconstrained face recognition scenarios.

* To appear in ECCV 2018 

  Click for Model/Code and Paper
Automated vehicle's behavior decision making using deep reinforcement learning and high-fidelity simulation environment

Apr 17, 2018
Yingjun Ye, Xiaohui Zhang, Jian Sun

Automated vehicles are deemed to be the key element for the intelligent transportation system in the future. Many studies have been made to improve the Automated vehicles' ability of environment recognition and vehicle control, while the attention paid to decision making is not enough though the decision algorithms so far are very preliminary. Therefore, a framework of the decision-making training and learning is put forward in this paper. It consists of two parts: the deep reinforcement learning training program and the high-fidelity virtual simulation environment. Then the basic microscopic behavior, car-following, is trained within this framework. In addition, theoretical analysis and experiments were conducted on setting reward function for accelerating training using deep reinforcement learning. The results show that on the premise of driving comfort, the efficiency of the trained Automated vehicle increases 7.9% compared to the classical traffic model, intelligent driver model. Later on, on a more complex three-lane section, we trained the integrated model combines both car-following and lane-changing behavior, the average speed further grows 2.4%. It indicates that our framework is effective for Automated vehicle's decision-making learning.

* 22 pages, 13 figures, CICTP2018 

  Click for Model/Code and Paper
Harmonic Extension

Sep 22, 2015
Zuoqiang Shi, Jian Sun, Minghao Tian

In this paper, we consider the harmonic extension problem, which is widely used in many applications of machine learning. We find that the transitional method of graph Laplacian fails to produce a good approximation of the classical harmonic function. To tackle this problem, we propose a new method called the point integral method (PIM). We consider the harmonic extension problem from the point of view of solving PDEs on manifolds. The basic idea of the PIM method is to approximate the harmonicity using an integral equation, which is easy to be discretized from points. Based on the integral equation, we explain the reason why the transitional graph Laplacian may fail to approximate the harmonicity in the classical sense and propose a different approach which we call the volume constraint method (VCM). Theoretically, both the PIM and the VCM computes a harmonic function with convergence guarantees, and practically, they are both simple, which amount to solve a linear system. One important application of the harmonic extension in machine learning is semi-supervised learning. We run a popular semi-supervised learning algorithm by Zhu et al. over a couple of well-known datasets and compare the performance of the aforementioned approaches. Our experiments show the PIM performs the best.

* 10 pages, 2 figures 

  Click for Model/Code and Paper
InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization

Jul 31, 2019
Fan-Yun Sun, Jordan Hoffmann, Jian Tang

This paper studies learning the representations of whole graphs in both unsupervised and semi-supervised scenarios. Graph-level representations are critical in a variety of real-world applications such as predicting the properties of molecules and community analysis in social networks. Traditional graph kernel based methods are simple, yet effective for obtaining fixed-length representations for graphs but they suffer from poor generalization due to hand-crafted designs. There are also some recent methods based on language models (e.g. graph2vec) but they tend to only consider certain substructures (e.g. subtrees) as graph representatives. Inspired by recent progress of unsupervised representation learning, in this paper we proposed a novel method called InfoGraph for learning graph-level representations. We maximize the mutual information between the graph-level representation and the representations of substructures of different scales (e.g., nodes, edges, triangles). By doing so, the graph-level representations encode aspects of the data that are shared across different scales of substructures. Furthermore, we further propose InfoGraph*, an extension of InfoGraph for semi-supervised scenarios. InfoGraph* maximizes the mutual information between unsupervised graph representations learned by InfoGraph and the representations learned by existing supervised methods. As a result, the supervised encoder learns from unlabeled data while preserving the latent semantic space favored by the current supervised task. Experimental results on the tasks of graph classification and molecular property prediction show that InfoGraph is superior to state-of-the-art baselines and InfoGraph* can achieve performance competitive with state-of-the-art semi-supervised models.

* 13 pages 

  Click for Model/Code and Paper