Models, code, and papers for "Jie Tang":

Design of Multifunctional Soft Doming Actuator for Soft Machines

Mar 23, 2018
Yichao Tang, Jie Yin

Bilayer bending based soft actuators are widely utilized in soft robotics for locomotion and object gripping. However, studies on soft actuators based on bilayer doming remain largely unexplored despite the often-observed dome-like shapes in undersea animals such as jellyfish and octopus suction cup. Here, based on the simplified model of bending-induced doming of circular bilayer plates with mismatched deformation, we explore the design of soft doming actuator upon pneumatic actuation and its implications in design of multifunctional soft machines. The bilayer actuator is composed of patterned embedded pneumatic channel on top for radial expansion and a solid elastomeric layer on bottom for strain-limiting. We show that both the cavity volume and bending angle at the rim of the actuated dome can be controlled by tuning the height gradient of the pneumatic channel along the radial direction. We demonstrate its potential multifunctional applications in swimming, adhesion, and gripping, including high efficient jellyfish-inspired underwater soft robots with locomotion speed of 84 cm/min and rotation-based soft grippers with low energy cost by harnessing the large rim bending angle, as well as octopus-inspired soft adhesion actuators with strong and switchable adhesion force of over 10 N by utilizing the large cavity volume.


  Click for Model/Code and Paper
Semi-supervised Learning on Graphs with Generative Adversarial Nets

Sep 01, 2018
Ming Ding, Jie Tang, Jie Zhang

We investigate how generative adversarial nets (GANs) can help semi-supervised learning on graphs. We first provide insights on working principles of adversarial learning over graphs and then present GraphSGAN, a novel approach to semi-supervised learning on graphs. In GraphSGAN, generator and classifier networks play a novel competitive game. At equilibrium, generator generates fake samples in low-density areas between subgraphs. In order to discriminate fake samples from the real, classifier implicitly takes the density property of subgraph into consideration. An efficient adversarial learning algorithm has been developed to improve traditional normalized graph Laplacian regularization with a theoretical guarantee. Experimental results on several different genres of datasets show that the proposed GraphSGAN significantly outperforms several state-of-the-art methods. GraphSGAN can be also trained using mini-batch, thus enjoys the scalability advantage.

* to appear in CIKM 2018 

  Click for Model/Code and Paper
Simple and Lightweight Human Pose Estimation

Nov 23, 2019
Zhe Zhang, Jie Tang, Gangshan Wu

Recent research on human pose estimation has achieved significant improvement. However, most existing methods tend to pursue higher scores using complex architecture or computationally expensive models on benchmark datasets, ignoring the deployment costs in practice. In this paper, we investigate the problem of simple and lightweight human pose estimation. We first redesign a lightweight bottleneck block with two non-novel concepts: depthwise convolution and attention mechanism. And then, based on the lightweight block, we present a Lightweight Pose Network (LPN) following the architecture design principles of SimpleBaseline. The model size (#Params) of our small network LPN-50 is only 9% of SimpleBaseline(ResNet50), and the computational complexity (FLOPs) is only 11%. To give full play to the potential of our LPN and get more accurate predicted results, we also propose an iterative training strategy and a model-agnostic post-processing function Beta-Soft-Argmax. We empirically demonstrate the effectiveness and efficiency of our methods on the benchmark dataset: the COCO keypoint detection dataset. Besides, we show the speed superiority of our lightweight network at inference time on a non-GPU platform. Specifically, our LPN-50 can achieve 68.7 in AP score on the COCO test-dev set, with only 2.7M parameters and 1.0 GFLOPs, while the inference speed is 33 FPS on an Intel i7-8700K CPU machine.


  Click for Model/Code and Paper
Weakly Learning to Match Experts in Online Community

May 07, 2018
Yujie Qian, Jie Tang, Kan Wu

In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able to provide answers to a given question and at the same time would be unlikely to say "no" to the invitation. The challenge is how to trade off the matching degree between users' expertise and the question topic, and the likelihood of positive response from the invited users. In this paper, we formally formulate the problem and develop a weakly supervised factor graph (WeakFG) model to address the problem. The model explicitly captures expertise matching degree between questions and users. To model the likelihood that an invited user is willing to answer a specific question, we incorporate a set of correlations based on social identity theory into the WeakFG model. We use two different genres of datasets: QA-Expert and Paper-Reviewer, to validate the proposed model. Our experimental results show that the proposed model can significantly outperform (+1.5-10.7% by MAP) the state-of-the-art algorithms for matching users (experts) with community questions. We have also developed an online system to further demonstrate the advantages of the proposed method.

* IJCAI 2018 

  Click for Model/Code and Paper
Real-Time Robot Localization, Vision, and Speech Recognition on Nvidia Jetson TX1

May 31, 2017
Jie Tang, Yong Ren, Shaoshan Liu

Robotics systems are complex, often consisted of basic services including SLAM for localization and mapping, Convolution Neural Networks for scene understanding, and Speech Recognition for user interaction, etc. Meanwhile, robots are mobile and usually have tight energy constraints, integrating these services onto an embedded platform with around 10 W of power consumption is critical to the proliferation of mobile robots. In this paper, we present a case study on integrating real-time localization, vision, and speech recognition services on a mobile SoC, Nvidia Jetson TX1, within about 10 W of power envelope. In addition, we explore whether offloading some of the services to cloud platform can lead to further energy efficiency while meeting the real-time requirements

* 12 pages, 8 figures 

  Click for Model/Code and Paper
Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs

Apr 20, 2016
Zhilin Yang, Jie Tang, William Cohen

We study the extent to which online social networks can be connected to open knowledge bases. The problem is referred to as learning social knowledge graphs. We propose a multi-modal Bayesian embedding model, GenVector, to learn latent topics that generate word and network embeddings. GenVector leverages large-scale unlabeled data with embeddings and represents data of two modalities---i.e., social network users and knowledge concepts---in a shared latent topic space. Experiments on three datasets show that the proposed method clearly outperforms state-of-the-art methods. We then deploy the method on AMiner, a large-scale online academic search system with a network of 38,049,189 researchers with a knowledge base with 35,415,011 concepts. Our method significantly decreases the error rate in an online A/B test with live users.


  Click for Model/Code and Paper
Online PCB Defect Detector On A New PCB Defect Dataset

Feb 17, 2019
Sanli Tang, Fan He, Xiaolin Huang, Jie Yang

Previous works for PCB defect detection based on image difference and image processing techniques have already achieved promising performance. However, they sometimes fall short because of the unaccounted defect patterns or over-sensitivity about some hyper-parameters. In this work, we design a deep model that accurately detects PCB defects from an input pair of a detect-free template and a defective tested image. A novel group pyramid pooling module is proposed to efficiently extract features of a large range of resolutions, which are merged by group to predict PCB defect of corresponding scales. To train the deep model, a dataset is established, namely DeepPCB, which contains 1,500 image pairs with annotations including positions of 6 common types of PCB defects. Experiment results validate the effectiveness and efficiency of the proposed model by achieving $98.6\%$ mAP @ 62 FPS on DeepPCB dataset. This dataset is now available at: https://github.com/tangsanli5201/DeepPCB.

* 4 pages, 4 figures 

  Click for Model/Code and Paper
Spectral Network Embedding: A Fast and Scalable Method via Sparsity

Jun 13, 2018
Jie Zhang, Yan Wang, Jie Tang, Ming Ding

Network embedding aims to learn low-dimensional representations of nodes in a network, while the network structure and inherent properties are preserved. It has attracted tremendous attention recently due to significant progress in downstream network learning tasks, such as node classification, link prediction, and visualization. However, most existing network embedding methods suffer from the expensive computations due to the large volume of networks. In this paper, we propose a $10\times \sim 100\times$ faster network embedding method, called Progle, by elegantly utilizing the sparsity property of online networks and spectral analysis. In Progle, we first construct a \textit{sparse} proximity matrix and train the network embedding efficiently via sparse matrix decomposition. Then we introduce a network propagation pattern via spectral analysis to incorporate local and global structure information into the embedding. Besides, this model can be generalized to integrate network information into other insufficiently trained embeddings at speed. Benefiting from sparse spectral network embedding, our experiment on four different datasets shows that Progle outperforms or is comparable to state-of-the-art unsupervised comparison approaches---DeepWalk, LINE, node2vec, GraRep, and HOPE, regarding accuracy, while is $10\times$ faster than the fastest word2vec-based method. Finally, we validate the scalability of Progle both in real large-scale networks and multiple scales of synthetic networks.


  Click for Model/Code and Paper
Distributed Simulation Platform for Autonomous Driving

May 31, 2017
Jie Tang, Shaoshan Liu, Chao Wang, Quan Wang

Autonomous vehicle safety and reliability are the paramount requirements when developing autonomous vehicles. These requirements are guaranteed by massive functional and performance tests. Conducting these tests on real vehicles is extremely expensive and time consuming, and thus it is imperative to develop a simulation platform to perform these tasks. For simulation, we can utilize the Robot Operating System (ROS) for data playback to test newly developed algorithms. However, due to the massive amount of simulation data, performing simulation on single machines is not practical. Hence, a high-performance distributed simulation platform is a critical piece in autonomous driving development. In this paper we present our experiences of building a production distributed autonomous driving simulation platform. This platform is built upon Spark distributed framework, for distributed computing management, and ROS, for data playback simulations.

* 12 pages, 7 figures 

  Click for Model/Code and Paper
PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

Aug 09, 2019
Qiankun Tang, Shice Liu, Jie Li, Yu Hu

We introduce a novel single-shot object detector to ease the imbalance of foreground-background class by suppressing the easy negatives while increasing the positives. To achieve this, we propose an Anchor Promotion Module (APM) which predicts the probability of each anchor as positive and adjusts their initial locations and shapes to promote both the quality and quantity of positive anchors. In addition, we design an efficient Feature Alignment Module (FAM) to extract aligned features for fitting the promoted anchors with the help of both the location and shape transformation information from the APM. We assemble the two proposed modules to the backbone of VGG-16 and ResNet-101 network with an encoder-decoder architecture. Extensive experiments on MS COCO well demonstrate our model performs competitively with alternative methods (40.0\% mAP on \textit{test-dev} set) and runs faster (28.6 \textit{fps}).

* Submitted to a conference, under review 

  Click for Model/Code and Paper
Graph Adversarial Training: Dynamically Regularizing Based on Graph Structure

Feb 20, 2019
Fuli Feng, Xiangnan He, Jie Tang, Tat-Seng Chua

Recent efforts show that neural networks are vulnerable to small but intentional perturbations on input features in visual classification tasks. Due to the additional consideration of connections between examples (e.g., articles with citation link tend to be in the same class), graph neural networks could be more sensitive to the perturbations, since the perturbations from connected examples exacerbate the impact on a target example. Adversarial Training (AT), a dynamic regularization technique, can resist the worst-case perturbations on input features and is a promising choice to improve model robustness and generalization. However, existing AT methods focus on standard classification, being less effective when training models on graph since it does not model the impact from connected examples. In this work, we explore adversarial training on graph, aiming to improve the robustness and generalization of models learned on graph. We propose Graph Adversarial Training (GAT), which takes the impact from connected examples into account when learning to construct and resist perturbations. We give a general formulation of GAT, which can be seen as a dynamic regularization scheme based on the graph structure. To demonstrate the utility of GAT, we employ it on a state-of-the-art graph neural network model --- Graph Convolutional Network (GCN). We conduct experiments on two citation graphs (Citeseer and Cora) and a knowledge graph (NELL), verifying the effectiveness of GAT which outperforms normal training on GCN by 4.51% in node classification accuracy. Codes will be released upon acceptance.


  Click for Model/Code and Paper
Switchable Adhesion Actuator for Amphibious Climbing Soft Robot

Feb 05, 2019
Yichao Tang, Qiuting Zhang, Gaojian Lin, Jie Yin

Climbing soft robots are of tremendous interest in both science and engineering due to their potential applications in intelligent surveillance, inspection, maintenance, and detection under environments away from the ground. The challenge lies in the design of a fast, robust, switchable adhesion actuator to easily attach and detach the vertical surfaces. Here, we propose a new design of pneumatic-actuated bioinspired soft adhesion actuator working both on ground and under water. It is composed of extremely soft bilayer structures with an embedded spiral pneumatic channel resting on top of a base layer with a cavity. Rather than the traditional way of directly pumping air out of the cavity for suction in hard polymer-based adhesion actuator, we inflate air into the top spiral channel to deform into a stable 3D domed shape for achieving negative pressure in the cavity. The characterization of the maximum shear adhesion force of the proposed soft adhesion actuator shows strong and rapid reversible adhesion on multiple types of smooth and semi-smooth surfaces. Based on the switchable adhesion actuator, we design and fabricate a novel load-carrying amphibious climbing soft robot (ACSR) by combining with a soft bending actuator. We demonstrate that it can operate on a wide range of foreign horizontal and vertical surfaces including dry, wet, slippery, smooth, and semi-smooth ones on ground and also under water with certain load-carrying capability. We show that the vertical climbing speed can reach about 286 mm/min (1.6 body length/min) while carrying over 200g object (over 5 times the weight of ACSR itself) during climbing on ground and under water. This research could largely push the boundaries of soft robot capabilities and multifunctionality in window cleaning and underwater inspection under harsh environment.


  Click for Model/Code and Paper
Adversarial Attack Type I: Generating False Positives

Sep 03, 2018
Sanli Tang, Xiaolin Huang, Mingjian Chen, Jie Yang

False positive and false negative rates are equally important for evaluating the performance of a classifier. Adversarial examples by increasing false negative rate have been studied in recent years. However, harming a classifier by increasing false positive rate is almost blank, since it is much more difficult to generate a new and meaningful positive than the negative. To generate false positives, a supervised generative framework is proposed in this paper. Experiment results show that our method is practical and effective to generate those adversarial examples on large-scale image datasets.


  Click for Model/Code and Paper
Expert Finding in Community Question Answering: A Review

Apr 21, 2018
Sha Yuan, Yu Zhang, Jie Tang, Juan Bautista Cabotà

The rapid development recently of Community Question Answering (CQA) satisfies users quest for professional and personal knowledge about anything. In CQA, one central issue is to find users with expertise and willingness to answer the given questions. Expert finding in CQA often exhibits very different challenges compared to traditional methods. Sparse data and new features violate fundamental assumptions of traditional recommendation systems. This paper focuses on reviewing and categorizing the current progress on expert finding in CQA. We classify all the existing solutions into four different categories: matrix factorization based models (MF-based models), gradient boosting tree based models (GBT-based models), deep learning based models (DL-based models) and ranking based models (R-based models). We find that MF-based models outperform other categories of models in the field of expert finding in CQA. Moreover, we use innovative diagrams to clarify several important concepts of ensemble learning, and find that ensemble models with several specific single models can further boosting the performance. Further, we compare the performance of different models on different types of matching tasks, including text vs. text, graph vs. text, audio vs. text and video vs. text. The results can help the model selection of expert finding in practice. Finally, we explore some potential future issues in expert finding research in CQA.


  Click for Model/Code and Paper
CAAD: Computer Architecture for Autonomous Driving

Feb 07, 2017
Shaoshan Liu, Jie Tang, Zhe Zhang, Jean-Luc Gaudiot

We describe the computing tasks involved in autonomous driving, examine existing autonomous driving computing platform implementations. To enable autonomous driving, the computing stack needs to simultaneously provide high performance, low power consumption, and low thermal dissipation, at low cost. We discuss possible approaches to design computing platforms that will meet these needs.

* 7 pages, 4 figures, accepted by IEEE Computer Magazine 

  Click for Model/Code and Paper
Cognitive Knowledge Graph Reasoning for One-shot Relational Learning

Jun 13, 2019
Zhengxiao Du, Chang Zhou, Ming Ding, Hongxia Yang, Jie Tang

Inferring new facts from existing knowledge graphs (KG) with explainable reasoning processes is a significant problem and has received much attention recently. However, few studies have focused on relation types unseen in the original KG, given only one or a few instances for training. To bridge this gap, we propose CogKR for one-shot KG reasoning. The one-shot relational learning problem is tackled through two modules: the summary module summarizes the underlying relationship of the given instances, based on which the reasoning module infers the correct answers. Motivated by the dual process theory in cognitive science, in the reasoning module, a cognitive graph is built by iteratively coordinating retrieval (System 1, collecting relevant evidence intuitively) and reasoning (System 2, conducting relational reasoning over collected information). The structural information offered by the cognitive graph enables our model to aggregate pieces of evidence from multiple reasoning paths and explain the reasoning process graphically. Experiments show that CogKR substantially outperforms previous state-of-the-art models on one-shot KG reasoning benchmarks, with relative improvements of 24.3%-29.7% on MRR. The source code is available at https://github.com/THUDM/CogKR.


  Click for Model/Code and Paper
Cognitive Graph for Multi-Hop Reading Comprehension at Scale

Jun 04, 2019
Ming Ding, Chang Zhou, Qibin Chen, Hongxia Yang, Jie Tang

We propose a new CogQA framework for multi-hop question answering in web-scale documents. Inspired by the dual process theory in cognitive science, the framework gradually builds a \textit{cognitive graph} in an iterative process by coordinating an implicit extraction module (System 1) and an explicit reasoning module (System 2). While giving accurate answers, our framework further provides explainable reasoning paths. Specifically, our implementation based on BERT and graph neural network efficiently handles millions of documents for multi-hop reasoning questions in the HotpotQA fullwiki dataset, achieving a winning joint $F_1$ score of 34.9 on the leaderboard, compared to 23.6 of the best competitor.

* ACL 2019 

  Click for Model/Code and Paper
Sequential Scenario-Specific Meta Learner for Online Recommendation

Jun 02, 2019
Zhengxiao Du, Xiaowei Wang, Hongxia Yang, Jingren Zhou, Jie Tang

Cold-start problems are long-standing challenges for practical recommendations. Most existing recommendation algorithms rely on extensive observed data and are brittle to recommendation scenarios with few interactions. This paper addresses such problems using few-shot learning and meta learning. Our approach is based on the insight that having a good generalization from a few examples relies on both a generic model initialization and an effective strategy for adapting this model to newly arising tasks. To accomplish this, we combine the scenario-specific learning with a model-agnostic sequential meta-learning and unify them into an integrated end-to-end framework, namely Scenario-specific Sequential Meta learner (or s^2 meta). By doing so, our meta-learner produces a generic initial model through aggregating contextual information from a variety of prediction tasks while effectively adapting to specific tasks by leveraging learning-to-learn knowledge. Extensive experiments on various real-world datasets demonstrate that our proposed model can achieve significant gains over the state-of-the-arts for cold-start problems in online recommendation. Deployment is at the Guess You Like session, the front page of the Mobile Taobao.

* Accepted to KDD 2019 

  Click for Model/Code and Paper
End-to-end Learning for Graph Decomposition

Dec 23, 2018
Jie Song, Bjoern Andres, Michael Black, Otmar Hilliges, Siyu Tang

We propose a novel end-to-end trainable framework for the graph decomposition problem. The minimum cost multicut problem is first converted to an unconstrained binary cubic formulation where cycle consistency constraints are incorporated into the objective function. The new optimization problem can be viewed as a Conditional Random Field (CRF) in which the random variables are associated with the binary edge labels of the initial graph and the hard constraints are introduced in the CRF as high-order potentials. The parameters of a standard Neural Network and the fully differentiable CRF are optimized in an end-to-end manner. Furthermore, our method utilizes the cycle constraints as meta-supervisory signals during the learning of the deep feature representations by taking the dependencies between the output random variables into account. We present analyses of the end-to-end learned representations, showing the impact of the joint training, on the task of clustering images of MNIST. We also validate the effectiveness of our approach both for the feature learning and the final clustering on the challenging task of real-world multi-person pose estimation.


  Click for Model/Code and Paper