Models, code, and papers for "Shu Liu":

Adversarial FDI Attack against AC State Estimation with ANN

Jun 26, 2019
Tian Liu, Tao Shu

Artificial neural network (ANN) provides superior accuracy for nonlinear alternating current (AC) state estimation (SE) in smart grid over traditional methods. However, research has discovered that ANN could be easily fooled by adversarial examples. In this paper, we initiate a new study of adversarial false data injection (FDI) attack against AC SE with ANN: by injecting a deliberate attack vector into measurements, the attacker can degrade the accuracy of ANN SE while remaining undetected. We propose a population-based algorithm and a gradient-based algorithm to generate attack vectors. The performance of these algorithms is evaluated through simulations on IEEE 9-bus, 14-bus and 30-bus systems under various attack scenarios. Simulation results show that DE is more effective than SLSQP on all simulation cases. The attack examples generated by DE algorithm successfully degrade the ANN SE accuracy with high probability.


  Click for Model/Code and Paper
Controlled CNN-based Sequence Labeling for Aspect Extraction

May 29, 2019
Lei Shu, Hu Xu, Bing Liu

One key task of fine-grained sentiment analysis on reviews is to extract aspects or features that users have expressed opinions on. This paper focuses on supervised aspect extraction using a modified CNN called controlled CNN (Ctrl). The modified CNN has two types of control modules. Through asynchronous parameter updating, it prevents over-fitting and boosts CNN's performance significantly. This model achieves state-of-the-art results on standard aspect extraction datasets. To the best of our knowledge, this is the first paper to apply control modules to aspect extraction.


  Click for Model/Code and Paper
Deep Learning with Inaccurate Training Data for Image Restoration

Nov 18, 2018
Bolin Liu, Xiao Shu, Xiaolin Wu

In many applications of deep learning, particularly those in image restoration, it is either very difficult, prohibitively expensive, or outright impossible to obtain paired training data precisely as in the real world. In such cases, one is forced to use synthesized paired data to train the deep convolutional neural network (DCNN). However, due to the unavoidable generalization error in statistical learning, the synthetically trained DCNN often performs poorly on real world data. To overcome this problem, we propose a new general training method that can compensate for, to a large extent, the generalization errors of synthetically trained DCNNs.


  Click for Model/Code and Paper
Demoiréing of Camera-Captured Screen Images Using Deep Convolutional Neural Network

Apr 11, 2018
Bolin Liu, Xiao Shu, Xiaolin Wu

Taking photos of optoelectronic displays is a direct and spontaneous way of transferring data and keeping records, which is widely practiced. However, due to the analog signal interference between the pixel grids of the display screen and camera sensor array, objectionable moir\'e (alias) patterns appear in captured screen images. As the moir\'e patterns are structured and highly variant, they are difficult to be completely removed without affecting the underneath latent image. In this paper, we propose an approach of deep convolutional neural network for demoir\'eing screen photos. The proposed DCNN consists of a coarse-scale network and a fine-scale network. In the coarse-scale network, the input image is first downsampled and then processed by stacked residual blocks to remove the moir\'e artifacts. After that, the fine-scale network upsamples the demoir\'ed low-resolution image back to the original resolution. Extensive experimental results have demonstrated that the proposed technique can efficiently remove the moir\'e patterns for camera acquired screen images; the new technique outperforms the existing ones.


  Click for Model/Code and Paper
Learning-Based Dequantization For Image Restoration Against Extremely Poor Illumination

Mar 20, 2018
Chang Liu, Xiaolin Wu, Xiao Shu

All existing image enhancement methods, such as HDR tone mapping, cannot recover A/D quantization losses due to insufficient or excessive lighting, (underflow and overflow problems). The loss of image details due to A/D quantization is complete and it cannot be recovered by traditional image processing methods, but the modern data-driven machine learning approach offers a much needed cure to the problem. In this work we propose a novel approach to restore and enhance images acquired in low and uneven lighting. First, the ill illumination is algorithmically compensated by emulating the effects of artificial supplementary lighting. Then a DCNN trained using only synthetic data recovers the missing detail caused by quantization.


  Click for Model/Code and Paper
Unseen Class Discovery in Open-world Classification

Jan 17, 2018
Lei Shu, Hu Xu, Bing Liu

This paper concerns open-world classification, where the classifier not only needs to classify test examples into seen classes that have appeared in training but also reject examples from unseen or novel classes that have not appeared in training. Specifically, this paper focuses on discovering the hidden unseen classes of the rejected examples. Clearly, without prior knowledge this is difficult. However, we do have the data from the seen training classes, which can tell us what kind of similarity/difference is expected for examples from the same class or from different classes. It is reasonable to assume that this knowledge can be transferred to the rejected examples and used to discover the hidden unseen classes in them. This paper aims to solve this problem. It first proposes a joint open classification model with a sub-model for classifying whether a pair of examples belongs to the same or different classes. This sub-model can serve as a distance function for clustering to discover the hidden classes of the rejected examples. Experimental results show that the proposed model is highly promising.


  Click for Model/Code and Paper
DOC: Deep Open Classification of Text Documents

Sep 25, 2017
Lei Shu, Hu Xu, Bing Liu

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.

* accepted at EMNLP 2017 

  Click for Model/Code and Paper
Fast Screening Algorithm for Rotation and Scale Invariant Template Matching

Jul 19, 2017
Bolin Liu, Xiao Shu, Xiaolin Wu

This paper presents a generic pre-processor for expediting conventional template matching techniques. Instead of locating the best matched patch in the reference image to a query template via exhaustive search, the proposed algorithm rules out regions with no possible matches with minimum computational efforts. While working on simple patch features, such as mean, variance and gradient, the fast pre-screening is highly discriminative. Its computational efficiency is gained by using a novel octagonal-star-shaped template and the inclusion-exclusion principle to extract and compare patch features. Moreover, it can handle arbitrary rotation and scaling of reference images effectively. Extensive experiments demonstrate that the proposed algorithm greatly reduces the search space while never missing the best match.


  Click for Model/Code and Paper
Lifelong Learning CRF for Supervised Aspect Extraction

Apr 29, 2017
Lei Shu, Hu Xu, Bing Liu

This paper makes a focused contribution to supervised aspect extraction. It shows that if the system has performed aspect extraction from many past domains and retained their results as knowledge, Conditional Random Fields (CRF) can leverage this knowledge in a lifelong learning manner to extract in a new domain markedly better than the traditional CRF without using this prior knowledge. The key innovation is that even after CRF training, the model can still improve its extraction with experiences in its applications.

* Accepted at ACL 2017. arXiv admin note: text overlap with arXiv:1612.07940 

  Click for Model/Code and Paper
Independence Promoted Graph Disentangled Networks

Nov 26, 2019
Yanbei Liu, Xiao Wang, Shu Wu, Zhitao Xiao

We address the problem of disentangled representation learning with independent latent factors in graph convolutional networks (GCNs). The current methods usually learn node representation by describing its neighborhood as a perceptual whole in a holistic manner while ignoring the entanglement of the latent factors. However, a real-world graph is formed by the complex interaction of many latent factors (e.g., the same hobby, education or work in social network). While little effort has been made toward exploring the disentangled representation in GCNs. In this paper, we propose a novel Independence Promoted Graph Disentangled Networks (IPGDN) to learn disentangled node representation while enhancing the independence among node representations. In particular, we firstly present disentangled representation learning by neighborhood routing mechanism, and then employ the Hilbert-Schmidt Independence Criterion (HSIC) to enforce independence between the latent representations, which is effectively integrated into a graph convolutional framework as a regularizer at the output layer. Experimental studies on real-world graphs validate our model and demonstrate that our algorithms outperform the state-of-the-arts by a wide margin in different network applications, including semi-supervised graph classification, graph clustering and graph visualization.


  Click for Model/Code and Paper
Evaluating Semantic Rationality of a Sentence: A Sememe-Word-Matching Neural Network based on HowNet

Sep 11, 2018
Shu Liu, Jingjing Xu, Xuancheng Ren, Xu Sun

Automatic evaluation of semantic rationality is an important yet challenging task, and current automatic techniques cannot well identify whether a sentence is semantically rational. The methods based on the language model do not measure the sentence by rationality but by commonness. The methods based on the similarity with human written sentences will fail if human-written references are not available. In this paper, we propose a novel model called Sememe-Word-Matching Neural Network (SWM-NN) to tackle semantic rationality evaluation by taking advantage of sememe knowledge base HowNet. The advantage is that our model can utilize a proper combination of sememes to represent the fine-grained semantic meanings of a word within the specific contexts. We use the fine-grained semantic representation to help the model learn the semantic dependency among words. To evaluate the effectiveness of the proposed model, we build a large-scale rationality evaluation dataset. Experimental results on this dataset show that the proposed model outperforms the competitive baselines with a 5.4\% improvement in accuracy.


  Click for Model/Code and Paper
Learning Stochastic Behaviour of Aggregate Data

Feb 10, 2020
Shaojun Ma, Shu Liu, Hongyuan Zha, Haomin Zhou

Learning nonlinear dynamics of aggregate data is a challenging problem since the full trajectory of each individual is not observable, namely, the individual observed at one time point may not be observed at next time point. One class of existing work investigate such dynamics by requiring complete longitudinal individual-level trajectories. However, in most of the practical applications, the requirement is unrealistic due to technical limitations, experimental costs and/or privacy issues. The other one class of methods learn the dynamics by regarding aggregate behaviour as a stochastic process with/without hidden variable. The performances of such methods may be restricted due to complex dynamics, high dimensions and computation costs. In this paper, we propose a new weak form based framework to study the hidden dynamics of aggregate data via Wasserstein generative adversarial network(WGAN) and Fokker Planck Equation(FPE). Our model fall into the second class of methods with simple structure and computation. We demonstrate our approach in the context of a series of synthetic and real-world datasets.


  Click for Model/Code and Paper
GridMask Data Augmentation

Jan 14, 2020
Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia

We propose a novel data augmentation method `GridMask' in this paper. It utilizes information removal to achieve state-of-the-art results in a variety of computer vision tasks. We analyze the requirement of information dropping. Then we show limitation of existing information dropping algorithms and propose our structured method, which is simple and yet very effective. It is based on the deletion of regions of the input image. Our extensive experiments show that our method outperforms the latest AutoAugment, which is way more computationally expensive due to the use of reinforcement learning to find the best policies. On the ImageNet dataset for recognition, COCO2017 object detection, and on Cityscapes dataset for semantic segmentation, our method all notably improves performance over baselines. The extensive experiments manifest the effectiveness and generality of the new method.


  Click for Model/Code and Paper
DSGN: Deep Stereo Geometry Network for 3D Object Detection

Jan 10, 2020
Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia

Most state-of-the-art 3D object detectors heavily rely on LiDAR sensors and there remains a large gap in terms of performance between image-based and LiDAR-based methods, caused by inappropriate representation for the prediction in 3D scenarios. Our method, called Deep Stereo Geometry Network (DSGN), reduces this gap significantly by detecting 3D objects on a differentiable volumetric representation -- 3D geometric volume, which effectively encodes 3D geometric structure for 3D regular space. With this representation, we learn depth information and semantic cues simultaneously. For the first time, we provide a simple and effective one-stage stereo-based 3D detection pipeline that jointly estimates the depth and detects 3D objects in an end-to-end learning manner. Our approach outperforms previous stereo-based 3D detectors (about 10 higher in terms of AP) and even achieves comparable performance with a few LiDAR-based methods on the KITTI 3D object detection leaderboard. Code will be made publicly available.


  Click for Model/Code and Paper
Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Jan 02, 2020
Kai Shu, Suhang Wang, Dongwon Lee, Huan Liu

In recent years, disinformation including fake news, has became a global phenomenon due to its explosive growth, particularly on social media. The wide spread of disinformation and fake news can cause detrimental societal effects. Despite the recent progress in detecting disinformation and fake news, it is still non-trivial due to its complexity, diversity, multi-modality, and costs of fact-checking or annotation. The goal of this chapter is to pave the way for appreciating the challenges and advancements via: (1) introducing the types of information disorder on social media and examine their differences and connections; (2) describing important and emerging tasks to combat disinformation for characterization, detection and attribution; and (3) discussing a weak supervision approach to detect disinformation with limited labeled data. We then provide an overview of the chapters in this book that represent the recent advancements in three related parts: (1) user engagements in the dissemination of information disorder; (2) techniques on detecting and mitigating disinformation; and (3) trending issues such as ethics, blockchain, clickbaits, etc. We hope this book to be a convenient entry point for researchers, practitioners, and students to understand the problems and challenges, learn state-of-the-art solutions for their specific needs, and quickly identify new research problems in their domains.

* Submitted as an introductory chapter for the edited book on "Fake News, Disinformation, and Misinformation in Social Media- Emerging Research Challenges and Opportunities", Springer Press 

  Click for Model/Code and Paper
Identifying Model Weakness with Adversarial Examiner

Nov 25, 2019
Michelle Shu, Chenxi Liu, Weichao Qiu, Alan Yuille

Machine learning models are usually evaluated according to the average case performance on the test set. However, this is not always ideal, because in some sensitive domains (e.g. autonomous driving), it is the worst case performance that matters more. In this paper, we are interested in systematic exploration of the input data space to identify the weakness of the model to be evaluated. We propose to use an adversarial examiner in the testing stage. Different from the existing strategy to always give the same (distribution of) test data, the adversarial examiner will dynamically select the next test data to hand out based on the testing history so far, with the goal being to undermine the model's performance. This sequence of test data not only helps us understand the current model, but also serves as constructive feedback to help improve the model in the next iteration. We conduct experiments on ShapeNet object classification. We show that our adversarial examiner can successfully put more emphasis on the weakness of the model, preventing performance estimates from being overly optimistic.

* To appear in AAAI-20 

  Click for Model/Code and Paper
Deep causal representation learning for unsupervised domain adaptation

Oct 28, 2019
Raha Moraffah, Kai Shu, Adrienne Raglin, Huan Liu

Studies show that the representations learned by deep neural networks can be transferred to similar prediction tasks in other domains for which we do not have enough labeled data. However, as we transition to higher layers in the model, the representations become more task-specific and less generalizable. Recent research on deep domain adaptation proposed to mitigate this problem by forcing the deep model to learn more transferable feature representations across domains. This is achieved by incorporating domain adaptation methods into deep learning pipeline. The majority of existing models learn the transferable feature representations which are highly correlated with the outcome. However, correlations are not always transferable. In this paper, we propose a novel deep causal representation learning framework for unsupervised domain adaptation, in which we propose to learn domain-invariant causal representations of the input from the source domain. We simulate a virtual target domain using reweighted samples from the source domain and estimate the causal effect of features on the outcomes. The extensive comparative study demonstrates the strengths of the proposed model for unsupervised domain adaptation via causal representations.


  Click for Model/Code and Paper
Modeling Multi-Action Policy for Task-Oriented Dialogues

Aug 30, 2019
Lei Shu, Hu Xu, Bing Liu, Piero Molino

Dialogue management (DM) plays a key role in the quality of the interaction with the user in a task-oriented dialogue system. In most existing approaches, the agent predicts only one DM policy action per turn. This significantly limits the expressive power of the conversational agent and introduces unwanted turns of interactions that may challenge users' patience. Longer conversations also lead to more errors and the system needs to be more robust to handle them. In this paper, we compare the performance of several models on the task of predicting multiple acts for each turn. A novel policy model is proposed based on a recurrent cell called gated Continue-Act-Slots (gCAS) that overcomes the limitations of the existing models. Experimental results show that gCAS outperforms other approaches. The code is available at https://leishu02.github.io/

* 7 

  Click for Model/Code and Paper
Fast Point R-CNN

Aug 16, 2019
Yilun Chen, Shu Liu, Xiaoyong Shen, Jiaya Jia

We present a unified, efficient and effective framework for point-cloud based 3D object detection. Our two-stage approach utilizes both voxel representation and raw point cloud data to exploit respective advantages. The first stage network, with voxel representation as input, only consists of light convolutional operations, producing a small number of high-quality initial predictions. Coordinate and indexed convolutional feature of each point in initial prediction are effectively fused with the attention mechanism, preserving both accurate localization and context information. The second stage works on interior points with their fused feature for further refining the prediction. Our method is evaluated on KITTI dataset, in terms of both 3D and Bird's Eye View (BEV) detection, and achieves state-of-the-arts with a 15FPS detection rate.


  Click for Model/Code and Paper