Models, code, and papers for "Lei Shu":

Controlled CNN-based Sequence Labeling for Aspect Extraction

May 29, 2019
Lei Shu, Hu Xu, Bing Liu

One key task of fine-grained sentiment analysis on reviews is to extract aspects or features that users have expressed opinions on. This paper focuses on supervised aspect extraction using a modified CNN called controlled CNN (Ctrl). The modified CNN has two types of control modules. Through asynchronous parameter updating, it prevents over-fitting and boosts CNN's performance significantly. This model achieves state-of-the-art results on standard aspect extraction datasets. To the best of our knowledge, this is the first paper to apply control modules to aspect extraction.


  Click for Model/Code and Paper
Unseen Class Discovery in Open-world Classification

Jan 17, 2018
Lei Shu, Hu Xu, Bing Liu

This paper concerns open-world classification, where the classifier not only needs to classify test examples into seen classes that have appeared in training but also reject examples from unseen or novel classes that have not appeared in training. Specifically, this paper focuses on discovering the hidden unseen classes of the rejected examples. Clearly, without prior knowledge this is difficult. However, we do have the data from the seen training classes, which can tell us what kind of similarity/difference is expected for examples from the same class or from different classes. It is reasonable to assume that this knowledge can be transferred to the rejected examples and used to discover the hidden unseen classes in them. This paper aims to solve this problem. It first proposes a joint open classification model with a sub-model for classifying whether a pair of examples belongs to the same or different classes. This sub-model can serve as a distance function for clustering to discover the hidden classes of the rejected examples. Experimental results show that the proposed model is highly promising.


  Click for Model/Code and Paper
DOC: Deep Open Classification of Text Documents

Sep 25, 2017
Lei Shu, Hu Xu, Bing Liu

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.

* accepted at EMNLP 2017 

  Click for Model/Code and Paper
Lifelong Learning CRF for Supervised Aspect Extraction

Apr 29, 2017
Lei Shu, Hu Xu, Bing Liu

This paper makes a focused contribution to supervised aspect extraction. It shows that if the system has performed aspect extraction from many past domains and retained their results as knowledge, Conditional Random Fields (CRF) can leverage this knowledge in a lifelong learning manner to extract in a new domain markedly better than the traditional CRF without using this prior knowledge. The key innovation is that even after CRF training, the model can still improve its extraction with experiences in its applications.

* Accepted at ACL 2017. arXiv admin note: text overlap with arXiv:1612.07940 

  Click for Model/Code and Paper
Supervised Complementary Entity Recognition with Augmented Key-value Pairs of Knowledge

May 29, 2017
Hu Xu, Lei Shu, Philip S. Yu

Extracting opinion targets is an important task in sentiment analysis on product reviews and complementary entities (products) are one important type of opinion targets that may work together with the reviewed product. In this paper, we address the problem of Complementary Entity Recognition (CER) as a supervised sequence labeling with the capability of expanding domain knowledge as key-value pairs from unlabeled reviews, by automatically learning and enhancing knowledge-based features. We use Conditional Random Field (CRF) as the base learner and augment CRF with knowledge-based features (called the Knowledge-based CRF or KCRF for short). We conduct experiments to show that KCRF effectively improves the performance of supervised CER task.


  Click for Model/Code and Paper
Modeling Multi-Action Policy for Task-Oriented Dialogues

Aug 30, 2019
Lei Shu, Hu Xu, Bing Liu, Piero Molino

Dialogue management (DM) plays a key role in the quality of the interaction with the user in a task-oriented dialogue system. In most existing approaches, the agent predicts only one DM policy action per turn. This significantly limits the expressive power of the conversational agent and introduces unwanted turns of interactions that may challenge users' patience. Longer conversations also lead to more errors and the system needs to be more robust to handle them. In this paper, we compare the performance of several models on the task of predicting multiple acts for each turn. A novel policy model is proposed based on a recurrent cell called gated Continue-Act-Slots (gCAS) that overcomes the limitations of the existing models. Experimental results show that gCAS outperforms other approaches. The code is available at https://leishu02.github.io/

* 7 

  Click for Model/Code and Paper
Supervised Opinion Aspect Extraction by Exploiting Past Extraction Results

Dec 23, 2016
Lei Shu, Bing Liu, Hu Xu, Annice Kim

One of the key tasks of sentiment analysis of product reviews is to extract product aspects or features that users have expressed opinions on. In this work, we focus on using supervised sequence labeling as the base approach to performing the task. Although several extraction methods using sequence labeling methods such as Conditional Random Fields (CRF) and Hidden Markov Models (HMM) have been proposed, we show that this supervised approach can be significantly improved by exploiting the idea of concept sharing across multiple domains. For example, "screen" is an aspect in iPhone, but not only iPhone has a screen, many electronic devices have screens too. When "screen" appears in a review of a new domain (or product), it is likely to be an aspect too. Knowing this information enables us to do much better extraction in the new domain. This paper proposes a novel extraction method exploiting this idea in the context of supervised sequence labeling. Experimental results show that it produces markedly better results than without using the past information.

* 10 pages 

  Click for Model/Code and Paper
A Failure of Aspect Sentiment Classifiers and an Adaptive Re-weighting Solution

Nov 04, 2019
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

Aspect-based sentiment classification (ASC) is an important task in fine-grained sentiment analysis.~Deep supervised ASC approaches typically model this task as a pair-wise classification task that takes an aspect and a sentence containing the aspect and outputs the polarity of the aspect in that sentence. However, we discovered that many existing approaches fail to learn an effective ASC classifier but more like a sentence-level sentiment classifier because they have difficulty to handle sentences with different polarities for different aspects.~This paper first demonstrates this problem using several state-of-the-art ASC models. It then proposes a novel and general adaptive re-weighting (ARW) scheme to adjust the training to dramatically improve ASC for such complex sentences. Experimental results show that the proposed framework is effective \footnote{The dataset and code are available at \url{https://github.com/howardhsu/ASC_failure}.}.


  Click for Model/Code and Paper
Variational Quantum Algorithms for Dimensionality Reduction and Classification

Oct 27, 2019
Jin-Min Liang, Shu-Qian Shen, Ming Li, Lei Li

Dimensionality reduction and classification play an absolutely critical role in pattern recognition and machine learning. In this work, we present a quantum neighborhood preserving embedding and a quantum local discriminant embedding for dimensionality reduction and classification. These two algorithms have an exponential speedup over their respectively classical counterparts. Along the way, we propose a variational quantum generalized eigenvalue solver (VQGE) that finds the generalized eigenvalues and eigenvectors of a matrix pencil $(\mathcal{G},\mathcal{S})$ with coherence time $O(1)$. We successfully conduct numerical experiment solving a problem size of $2^5\times2^5$. Moreover, our results offer two optional outputs with quantum or classical form, which can be directly applied in another quantum or classical machine learning process.


  Click for Model/Code and Paper
BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis

May 04, 2019
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

Question-answering plays an important role in e-commerce as it allows potential customers to actively seek crucial information about products or services to help their purchase decision making. Inspired by the recent success of machine reading comprehension (MRC) on formal documents, this paper explores the potential of turning customer reviews into a large source of knowledge that can be exploited to answer user questions.~We call this problem Review Reading Comprehension (RRC). To the best of our knowledge, no existing work has been done on RRC. In this work, we first build an RRC dataset called ReviewRC based on a popular benchmark for aspect-based sentiment analysis. Since ReviewRC has limited training examples for RRC (and also for aspect-based sentiment analysis), we then explore a novel post-training approach on the popular language model BERT to enhance the performance of fine-tuning of BERT for RRC. To show the generality of the approach, the proposed post-training is also applied to some other review-based tasks such as aspect extraction and aspect sentiment classification in aspect-based sentiment analysis. Experimental results demonstrate that the proposed post-training is highly effective. The datasets and code are available at https://www.cs.uic.edu/~hxu/.

* accepted by NAACL 2019 

  Click for Model/Code and Paper
Review Conversational Reading Comprehension

Feb 03, 2019
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

Seeking information about products and services is an important activity of online consumers before making a purchase decision. Inspired by recent research on conversational reading comprehension (CRC) on formal documents, this paper studies the task of leveraging knowledge from a huge amount of reviews to answer multi-turn questions from consumers or users. Questions spanning multiple turns in a dialogue enables users to ask more specific questions that are hard to ask within a single question as in traditional machine reading comprehension (MRC). In this paper, we first build a dataset and then propose a novel task-adaptation approach to encoding the formulation of CRC task into a pre-trained language model. This task-adaptation approach is unsupervised and can greatly enhance the performance of the end CRC task that has only limited supervision. Experimental results show that the proposed approach is highly effective and has competitive performance as supervised approach. We plan to release the datasets and the code in May 2019.


  Click for Model/Code and Paper
Learning to Accept New Classes without Training

Sep 17, 2018
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

Classic supervised learning makes the closed-world assumption, meaning that classes seen in testing must have been seen in training. However, in the dynamic world, new or unseen class examples may appear constantly. A model working in such an environment must be able to reject unseen classes (not seen or used in training). If enough data is collected for the unseen classes, the system should incrementally learn to accept/classify them. This learning paradigm is called open-world learning (OWL). Existing OWL methods all need some form of re-training to accept or include the new classes in the overall model. In this paper, we propose a meta-learning approach to the problem. Its key novelty is that it only needs to train a meta-classifier, which can then continually accept new classes when they have enough labeled data for the meta-classifier to use, and also detect/reject future unseen classes. No re-training of the meta-classifier or a new overall classifier covering all old and new classes is needed. In testing, the method only uses the examples of the seen classes (including the newly added classes) on-the-fly for classification and rejection. Experimental results demonstrate the effectiveness of the new approach.


  Click for Model/Code and Paper
Lifelong Domain Word Embedding via Meta-Learning

May 25, 2018
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However, domain-specific tasks often do not have large in-domain corpora for training high-quality domain embeddings. In this paper, we propose a novel lifelong learning setting for domain embedding. That is, when performing the new domain embedding, the system has seen many past domains, and it tries to expand the new in-domain corpus by exploiting the corpora from the past domains via meta-learning. The proposed meta-learner characterizes the similarities of the contexts of the same word in many domain corpora, which helps retrieve relevant data from the past domains to expand the new domain corpus. Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks.

* IJCAI 2018 
* 7 pages 

  Click for Model/Code and Paper
Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction

May 11, 2018
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

One key task of fine-grained sentiment analysis of product reviews is to extract product aspects or features that users have expressed opinions on. This paper focuses on supervised aspect extraction using deep learning. Unlike other highly sophisticated supervised deep learning models, this paper proposes a novel and yet simple CNN model employing two types of pre-trained embeddings for aspect extraction: general-purpose embeddings and domain-specific embeddings. Without using any additional supervision, this model achieves surprisingly good results, outperforming state-of-the-art sophisticated existing methods. To our knowledge, this paper is the first to report such double embeddings based CNN model for aspect extraction and achieve very good results.

* ACL 2018 

  Click for Model/Code and Paper
Product Function Need Recognition via Semi-supervised Attention Network

Dec 06, 2017
Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu

Functionality is of utmost importance to customers when they purchase products. However, it is unclear to customers whether a product can really satisfy their needs on functions. Further, missing functions may be intentionally hidden by the manufacturers or the sellers. As a result, a customer needs to spend a fair amount of time before purchasing or just purchase the product on his/her own risk. In this paper, we first identify a novel QA corpus that is dense on product functionality information \footnote{The annotated corpus can be found at \url{https://www.cs.uic.edu/~hxu/}.}. We then design a neural network called Semi-supervised Attention Network (SAN) to discover product functions from questions. This model leverages unlabeled data as contextual information to perform semi-supervised sequence labeling. We conduct experiments to show that the extracted function have both high coverage and accuracy, compared with a wide spectrum of baselines.


  Click for Model/Code and Paper
Dual Attention Network for Product Compatibility and Function Satisfiability Analysis

Dec 06, 2017
Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu

Product compatibility and their functionality are of utmost importance to customers when they purchase products, and to sellers and manufacturers when they sell products. Due to the huge number of products available online, it is infeasible to enumerate and test the compatibility and functionality of every product. In this paper, we address two closely related problems: product compatibility analysis and function satisfiability analysis, where the second problem is a generalization of the first problem (e.g., whether a product works with another product can be considered as a special function). We first identify a novel question and answering corpus that is up-to-date regarding product compatibility and functionality information. To allow automatic discovery product compatibility and functionality, we then propose a deep learning model called Dual Attention Network (DAN). Given a QA pair for a to-be-purchased product, DAN learns to 1) discover complementary products (or functions), and 2) accurately predict the actual compatibility (or satisfiability) of the discovered products (or functions). The challenges addressed by the model include the briefness of QAs, linguistic patterns indicating compatibility, and the appropriate fusion of questions and answers. We conduct experiments to quantitatively and qualitatively show that the identified products and functions have both high coverage and accuracy, compared with a wide spectrum of baselines.


  Click for Model/Code and Paper
Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification using Distant Label Expansion

Dec 14, 2016
Hu Xu, Lei Shu, Jingyuan Zhang, Philip S. Yu

Product Community Question Answering (PCQA) provides useful information about products and their features (aspects) that may not be well addressed by product descriptions and reviews. We observe that a product's compatibility issues with other products are frequently discussed in PCQA and such issues are more frequently addressed in accessories, i.e., via a yes/no question "Does this mouse work with windows 10?". In this paper, we address the problem of extracting compatible and incompatible products from yes/no questions in PCQA. This problem can naturally have a two-stage framework: first, we perform Complementary Entity (product) Recognition (CER) on yes/no questions; second, we identify the polarities of yes/no answers to assign the complementary entities a compatibility label (compatible, incompatible or unknown). We leverage an existing unsupervised method for the first stage and a 3-class classifier by combining a distant PU-learning method (learning from positive and unlabeled examples) together with a binary classifier for the second stage. The benefit of using distant PU-learning is that it can help to expand more implicit yes/no answers without using any human annotated data. We conduct experiments on 4 products to show that the proposed method is effective.

* 9 pages, 1 figures 

  Click for Model/Code and Paper
CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews

Dec 04, 2016
Hu Xu, Sihong Xie, Lei Shu, Philip S. Yu

Product reviews contain a lot of useful information about product features and customer opinions. One important product feature is the complementary entity (products) that may potentially work together with the reviewed product. Knowing complementary entities of the reviewed product is very important because customers want to buy compatible products and avoid incompatible ones. In this paper, we address the problem of Complementary Entity Recognition (CER). Since no existing method can solve this problem, we first propose a novel unsupervised method to utilize syntactic dependency paths to recognize complementary entities. Then we expand category-level domain knowledge about complementary entities using only a few general seed verbs on a large amount of unlabeled reviews. The domain knowledge helps the unsupervised method to adapt to different products and greatly improves the precision of the CER task. The advantage of the proposed method is that it does not require any labeled data for training. We conducted experiments on 7 popular products with about 1200 reviews in total to demonstrate that the proposed approach is effective.

* 10 pages, 2 figures, IEEE BigData 2016 

  Click for Model/Code and Paper