Models, code, and papers for "Jiangtong Li":

Fast Neural Chinese Word Segmentation for Long Sentences

Nov 06, 2018
Sufeng Duan, Jiangtong Li, Hai Zhao

Rapidly developed neural models have achieved competitive performance in Chinese word segmentation (CWS) as their traditional counterparts. However, most of methods encounter the computational inefficiency especially for long sentences because of the increasing model complexity and slower decoders. This paper presents a simple neural segmenter which directly labels the gap existence between adjacent characters to alleviate the existing drawback. Our segmenter is fully end-to-end and capable of performing segmentation very fast. We also show a performance difference with different tag sets. The experiments show that our segmenter can provide comparable performance with state-of-the-art.


  Click for Model/Code and Paper
SJTU-NLP at SemEval-2018 Task 9: Neural Hypernym Discovery with Term Embeddings

May 26, 2018
Zhuosheng Zhang, Jiangtong Li, Hai Zhao, Bingjie Tang

This paper describes a hypernym discovery system for our participation in the SemEval-2018 Task 9, which aims to discover the best (set of) candidate hypernyms for input concepts or entities, given the search space of a pre-defined vocabulary. We introduce a neural network architecture for the concerned task and empirically study various neural network models to build the representations in latent space for words and phrases. The evaluated models include convolutional neural network, long-short term memory network, gated recurrent unit and recurrent convolutional neural network. We also explore different embedding methods, including word embedding and sense embedding for better performance.

* SemEval-2018, Workshop of NAACL-HLT 2018 

  Click for Model/Code and Paper
Subword ELMo

Sep 18, 2019
Jiangtong Li, Hai Zhao, Zuchao Li, Wei Bi, Xiaojiang Liu

Embedding from Language Models (ELMo) has shown to be effective for improving many natural language processing (NLP) tasks, and ELMo takes character information to compose word representation to train language models.However, the character is an insufficient and unnatural linguistic unit for word representation.Thus we introduce Embedding from Subword-aware Language Models (ESuLMo) which learns word representation from subwords using unsupervised segmentation over words.We show that ESuLMo can enhance four benchmark NLP tasks more effectively than ELMo, including syntactic dependency parsing, semantic role labeling, implicit discourse relation recognition and textual entailment, which brings a meaningful improvement over ELMo.


  Click for Model/Code and Paper
Lattice-Based Transformer Encoder for Neural Machine Translation

Jun 04, 2019
Fengshun Xiao, Jiangtong Li, Hai Zhao, Rui Wang, Kehai Chen

Neural machine translation (NMT) takes deterministic sequences for source representations. However, either word-level or subword-level segmentations have multiple choices to split a source sequence with different word segmentors or different subword vocabulary sizes. We hypothesize that the diversity in segmentations may affect the NMT performance. To integrate different segmentations with the state-of-the-art NMT model, Transformer, we propose lattice-based encoders to explore effective word or subword representation in an automatic way during training. We propose two methods: 1) lattice positional encoding and 2) lattice-aware self-attention. These two methods can be used together and show complementary to each other to further improve translation performance. Experiment results show superiorities of lattice-based encoders in word-level and subword-level representations over conventional Transformer encoder.

* Accepted by ACL 2019 

  Click for Model/Code and Paper
Modeling Multi-turn Conversation with Deep Utterance Aggregation

Nov 06, 2018
Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao, Gongshen Liu

Multi-turn conversation understanding is a major challenge for building intelligent dialogue systems. This work focuses on retrieval-based response matching for multi-turn conversation whose related work simply concatenates the conversation utterances, ignoring the interactions among previous utterances for context modeling. In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation. In detail, a self-matching attention is first introduced to route the vital information in each utterance. Then the model matches a response with each refined utterance and the final matching score is obtained after attentive turns aggregation. Experimental results show our model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.

* COLING 2018, pages 3740-3752 
* Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018) 

  Click for Model/Code and Paper
Effective Subword Segmentation for Text Comprehension

Nov 06, 2018
Zhuosheng Zhang, Hai Zhao, Kangwei Ling, Jiangtong Li, Zuchao Li, Shexia He

Character-level representations have been broadly adopted to alleviate the problem of effectively representing rare or complex words. However, character itself is not a natural minimal linguistic unit for representation or word embedding composing due to ignoring the linguistic coherence of consecutive characters inside word. This paper presents a general subword-augmented embedding framework for learning and composing computationally-derived subword-level representations. We survey a series of unsupervised segmentation methods for subword acquisition and different subword-augmented strategies for text understanding, showing that subword-augmented embedding significantly improves our baselines in multiple text understanding tasks on both English and Chinese languages.


  Click for Model/Code and Paper
Lingke: A Fine-grained Multi-turn Chatbot for Customer Service

Aug 10, 2018
Pengfei Zhu, Zhuosheng Zhang, Jiangtong Li, Yafang Huang, Hai Zhao

Traditional chatbots usually need a mass of human dialogue data, especially when using supervised machine learning method. Though they can easily deal with single-turn question answering, for multi-turn the performance is usually unsatisfactory. In this paper, we present Lingke, an information retrieval augmented chatbot which is able to answer questions based on given product introduction document and deal with multi-turn conversations. We will introduce a fine-grained pipeline processing to distill responses based on unstructured documents, and attentive sequential context-response matching for multi-turn conversations.

* Accepted by COLING 2018 demonstration paper 

  Click for Model/Code and Paper
Judging Chemical Reaction Practicality From Positive Sample only Learning

Apr 22, 2019
Shu Jiang, Zhuosheng Zhang, Hai Zhao, Jiangtong Li, Yang Yang, Bao-Liang Lu, Ning Xia

Chemical reaction practicality is the core task among all symbol intelligence based chemical information processing, for example, it provides indispensable clue for further automatic synthesis route inference. Considering that chemical reactions have been represented in a language form, we propose a new solution to generally judge the practicality of organic reaction without considering complex quantum physical modeling or chemistry knowledge. While tackling the practicality judgment as a machine learning task from positive and negative (chemical reaction) samples, all existing studies have to carefully handle the serious insufficiency issue on the negative samples. We propose an auto-construction method to well solve the extensively existed long-term difficulty. Experimental results show our model can effectively predict the practicality of chemical reactions, which achieves a high accuracy of 99.76\% on real large-scale chemical lab reaction practicality judgment.


  Click for Model/Code and Paper