Bach in 2014: Music Composition with Recurrent Neural Network

Dec 14, 2014

I-Ting Liu, Bhiksha Ramakrishnan

We propose a framework for computer music composition that uses resilient propagation (RProp) and long short term memory (LSTM) recurrent neural network. In this paper, we show that LSTM network learns the structure and characteristics of music pieces properly by demonstrating its ability to recreate music. We also show that predicting existing music using RProp outperforms Back propagation through time (BPTT).
Dec 14, 2014

I-Ting Liu, Bhiksha Ramakrishnan

**Click to Read Paper and Get Code**

Protein-protein interaction extraction is the key precondition of the construction of protein knowledge network, and it is very important for the research in the biomedicine. This paper extracted directional protein-protein interaction from the biological text, using the SVM-based method. Experiments were evaluated on the LLL05 corpus with good results. The results show that dependency features are import for the protein-protein interaction extraction and features related to the interaction word are effective for the interaction direction judgment. At last, we analyzed the effects of different features and planed for the next step.

* This paper has been withdrawn by the author due to its lack of academic value

* This paper has been withdrawn by the author due to its lack of academic value

**Click to Read Paper and Get Code**
This paper is based on our previous work on neural coding. It is a self-organized model supported by existing evidences. Firstly, we briefly introduce this model in this paper, and then we explain the neural mechanism of language and reasoning with it. Moreover, we find that the position of an area determines its importance. Specifically, language relevant areas are in the capital position of the cortical kingdom. Therefore they are closely related with autonomous consciousness and working memories. In essence, language is a miniature of the real world. Briefly, this paper would like to bridge the gap between molecule mechanism of neurons and advanced functions such as language and reasoning.

* 6 pages, 3 figures

* 6 pages, 3 figures

**Click to Read Paper and Get Code**
Based on existing data, we wish to put forward a biological model of motor system on the neuron scale. Then we indicate its implications in statistics and learning. Specifically, neuron firing frequency and synaptic strength are probability estimates in essence. And the lateral inhibition also has statistical implications. From the standpoint of learning, dendritic competition through retrograde messengers is the foundation of conditional reflex and grandmother cell coding. And they are the kernel mechanisms of motor learning and sensory motor integration respectively. Finally, we compare motor system with sensory system. In short, we would like to bridge the gap between molecule evidences and computational models.

* 8 pages, 4 figures

* 8 pages, 4 figures

**Click to Read Paper and Get Code**
The coding mechanism of sensory memory on the neuron scale is one of the most important questions in neuroscience. We have put forward a quantitative neural network model, which is self organized, self similar, and self adaptive, just like an ecosystem following Darwin theory. According to this model, neural coding is a mult to one mapping from objects to neurons. And the whole cerebrum is a real-time statistical Turing Machine, with powerful representing and learning ability. This model can reconcile some important disputations, such as: temporal coding versus rate based coding, grandmother cell versus population coding, and decay theory versus interference theory. And it has also provided explanations for some key questions such as memory consolidation, episodic memory, consciousness, and sentiment. Philosophical significance is indicated at last.

* 9 pages, 3 figures

* 9 pages, 3 figures

**Click to Read Paper and Get Code**
We have put forwards a unified quantitative framework of vision and audition, based on existing data and theories. According to this model, the retina is a feedforward network self-adaptive to inputs in a specific period. After fully grown, cells become specialized detectors based on statistics of stimulus history. This model has provided explanations for perception mechanisms of colour, shape, depth and motion. Moreover, based on this ground we have put forwards a bold conjecture that single ear can detect sound direction. This is complementary to existing theories and has provided better explanations for sound localization.

* 7 pages, 3 figures

* 7 pages, 3 figures

**Click to Read Paper and Get Code**
Further properties of the forward-backward envelope with applications to difference-of-convex programming

Oct 18, 2016

Tianxiang Liu, Ting Kei Pong

In this paper, we further study the forward-backward envelope first introduced in [28] and [30] for problems whose objective is the sum of a proper closed convex function and a twice continuously differentiable possibly nonconvex function with Lipschitz continuous gradient. We derive sufficient conditions on the original problem for the corresponding forward-backward envelope to be a level-bounded and Kurdyka-{\L}ojasiewicz function with an exponent of $\frac12$; these results are important for the efficient minimization of the forward-backward envelope by classical optimization algorithms. In addition, we demonstrate how to minimize some difference-of-convex regularized least squares problems by minimizing a suitably constructed forward-backward envelope. Our preliminary numerical results on randomly generated instances of large-scale $\ell_{1-2}$ regularized least squares problems [37] illustrate that an implementation of this approach with a limited-memory BFGS scheme usually outperforms standard first-order methods such as the nonmonotone proximal gradient method in [35].
Oct 18, 2016

Tianxiang Liu, Ting Kei Pong

* Theorem 3.3 is added. Included numerical tests on oversampled DCT matrix

**Click to Read Paper and Get Code**

Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8%, dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks. Error analysis shows what are the strength and weakness of BERT-based models for story ending prediction.

* Accepted and to appear in IJCAI 2019

* Accepted and to appear in IJCAI 2019

**Click to Read Paper and Get Code**
Constructing Narrative Event Evolutionary Graph for Script Event Prediction

May 16, 2018

Zhongyang Li, Xiao Ding, Ting Liu

Script event prediction requires a model to predict the subsequent event given an existing event context. Previous models based on event pairs or event chains cannot make full use of dense event connections, which may limit their capability of event prediction. To remedy this, we propose constructing an event graph to better utilize the event network information for script event prediction. In particular, we first extract narrative event chains from large quantities of news corpus, and then construct a narrative event evolutionary graph (NEEG) based on the extracted chains. NEEG can be seen as a knowledge base that describes event evolutionary principles and patterns. To solve the inference problem on NEEG, we present a scaled graph neural network (SGNN) to model event interactions and learn better event representations. Instead of computing the representations on the whole graph, SGNN processes only the concerned nodes each time, which makes our model feasible to large-scale graphs. By comparing the similarity between input context event representations and candidate event representations, we can choose the most reasonable subsequent event. Experimental results on widely used New York Times corpus demonstrate that our model significantly outperforms state-of-the-art baseline methods, by using standard multiple choice narrative cloze evaluation.
May 16, 2018

Zhongyang Li, Xiao Ding, Ting Liu

* This paper has been accepted by IJCAI 2018

**Click to Read Paper and Get Code**

Preference-based performance measures for Time-Domain Global Similarity method

Nov 08, 2017

Ting Lan, Jian Liu, Hong Qin

For Time-Domain Global Similarity (TDGS) method, which transforms the data cleaning problem into a binary classification problem about the physical similarity between channels, directly adopting common performance measures could only guarantee the performance for physical similarity. Nevertheless, practical data cleaning tasks have preferences for the correctness of original data sequences. To obtain the general expressions of performance measures based on the preferences of tasks, the mapping relations between performance of TDGS method about physical similarity and correctness of data sequences are investigated by probability theory in this paper. Performance measures for TDGS method in several common data cleaning tasks are set. Cases when these preference-based performance measures could be simplified are introduced.
Nov 08, 2017

Ting Lan, Jian Liu, Hong Qin

**Click to Read Paper and Get Code**

Improvement of training set structure in fusion data cleaning using Time-Domain Global Similarity method

Jun 30, 2017

Jian Liu, Ting Lan, Hong Qin

Traditional data cleaning identifies dirty data by classifying original data sequences, which is a class$-$imbalanced problem since the proportion of incorrect data is much less than the proportion of correct ones for most diagnostic systems in Magnetic Confinement Fusion (MCF) devices. When using machine learning algorithms to classify diagnostic data based on class$-$imbalanced training set, most classifiers are biased towards the major class and show very poor classification rates on the minor class. By transforming the direct classification problem about original data sequences into a classification problem about the physical similarity between data sequences, the class$-$balanced effect of Time$-$Domain Global Similarity (TDGS) method on training set structure is investigated in this paper. Meanwhile, the impact of improved training set structure on data cleaning performance of TDGS method is demonstrated with an application example in EAST POlarimetry$-$INTerferometry (POINT) system.
Jun 30, 2017

Jian Liu, Ting Lan, Hong Qin

**Click to Read Paper and Get Code**

Improving Fully Convolution Network for Semantic Segmentation

Nov 28, 2016

Bing Shuai, Ting Liu, Gang Wang

Fully Convolution Networks (FCN) have achieved great success in dense prediction tasks including semantic segmentation. In this paper, we start from discussing FCN by understanding its architecture limitations in building a strong segmentation network. Next, we present our Improved Fully Convolution Network (IFCN). In contrast to FCN, IFCN introduces a context network that progressively expands the receptive fields of feature maps. In addition, dense skip connections are added so that the context network can be effectively optimized. More importantly, these dense skip connections enable IFCN to fuse rich-scale context to make reliable predictions. Empirically, those architecture modifications are proven to be significant to enhance the segmentation performance. Without engaging any contextual post-processing, IFCN significantly advances the state-of-the-arts on ADE20K (ImageNet scene parsing), Pascal Context, Pascal VOC 2012 and SUN-RGBD segmentation datasets.
Nov 28, 2016

Bing Shuai, Ting Liu, Gang Wang

**Click to Read Paper and Get Code**

Aspect Level Sentiment Classification with Deep Memory Network

Sep 24, 2016

Duyu Tang, Bing Qin, Ting Liu

We introduce a deep memory network for aspect level sentiment classification. Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word when inferring the sentiment polarity of an aspect. Such importance degree and text representation are calculated with multiple computational layers, each of which is a neural attention model over an external memory. Experiments on laptop and restaurant datasets demonstrate that our approach performs comparable to state-of-art feature based SVM system, and substantially better than LSTM and attention-based LSTM architectures. On both datasets we show that multiple computational layers could improve the performance. Moreover, our approach is also fast. The deep memory network with 9 layers is 15 times faster than LSTM with a CPU implementation.
Sep 24, 2016

Duyu Tang, Bing Qin, Ting Liu

* published in EMNLP 2016

**Click to Read Paper and Get Code**

Image Segmentation Using Hierarchical Merge Tree

Jul 31, 2016

Ting Liu, Mojtaba Seyedhosseini, Tolga Tasdizen

This paper investigates one of the most fundamental computer vision problems: image segmentation. We propose a supervised hierarchical approach to object-independent image segmentation. Starting with over-segmenting superpixels, we use a tree structure to represent the hierarchy of region merging, by which we reduce the problem of segmenting image regions to finding a set of label assignment to tree nodes. We formulate the tree structure as a constrained conditional model to associate region merging with likelihoods predicted using an ensemble boundary classifier. Final segmentations can then be inferred by finding globally optimal solutions to the model efficiently. We also present an iterative training and testing algorithm that generates various tree structures and combines them to emphasize accurate boundaries by segmentation accumulation. Experiment results and comparisons with other very recent methods on six public data sets demonstrate that our approach achieves the state-of-the-art region accuracy and is very competitive in image segmentation without semantic priors.
Jul 31, 2016

Ting Liu, Mojtaba Seyedhosseini, Tolga Tasdizen

* IEEE.Trans.Image.Processing 25 (2016) 4596-4607

**Click to Read Paper and Get Code**

Latent Feature Based FM Model For Rating Prediction

Oct 29, 2014

Xudong Liu, Bin Zhang, Ting Zhang, Chang Liu

Rating Prediction is a basic problem in Recommender System, and one of the most widely used method is Factorization Machines(FM). However, traditional matrix factorization methods fail to utilize the benefit of implicit feedback, which has been proved to be important in Rating Prediction problem. In this work, we consider a specific situation, movie rating prediction, where we assume that watching history has a big influence on his/her rating behavior on an item. We introduce two models, Latent Dirichlet Allocation(LDA) and word2vec, both of which perform state-of-the-art results in training latent features. Based on that, we propose two feature based models. One is the Topic-based FM Model which provides the implicit feedback to the matrix factorization. The other is the Vector-based FM Model which expresses the order info of watching history. Empirical results on three datasets demonstrate that our method performs better than the baseline model and confirm that Vector-based FM Model usually works better as it contains the order info.
Oct 29, 2014

Xudong Liu, Bin Zhang, Ting Zhang, Chang Liu

* 4 pages, 3 figures, Large Scale Recommender Systems:workshop of Recsys 2014

**Click to Read Paper and Get Code**

Attribute Acquisition in Ontology based on Representation Learning of Hierarchical Classes and Attributes

Mar 08, 2019

Tianwen Jiang, Ming Liu, Bing Qin, Ting Liu

Attribute acquisition for classes is a key step in ontology construction, which is often achieved by community members manually. This paper investigates an attention-based automatic paradigm called TransATT for attribute acquisition, by learning the representation of hierarchical classes and attributes in Chinese ontology. The attributes of an entity can be acquired by merely inspecting its classes, because the entity can be regard as the instance of its classes and inherit their attributes. For explicitly describing of the class of an entity unambiguously, we propose class-path to represent the hierarchical classes in ontology, instead of the terminal class word of the hypernym-hyponym relation (i.e., is-a relation) based hierarchy. The high performance of TransATT on attribute acquisition indicates the promising ability of the learned representation of class-paths and attributes. Moreover, we construct a dataset named \textbf{BigCilin11k}. To the best of our knowledge, this is the first Chinese dataset with abundant hierarchical classes and entities with attributes.
Mar 08, 2019

Tianwen Jiang, Ming Liu, Bing Qin, Ting Liu

**Click to Read Paper and Get Code**

A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization

Dec 14, 2018

Sendong Zhao, Ting Liu, Sicheng Zhao, Fei Wang

State-of-the-art studies have demonstrated the superiority of joint modelling over pipeline implementation for medical named entity recognition and normalization due to the mutual benefits between the two processes. To exploit these benefits in a more sophisticated way, we propose a novel deep neural multi-task learning framework with explicit feedback strategies to jointly model recognition and normalization. On one hand, our method benefits from the general representations of both tasks provided by multi-task learning. On the other hand, our method successfully converts hierarchical tasks into a parallel multi-task setting while maintaining the mutual supports between tasks. Both of these aspects improve the model performance. Experimental results demonstrate that our method performs significantly better than state-of-the-art approaches on two publicly available medical literature datasets.
Dec 14, 2018

Sendong Zhao, Ting Liu, Sicheng Zhao, Fei Wang

* AAAI-2019

**Click to Read Paper and Get Code**

Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding

Jul 04, 2018

Yutai Hou, Yijia Liu, Wanxiang Che, Ting Liu

In this paper, we study the problem of data augmentation for language understanding in task-oriented dialogue system. In contrast to previous work which augments an utterance without considering its relation with other utterances, we propose a sequence-to-sequence generation based data augmentation framework that leverages one utterance's same semantic alternatives in the training data. A novel diversity rank is incorporated into the utterance representation to make the model produce diverse utterances and these diversely augmented utterances help to improve the language understanding module. Experimental results on the Airline Travel Information System dataset and a newly created semantic frame annotation on Stanford Multi-turn, Multidomain Dialogue Dataset show that our framework achieves significant improvements of 6.38 and 10.04 F-scores respectively when only a training set of hundreds utterances is represented. Case studies also confirm that our method generates diverse utterances.
Jul 04, 2018

Yutai Hou, Yijia Liu, Wanxiang Che, Ting Liu

* Accepted By COLING2018

**Click to Read Paper and Get Code**

A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems

May 26, 2018

Tianxiang Liu, Ting Kei Pong, Akiko Takeda

We consider a class of nonconvex nonsmooth optimization problems whose objective is the sum of a smooth function and a finite number of nonnegative proper closed possibly nonsmooth functions (whose proximal mappings are easy to compute), some of which are further composed with linear maps. This kind of problems arises naturally in various applications when different regularizers are introduced for inducing simultaneous structures in the solutions. Solving these problems, however, can be challenging because of the coupled nonsmooth functions: the corresponding proximal mapping can be hard to compute so that standard first-order methods such as the proximal gradient algorithm cannot be applied efficiently. In this paper, we propose a successive difference-of-convex approximation method for solving this kind of problems. In this algorithm, we approximate the nonsmooth functions by their Moreau envelopes in each iteration. Making use of the simple observation that Moreau envelopes of nonnegative proper closed functions are continuous {\em difference-of-convex} functions, we can then approximately minimize the approximation function by first-order methods with suitable majorization techniques. These first-order methods can be implemented efficiently thanks to the fact that the proximal mapping of {\em each} nonsmooth function is easy to compute. Under suitable assumptions, we prove that the sequence generated by our method is bounded and any accumulation point is a stationary point of the objective. We also discuss how our method can be applied to concrete applications such as nonconvex fused regularized optimization problems and simultaneously structured matrix optimization problems, and illustrate the performance numerically for these two specific applications.
May 26, 2018

Tianxiang Liu, Ting Kei Pong, Akiko Takeda

**Click to Read Paper and Get Code**

A refined convergence analysis of pDCA$_e$ with applications to simultaneous sparse recovery and outlier detection

Apr 19, 2018

Tianxiang Liu, Ting Kei Pong, Akiko Takeda

We consider the problem of minimizing a difference-of-convex (DC) function, which can be written as the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function and a continuous possibly nonsmooth concave function. We refine the convergence analysis in [38] for the proximal DC algorithm with extrapolation (pDCA$_e$) and show that the whole sequence generated by the algorithm is convergent when the objective is level-bounded, {\em without} imposing differentiability assumptions in the concave part. Our analysis is based on a new potential function and we assume such a function is a Kurdyka-{\L}ojasiewicz (KL) function. We also establish a relationship between our KL assumption and the one used in [38]. Finally, we demonstrate how the pDCA$_e$ can be applied to a class of simultaneous sparse recovery and outlier detection problems arising from robust compressed sensing in signal processing and least trimmed squares regression in statistics. Specifically, we show that the objectives of these problems can be written as level-bounded DC functions whose concave parts are {\em typically nonsmooth}. Moreover, for a large class of loss functions and regularizers, the KL exponent of the corresponding potential function are shown to be 1/2, which implies that the pDCA$_e$ is locally linearly convergent when applied to these problems. Our numerical experiments show that the pDCA$_e$ usually outperforms the proximal DC algorithm with nonmonotone linesearch [24, Appendix A] in both CPU time and solution quality for this particular application.
Apr 19, 2018

Tianxiang Liu, Ting Kei Pong, Akiko Takeda

**Click to Read Paper and Get Code**