Models, code, and papers for "Ruihua Song":
As the wide adoption of intelligent chatbot in human daily life, user demands for such systems evolve from basic task-solving conversations to more casual and friend-like communication. To meet the user needs and build emotional bond with users, it is essential for social chatbots to incorporate more human-like and advanced linguistic features. In this paper, we investigate the usage of a commonly used rhetorical device by human -- metaphor for social chatbot. Our work first designs a metaphor generation framework, which generates topic-aware and novel figurative sentences. By embedding the framework into a chatbot system, we then enables the chatbot to communicate with users using figurative language. Human annotators validate the novelty and properness of the generated metaphors. More importantly, we evaluate the effects of employing metaphors in human-chatbot conversations. Experiments indicate that our system effectively arouses user interests in communicating with our chatbot, resulting in significantly longer human-chatbot conversations.
Sound effects play an essential role in producing high-quality radio stories but require enormous labor cost to add. In this paper, we address the problem of automatically adding sound effects to radio stories with a retrieval-based model. However, directly implementing a tag-based retrieval model leads to high false positives due to the ambiguity of story contents. To solve this problem, we introduce a retrieval-based framework hybridized with a semantic inference model which helps to achieve robust retrieval results. Our model relies on fine-designed features extracted from the context of candidate triggers. We collect two story dubbing datasets through crowdsourcing to analyze the setting of adding sound effects and to train and test our proposed methods. We further discuss the importance of each feature and introduce several heuristic rules for the trade-off between precision and recall. Together with the text-to-speech technology, our results reveal a promising automatic pipeline on producing high-quality radio stories.
Divergent word usages reflect differences among people. In this paper, we present a novel angle for studying word usage divergence -- word interpretations. We propose an approach that quantifies semantic differences in interpretations among different groups of people. The effectiveness of our approach is validated by quantitative evaluations. Experiment results indicate that divergences in word interpretations exist. We further apply the approach to two well studied types of differences between people -- gender and region. The detected words with divergent interpretations reveal the unique features of specific groups of people. For gender, we discover that certain different interests, social attitudes, and characters between males and females are reflected in their divergent interpretations of many words. For region, we find that specific interpretations of certain words reveal the geographical and cultural features of different regions.
Vision is a common source of inspiration for poetry. The objects and the sentimental imprints that one perceives from an image may lead to various feelings depending on the reader. In this paper, we present a system of poetry generation from images to mimic the process. Given an image, we first extract a few keywords representing objects and sentiments perceived from the image. These keywords are then expanded to related ones based on their associations in human written poems. Finally, verses are generated gradually from the keywords using recurrent neural networks trained on existing poems. Our approach is evaluated by human assessors and compared to other generation baselines. The results show that our method can generate poems that are more artistic than the baseline methods. This is one of the few attempts to generate poetry from images. By deploying our proposed approach, XiaoIce has already generated more than 12 million poems for users since its release in July 2017. A book of its poems has been published by Cheers Publishing, which claimed that the book is the first-ever poetry collection written by an AI in human history.
A storyboard is a sequence of images to illustrate a story containing multiple sentences, which has been a key process to create different story products. In this paper, we tackle a new multimedia task of automatic storyboard creation to facilitate this process and inspire human artists. Inspired by the fact that our understanding of languages is based on our past experience, we propose a novel inspire-and-create framework with a story-to-image retriever that selects relevant cinematic images for inspiration and a storyboard creator that further refines and renders images to improve the relevancy and visual consistency. The proposed retriever dynamically employs contextual information in the story with hierarchical attentions and applies dense visual-semantic matching to accurately retrieve and ground images. The creator then employs three rendering steps to increase the flexibility of retrieved images, which include erasing irrelevant regions, unifying styles of images and substituting consistent characters. We carry out extensive experiments on both in-domain and out-of-domain visual story datasets. The proposed model achieves better quantitative performance than the state-of-the-art baselines for storyboard creation. Qualitative visualizations and user studies further verify that our approach can create high-quality storyboards even for stories in the wild.