Models, code, and papers for "Zixun Chen":
In order to understand content and automatically extract labels for videos of the game "Honor of Kings", it is necessary to detect and recognize characters (called "hero") together with their camps in the game video. In this paper, we propose an efficient two-stage algorithm to detect and recognize heros in game videos. First, we detect all heros in a video frame based on blood bar template-matching method, and classify them according to their camps (self/ friend/ enemy). Then we recognize the name of each hero using one or more deep convolution neural networks. Our method needs almost no work for labelling training and testing samples in the recognition stage. Experiments show its efficiency and accuracy in the task of hero detection and recognition in game videos.
In this paper, we propose Evebot, an innovative, sequence to sequence (Seq2seq) based, fully generative conversational system for the diagnosis of negative emotions and prevention of depression through positively suggestive responses. The system consists of an assembly of deep-learning based models, including Bi-LSTM based model for detecting negative emotions of users and obtaining psychological counselling related corpus for training the chatbot, anti-language sequence to sequence neural network, and maximum mutual information (MMI) model. As adolescents are reluctant to show their negative emotions in physical interaction, traditional methods of emotion analysis and comforting methods may not work. Therefore, this system puts emphasis on using virtual platform to detect signs of depression or anxiety, channel adolescents' stress and mood, and thus prevent the emergence of mental illness. We launched the integrated chatbot system onto an online platform for real-world campus applications. Through a one-month user study, we observe better results in the increase in positivity than other public chatbots in the control group.
Online personalized news product needs a suitable cover for the article. The news cover demands to be with high image quality, and draw readers' attention at same time, which is extraordinary challenging due to the subjectivity of the task. In this paper, we assess the news cover from image clarity and object salience perspective. We propose an end-to-end multi-task learning network for image clarity assessment and semantic segmentation simultaneously, the results of which can be guided for news cover assessment. The proposed network is based on a modified DeepLabv3+ model. The network backbone is used for multiple scale spatial features exaction, followed by two branches for image clarity assessment and semantic segmentation, respectively. The experiment results show that the proposed model is able to capture important content in images and performs better than single-task learning baselines on our proposed game content based CIA dataset.
With the increasing popularity of E-sport live, Highlight Flashback has been a critical functionality of live platforms, which aggregates the overall exciting fighting scenes in a few seconds. In this paper, we introduce a novel training strategy without any additional annotation to automatically generate highlights for game video live. Considering that the existing manual edited clips contain more highlights than long game live videos, we perform pair-wise ranking constraints across clips from edited and long live videos. A multi-stream framework is also proposed to fuse spatial, temporal as well as audio features extracted from videos. To evaluate our method, we test on long game live videos with an average length of about 15 minutes. Extensive experimental results on videos demonstrate its satisfying performance on highlights generation and effectiveness by the fusion of three streams.