Deep reinforcement learning has become popular over recent years, showing superiority on different visual-input tasks such as playing Atari games and robot navigation. Although objects are important image elements, few work considers enhancing deep reinforcement learning with object characteristics. In this paper, we propose a novel method that can incorporate object recognition processing to deep reinforcement learning models. This approach can be adapted to any existing deep reinforcement learning frameworks. State-of-the-art results are shown in experiments on Atari games. We also propose a new approach called "object saliency maps" to visually explain the actions made by deep reinforcement learning agents.

* 15 pages, 6 figures, Accepted at 3rd Global Conference on Artificial Intelligence (GCAI-17), Miami, 2017
Click to Read Paper
Photorealistic style transfer is a technique which transfers colour from one reference domain to another domain by using deep learning and optimization techniques. Here, we present a technique which we use to transfer style and colour from a reference image to a video.

Click to Read Paper
We show that there is a largely unexplored class of functions (positive polymatroids) that can define proper discrete metrics over pairs of binary vectors and that are fairly tractable to optimize over. By exploiting submodularity, we are able to give hardness results and approximation algorithms for optimizing over such metrics. Additionally, we demonstrate empirically the effectiveness of these metrics and associated algorithms on both a metric minimization task (a form of clustering) and also a metric maximization task (generating diverse k-best lists).

* 15 pages, 1 figure, a short version of this will appear in the NIPS 2015 conference
Click to Read Paper
Autonomous AI systems will be entering human society in the near future to provide services and work alongside humans. For those systems to be accepted and trusted, the users should be able to understand the reasoning process of the system, i.e. the system should be transparent. System transparency enables humans to form coherent explanations of the system's decisions and actions. Transparency is important not only for user trust, but also for software debugging and certification. In recent years, Deep Neural Networks have made great advances in multiple application areas. However, deep neural networks are opaque. In this paper, we report on work in transparency in Deep Reinforcement Learning Networks (DRLN). Such networks have been extremely successful in accurately learning action control in image input domains, such as Atari games. In this paper, we propose a novel and general method that (a) incorporates explicit object recognition processing into deep reinforcement learning models, (b) forms the basis for the development of "object saliency maps", to provide visualization of internal states of DRLNs, thus enabling the formation of explanations and (c) can be incorporated in any existing deep reinforcement learning framework. We present computational results and human experiments to evaluate our approach.

* 8 pages, 5 figures, Accepted at AAAI/ACM Conference on AI, Ethics, and Society 2018
Click to Read Paper
Due to the lack of structured knowledge applied in learning distributed representation of cate- gories, existing work cannot incorporate category hierarchies into entity information. We propose a framework that embeds entities and categories into a semantic space by integrating structured knowledge and taxonomy hierarchy from large knowledge bases. The framework allows to com- pute meaningful semantic relatedness between entities and categories. Our framework can han- dle both single-word concepts and multiple-word concepts with superior performance on concept categorization and yield state of the art results on dataless hierarchical classification.

* 10 pages, submitted to Coling 2016. arXiv admin note: substantial text overlap with arXiv:1605.03924
Click to Read Paper
Due to the lack of structured knowledge applied in learning distributed representation of categories, existing work cannot incorporate category hierarchies into entity information.~We propose a framework that embeds entities and categories into a semantic space by integrating structured knowledge and taxonomy hierarchy from large knowledge bases. The framework allows to compute meaningful semantic relatedness between entities and categories.~Compared with the previous state of the art, our framework can handle both single-word concepts and multiple-word concepts with superior performance in concept categorization and semantic relatedness.

* 10 pages
Click to Read Paper
Existing video indexing and retrieval methods on popular web-based multimedia sharing websites are based on user-provided sparse tagging. This paper proposes a very specific way of searching for video clips, based on the content of the video. We present our work on Content-based Video Indexing and Retrieval using the Correspondence-Latent Dirichlet Allocation (corr-LDA) probabilistic framework. This is a model that provides for auto-annotation of videos in a database with textual descriptors, and brings the added benefit of utilizing the semantic relations between the content of the video and text. We use the concept-level matching provided by corr-LDA to build correspondences between text and multimedia, with the objective of retrieving content with increased accuracy. In our experiments, we employ only the audio components of the individual recordings and compare our results with an SVM-based approach.

* 7 pages
Click to Read Paper