Reinforcement Learning (RL) is a research area that has blossomed tremendously in recent years and has shown remarkable potential for artificial intelligence based opponents in computer games. This success is primarily due to vast capabilities of Convolutional Neural Networks (ConvNet), enabling algorithms to extract useful information from noisy environments. Capsule Network (CapsNet) is a recent introduction to the Deep Learning algorithm group and has only barely begun to be explored. The network is an architecture for image classification, with superior performance for classification of the MNIST dataset. CapsNets have not been explored beyond image classification. This thesis introduces the use of CapsNet for Q-Learning based game algorithms. To successfully apply CapsNet in advanced game play, three main contributions follow. First, the introduction of four new game environments as frameworks for RL research with increasing complexity, namely Flash RL, Deep Line Wars, Deep RTS, and Deep Maze. These environments fill the gap between relatively simple and more complex game environments available for RL research and are in the thesis used to test and explore the CapsNet behavior. Second, the thesis introduces a generative modeling approach to produce artificial training data for use in Deep Learning models including CapsNets. We empirically show that conditional generative modeling can successfully generate game data of sufficient quality to train a Deep Q-Network well. Third, we show that CapsNet is a reliable architecture for Deep Q-Learning based algorithms for game AI. A capsule is a group of neurons that determine the presence of objects in the data and is in the literature shown to increase the robustness of training and predictions while lowering the amount training data needed. It should, therefore, be ideally suited for game plays.
* Master Thesis in Computer Science
Click to Read Paper
Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.
* Best Student Paper Award, Proceedings of the 38th SGAI International
Conference on Artificial Intelligence, Cambridge, UK, 2018, Artificial
Intelligence XXXV, 2018
Click to Read Paper
Reinforcement learning (RL) is an area of research that has blossomed tremendously in recent years and has shown remarkable potential for artificial intelligence based opponents in computer games. This success is primarily due to the vast capabilities of convolutional neural networks, that can extract useful features from noisy and complex data. Games are excellent tools to test and push the boundaries of novel RL algorithms because they give valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences. Real-time strategy games (RTS) is a genre that has tremendous complexity and challenges the player in short and long-term planning. There is much research that focuses on applied RL in RTS games, and novel advances are therefore anticipated in the not too distant future. However, there are to date few environments for testing RTS AIs. Environments in the literature are often either overly simplistic, such as microRTS, or complex and without the possibility for accelerated learning on consumer hardware like StarCraft II. This paper introduces the Deep RTS game environment for testing cutting-edge artificial intelligence algorithms for RTS games. Deep RTS is a high-performance RTS game made specifically for artificial intelligence research. It supports accelerated learning, meaning that it can learn at a magnitude of 50 000 times faster compared to existing RTS games. Deep RTS has a flexible configuration, enabling research in several different RTS scenarios, including partially observable state-spaces and map complexity. We show that Deep RTS lives up to our promises by comparing its performance with microRTS, ELF, and StarCraft II on high-end consumer hardware. Using Deep RTS, we show that a Deep Q-Network agent beats random-play agents over 70% of the time. Deep RTS is publicly available at https://github.com/cair/DeepRTS.
* Proceedings of the IEEE International Conference on Computational
Intelligence and Games (CIG 2018)
Click to Read Paper
Reinforcement Learning (RL) is a research area that has blossomed tremendously in recent years and has shown remarkable potential in among others successfully playing computer games. However, there only exists a few game platforms that provide diversity in tasks and state-space needed to advance RL algorithms. The existing platforms offer RL access to Atari- and a few web-based games, but no platform fully expose access to Flash games. This is unfortunate because applying RL to Flash games have potential to push the research of RL algorithms. This paper introduces the Flash Reinforcement Learning platform (FlashRL) which attempts to fill this gap by providing an environment for thousands of Flash games on a novel platform for Flash automation. It opens up easy experimentation with RL algorithms for Flash games, which has previously been challenging. The platform shows excellent performance with as little as 5% CPU utilization on consumer hardware. It shows promising results for novel reinforcement learning algorithms.
* 12 Pages, Proceedings of the 30th Norwegian Informatics Conference,
Oslo, Norway 2017
Click to Read Paper
There have been numerous breakthroughs with reinforcement learning in the recent years, perhaps most notably on Deep Reinforcement Learning successfully playing and winning relatively advanced computer games. There is undoubtedly an anticipation that Deep Reinforcement Learning will play a major role when the first AI masters the complicated game plays needed to beat a professional Real-Time Strategy game player. For this to be possible, there needs to be a game environment that targets and fosters AI research, and specifically Deep Reinforcement Learning. Some game environments already exist, however, these are either overly simplistic such as Atari 2600 or complex such as Starcraft II from Blizzard Entertainment. We propose a game environment in between Atari 2600 and Starcraft II, particularly targeting Deep Reinforcement Learning algorithm research. The environment is a variant of Tower Line Wars from Warcraft III, Blizzard Entertainment. Further, as a proof of concept that the environment can harbor Deep Reinforcement algorithms, we propose and apply a Deep Q-Reinforcement architecture. The architecture simplifies the state space so that it is applicable to Q-learning, and in turn improves performance compared to current state-of-the-art methods. Our experiments show that the proposed architecture can learn to play the environment well, and score 33% better than standard Deep Q-learning which in turn proves the usefulness of the game environment.
* Proceedings of the 37th SGAI International Conference on Artificial
Intelligence, Cambridge, UK, 2017, Artificial Intelligence XXXIV, 2017
Click to Read Paper
With the increasing popularity of online learning, intelligent tutoring systems are regaining increased attention. In this paper, we introduce adaptive algorithms for personalized assignment of learning tasks to student so that to improve his performance in online learning environments. As main contribution of this paper, we propose a a novel Skill-Based Task Selector (SBTS) algorithm which is able to approximate a student's skill level based on his performance and consequently suggest adequate assignments. The SBTS is inspired by the class of multi-armed bandit algorithms. However, in contrast to standard multi-armed bandit approaches, the SBTS aims at acquiring two criteria related to student learning, namely: which topics should the student work on, and what level of difficulty should the task be. The SBTS centers on innovative reward and punishment schemes in a task and skill matrix based on the student behaviour. To verify the algorithm, the complex student behaviour is modelled using a neighbour node selection approach based on empirical estimations of a students learning curve. The algorithm is evaluated with a practical scenario from a basic java programming course. The SBTS is able to quickly and accurately adapt to the composite student competency --- even with a multitude of student models.
* 6th International Conference on Web Intelligence
Click to Read Paper