Hierarchical Reinforcement learning in Turn Based Strategy Games
Background
Strategy games offer a rich environment for testing AI approaches. The games include a number of challenges to which learning can applied such as: team work, adversarial play, exploration, division of labour and many more. This fact has led to a number of attempts to employ these games as a testing environment in research papers. The goal of this thesis is to investigate the potential application of Hierarchical reinforcement learning in an open source turn based strategy (TBS) game.
Reinforcement Learning (RL) describes a class of learning algorithms which can be used to let a agent optimize its behavior by trial and error. In the standard problem formulation, the agent is situated in an environment which can assume a number of states. Each time step the environment is in a certain state and the agent selects an action to perform. The result of this action is an immediate reward for the agent and a change in environment state. The agent's goal is to map actions to states in order to maximize its long term reward.
Hierarchical reinforcement learning (HRL) is an extension of reinforcement learning, which decomposes the the learning problem into a hierarchy of smaller sub-problems. This decomposition can significantly speed up learning, and allows the solutions for certain subproblems to be reused in multiple places in the hierarchy.
During this thesis you will implement and evaluate hierarchical reinforcement learning in the FreeCol strategy game. FreeCol is an open source reimplementation of Sid Meiers’ Colonization (a Civilization type TBS game).
Prerequisites
Students should have some knowledge of reinforcement learning (eg. from the courses Machine Learning or Multi-agent Learning) or be willing to acquire this knowledge during the thesis.
The FreeCol code base is written in Java. The student will be required to understand and modify this implementation.
Requirements
This thesis has 2 main components:
- Modifying the FreeCol code to allow adding custom AI clients and to create training scenarios.
- Implement and evaluate a HRL client
Links
Contacts
- Ann Nowe (promotor): ann.nowe@vub.ac.be

