Deep Polygon

A multi-agent deep reinforcement learning experiment - designing collaborative and competitive agents for grid based arena environments like splix.io, paper.io and tileman.io.

Showcase

Aggressive	Balanced	Defensive

Prioritize kills	Land capture and low risk kills	Prioritize land capture, avoid combat

Overview

Table of Contents

Reinforcement learning
Environment
Strategy
- Secondary Strategies
Getting Started
Roadmap

This repository has two main components: deep reinforcement learning algorithms and the grid environment.

Environment

Agents control a character in a grid-based map and aims to own as much land as possible by capturing it with their trail.
An agent can kill other agents (including themselves) by colliding with any of their trail segments.

Agents are rewarded for their current land area and each player kill.

Reinforcement Learning

Reward Function

A simple model function is ideal in RL since it leaves more creativity for the AI's behaviour, and is also less computationally expensive - important if it's run each step of the game (for every tile moved).

Spawn Conditions

A player spawns in a 5x5 area of their own land.

Model Design

A CNN is required for the AI to have spatial understanding - to avoid enemies and maximize the area captured by its path.

Several scalar inputs are also useful, to pass continuous values which vary within a range, or discrete values (flags or states we can enumerate). This can be used to force certain behaviour.

State Design

The game grid is viewed as an image with 3 color channels.

The 1st channel represents the state of the grid - empty, player block or enemy block. Up to n unique enemies can be represented (a configurable parameter) within the player's FOV at any instance.
The 2nd channel represents the trail/path taken by the player.
The 3rd channel represents the trail/path of the (up to) n enemies simultaneously.

Limitations

Training a CNN requires a much larger training set. In RL this would require gathering more state transition tuples or simply experiencing the game for longer.

Strategy

There are three primary play styles which arise from these rules - the primary AIs.

Aggressive - optimize for player kills over survival and land capturing.
Balanced - capture land while capitalizing on opportunities for kills and avoiding unnecessary risk.
Defensive - optimize for land capture, ignoring player kills and any danger.

However, these classes can be expanded on with new gameplay objectives.

Secondary Strategies

Targeting

Given the position of a mobile target (player or position), attempt to achieve a goal. The target may be outside of the FOV.
- Assassin
  - hunt down the target player, optimally paths towards the target with land capture
  - ignore other enemy players
  - avoid death from excessive risk
  - kills the target with a high risk, sacrificial playstyle
- Defender
  - optimally paths towards the target
  - when target is found, nearby enemies are fought while ensuring the target is still protected
Units

A group of agents which must maintain a formation to ensure an objective/prevent a failure condition.
- Guardian Formation
  - Defenders surround their target to prevent their death at all costs.
  - Has multiple Defender agents and must coordinate with other defenders and enemy Guardian formations.
Land Manipulation

Capturing all of an enemy's land renders them vulnerable and reduces their score.
- Consumer
  - Specifically captures enemy land, optimally pathing to enemy area
  - ignores other players

Getting Started

This simulation can be run to train AIs.

Prerequisites

Docker
A GPU which can be exposed using --gpus with docker. Read more here.

Usage

./run-offline-nvidia.sh is the entrypoint script and passess all arguments to main.py.

./run-offline-nvidia.sh

./run-offline-nvidia.sh train

./run-offline-nvidia.sh eval

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
env		env
.gitignore		.gitignore
Dockerfile.base		Dockerfile.base
Dockerfile.offline.profile		Dockerfile.offline.profile
Dockerfile.online		Dockerfile.online
README.md		README.md
__init__.py		__init__.py
agent.py		agent.py
build-offline.sh		build-offline.sh
config.json		config.json
constants.py		constants.py
main.py		main.py
main_old.py		main_old.py
model.py		model.py
replay.py		replay.py
requirements.txt		requirements.txt
run-offline-nvidia.sh		run-offline-nvidia.sh
webhook.py		webhook.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Polygon

Showcase

Overview

Environment