Skip to content

Commit

Permalink
Feat: Full Level-Based Foraging(LBF) environment (#218)
Browse files Browse the repository at this point in the history
Co-authored-by: Sasha Abramowitz <[email protected]>
Co-authored-by: Simon du Toit <[email protected]>
  • Loading branch information
3 people authored Oct 30, 2024
1 parent 04bf8fb commit 85333d7
Show file tree
Hide file tree
Showing 31 changed files with 3,054 additions and 5 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
</div>
<div class="row" align="center">
<img src="docs/env_anim/multi_cvrp.gif" alt="MultiCVRP" width="16%">
<img src="docs/env_anim/pac_man.gif" alt="PacMan" width="16%">
<img src="docs/env_anim/pac_man.gif" alt="PacMan" width="12.9%">
<img src="docs/env_anim/robot_warehouse.gif" alt="RobotWarehouse" width="16%">
<img src="docs/env_anim/rubiks_cube.gif" alt="RubiksCube" width="16%">
<img src="docs/env_anim/sliding_tile_puzzle.gif" alt="SlidingTilePuzzle" width="16%">
Expand All @@ -50,6 +50,7 @@
<img src="docs/env_anim/sudoku.gif" alt="Sudoku" width="16%">
<img src="docs/env_anim/tetris.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/tsp.gif" alt="Tetris" width="16%">
<img src="docs/env_anim/lbf.gif" alt="Level-Based Foraging" width="16%">
</div>
</div>

Expand Down Expand Up @@ -121,6 +122,7 @@ problems.
| Multi Minimum Spanning Tree Problem | Routing | `MMST-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/mmst) | [doc](https://instadeepai.github.io/jumanji/environments/mmst/) |
| ᗧ•••ᗣ•• PacMan | Routing | `PacMan-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pac_man/) | [doc](https://instadeepai.github.io/jumanji/environments/pac_man/)
| 👾 Sokoban | Routing | `Sokoban-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/sokoban/) | [doc](https://instadeepai.github.io/jumanji/environments/sokoban/) |
| 🍎 Level-Based Foraging | Routing | `LevelBasedForaging-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/lbf/) | [doc](https://instadeepai.github.io/jumanji/environments/lbf/) |

<h2 name="install" id="install">Installation 🎬</h2>

Expand Down
9 changes: 9 additions & 0 deletions docs/api/environments/lbf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
::: jumanji.environments.routing.lbf.env.LevelBasedForaging
selection:
members:
- __init__
- reset
- step
- observation_spec
- action_spec
- render
Binary file added docs/env_anim/lbf.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
43 changes: 43 additions & 0 deletions docs/environments/lbf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# # Level-Based Foraging Environment

<p align="center">
<img src="../env_anim/lbf.gif" width="600"/>
</p>

We provide a JAX jit-able implementation of the [Level-Based Foraging](https://github.com/semitable/lb-foraging/tree/master)
environment.

The Level-Based Foraging (LBF) represents a mixed cooperative-competitive environment that emphasises coordination between agents. As illustrated above, agents are placed within a grid world and assigned different levels.

To collect food, agents must be adjacent to it and the cumulative level of participating agents must meet or exceed the food's designated level. Agents receive points based on the level of the collected food and their own level.

## Observation

The **observation** seen by the agent is a `NamedTuple` containing the following:

- `agents_view`: jax array (int32) of shape `(num_agents, num_obs_features)`, array representing the agent's view of other agents
and food.

- `action_mask`: jax array (bool) of shape `(num_agents, 6)`, array specifying, for each agent,
which action (noop, up, down, left, right, load) is legal.

- `step_count`: jax array (int32) of shape `()`, number of steps elapsed in the current episode.

## Action

The action space is a `MultiDiscreteArray` containing an integer value in `[0, 1, 2, 3, 4, 5]` for each
agent. Each agent can take one of five actions: noop (`0`), up (`1`), down (`2`), turn left (`3`), turn right (`4`), or pick up food (`5`).

The episode terminates under the following conditions:

- An invalid action is taken, or

- An agent collides with another agent.

## Reward

The reward is equal to the sum of the levels of collected food divided by the level of the agents that collected them.

## Registered Versions 📖

- `LevelBasedForaging-v0`, a grid with 2 agents each with a field of view equal to the grid size (full observation case), with 2 food items and forcing the cooperation between agents.
6 changes: 6 additions & 0 deletions jumanji/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,3 +140,9 @@
register(
id="SlidingTilePuzzle-v0", entry_point="jumanji.environments:SlidingTilePuzzle"
)

# LevelBasedForaging with a random generator with 8 grid size,
# 2 agents and 2 food items and the maximum agent's level is 2.
register(
id="LevelBasedForaging-v0", entry_point="jumanji.environments:LevelBasedForaging"
)
1 change: 1 addition & 0 deletions jumanji/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
from jumanji.environments.routing.cleaner.env import Cleaner
from jumanji.environments.routing.connector.env import Connector
from jumanji.environments.routing.cvrp.env import CVRP
from jumanji.environments.routing.lbf.env import LevelBasedForaging
from jumanji.environments.routing.maze.env import Maze
from jumanji.environments.routing.mmst.env import MMST
from jumanji.environments.routing.multi_cvrp.env import MultiCVRP
Expand Down
2 changes: 1 addition & 1 deletion jumanji/environments/routing/connector/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ class Connector(Environment[State, specs.MultiDiscreteArray, Observation]):
key = jax.random.PRNGKey(0)
state, timestep = jax.jit(env.reset)(key)
env.render(state)
action = env.action_specc.generate_value()
action = env.action_spec.generate_value()
state, timestep = jax.jit(env.step)(state, action)
env.render(state)
```
Expand Down
17 changes: 17 additions & 0 deletions jumanji/environments/routing/lbf/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from jumanji.environments.routing.lbf.env import LevelBasedForaging
from jumanji.environments.routing.lbf.observer import GridObserver, VectorObserver
from jumanji.environments.routing.lbf.types import Agent, Food, Observation, State
205 changes: 205 additions & 0 deletions jumanji/environments/routing/lbf/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import chex
import jax
import jax.numpy as jnp
import pytest

from jumanji.environments.routing.lbf.env import LevelBasedForaging
from jumanji.environments.routing.lbf.generator import RandomGenerator
from jumanji.environments.routing.lbf.types import Agent, Food, State
from jumanji.tree_utils import tree_transpose

# create food and agents for grid that looks like:
# "AGENT" | EMPTY | EMPTY | EMPTY | EMPTY | EMPTY
# EMPTY | "AGENT" | EMPTY | EMPTY | EMPTY | EMPTY
# EMPTY | "FOOD" | "AGENT" | "FOOD" | EMPTY | EMPTY
# EMPTY | EMPTY | EMPTY | EMPTY | EMPTY | EMPTY
# EMPTY | EMPTY | "FOOD" | EMPTY | EMPTY | EMPTY
# EMPTY | EMPTY | EMPTY | EMPTY | EMPTY | EMPTY


@pytest.fixture
def key() -> chex.PRNGKey:
return jax.random.PRNGKey(42)


@pytest.fixture
def agent0() -> Agent:
return Agent(
id=jnp.asarray(0),
position=jnp.array([0, 0]),
level=jnp.asarray(1),
loading=jnp.asarray(False),
)


@pytest.fixture
def agent1() -> Agent:
return Agent(
id=jnp.asarray(1),
position=jnp.array([1, 1]),
level=jnp.asarray(2),
loading=jnp.asarray(False),
)


@pytest.fixture
def agent2() -> Agent:
return Agent(
id=jnp.asarray(2),
position=jnp.array([2, 2]),
level=jnp.asarray(4),
loading=jnp.asarray(False),
)


@pytest.fixture
def food0() -> Food:
return Food(
id=jnp.asarray(0),
position=jnp.array([2, 1]),
level=jnp.asarray(4),
eaten=jnp.asarray(False),
)


@pytest.fixture
def food1() -> Food:
return Food(
id=jnp.asarray(1),
position=jnp.array([2, 3]),
level=jnp.asarray(4),
eaten=jnp.asarray(False),
)


@pytest.fixture
def food2() -> Food:
return Food(
id=jnp.asarray(1),
position=jnp.array([4, 2]),
level=jnp.asarray(3),
eaten=jnp.asarray(False),
)


@pytest.fixture
def agents(agent0: Agent, agent1: Agent, agent2: Agent) -> Agent:
return tree_transpose([agent0, agent1, agent2])


@pytest.fixture
def food_items(food0: Food, food1: Food, food2: Food) -> Food:
return tree_transpose([food0, food1, food2])


@pytest.fixture
def state(agents: Agent, food_items: Food, key: chex.PRNGKey) -> State:
return State(agents=agents, food_items=food_items, step_count=0, key=key)


@pytest.fixture
def agent_grid() -> chex.Array:
"""Returns the agents' levels in their postion on the grid."""
return jnp.array(
[
[1, 0, 0, 0, 0, 0],
[0, 2, 0, 0, 0, 0],
[0, 0, 4, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
]
)


@pytest.fixture
def food_grid() -> chex.Array:
"""Returns the food items's levels in their postion on the grid."""
return jnp.array(
[
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 4, 0, 4, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 3, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
]
)


@pytest.fixture
def random_generator() -> RandomGenerator:
return RandomGenerator(
grid_size=8,
fov=2,
num_agents=2,
num_food=2,
max_agent_level=2,
force_coop=True,
)


@pytest.fixture
def lbf_environment() -> LevelBasedForaging:
generator = RandomGenerator(
grid_size=8,
fov=6,
num_agents=3,
num_food=3,
max_agent_level=4,
force_coop=True,
)

return LevelBasedForaging(generator=generator, time_limit=5)


@pytest.fixture
def lbf_env_2s() -> LevelBasedForaging:
generator = RandomGenerator(
grid_size=8,
fov=2,
num_agents=2,
num_food=2,
max_agent_level=2,
force_coop=False,
)

return LevelBasedForaging(generator=generator, time_limit=5)


@pytest.fixture
def lbf_env_grid_obs() -> LevelBasedForaging:
generator = RandomGenerator(
grid_size=8,
fov=6,
num_agents=3,
num_food=3,
max_agent_level=4,
force_coop=True,
)

return LevelBasedForaging(generator=generator, grid_observation=True)


@pytest.fixture
def lbf_with_penalty() -> LevelBasedForaging:
return LevelBasedForaging(penalty=1.0)


@pytest.fixture
def lbf_with_no_norm_reward() -> LevelBasedForaging:
return LevelBasedForaging(normalize_reward=False)
32 changes: 32 additions & 0 deletions jumanji/environments/routing/lbf/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import jax.numpy as jnp

# Actions
NOOP = 0
UP = 1
DOWN = 2
LEFT = 3
RIGHT = 4
LOAD = 5

# NOOP, UP, DOWN, LEFT, RIGHT, LOAD
MOVES = jnp.array([[0, 0], [-1, 0], [1, 0], [0, -1], [0, 1], [0, 0]])

# viewer constants
_FIGURE_SIZE = (5, 5)

# Define some colors for visualization.
_GRID_COLOR = (0, 0, 0) # black
_LINE_COLOR = (1, 1, 1) # white
Loading

0 comments on commit 85333d7

Please sign in to comment.