- Make your own custom environment ¶gymlibrary.dev•https://www.gymlibrary.dev/content/environment_creation/
To make a custom Reinforcement Learning (RL) environment for agents, you can subclass
gym.Envwithin the Gym framework. The documentation provides a tutorial on creating aGridWorldEnvas an example.Getting Started
- Clone the
gym-examplesrepository: This repository contains the code for the examples presented.
git clone https://github.com/Farama-Foundation/gym-examples
cd gym-examples
2. **Set up a virtual environment**: It is recommended to use a virtual environment for the project.bash
python -m venv .env
source .env/bin/activate
pip install -e .
```Subclassing
gym.EnvBefore starting, review Gym's API documentation. When creating your custom environment, you will define it by inheriting from the
gym.Envabstract class.-
Declaration and Initialization: Your custom environment class should include a
metadataattribute and an__init__method.metadataattribute: Specify the supported render modes (e.g.,"human","rgb_array","ansi") and the rendering framerate (render_fps). Every environment should supportNoneas a render mode.__init__method: This method accepts parameters for your environment (e.g.,sizefor grid dimensions). Here, you will defineself.observation_spaceandself.action_space.
Example (
GridWorldEnv): This environment consists of a 2D square grid where an agent navigates to a randomly placed target.- Observations: Provided as a dictionary containing the agent's and target's locations. For example,
{"agent":array([1,0]),"target":array([0,3])}. This is typically defined usingspaces.DictwithMultiDiscretefor locations. - Actions: There are 4 possible actions ("right", "up", "left", "down"). This is defined using
spaces.Discrete(4). - Done Signal: Issued when the agent reaches the target.
- Rewards: Binary and sparse; 1 upon reaching the target, 0 otherwise.
An example
__init__method snippet fromGridWorldEnvwould look like this:
import gymnasium as gym
from gymnasium import spaces
import pygame
import numpy as npclass GridWorldEnv(gym.Env):
metadata = {"render_modes": ["human", "rgb_array"], "render_fps": 4}def __init__(self, render_mode=None, size=5): self.size = size # The size of the square grid self.window_size = 512 # The size of the PyGame window # Observations are dictionaries with the agent's and the target's location. # Each location is encoded as an element of {0, ..., `size`}^2, i.e. MultiDiscrete([size, size]). self.observation_space = spaces.Dict( { "agent": spaces.Box(0, size - 1, shape=(2,), dtype=int), "target": spaces.Box(0, size - 1, shape=(2,), dtype=int), } ) # We have 4 actions, corresponding to "right", "up", "left", "down" self.action_space = spaces.Discrete(4) ``` - Clone the