Basic Usage¶

Minari is a standard dataset hosting interface for Offline Reinforcement Learning applications. Minari is compatible with most of the RL environments that follow the Gymnasium API and facilitates Offline RL dataset handling by providing data collection, dataset hosting, and dataset sampling capabilities.

Installation¶

To install the most recent version of the Minari library run this command:

pip install minari

This will install the minimum required dependencies. Additional dependencies will be prompted for installation based on your use case. To install all dependencies at once, use:

pip install "minari[all]"

If you’d like to start testing or contribute to Minari then please install this project from source with:

git clone https://github.com/Farama-Foundation/Minari.git --single-branch
cd Minari
pip install -e ".[all]"

We support Python with minimum version 3.8 on Linux and macOS.

Using Minari Datasets¶

Download Datasets¶

Minari has a remote storage which provides access to a variety of datasets. The datasets hosted in the remote Farama server can be listed running in the terminal:

minari list remote

                                Minari datasets in Farama server
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Name                           ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ D4RL/antmaze/large-diverse-v1  │           1000 │     1000000 │ 605.2 MB     │ Alex Davey      │
│ D4RL/antmaze/large-play-v1     │           1000 │     1000000 │ 605.2 MB     │ Alex Davey      │
│ D4RL/antmaze/medium-diverse-v1 │           1000 │     1000000 │ 605.2 MB     │ Alex Davey      │
│             ...                │       ...      │     ...     │     ...      │       ...       │

Minari supports storing datasets on remote servers, including Google Cloud Platform (GCP) and Hugging Face Hub. To use your own server with Minari, set the MINARI_REMOTE environment variable in the format remote-type://remote-path. For example, to set up a GCP bucket named my-datasets, run the following command:

export MINARI_REMOTE=gcp://my-datasets

To access the Minari datasets of a user or organization on HuggingFace, you can use the format hf://username-or-org. For example, for farama-minari organization:

export MINARI_REMOTE=hf://farama-minari

To download any of the remote datasets into the local storage use the download command:

minari download D4RL/door/human-v2

Load Local Datasets¶

Minari will only be able to load datasets that are stored in your local root directory . To list the local datasets, use the list command:

minari list local

                 Local Minari datasets('/Users/farama/.minari/datasets/')
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Name               ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author             ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ D4RL/door/human-v2 │             25 │        6729 │ 7.1 MB       │ Rodrigo de Lazcano │
└────────────────────┴────────────────┴─────────────┴──────────────┴────────────────────┘

In order to use any of the dataset sampling features of Minari we first need to load the dataset as a minari.MinariDataset object using the minari.load_dataset() Python function as follows:

import minari
dataset = minari.load_dataset('D4RL/door/human-v2')
print("Observation space:", dataset.observation_space)
print("Action space:", dataset.action_space)
print("Total episodes:", dataset.total_episodes)
print("Total steps:", dataset.total_steps)

Observation space: Box(-inf, inf, (39,), float64)
Action space: Box(-1.0, 1.0, (28,), float32)
Total episodes: 25
Total steps: 6729

Sampling Episodes¶

Minari can retrieve a certain amount of episode shards from the dataset files as a list of minari.EpisodeData objects. The sampling process of the Minari datasets is performed through the method minari.MinariDataset.sample_episodes(). This method is a generator that randomly samples n number of minari.EpisodeData from the minari.MinariDataset. The seed of this generator can be set with minari.MinariDataset.set_seed(). For example:

import minari

dataset = minari.load_dataset("D4RL/door/human-v2")
dataset.set_seed(seed=123)

for i in range(5):
    # sample 5 episodes from the dataset
    episodes = dataset.sample_episodes(n_episodes=5)
    # get id's from the sampled episodes
    ids = list(map(lambda ep: ep.id, episodes))
    print(f"EPISODE ID'S SAMPLE {i}: {ids}")

This code will show the following.

EPISODE ID'S SAMPLE 0: [1, 13, 0, 22, 15]
EPISODE ID'S SAMPLE 1: [3, 10, 23, 7, 18]
EPISODE ID'S SAMPLE 2: [12, 6, 0, 18, 19]
EPISODE ID'S SAMPLE 3: [9, 4, 15, 3, 17]
EPISODE ID'S SAMPLE 4: [19, 4, 12, 17, 21]

Notice that in each sample non of the episodes are sampled more than once but the same episode can be retrieved in different minari.MinariDataset.sample_episodes() calls.

Minari doesn’t serve the purpose of creating replay buffers out of the Minari datasets, we leave this task for the user to make for their specific needs. To create your own buffers and dataloaders, you may need the ability to iterate through an episodes in a deterministic order. This can be achieved with minari.MinariDataset.iterate_episodes(). This method is a generator that iterates over minari.EpisodeData episodes from minari.MinariDataset. Specific indices can be also provided. For example:

import minari

dataset = minari.load_dataset("D4RL/door/human-v2")
episodes_generator = dataset.iterate_episodes(episode_indices=[1, 2, 0])

for episode in episodes_generator:
    print(f"EPISODE ID {episode.id}")

This code will show the following.

EPISODE ID 1
EPISODE ID 2
EPISODE ID 0

In addition, the minari.MinariDataset dataset itself is iterable:.

import minari

dataset = minari.load_dataset("D4RL/door/human-v2")

for episode in dataset:
    print(f"EPISODE ID {episode.id}")

Filter Episodes¶

The episodes in the dataset can be filtered before sampling. This is done with a custom conditional callable passed to minari.MinariDataset.filter_episodes(). The input to the conditional callable is an minari.EpisodeData and the return value must be True if you want to keep the episode or False otherwise. The method will return a new minari.MinariDataset:

import minari

dataset = minari.load_dataset("D4RL/door/human-v2")

print(f'TOTAL EPISODES ORIGINAL DATASET: {dataset.total_episodes}')

# get episodes with mean reward greater than 2
filter_dataset = dataset.filter_episodes(lambda episode: episode.rewards.mean() > 2)

print(f'TOTAL EPISODES FILTER DATASET: {filter_dataset.total_episodes}')

Some episodes were removed from the dataset:

TOTAL EPISODES ORIGINAL DATASET: 25
TOTAL EPISODES FILTER DATASET: 18

Split Dataset¶

Minari provides another utility function to divide a dataset into multiple datasets, minari.split_dataset()

import minari

dataset = minari.load_dataset("D4RL/door/human-v2", download=True)

split_datasets = minari.split_dataset(dataset, sizes=[20, 5], seed=123)

print(f'TOTAL EPISODES FIRST SPLIT: {split_datasets[0].total_episodes}')
print(f'TOTAL EPISODES SECOND SPLIT: {split_datasets[1].total_episodes}')

TOTAL EPISODES FIRST SPLIT: 20
TOTAL EPISODES SECOND SPLIT: 5

Recover Environment¶

From a minari.MinariDataset object we can also recover the Gymnasium environment used to create the dataset, this can be useful for reproducibility or to generate more data for a specific dataset:

import minari

dataset = minari.load_dataset('D4RL/door/human-v2')
env = dataset.recover_environment()

env.reset()
for _ in range(100):
    obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
    if terminated or truncated:
        env.reset()

Note

There are some datasets that provide a different environment for evaluation purposes than the one used for collecting the data. This environment can be recovered by setting to True the eval_env argument:

import minari

dataset = minari.load_dataset('D4RL/door/human-v2')
eval_env = dataset.recover_environment(eval_env=True)

If the dataset doesn’t have an eval_env_spec attribute, the environment used for collecting the data will be retrieved by default.

Combine Minari Datasets¶

In the case of having two or more Minari datasets created with the same environment we can combine these datasets into a single one by using the Minari function minari.combine_datasets(), i.e. the 'AdroitHandDoor-v1' environment has two datasets available in the remote Farama servers, D4RL/door/human-v2 and D4RL/door/expert-v2, we can combine the episodes in these two datasets into a new Minari dataset D4RL/door/all-v0:

minari download D4RL/door/expert-v2
minari combine D4RL/door/human-v2 D4RL/door/expert-v2 --dataset-id=D4RL/door/all-v0
minari list local

                    Local Minari datasets('/Users/farama/.minari/datasets/')
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Name                ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author             ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ D4RL/door/all-v0    │           5025 │     1006729 │ 1103.5 MB    │ Rodrigo de Lazcano │
│ D4RL/door/expert-v2 │           5000 │     1000000 │ 1096.4 MB    │ Rodrigo de Lazcano │
│ D4RL/door/human-v2  │             25 │        6729 │ 7.1 MB       │ Rodrigo de Lazcano │
└─────────────────────┴────────────────┴─────────────┴──────────────┴────────────────────┘

Create Minari Dataset¶

Collecting Data¶

Minari can abstract the data collection process. This is achieved by using the minari.DataCollector wrapper which stores the environments stepping data in internal memory buffers before saving the dataset into disk. The minari.DataCollector wrapper can also perform caching by scheduling the amount of episodes or steps that are stored in-memory before saving the data in a temporary Minari dataset file . This wrapper also computes relevant metadata of the dataset while collecting the data.

The wrapper is very simple to initialize:

from minari import DataCollector
import gymnasium as gym

env = gym.make('CartPole-v1')
env = DataCollector(env, record_infos=True)

In this example, the minari.DataCollector wraps the ‘CartPole-v1’ environment from Gymnasium. We set record_infos=True so the wrapper will also collect the returned info dictionaries to create the dataset. For the full list of arguments, read the minari.DataCollector documentation.

Save Dataset¶

To create a Minari dataset first we need to step the environment with a given policy to allow the minari.DataCollector to record the data that will comprise the dataset. This is as simple as just looping through the Gymansium MDP API. For our example we will loop through 100 episodes of the 'CartPole-v1' environment with a random policy.

Finally, we need to create the Minari dataset and give it a name id. This is done by calling the minari.DataCollector.create_dataset() Minari function which will move the temporary data recorded in the minari.DataCollector environment to a permanent location in the local Minari root path with the Minari dataset standard structure.

Extending the code example for the 'CartPole-v1' environment we can create the Minari dataset as follows:

import minari
import gymnasium as gym
from minari import DataCollector

env = gym.make('CartPole-v1')
env = DataCollector(env, record_infos=True)

total_episodes = 100

for _ in range(total_episodes):
    env.reset(seed=123)
    while True:
        # random action policy
        action = env.action_space.sample()
        obs, rew, terminated, truncated, info = env.step(action)

        if terminated or truncated:
            break

dataset = env.create_dataset(
    dataset_id="cartpole/test-v0",
    algorithm_name="Random-Policy",
    code_permalink="https://github.com/Farama-Foundation/Minari",
    author="Farama",
    author_email="contact@farama.org"
)

When creating the Minari dataset additional metadata can be added such as the algorithm_name used to compute the actions, a code_permalink with a link to the code used to generate the dataset, as well as the author and author_email.

The minari.DataCollector.create_dataset() function returns a minari.MinariDataset object, dataset in the previous code snippet.

Once the dataset has been created we can check if the Minari dataset id appears in the list of local datasets:

minari list local

        Local Minari datasets('/Users/farama/.minari/datasets/')
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Name             ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃  Author  ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ cartpole-test-v0 │            100 │        2059 │ 1.6 MB       │  Farama  │
└──────────────────┴────────────────┴─────────────┴──────────────┴──────────┘

The minari.list_local_datasets() function returns a dictionary with keys the local Minari dataset ids and values their metadata.

There is another optional way of creating a Minari dataset and that is by using the minari.create_dataset_from_buffers() function. The data collection is left to the user instead of using the minari.DataCollector wrapper. The user will be responsible for creating their own buffers to store the stepping data, and these buffers must follow a specific structure specified in the function API documentation.

Checkpoint Minari Dataset¶

When collecting data with the minari.DataCollector wrapper, the recorded data is saved into temporary files and it won’t be permanently saved on disk until the DataCollector.create_dataset() function is called. To prevent losing data for large datasets, it is recommended to create the dataset during data collection and append the data to it using DataCollector.add_to_dataset().

Continuing the 'CartPole-v1' example we can checkpoint the newly created Minari dataset every 10 episodes as follows:

import minari
import gymnasium as gym
from minari import DataCollector

env = gym.make('CartPole-v1')
env = DataCollector(env, record_infos=True)

total_episodes = 100
dataset_id = "cartpole/test-v0"
dataset = None
if dataset_id in minari.list_local_datasets():
    dataset = minari.load_dataset(dataset_id)

for episode_id in range(total_episodes):
    env.reset()
    while True:
        # random action policy
        action = env.action_space.sample()
        obs, rew, terminated, truncated, info = env.step(action)

        if terminated or truncated:
            break

    if (episode_id + 1) % 10 == 0:
        # Update local Minari dataset every 10 episodes.
        # This works as a checkpoint to not lose the already collected data
        if dataset is None:
            dataset = env.create_dataset(
                dataset_id=dataset_id,
                algorithm_name="Random-Policy",
                code_permalink="https://github.com/Farama-Foundation/Minari",
                author="Farama",
                author_email="contact@farama.org"
            )
        else:
            env.add_to_dataset(dataset)

Using Namespaces¶

Namespaces can be used to group together common datasets and provide them with a hierarchical structure. For example, suppose we want to create a series of Classic Control datasets (cartpole, acrobot, e.t.c.) using the dataset creation code above. Instead of specifying dataset_id=cartpole-test-v0, we can use e.g. classic_control/cartpole-test-v0 when creating the dataset. This, and all other datasets with a dataset_id that starts with classic_control/ will now be stored together in the classic_control namespace.

For more flexibility, namespaces can be created and modified directly using the Namespace API.