Basic Usage

Minari is a standard dataset hosting interface for Offline Reinforcement Learning applications. Minari is compatible with most of the RL environments that follow the Gymnasium API and facilitates Offline RL dataset handling by providing data collection, dataset hosting, and dataset sampling capabilities.

Installation

To install the most recent version of the Minari library run this command:

pip install minari

If you’d like to start testing or contribute to Minari then please install this project from source with:

git clone https://github.com/Farama-Foundation/Minari.git
cd Minari
pip install -e .

We support Python with minimum version 3.8 on Linux and macOS.

Create Minari Dataset

Collecting Data

Minari can abstract the data collection process. This is achieved by using the minari.DataCollector wrapper which stores the environments stepping data in internal memory buffers before saving the dataset into disk. The minari.DataCollector wrapper can also perform caching by scheduling the amount of episodes or steps that are stored in-memory before saving the data in a temporary Minari dataset file . This wrapper also computes relevant metadata of the dataset while collecting the data.

The wrapper is very simple to initialize:

from minari import DataCollector
import gymnasium as gym

env = gym.make('CartPole-v1')
env = DataCollector(env, record_infos=True, max_buffer_steps=100000)

In this example, the minari.DataCollector wraps the ‘CartPole-v1’ environment from Gymnasium. The arguments passed are record_infos (when set to True the wrapper will also collect the returned info dictionaries to create the dataset), and the max_buffer_steps argument, which specifies a caching scheduler by giving the number of data steps to store in-memory before moving them to a temporary file on disk. There are more arguments that can be passed to this wrapper, a detailed description of them can be read in the minari.DataCollector documentation.

Save Dataset

To create a Minari dataset first we need to step the environment with a given policy to allow the minari.DataCollector to record the data that will comprise the dataset. This is as simple as just looping through the Gymansium MDP API. For our example we will loop through 100 episodes of the 'CartPole-v1' environment with a random policy.

Finally, we need to create the Minari dataset and give it a name id. This is done by calling the DataCollector.create_dataset() Minari function which will move the temporary data recorded in the minari.DataCollector environment to a permanent location in the local Minari root path with the Minari dataset standard structure.

Extending the code example for the 'CartPole-v1' environment we can create the Minari dataset as follows:

import minari
import gymnasium as gym
from minari import DataCollector

env = gym.make('CartPole-v1')
env = DataCollector(env, record_infos=True, max_buffer_steps=100000)

total_episodes = 100

for _ in range(total_episodes):
    env.reset(seed=123)
    while True:
        # random action policy
        action = env.action_space.sample()
        obs, rew, terminated, truncated, info = env.step(action)

        if terminated or truncated:
            break

dataset = env.create_dataset(
    dataset_id="cartpole-test-v0",
    algorithm_name="Random-Policy",
    code_permalink="https://github.com/Farama-Foundation/Minari",
    author="Farama",
    author_email="contact@farama.org"
)

When creating the Minari dataset additional metadata can be added such as the algorithm_name used to compute the actions, a code_permalink with a link to the code used to generate the dataset, as well as the author and author_email.

The DataCollector.create_dataset() function returns a minari.MinariDataset object, dataset in the previous code snippet.

Once the dataset has been created we can check if the Minari dataset id appears in the list of local datasets:

minari list local
                     Local Minari datasets('/Users/farama/.minari/datasets/')
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Name             ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author ┃ Email              ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ cartpole-test-v0 │            100 │        2059 │ 1.6 MB       │ Farama │ contact@farama.org │
└──────────────────┴────────────────┴─────────────┴──────────────┴────────┴────────────────────┘

The minari.list_local_datasets() function returns a dictionary with keys the local Minari dataset ids and values their metadata.

There is another optional way of creating a Minari dataset and that is by using the minari.create_dataset_from_buffers() function. The data collection is left to the user instead of using the minari.DataCollector wrapper. The user will be responsible for creating their own buffers to store the stepping data, and these buffers must follow a specific structure specified in the function API documentation.

Checkpoint Minari Dataset

When collecting data with the minari.DataCollector wrapper, the recorded data is saved into temporary files and it won’t be permanently saved in disk until the DataCollector.create_dataset() function is called. For large datasets, to avoid losing all of the collected data, extra data from a minari.DataCollector can be appended to checkpoint the data collection process.

To checkpoint a dataset we can call the minari.MinariDataset.update_dataset_from_collector_env() method. Every time the function DataCollector.create_dataset() or the method minari.MinariDataset.update_dataset_from_collector_env() are called, the buffers from the minari.DataCollector environment are cleared.

Continuing the 'CartPole-v1' example we can checkpoint the newly created Minari dataset every 10 episodes as follows:

import minari
import gymnasium as gym
from minari import DataCollector

env = gym.make('CartPole-v1')
env = DataCollector(env, record_infos=True, max_buffer_steps=100000)

total_episodes = 100
dataset_name = "cartpole-test-v0"
dataset = None
if dataset_name in minari.list_local_datasets():
    dataset = minari.load_dataset(dataset_name)

for episode_id in range(total_episodes):
    env.reset(seed=123)
    while True:
        # random action policy
        action = env.action_space.sample()
        obs, rew, terminated, truncated, info = env.step(action)

        if terminated or truncated:
            break

    if (episode_id + 1) % 10 == 0:
        # Update local Minari dataset every 10 episodes.
        # This works as a checkpoint to not lose the already collected data
        if dataset is None:
            dataset = env.create_dataset(
                dataset_id=dataset_name,
                algorithm_name="Random-Policy",
                code_permalink="https://github.com/Farama-Foundation/Minari",
                author="Farama",
                author_email="contact@farama.org"
            )
        else:
            env.add_to_dataset(dataset)

Using Minari Datasets

Minari will only be able to load datasets that are stored in your local root directory . In order to use any of the dataset sampling features of Minari we first need to load the dataset as a minari.MinariDataset object using the minari.load_dataset() function as follows:

import minari
dataset = minari.load_dataset('cartpole-test-v0')
print(dataset.id)
cartpole-test-v0

Download Remote Datasets

Minari also has a remote storage in a Google Cloud Platform (GCP) bucket which provides access to standardize Minari datasets. The datasets hosted in the remote Farama server can be listed with minari.list_remote_datasets():

minari list remote
                                                Minari datasets in Farama server
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name                         ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author            ┃ Email                  ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩
│ antmaze-large-diverse-v0     │           1000 │     1000000 │ 700.5 MB     │ Alex Davey        │ amd1g13@soton.ac.uk    │
│ antmaze-large-play-v0        │           1000 │     1000000 │ 700.5 MB     │ Alex Davey        │ amd1g13@soton.ac.uk    │
│ antmaze-medium-diverse-v0    │           1000 │     1000000 │ 700.5 MB     │ Alex Davey        │ amd1g13@soton.ac.uk    │
│             ...              │       ...      │     ...     │     ...      │        ...        │           ...          │

Same as the minari.list_local_datasets() function, the minari.list_remote_datasets() function returns a dictionary with keys equal to the remote Minari dataset ids and values their metadata.

To download any of the remote datasets into the local Minari root path use the function minari.download_dataset():

minari download door-human-v1
minari list local
                            Local Minari datasets('/Users/farama/.minari/datasets/')
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name          ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author             ┃ Email                    ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ door-human-v1 │             25 │        6729 │ 7.1 MB       │ Rodrigo de Lazcano │ rperezvicente@farama.org │
└───────────────┴────────────────┴─────────────┴──────────────┴────────────────────┴──────────────────────────┘

Sampling Episodes

Minari can retrieve a certain amount of episode shards from the dataset files as a list of minari.EpisodeData objects. The sampling process of the Minari datasets is performed through the method minari.MinariDataset.sample_episodes(). This method is a generator that randomly samples n number of minari.EpisodeData from the minari.MinariDataset. The seed of this generator can be set with minari.MinariDataset.set_seed(). For example:

import minari

dataset = minari.load_dataset("door-human-v1", download=True)
dataset.set_seed(seed=123)

for i in range(5):
    # sample 5 episodes from the dataset
    episodes = dataset.sample_episodes(n_episodes=5)
    # get id's from the sampled episodes
    ids = list(map(lambda ep: ep.id, episodes))
    print(f"EPISODE ID'S SAMPLE {i}: {ids}")

This code will show the following.

EPISODE ID'S SAMPLE 0: [1, 13, 0, 22, 15]
EPISODE ID'S SAMPLE 1: [3, 10, 23, 7, 18]
EPISODE ID'S SAMPLE 2: [12, 6, 0, 18, 19]
EPISODE ID'S SAMPLE 3: [9, 4, 15, 3, 17]
EPISODE ID'S SAMPLE 4: [19, 4, 12, 17, 21]

Notice that in each sample non of the episodes are sampled more than once but the same episode can be retrieved in different minari.MinariDataset.sample_episodes() calls.

Minari doesn’t serve the purpose of creating replay buffers out of the Minari datasets, we leave this task for the user to make for their specific needs. To create your own buffers and dataloaders, you may need the ability to iterate through an episodes in a deterministic order. This can be achieved with minari.MinariDataset.iterate_episodes(). This method is a generator that iterates over minari.EpisodeData episodes from minari.MinariDataset. Specific indices can be also provided. For example:

import minari

dataset = minari.load_dataset("door-human-v1", download=True)
episodes_generator = dataset.iterate_episodes(episode_indices=[1, 2, 0])

for episode in episodes_generator:
    print(f"EPISODE ID {episode.id}")

This code will show the following.

EPISODE ID 1
EPISODE ID 2
EPISODE ID 0

In addition, the minari.MinariDataset dataset itself is iterable. However, in this case the indices will have to be filtered separately using minari.MinariDataset.filter_episodes().

import minari

dataset = minari.load_dataset("door-human-v1", download=True)

for episode in dataset:
    print(f"EPISODE ID {episode.id}")

Filter Episodes

The episodes in the dataset can be filtered before sampling. This is done with a custom conditional callable passed to minari.MinariDataset.filter_episodes(). The input to the conditional callable is an episode group in h5py.Group format and the return value must be True if you want to keep the episode or False otherwise. The method will return a new minari.MinariDataset:

import minari

dataset = minari.load_dataset("door-human-v1", download=True)

print(f'TOTAL EPISODES ORIGINAL DATASET: {dataset.total_episodes}')

# get episodes with mean reward greater than 2
filter_dataset = dataset.filter_episodes(lambda episode: episode.rewards.mean() > 2)

print(f'TOTAL EPISODES FILTER DATASET: {filter_dataset.total_episodes}')

Some episodes were removed from the dataset:

TOTAL EPISODES ORIGINAL DATASET: 25
TOTAL EPISODES FILTER DATASET: 18

Split Dataset

Minari provides another utility function to divide a dataset into multiple datasets, minari.split_dataset()

import minari

dataset = minari.load_dataset("door-human-v1", download=True)

split_datasets = minari.split_dataset(dataset, sizes=[20, 5], seed=123)

print(f'TOTAL EPISODES FIRST SPLIT: {split_datasets[0].total_episodes}')
print(f'TOTAL EPISODES SECOND SPLIT: {split_datasets[1].total_episodes}')
TOTAL EPISODES FIRST SPLIT: 20
TOTAL EPISODES SECOND SPLIT: 5

Recover Environment

From a minari.MinariDataset object we can also recover the Gymnasium environment used to create the dataset, this can be useful for reproducibility or to generate more data for a specific dataset:

import minari

dataset = minari.load_dataset('cartpole-test-v0')
env = dataset.recover_environment()

env.reset()
for _ in range(100):
    obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
    if terminated or truncated:
        env.reset()

Note

There are some datasets that provide a different environment for evaluation purposes than the one used for collecting the data. This environment can be recovered by setting to True the eval_env argument:

import minari

dataset = minari.load_dataset('LunarLander-v2-test-v0')
eval_env = dataset.recover_environment(eval_env=True)

If the dataset doesn’t have an eval_env_spec attribute, the environment used for collecting the data will be retrieved by default.

Combine Minari Datasets

Lastly, in the case of having two or more Minari datasets created with the same environment we can combine these datasets into a single one by using the Minari function minari.combine_datasets(), i.e. the 'AdroitHandDoor-v1' environment has two datasets available in the remote Farama servers, door-human-v1 and door-expert-v1, we can combine the episodes in these two datasets into a new Minari dataset door-all-v1:

minari download door-expert-v1
minari combine door-human-v1 door-expert-v1 --dataset-id=door-all-v1
minari list local
                             Local Minari datasets('/Users/farama/.minari/datasets/')
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name           ┃ Total Episodes ┃ Total Steps ┃ Dataset Size ┃ Author             ┃ Email                    ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ door-all-v1    │           5025 │     1006729 │ 1103.5 MB    │ Rodrigo de Lazcano │ rperezvicente@farama.org │
│ door-expert-v1 │           5000 │     1000000 │ 1096.4 MB    │ Rodrigo de Lazcano │ rperezvicente@farama.org │
│ door-human-v1  │             25 │        6729 │ 7.1 MB       │ Rodrigo de Lazcano │ rperezvicente@farama.org │
└────────────────┴────────────────┴─────────────┴──────────────┴────────────────────┴──────────────────────────┘