Dataset Standards

Minari Storage

Minari root

Minari stores the offline datasets under a common root directory. The root directory path for the local datasets is set by default to ~/.minari/datasets/. However, this path can be modified by setting the environment variable MINARI_DATASETS_PATH.

The remote datasets are kept in the public Google Cloud Platform (GCP) bucket minari-datasets.

The first level of the root directory tree contains the Minari dataset directories, which are named after the datasets id. The datasets id must follow the syntax (env_name-)(dataset_name)(-v(version)), where:

  • env_name: a string that describes the environment from which the dataset was created. If a dataset comes from the AdroitHandDoor environment ennv_name can be equal to door.

  • dataset_name: a string describing the content of the dataset. For example, if the dataset for the AdroitHandDoor environment was generated from human input we can give the value human to dataset_name.

  • version: integer value that represent the number of versions for door-human-v(version) dataset, starting from 0.

In the end, the id of the dataset for the initial version of the AdroitHandDoor environment example will be door-human-v0.

Data files

Each Minari dataset directory contains another directory named data where the files of the collected offline data are stored (more directories are yet to be included for additional information, _docs and policies are WIP). The data directory can contain multiple .hdf5 files storing the offline data. When using a Minari dataset the offline data is loaded homogeneously from all .hdf5 as if it were a single file. The names for these files are:

  • main_data.hdf5: root file that aside from raw data it also contains all the metadata of the global dataset and external links to the data in the other files. Minari will read this file when a dataset is loaded.

  • additional_data_x.hdf5: these files contain raw data. Each of them is generated after making a checkpoint when collecting the offline data with MinariDataset.update_datasets(env).

The following directory tree of a Minari root path contains three different datasets named dataset_id-v0, dataset_id-v1, and other_dataset_id-v0. The offline data of the dataset_id-v1 is saved in a single main_data.hdf5 file, while for dataset_id-v0 the offline data has been divided into multiple .hdf5 files.

  • minari_root
    • dataset_id-v0
      • data
        • main_data.hdf5
        • additional_data_0.hdf5
        • additional_data_1.hdf5
    • dataset_id-v1
      • data
        • main_data.hdf5
    • other_dataset_id-v0
  • minari_root
    • dataset_id-v0
      • data
        • main_data.hdf5
        • additional_data_0.hdf5
        • additional_data_1.hdf5
    • dataset_id-v1
      • data
        • main_data.hdf5
    • other_dataset_id-v0

Dataset File Format

Minari datasets are stored in HDF5 file format by using the h5py Python interface. We leverage the hierarchical structure of HDF5 files of group and dataset elements to clearly divide the recorded step data into episode groups and add custom metadata to the whole dataset, to each episode group, or to the individual HDF5 datasets that comprise each episode group.

More information about the features that the HDF5 file format support can be read in this link

HDF5 file structure

The offline data is organized inside the main_data.hdf5 file in episode groups named as episode_id. Each episode group contains all the stepping data from a Gymnasium environment until the environment is terminated or truncated.

The stepping data inside the episode group is divided into some required datasets (StepData) plus other optional groups and nested sub-groups such as infos. If the action and observation spaces both are simple spaces (not Tuple and not Dict), then the hierarchical tree of the Minari dataset HDF5 file will end up looking as follows:

  • main_data.hdf5
    • episode_0
      • observations
      • actions
      • terminations
      • truncations
      • rewards
      • infos
        • infos_datasets
        • infos_subgroup
          • more_datasets
      • additional_groups
        • additional_datasets
    • episode_1
    • episode_2

    • episode_id
  • main_data.hdf5
    • episode_0
      • observations
      • actions
      • terminations
      • truncations
      • rewards
      • infos
        • infos_datasets
        • infos_subgroup
          • more_datasets
      • additional_groups
        • additional_datasets
    • episode_1
    • episode_2

    • episode_id

In the case where, the observation space is a relatively complex Dict space with the following definition:

spaces.Dict(
    {
        "component_1": spaces.Box(low=-1, high=1, dtype=np.float32),
        "component_2": spaces.Dict(
            {
                "subcomponent_1": spaces.Box(low=2, high=3, dtype=np.float32),
                "subcomponent_2": spaces.Box(low=4, high=5, dtype=np.float32),
            }
        ),
    }
)

and the action space is a Box space, the resulting Minari dataset HDF5 file will end up looking as follows:

  • main_data.hdf5
    • episode_0
      • observations
        • component_1
        • component_2
          • subcomponent_1
          • subcomponent_2
      • actions
      • terminations
      • truncations
      • rewards
      • infos
        • infos_datasets
        • infos_subgroup
          • more_datasets
      • additional_groups
        • additional_datasets
    • episode_1
    • episode_2

    • episode_id
  • main_data.hdf5
    • episode_0
      • observations
        • component_1
        • component_2
          • subcomponent_1
          • subcomponent_2
      • actions
      • terminations
      • truncations
      • rewards
      • infos
        • infos_datasets
        • infos_subgroup
          • more_datasets
      • additional_groups
        • additional_datasets
    • episode_1
    • episode_2

    • episode_id

Similarly, consider the case where we have a Box space as an observation space and a relatively complex Tuple space as an action space with the following definition:

spaces.Tuple(
    (
        spaces.Box(low=2, high=3, dtype=np.float32),
        spaces.Tuple(
            (
                spaces.Box(low=2, high=3, dtype=np.float32),
                spaces.Box(low=4, high=5, dtype=np.float32),
            )
        ),
    )
)

In this case, the resulting Minari dataset HDF5 file will end up looking as follows:

  • main_data.hdf5
    • episode_0
      • observations
      • actions
        • _index_0
        • _index_1
          • _index_0
          • _index_1
      • terminations
      • truncations
      • rewards
      • infos
        • infos_datasets
        • infos_subgroup
          • more_datasets
      • additional_groups
        • additional_datasets
    • episode_1
    • episode_2

    • episode_id
  • main_data.hdf5
    • episode_0
      • observations
      • actions
        • _index_0
        • _index_1
          • _index_0
          • _index_1
      • terminations
      • truncations
      • rewards
      • infos
        • infos_datasets
        • infos_subgroup
          • more_datasets
      • additional_groups
        • additional_datasets
    • episode_1
    • episode_2

    • episode_id

Note how the Tuple space elements are assigned corresponding keys of the format f"_index_{i}" were i is their index in the Tuple space.

The required datasets found in the episode groups correspond to the data involved in every Gymnasium step call: obs, rew, terminated, truncated, info = env.step(action): observations, actions, rewards, terminations, and truncations. These datasets are np.ndarray or nested groups of np.ndarray and other groups, depending on the observation and action spaces, and the space of all datasets under each required top-level episode key is equal to:

  • actions: shape=(num_steps, action_space_component_shape). If the action or observation space is Dict or a Tuple, then the corresponding entry will be a group instead of a dataset. Within this group, there will be nested groups and datasets, as specified by the action and observation spaces. Dict and Tuple spaces are represented as groups, and Box and Discrete spaces are represented as datasets. All datasets at any level under the top-level key actions will have the same num_steps, but will vary in action_space_component_shape on for each particular action space component. For example, a Dict space may contain two Box spaces with different shapes.

  • observations: shape=(num_steps + 1, observation_space_component_shape). Observations nest in the same way as actions if the top level space is a Tuple or Dict space. The value of num_steps + 1 is the same for datasets at any level under observations. These datasets have an additional element because the initial observation of the environment when calling obs, info = env.reset() is also saved. observation_space_component_shape will vary between datasets, depending on the shapes of the simple spaces specified in the observation space.

  • rewards: shape=(num_steps, 1), stores the returned reward in each step.

  • terminations: shape=(num_steps, 1), the dtype is np.bool and the last element value will be True if the episode finished due to a terminated step return.

  • truncations: shape=(num_steps, 1), the dtype is np.bool and the last element value will be True if the episode finished due to a truncated step return.

The dtype of the numpy array datasets can be of any type compatible with h5py.

The info dictionary returned in env.step() and env.reset() can be optionally saved in the dataset as a sub-group. The option to save the info data can be set in the DataCollector wrapper with the record_infos argument.

Also, additional datasets and nested sub-groups can be saved in each episode. This can be the case of environment data that doesn’t participate in each env.step() or env.reset() call in the Gymnasium API, such as the full environment state in each step. This can be achieved by creating a custom StepDataCallback that returns extra keys and nested dictionaries in the StepData dictionary return.

For example, the Adroit Hand environments in the Gymnasium-Robotics project need to store the full state of the MuJoCo simulation since this information is not present in the observations dataset and the environments are reset by setting an initial state in the simulation.

The following code snippet creates a custom StepDataCallbak and adds a new key, state, to the returned StepData dictionary. state is a nested dictionary with np.ndarray values and the keys are relevant MuJoCo data that represent the state of the simulation: qpos, qvel, and some other body positions.

from minari import StepDataCallback
class AdroitStepDataCallback(StepDataCallback):
    def __call__(self, env, **kwargs):
        step_data = super().__call__(env, **kwargs)
        step_data['state'] = env.get_env_state()
        return step_data

The episode groups in the HDF5 file will then have the following structure:

  • episode_id
    • observations
    • actions
    • terminations
    • truncations
    • rewards
    • infos
    • state
      • qpos
      • qvel
      • object_body_pos
  • episode_id
    • observations
    • actions
    • terminations
    • truncations
    • rewards
    • infos
    • state
      • qpos
      • qvel
      • object_body_pos

Default dataset metadata

HDF5 files can have metadata attached to objects as attributes. Minari uses these attributes to add metadata to the global dataset file, to each episode group, as well as to the individual datasets inside each episode. This metadata can be added by the user by overriding the EpisodeMetadataCallback in the DataCollector wrapper. However, there is also some metadata added by default to every dataset.

When creating a Minari dataset with the DataCollector wrapper the default global metadata will be the following:

Attribute

Type

Description

total_episodes

np.int64

Number of episodes in the Minari dataset.

total_steps

np.int64

Number of steps in the Minari dataset.

env_spec

str

json string of the Gymnasium environment spec.

dataset_id

str

Identifier of the Minari dataset.

code_permalink

str

Link to a repository with the code used to generate the dataset.

author

str

Author’s name that created the dataset.

author_email

str

Email of the author that created the dataset.

algorithm_name

str

Name of the expert policy used to create the dataset.

action_space

str

Serialized Gymnasium action space describing actions in dataset.

observation_space

str

Serialized Gymnasium observation space describing observations in dataset.

minari_version

str

Version specifier of Minari versions compatible with the dataset.

For each episode group the default metadata attributes are:

Attribute

Type

Description

id

np.int64

ID of the episode, episode_id.

total_steps

np.int64

Number of steps in the episode.

seed

np.int64

Seed used to reset the episode.

Statistical metrics are also computed as metadata for the individual datasets in each episode (for now only computed for rewards dataset)

  • rewards dataset:

    Metric

    Type

    Description

    max

    np.float64

    Maximum reward value in the episode.

    min

    np.float64

    Minimum reward value in the episode.

    mean

    np.float64

    Mean value of the episode rewards.

    std

    np.float64

    Standard deviation of the episode rewards.

    sum

    np.float64

    Total undiscounted return of the episode.

Observation and Action Spaces

The Minari storage format supports the following observation and action spaces:

Supported Spaces

Space

Description

Discrete

Describes a discrete space where {0, 1, ..., n-1} are the possible values our observation can take. An optional argument can be used to shift the values to {a, a+1, ..., a+n-1}.

Box

An n-dimensional continuous space. The upper and lower arguments can be used to define bounded spaces.

Tuple

Represents a tuple of spaces.

Dict

Represents a dictionary of spaces.

Text

The elements of this space are bounded strings from a charset. Note: at the moment, we don’t guarantee support for all surrogate pairs.

Space Serialization

Spaces are serialized to a JSON format when saving to disk. This serialization supports all space types supported by Minari, and aims to be both human, and machine readable. The serialized action and observation spaces for the episodes in the dataset are saved as strings in the global HDF5 group metadata in main_data.hdf5 for a particular dataset as action_space and observation_space respectively. All episodes in main_data.hdf5 must have observations and actions that comply with these action and observation spaces.

Minari Data Structures

A Minari dataset is encapsulated in the MinariDataset class which allows for iterating and sampling through episodes which are defined as EpisodeData data class.

EpisodeData Structure

Episodes can be accessed from a Minari dataset through iteration, random sampling, or even filtering episodes from a dataset through an arbitrary condition via the filter_episodes method. Take the following example where we load the door-human-v0 dataset and randomly sample 10 episodes:

import minari
dataset = minari.load_dataset("door-human-v1", download=True)
sampled_episodes = dataset.sample_episodes(10)

The sampled_episodes variable will be a list of 10 EpisodeData elements, each containing episode data. An EpisodeData element is a data class consisting of the following fields:

Field

Type

Description

id

np.int64

ID of the episode.

seed

np.int64

Seed used to reset the episode.

total_steps

np.int64

Number of steps in the episode.

observations

np.ndarray, list, tuple, dict

Observations for each step including initial observation.

actions

np.ndarray, list, tuple, dict

Actions for each step.

rewards

np.ndarray

Rewards for each step.

terminations

np.ndarray

Terminations for each step.

truncations

np.ndarray

Truncations for each step.

infos

dict

A dictionary containing additional information.

As mentioned in the Supported Spaces section, many different observation and action spaces are supported so the data type for these fields are dependent on the environment being used.

Additional Information Formatting

When creating a dataset with DataCollector, if the DataCollector is initialized with record_infos=True, an info dict must be provided from every call to the environment’s step and reset function. The structure of the info dictionary must be the same across steps.

Given that it is not guaranteed that all Gymnasium environments provide infos at every step, we provide the StepDataCallback which can modify the infos from a non-compliant environment so they have the same structure at every step. An example of this pattern is available in our test test_data_collector_step_data_callback_info_correction in test_step_data_callback.py.