Dataset Standards#
Minari Storage#
Minari root#
Minari stores the offline datasets under a common root directory. The root directory path for the local datasets is set by default to ~/.minari/datasets/
. However, this path can be modified by setting the environment variable MINARI_DATASETS_PATH
.
The remote datasets are kept in the public Google Cloud Platform (GCP) bucket minari-datasets
.
The first level of the root directory tree contains the Minari dataset directories, which are named after the datasets id
. The datasets id
must follow the syntax (env_name-)(dataset_name)(-v(version))
, where:
env_name
: a string that describes the environment from which the dataset was created. If a dataset comes from theAdroitHandDoor
environmentennv_name
can be equal todoor
.dataset_name
: a string describing the content of the dataset. For example, if the dataset for theAdroitHandDoor
environment was generated from human input we can give the valuehuman
todataset_name
.version
: integer value that represent the number of versions fordoor-human-v(version)
dataset, starting from0
.
In the end, the id
of the dataset for the initial version of the AdroitHandDoor
environment example will be door-human-v0
.
Data files#
Each Minari dataset directory contains another directory named data
where the files of the collected offline data are stored (more directories are yet to be included for additional information, _docs
and policies
are WIP). The data
directory can contain multiple .hdf5
files storing the offline data. When using a Minari dataset the offline data is loaded homogeneously from all .hdf5
as if it were a single file. The names for these files are:
main_data.hdf5
: root file that aside from raw data it also contains all the metadata of the global dataset and external links to the data in the other files. Minari will read this file when a dataset is loaded.additional_data_x.hdf5
: these files contain raw data. Each of them is generated after making a checkpoint when collecting the offline data withMinariDataset.update_datasets(env)
.
The following directory tree of a Minari root path contains three different datasets named dataset_id-v0
, dataset_id-v1
, and other_dataset_id-v0
. The offline data of the dataset_id-v1
is saved in a single main_data.hdf5
file, while for dataset_id-v0
the offline data has been divided into multiple .hdf5
files.
- minari_root
- dataset_id-v0
- data
- main_data.hdf5
- additional_data_0.hdf5
- additional_data_1.hdf5
- data
- dataset_id-v1
- data
- main_data.hdf5
- data
- other_dataset_id-v0
- dataset_id-v0
- minari_root
- dataset_id-v0
- data
- main_data.hdf5
- additional_data_0.hdf5
- additional_data_1.hdf5
- data
- dataset_id-v1
- data
- main_data.hdf5
- data
- other_dataset_id-v0
- dataset_id-v0
Dataset File Format#
Minari datasets are stored in HDF5
file format by using the h5py
Python interface. We leverage the hierarchical structure of HDF5
files of group
and dataset
elements to clearly divide the recorded step data into episode groups
and add custom metadata to the whole dataset, to each episode group
, or to the individual HDF5
datasets
that comprise each episode group
.
More information about the features that the HDF5
file format support can be read in this link
HDF5 file structure#
The offline data is organized inside the main_data.hdf5
file in episode groups
named as episode_id
. Each episode group contains all the stepping data from a Gymnasium environment until the environment is terminated
or truncated
.
The stepping data inside the episode group is divided into some required datasets
(StepData
) plus other optional groups
and nested sub-groups
such as infos
. If the action and observation spaces both are simple spaces (not Tuple
and not Dict
), then the hierarchical tree of the Minari dataset HDF5
file will end up looking as follows:
- main_data.hdf5
- episode_0
- observations
- actions
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- episode_1
- episode_2
- episode_id
- episode_0
- main_data.hdf5
- episode_0
- observations
- actions
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- episode_1
- episode_2
- episode_id
- episode_0
In the case where, the observation space is a relatively complex Dict
space with the following definition:
spaces.Dict(
{
"component_1": spaces.Box(low=-1, high=1, dtype=np.float32),
"component_2": spaces.Dict(
{
"subcomponent_1": spaces.Box(low=2, high=3, dtype=np.float32),
"subcomponent_2": spaces.Box(low=4, high=5, dtype=np.float32),
}
),
}
)
and the action space is a Box
space, the resulting Minari dataset HDF5
file will end up looking as follows:
- main_data.hdf5
- episode_0
- observations
- component_1
- component_2
- subcomponent_1
- subcomponent_2
- actions
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- observations
- episode_1
- episode_2
- episode_id
- episode_0
- main_data.hdf5
- episode_0
- observations
- component_1
- component_2
- subcomponent_1
- subcomponent_2
- actions
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- observations
- episode_1
- episode_2
- episode_id
- episode_0
Similarly, consider the case where we have a Box
space as an observation space and a relatively complex Tuple
space as an action space with the following definition:
spaces.Tuple(
(
spaces.Box(low=2, high=3, dtype=np.float32),
spaces.Tuple(
(
spaces.Box(low=2, high=3, dtype=np.float32),
spaces.Box(low=4, high=5, dtype=np.float32),
)
),
)
)
In this case, the resulting Minari dataset HDF5
file will end up looking as follows:
- main_data.hdf5
- episode_0
- observations
- actions
- _index_0
- _index_1
- _index_0
- _index_1
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- episode_1
- episode_2
- episode_id
- episode_0
- main_data.hdf5
- episode_0
- observations
- actions
- _index_0
- _index_1
- _index_0
- _index_1
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- episode_1
- episode_2
- episode_id
- episode_0
Note how the Tuple
space elements are assigned corresponding keys of the format f"_index_{i}"
were i
is their index in the Tuple
space.
The required datasets
found in the episode groups correspond to the data involved in every Gymnasium step call: obs, rew, terminated, truncated, info = env.step(action)
: observations
, actions
, rewards
, terminations
, and truncations
. These datasets are np.ndarray
or nested groups of np.ndarray
and other groups, depending on the observation and action spaces, and the space of all datasets under each required top-level episode key is equal to:
actions
:shape=(num_steps, action_space_component_shape)
. If the action or observation space isDict
or aTuple
, then the corresponding entry will be a group instead of a dataset. Within this group, there will be nested groups and datasets, as specified by the action and observation spaces.Dict
andTuple
spaces are represented as groups, andBox
andDiscrete
spaces are represented as datasets. All datasets at any level under the top-level keyactions
will have the samenum_steps
, but will vary inaction_space_component_shape
on for each particular action space component. For example, aDict
space may contain twoBox
spaces with different shapes.observations
:shape=(num_steps + 1, observation_space_component_shape)
. Observations nest in the same way as actions if the top level space is aTuple
orDict
space. The value ofnum_steps + 1
is the same for datasets at any level underobservations
. These datasets have an additional element because the initial observation of the environment when callingobs, info = env.reset()
is also saved.observation_space_component_shape
will vary between datasets, depending on the shapes of the simple spaces specified in the observation space.rewards
:shape=(num_steps, 1)
, stores the returned reward in each step.terminations
:shape=(num_steps, 1)
, thedtype
isnp.bool
and the last element value will beTrue
if the episode finished due to aterminated
step return.truncations
:shape=(num_steps, 1)
, thedtype
isnp.bool
and the last element value will beTrue
if the episode finished due to atruncated
step return.
The dtype
of the numpy array datasets can be of any type compatible with h5py
.
The info
dictionary returned in env.step()
and env.reset()
can be optionally saved in the dataset as a sub-group
. The option to save the info
data can be set in the DataCollectorv0
wrapper with the record_infos
argument.
Also, additional datasets
and nested sub-groups
can be saved in each episode. This can be the case of environment data that doesn’t participate in each env.step()
or env.reset()
call in the Gymnasium API, such as the full environment state in each step. This can be achieved by creating a custom StepDataCallback
that returns extra keys and nested dictionaries in the StepData
dictionary return.
For example, the Adroit Hand
environments in the Gymnasium-Robotics
project need to store the full state of the MuJoCo simulation since this information is not present in the observations
dataset and the environments are reset by setting an initial state in the simulation.
The following code snippet creates a custom StepDataCallbak
and adds a new key, state
, to the returned StepData
dictionary. state
is a nested dictionary with np.ndarray
values and the keys are relevant MuJoCo data that represent the state of the simulation: qpos
, qvel
, and some other body positions.
from minari import StepDataCallback
class AdroitStepDataCallback(StepDataCallback):
def __call__(self, env, **kwargs):
step_data = super().__call__(env, **kwargs)
step_data['state'] = env.get_env_state()
return step_data
The episode groups in the HDF5
file will then have the following structure:
- episode_id
- observations
- actions
- terminations
- truncations
- rewards
- infos
- state
- qpos
- qvel
- object_body_pos
- episode_id
- observations
- actions
- terminations
- truncations
- rewards
- infos
- state
- qpos
- qvel
- object_body_pos
Default dataset metadata#
HDF5
files can have metadata attached to objects
as attributes
. Minari uses these attributes
to add metadata to the global dataset file, to each episode group, as well as to the individual datasets inside each episode. This metadata can be added by the user by overriding the EpisodeMetadataCallback
in the DataCollectorV0
wrapper. However, there is also some metadata added by default to every dataset.
When creating a Minari dataset with the DataCollectorV0
wrapper the default global metadata will be the following:
Attribute |
Type |
Description |
---|---|---|
|
|
Number of episodes in the Minari dataset. |
|
|
Number of steps in the Minari dataset. |
|
|
json string of the Gymnasium environment spec. |
|
|
Identifier of the Minari dataset. |
|
|
Link to a repository with the code used to generate the dataset. |
|
|
Author’s name that created the dataset. |
|
|
Email of the author that created the dataset. |
|
|
Name of the expert policy used to create the dataset. |
|
|
Serialized Gymnasium action space describing actions in dataset. |
|
|
Serialized Gymnasium observation space describing observations in dataset. |
|
|
Version specifier of Minari versions compatible with the dataset. |
For each episode group the default metadata attributes
are:
Attribute |
Type |
Description |
---|---|---|
|
|
ID of the episode, |
|
|
Number of steps in the episode. |
|
|
Seed used to reset the episode. |
Statistical metrics are also computed as metadata for the individual datasets in each episode (for now only computed for rewards
dataset)
rewards
dataset:Metric
Type
Description
max
np.float64
Maximum reward value in the episode.
min
np.float64
Minimum reward value in the episode.
mean
np.float64
Mean value of the episode rewards.
std
np.float64
Standard deviation of the episode rewards.
sum
np.float64
Total undiscounted return of the episode.
Observation and Action Spaces#
The Minari storage format supports the following observation and action spaces:
Supported Spaces#
Space |
Description |
---|---|
Describes a discrete space where |
|
An n-dimensional continuous space. The |
|
Represents a tuple of spaces. |
|
Represents a dictionary of spaces. |
|
The elements of this space are bounded strings from a charset. Note: at the moment, we don’t guarantee support for all surrogate pairs. |
Space Serialization#
Spaces are serialized to a JSON format when saving to disk. This serialization supports all space types supported by Minari, and aims to be both human, and machine readable. The serialized action and observation spaces for the episodes in the dataset are saved as strings in the global HDF5 group metadata in main_data.hdf5
for a particular dataset as action_space
and observation_space
respectively. All episodes in main_data.hdf5
must have observations and actions that comply with these action and observation spaces.
Minari Data Structures#
A Minari dataset is encapsulated in the MinariDataset
class which allows for iterating and sampling through episodes which are defined as EpisodeData
data class.
EpisodeData Structure#
Episodes can be accessed from a Minari dataset through iteration, random sampling, or even filtering episodes from a dataset through an arbitrary condition via the filter_episodes
method. Take the following example where we load the door-human-v0
dataset and randomly sample 10 episodes:
import minari
dataset = minari.load_dataset("door-human-v1", download=True)
sampled_episodes = dataset.sample_episodes(10)
The sampled_episodes
variable will be a list of 10 EpisodeData
elements, each containing episode data. An EpisodeData
element is a data class consisting of the following fields:
Field |
Type |
Description |
---|---|---|
|
|
ID of the episode. |
|
|
Seed used to reset the episode. |
|
|
Number of timesteps in the episode. |
|
|
Observations for each timestep including initial observation. |
|
|
Actions for each timestep. |
|
|
Rewards for each timestep. |
|
|
Terminations for each timestep. |
|
|
Truncations for each timestep. |
As mentioned in the Supported Spaces
section, many different observation and action spaces are supported so the data type for these fields are dependent on the environment being used.