Dataset Standards#
Minari Storage#
Minari root#
Minari stores the offline datasets under a common root directory. The root directory path for the local datasets is set by default to ~/.minari/datasets/. However, this path can be modified by setting the environment variable MINARI_DATASETS_PATH.
The remote datasets are kept in the public Google Cloud Platform (GCP) bucket minari-datasets.
The first level of the root directory tree contains the Minari dataset directories, which are named after the datasets id. The datasets id must follow the syntax (env_name-)(dataset_name)(-v(version)), where:
env_name: a string that describes the environment from which the dataset was created. If a dataset comes from theAdroitHandDoorenvironmentennv_namecan be equal todoor.dataset_name: a string describing the content of the dataset. For example, if the dataset for theAdroitHandDoorenvironment was generated from human input we can give the valuehumantodataset_name.version: integer value that represent the number of versions fordoor-human-v(version)dataset, starting from0.
In the end, the id of the dataset for the initial version of the AdroitHandDoor environment example will be door-human-v0.
Data files#
Each Minari dataset directory contains another directory named data where the files of the collected offline data are stored (more directories are yet to be included for additional information, _docs and policies are WIP). The data directory can contain multiple .hdf5 files storing the offline data. When using a Minari dataset the offline data is loaded homogeneously from all .hdf5 as if it were a single file. The names for these files are:
main_data.hdf5: root file that aside from raw data it also contains all the metadata of the global dataset and external links to the data in the other files. Minari will read this file when a dataset is loaded.additional_data_x.hdf5: these files contain raw data. Each of them is generated after making a checkpoint when collecting the offline data withMinariDataset.update_datasets(env).
The following directory tree of a Minari root path contains three different datasets named dataset_id-v0, dataset_id-v1, and other_dataset_id-v0. The offline data of the dataset_id-v1 is saved in a single main_data.hdf5 file, while for dataset_id-v0 the offline data has been divided into multiple .hdf5 files.
- minari_root
- dataset_id-v0
- data
- main_data.hdf5
- additional_data_0.hdf5
- additional_data_1.hdf5
- data
- dataset_id-v1
- data
- main_data.hdf5
- data
- other_dataset_id-v0
- dataset_id-v0
- minari_root
- dataset_id-v0
- data
- main_data.hdf5
- additional_data_0.hdf5
- additional_data_1.hdf5
- data
- dataset_id-v1
- data
- main_data.hdf5
- data
- other_dataset_id-v0
- dataset_id-v0
Dataset File Format#
Minari datasets are stored in HDF5 file format by using the h5py Python interface. We leverage the hierarchical structure of HDF5 files of group and dataset elements to clearly divide the recorded step data into episode groups and add custom metadata to the whole dataset, to each episode group, or to the individual HDF5 datasets that comprise each episode group.
More information about the features that the HDF5 file format support can be read in this link
HDF5 file structure#
The offline data is organized inside the main_data.hdf5 file in episode groups named as episode_id. Each episode group contains all the stepping data from a Gymnasium environment until the environment is terminated or truncated.
The stepping data inside the episode group is divided into some required datasets (StepData) plus other optional groups and nested sub-groups such as infos. The hierarchical tree of the Minari dataset HDF5 file will end up looking as follows:
- main_data.hdf5
- episode_0
- observations
- actions
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- episode_1
- episode_2
- episode_id
- episode_0
- main_data.hdf5
- episode_0
- observations
- actions
- terminations
- truncations
- rewards
- infos
- infos_datasets
- infos_subgroup
- more_datasets
- additional_groups
- additional_datasets
- episode_1
- episode_2
- episode_id
- episode_0
The required datasets found in the episode groups correspond to the data involved in every Gymnasium step call obs, rew, terminated, truncated, info = env.step(action): observations, actions, rewards, terminations, and truncations. These datasets are np.ndarray and their shape is equal to:
actions:shape=(number_of_steps, action_space_shape). If the action space is aDictionaryor aTupleeach step action is flatten before creating theactionsdataset (currentlySequenceandGraphaction spaces are not supported). If using theDataCollectorv0wrapper to create the Mianri datasets, the saved actions will be automatically flattened by theStepDataCallback.observations:shape=(number_of_steps + 1, observation_space_shape). The observations are also flattened if the observation space of the environment is of typesDictionaryorTuple. The size of the first axis of theobservationsdataset has an additional element because the initial observation of the environment when callingobs, info = env.reset()is also saved. You can get a transition of the form(o_t, a_t, o_t+1)from the datasets in the episode group, whereo_tis the current observation,o_t+1is the next observation after taking actiona, andtis the discrete transition index ; as follows:next_observations = observations[1:] observations = observations[:-1] # get transition at timestep t observation = observations[t] # o_t action = actions[t] # a_t next_observation = next_observations[t] # o_t+1 reward = rewards[t] # r_t terminated = terminations[t] truncated = truncations[t]
rewards:shape=(number_of_steps, 1), stores the returned reward in each step.terminations:shape=(number_of_steps, 1), thedtypeisnp.booland the last element value will beTrueif the episode finished due to aterminatedstep return.truncations:shape=(number_of_steps, 1), thedtypeisnp.booland the last element value will beTrueif the episode finished due to atruncatedstep return.
The dtype of the numpy array datasets can be of any type compatible with h5py.
The info dictionary returned in env.step() and env.reset() can be optionally saved in the dataset as a sub-group. The option to save the info data can be set in the DataCollectorv0 wrapper with the record_infos argument.
Also, additional datasets and nested sub-groups can be saved in each episode. This can be the case of environment data that doesn’t participate in each env.step() or env.reset() call in the Gymnasium API, such as the full environment state in each step. This can be achieved by creating a custom StepDataCallback that returns extra keys and nested dictionaries in the StepData dictionary return.
For example, the Adroit Hand environments in the Gymnasium-Robotics project need to store the full state of the MuJoCo simulation since this information is not present in the observations dataset and the environments are reset by setting an initial state in the simulation.
The following code snippet creates a custom StepDataCallbak and adds a new key, state, to the returned StepData dictionary. state is a nested dictionary with np.ndarray values and the keys are relevant MuJoCo data that represent the state of the simulation: qpos, qvel, and some other body positions.
class AdroitStepDataCallback(StepDataCallback):
def __call__(self, env, **kwargs):
step_data = super().__call__(env, **kwargs)
step_data['state'] = env.get_env_state()
return step_data
The episode groups in the HDF5 file will then have the following structure:
- episode_id
- observations
- actions
- terminations
- truncations
- rewards
- infos
- state
- qpos
- qvel
- object_body_pos
- episode_id
- observations
- actions
- terminations
- truncations
- rewards
- infos
- state
- qpos
- qvel
- object_body_pos
Default dataset metadata#
HDF5 files can have metadata attached to objects as attributes. Minari uses these attributes to add metadata to the global dataset file, to each episode group, as well as to the individual datasets inside each episode. This metadata can be added by the user by overriding the EpisodeMetadataCallback in the DataCollectorV0 wrapper. However, there is also some metadata added by default to every dataset.
When creating a Minari dataset with the DataCollectorV0 wrapper the default global metadata will be the following:
Attribute |
Type |
Description |
|---|---|---|
|
|
Number of episodes in the Minari dataset. |
|
|
Number of steps in the Minari dataset. |
|
|
If the observation space had to be flattened. Usually for |
|
|
If the action space had to be flattened. Usually for |
|
|
json string of the Gymnasium environment spec. |
|
|
Name tag of the Minari dataset. |
|
|
Link to a repository with the code used to generate the dataset. |
|
|
Author’s name that created the dataset. |
|
|
Email of the author that created the dataset. |
|
|
Name of the expert policy used to create the dataset. |
For each episode group the default metadata attributes are:
Attribute |
Type |
Description |
|---|---|---|
|
|
ID of the episode, |
|
|
Number of steps in the episode. |
|
|
Seed used to reset the episode. |
Statistical metrics are also computed as metadata for the individual datasets in each episode (for now only computed for rewards dataset)
rewardsdataset:Metric
Type
Description
maxnp.float64Maximum reward value in the episode.
minnp.float64Minimum reward value in the episode.
meannp.float64Mean value of the episode rewards.
stdnp.float64Standard deviation of the episode rewards.
sumnp.float64Total undiscounted return of the episode.