MinariDataset#

minari.MinariDataset#

class minari.MinariDataset(data: MinariStorage | str | bytes | PathLike, episode_indices: ndarray | None = None)[source]#

Main Minari dataset class to sample data and get metadata information from a dataset.

Initialize properties of the Minari Dataset.

Parameters:

data (Union[MinariStorage, _PathLike]) – source of data.
episode_indices (Optiona[np.ndarray]) – slice of episode indices this dataset is pointing to.

Methods#

minari.MinariDataset.sample_episodes(self, n_episodes: int) → Iterable[EpisodeData]#

Sample n number of episodes from the dataset.

Parameters:: n_episodes (Optional[int], optional) – number of episodes to sample.

minari.MinariDataset.iterate_episodes(self, episode_indices: List[int] | None = None) → Iterator[EpisodeData]#

Iterate over episodes from the dataset.

Parameters:: episode_indices (Optional[List[int]], optional) – episode indices to iterate over.

minari.MinariDataset.filter_episodes(self, condition: Callable[[EpisodeData], bool]) → MinariDataset#

Filter the dataset episodes with a condition.

The condition must be a callable which takes an EpisodeData instance and retutrns a bool. The callable must return a bool True if the condition is met and False otherwise. i.e filtering for episodes that terminate:

` dataset.filter(condition=lambda x: x['terminations'][-1] ) `

Parameters:: condition (Callable[[EpisodeData], bool]) – callable that accepts any type(For our current backend, an h5py episode group) and returns True if certain condition is met.

minari.MinariDataset.set_seed(self, seed: int)#: Set seed for random episode sampling generator.

minari.MinariDataset.recover_environment(self) → Env#

Recover the Gymnasium environment used to create the dataset.

Returns:: environment – Gymnasium environment

minari.MinariDataset.update_dataset_from_collector_env(self, collector_env: DataCollectorV0)#

Add extra data to Minari dataset from collector environment buffers (DataCollectorV0).

This method can be used as a checkpoint when creating a dataset. A new HDF5 file will be created with the new dataset file in the same directory as main_data.hdf5 called additional_data_i.hdf5. Both datasets are joined together by creating external links to each additional episode group: https://docs.h5py.org/en/stable/high/group.html#external-links

Parameters:: collector_env (DataCollectorV0) – Collector environment

minari.MinariDataset.update_dataset_from_buffer(self, buffer: List[dict])#

Additional data can be added to the Minari Dataset from a list of episode dictionary buffers.

Each episode dictionary buffer must have the following items:

observations: np.ndarray of step observations. shape = (total_episode_steps + 1, (observation_shape)). Should include initial and final observation
actions: np.ndarray of step action. shape = (total_episode_steps + 1, (action_shape)).
rewards: np.ndarray of step rewards. shape = (total_episode_steps + 1, 1).
terminations: np.ndarray of step terminations. shape = (total_episode_steps + 1, 1).
truncations: np.ndarray of step truncations. shape = (total_episode_steps + 1, 1).

Other additional items can be added as long as the values are np.ndarray’s or other nested dictionaries.

Parameters:: buffer (list[dict]) – list of episode dictionary buffers to add to dataset

Attributes#

MinariDataset.spec#

MinariDataset.total_steps#: Total episodes steps in the Minari dataset.

MinariDataset.total_episodes#: Total episodes recorded in the Minari dataset.

MinariDataset.episode_indices#: Indices of the available episodes to sample within the Minari dataset.